Advertisers in online advertising use auto-bidding tools in ad auctions on demand-side platforms.
Researchers have introduced the R* Decision Transformer (R* DT) to enhance automated bidding systems by addressing challenges in conventional Decision Transformers (DT), such as lack of preset return-to-go (RTG) values and mixed-quality training data.
The R* DT is developed in three steps: R DT stores actions based on state and RTG, R^ DT predicts the highest RTG for a state to derive a suboptimal policy, and R* DT generates trajectories based on R^ DT to improve training data quality and move towards an optimal policy.
Tests on a public bidding dataset demonstrate the effectiveness of R* DT in handling mixed-quality trajectories and improving the RTG of bidding actions.