UP - logo
E-viri
Celotno besedilo
Odprti dostop
  • Su, Yi; Wang, Xiangyu; Elaine Ya Le; Liu, Liang; Li, Yuening; Lu, Haokai; Lipshitz, Benjamin; Sriraj Badam; Heldt, Lukasz; Bi, Shuchao; Chi, Ed; Goodrow, Cristos; Su-Lin, Wu; Baugher, Lexi; Chen, Minmin

    arXiv.org, 02/2024
    Paper, Journal Article

    Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term value of exploration by examining its effects on content corpus, and connecting content corpus growth to the long-term user experience from real-world experiments. Once established the values of exploration, we investigate the Neural Linear Bandit algorithm as a general framework to introduce exploration into any deep learning based ranking systems. We conduct live experiments on one of the largest short-form video recommendation platforms that serves billions of users to validate the new experiment designs, quantify the long-term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration.