Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020 | 6604 | 2020 |
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020 | 479 | 2020 |
Data distributional properties drive emergent in-context learning in transformers S Chan, A Santoro, A Lampinen, J Wang, A Singh, P Richemond, ... Advances in Neural Information Processing Systems 35, 18878-18891, 2022 | 228 | 2022 |
Byol works even without batch statistics PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ... arXiv preprint arXiv:2010.10241, 2020 | 103 | 2020 |
Continuous diffusion for categorical data S Dieleman, L Sartran, A Roshannai, N Savinov, Y Ganin, PH Richemond, ... arXiv preprint arXiv:2211.15089, 2022 | 77 | 2022 |
Generalized preference optimization: A unified approach to offline alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024 | 34 | 2024 |
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023 | 28 | 2023 |
On Wasserstein reinforcement learning and the Fokker-Planck equation PH Richemond, B Maginnis arXiv preprint arXiv:1712.07185, 2017 | 24 | 2017 |
Categorical SDEs with simplex diffusion PH Richemond, S Dieleman, A Doucet arXiv preprint arXiv:2210.14784, 2022 | 20 | 2022 |
Zipfian environments for reinforcement learning SCY Chan, AK Lampinen, PH Richemond, F Hill Conference on Lifelong Learning Agents, 406-429, 2022 | 16 | 2022 |
Semppl: Predicting pseudo-labels for better contrastive representations M Bošnjak, PH Richemond, N Tomasev, F Strub, JC Walker, F Hill, ... arXiv preprint arXiv:2301.05158, 2023 | 10 | 2023 |
Scaling Instructable Agents Across Many Simulated Worlds MA Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ... arXiv preprint arXiv:2404.10179, 2024 | 6 | 2024 |
The edge of orthogonality: A simple view of what makes byol tick PH Richemond, A Tam, Y Tang, F Strub, B Piot, F Hill International Conference on Machine Learning, 29063-29081, 2023 | 6 | 2023 |
Offline Regularised Reinforcement Learning for Large Language Models Alignment PH Richemond, Y Tang, D Guo, D Calandriello, MG Azar, R Rafailov, ... arXiv preprint arXiv:2405.19107, 2024 | 5 | 2024 |
Memory-efficient episodic control reinforcement learning with dynamic online k-means A Agostinelli, K Arulkumaran, M Sarrico, P Richemond, AA Bharath arXiv preprint arXiv:1911.09560, 2019 | 5 | 2019 |
Sample-efficient reinforcement learning with maximum entropy mellowmax episodic control M Sarrico, K Arulkumaran, A Agostinelli, P Richemond, AA Bharath arXiv preprint arXiv:1911.09615, 2019 | 5 | 2019 |
Human alignment of large language models through online preference optimisation D Calandriello, D Guo, R Munos, M Rowland, Y Tang, BA Pires, ... arXiv preprint arXiv:2403.08635, 2024 | 4 | 2024 |
Scaling instructable agents across many simulated worlds M Abi Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ... arXiv e-prints, arXiv: 2404.10179, 2024 | 4 | 2024 |
A short variational proof of equivalence between policy gradients and soft Q learning PH Richemond, B Maginnis arXiv preprint arXiv:1712.08650, 2017 | 4 | 2017 |
Biologically inspired architectures for sample-efficient deep reinforcement learning PH Richemond, A Kolbeinsson, Y Guo arXiv preprint arXiv:1911.11285, 2019 | 3 | 2019 |