Workshop «Bridging the Gap between Editors and LLMs – Reinforcement Learning, DPO, and beyond»

Topic and target audience:

This workshop covers theory and practical examples for Post Training techniques and, in particular, Reinforcement learning (RL). This workshop is at an intermedia level and ideal for people who are just getting started with post-training generative models and they have already trained in the past transformer models.

Generative AI is being used increasingly in the news industry. Grounding content in reliable sources is essential. This workshop includes topics like Reinforcement Learning with Human Feedback (RLHF), Direct Policy Optimisation (DPO) and Online-DPO.

Who’s leading the workshop:

Daniele Giofre: Daniele, a leader in AI with a background in theoretical physics, pioneered deep learning methodologies during his PhD at EPFL before the emergence of TensorFlow and PyTorch. After applying AI to corporate settings at the World Economic Forum, he now serves as Lead Scientist at Thomson Reuters Labs, specialising in LLM research for legal NLP challenges. His current interests span from optimised multi-node training in AI supercomputers to fine-tuning and reinforcement learning.

Timing:

1.5 hours.

Prequisites and what to bring:

All participants need to bring a laptop
No special software is required on the laptop
People should be familiar with Python and Jupyter Notebooks
A Kaggle account is required for GPU usage
Minimal knowledge of Transformer architecture is required