Home | Search


2025-04-14

RL for LLM

I'm writing this more for my own understanding than to teach anyone.

LLM solves following problem

RL for LLM solves following problem

Major doubt I have:

In order to do RL for LLM, you have to train the following:

How to train these models:

RL for safety

Subscribe

Enter email or phone number to subscribe. You will receive atmost one update per month

Comment

Enter comment