(대학원) 강화학습

(Grad) Reinforce Learning · Prof. Giseop Noh

RL Theory | Q-learning | PPO | RLHF | Immitation Learning

Lecture plan (click me)

주차별 자료

  1. 01 Introduction to RL PDF
  2. 02 What does RL Learn? PDF
  3. 03 RL Taxonomy & MDP Basics PDF
  4. 04 Markov Decision Process in RL PDF
  5. 05 Dynamic Programming in RL PDF
  6. 06 Monte Carlo Methods in RL PDF
  7. 07 Temporal Difference PDF
  8. 08 MAB, Exploration vs. Exploitation PDF
  9. 09 Multi-Armed Bandit Practice MD
  10. 10 Q-Learning PDF
  11. 11 Deep Q-learning Network (DQN) MD
  12. 12 Practice for Code Repair using RL & LLM MD
  13. 13 Double DQN PDF
  14. 14 Dueling DQN PDF
  15. 15 Policy Gradient PDF
전체 강좌 목록으로 이동