From 4812948cea066c2eb420264e327ed60fd4fbcc29 Mon Sep 17 00:00:00 2001 From: callummcdougall Date: Fri, 28 Jul 2023 12:07:15 +0100 Subject: [PATCH] update planned changes page --- .../exercises/part3_resnets/solutions.py | 1 - general_information/Home.py | 27 +++++++------------ 2 files changed, 9 insertions(+), 19 deletions(-) diff --git a/chapter0_fundamentals/exercises/part3_resnets/solutions.py b/chapter0_fundamentals/exercises/part3_resnets/solutions.py index a52d2f79..5f49f1eb 100644 --- a/chapter0_fundamentals/exercises/part3_resnets/solutions.py +++ b/chapter0_fundamentals/exercises/part3_resnets/solutions.py @@ -835,4 +835,3 @@ def val_dataloader(self): # %% - diff --git a/general_information/Home.py b/general_information/Home.py index 174eb648..5bce38b8 100644 --- a/general_information/Home.py +++ b/general_information/Home.py @@ -36,7 +36,7 @@ r""" If you have any other suggestions for material or features that you think ARENA could benefit from, please reach out via email `cal.s.mcdougall@gmail.com` or on the [Slack group](https://join.slack.com/t/arena-la82367/shared_invite/zt-1uvoagohe-JUv9xB7Vr143pdx1UBPrzQ). -***Note** - some of the dates below have been moved backwards recently, because I've become quite busy with other fieldbuilding projects.* +***Note** - some of the dates below have been moved backwards recently, because I've become quite busy with other fieldbuilding projects and working on a research paper. I've unfortunately had to remove most of the fixed date estimates. If there is anything here you're particularly interested in then please reach out, so we know what to prioritise.* """) st.markdown( @@ -47,7 +47,7 @@ This section details all planned changes, i.e. things we're confident will be implemented and which we do have a good idea of the timeline for. They're roughly ordered by priority / release date (i.e. the first ones in this list will be coming first).
-🗺️ Smoothing the path from PPO to RLHF
Planned release: July 2023
+🗺️ Smoothing the path from PPO to RLHF
Planned release: July-August 2023
--- @@ -57,7 +57,7 @@ Some more information about each new section: -#### **PPO (Atari / MuJoCo)** - 9th July 2023 (this has now been released) +#### **PPO (Atari / MuJoCo)** This will extend the basic PPO material. @@ -65,19 +65,19 @@ MuJoCo is more challenging than Atari, because it introduces a **continuous action space**. This will require a redesign of several components of the PPO algorithm. -#### **RLHF (Atari / MuJoCo)** - 23th July 2023 +#### **RLHF (Atari / MuJoCo)** Practice RLHF in a simple environment, before we get to language models. You'll be able to try RLHF on Atari games, or attempt to replicate results like OpenAI's famous "backflipping noodle" experiment. -#### **PPO (transformers)** - 25th July 2023 +#### **PPO (transformers)** Now, we move on to transformer architecture. Before we introduce the human feedback element, we'll use the PPO algorithm to train the transformer to do a simple task (generating more periods, i.e. shorter sentences). This will involve some concepts that came up in the PPO Atari material (e.g. the policy and value networks as having shared architecture), as well as some new ideas (e.g. keeping the transformer on-distribution using a KL divergence penalty term). -#### **RLHF (transformers)** - 28th July 2023 +#### **RLHF (transformers)** This section combines the conceptual ideas from the last 2 sections, by having you implement RLHF on transformers. You'll be doing all the coding from the ground up, i.e. with no assistance from libraries like `trlx`. -#### **RLHF (trlx), re-release** - 1st August 2023 +#### **RLHF (trlx), re-release** Lastly, we plan to restructure the RLHF & `trlx` chapter, to make it easier to follow & flow better with the rest of the chapter. Learning how to use `trlx` is still valuable, because once you have a conceptual understanding of the RLHF process, there's no need to go through all the low-level implementational details. @@ -90,7 +90,7 @@ We'll start to release a monthly sequence on mechanistic interpretability, in the same vein as Steven Caspar's sequence (they will also be published on LessWrong). Each challenge will involve interpreting a transformer trained on a particular LeetCode problem. The balanced brackets transformer exercises are a good example (although I expect the average task in this sequence to be slightly easier). -**The first one is planned for release on Saturday 15th July.** +**The first one is currently live.**
@@ -125,16 +125,6 @@ -
-📽️ Video walkthroughs of solutions
Planned release: July - August 2023
- ---- - -Over the next couple of months, video walkthroughs to most of the exercise sets are planned to be released. The mechanistic interpretability chapter will be highest priority; after that we'll work on videos for the RL chapter. - -
- - ## Speculative @@ -142,5 +132,6 @@ * A chapter on diffusion models: GANs, VAEs, diffusion models, stable diffusion. * Evals (possibly as a subsection of a chapter, or as its own micro section). +* Video walkthroughs of solutions to exercises (this depends on demand). """, unsafe_allow_html=True)