Skip to content

Commit

Permalink
update planned changes page
Browse files Browse the repository at this point in the history
  • Loading branch information
callummcdougall committed Jul 28, 2023
1 parent b98c1f7 commit 4812948
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 19 deletions.
1 change: 0 additions & 1 deletion chapter0_fundamentals/exercises/part3_resnets/solutions.py
Original file line number Diff line number Diff line change
Expand Up @@ -835,4 +835,3 @@ def val_dataloader(self):


# %%

27 changes: 9 additions & 18 deletions general_information/Home.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
r"""
If you have any other suggestions for material or features that you think ARENA could benefit from, please reach out via email `[email protected]` or on the [Slack group](https://join.slack.com/t/arena-la82367/shared_invite/zt-1uvoagohe-JUv9xB7Vr143pdx1UBPrzQ).
***Note** - some of the dates below have been moved backwards recently, because I've become quite busy with other fieldbuilding projects.*
***Note** - some of the dates below have been moved backwards recently, because I've become quite busy with other fieldbuilding projects and working on a research paper. I've unfortunately had to remove most of the fixed date estimates. If there is anything here you're particularly interested in then please reach out, so we know what to prioritise.*
""")

st.markdown(
Expand All @@ -47,7 +47,7 @@
This section details all planned changes, i.e. things we're confident will be implemented and which we do have a good idea of the timeline for. They're roughly ordered by priority / release date (i.e. the first ones in this list will be coming first).
<details>
<summary>🗺️ <b>Smoothing the path from PPO to RLHF</b><br>Planned release: <b>July 2023</b></summary>
<summary>🗺️ <b>Smoothing the path from PPO to RLHF</b><br>Planned release: <b>July-August 2023</b></summary>
---
Expand All @@ -57,27 +57,27 @@
Some more information about each new section:
#### **PPO (Atari / MuJoCo)** - 9th July 2023 (this has now been released)
#### **PPO (Atari / MuJoCo)**
This will extend the basic PPO material.
The most important conceptual ideas Atari introduces relate to the design of the actor / critic architectures (because the observation space is now an image of pixels rather than four values). We'll use CNNs for our agent, and we'll have shared architecture for the actor and critic (an important idea which reappears when we look at transformers later on).
MuJoCo is more challenging than Atari, because it introduces a **continuous action space**. This will require a redesign of several components of the PPO algorithm.
#### **RLHF (Atari / MuJoCo)** - 23th July 2023
#### **RLHF (Atari / MuJoCo)**
Practice RLHF in a simple environment, before we get to language models. You'll be able to try RLHF on Atari games, or attempt to replicate results like OpenAI's famous "backflipping noodle" experiment.
#### **PPO (transformers)** - 25th July 2023
#### **PPO (transformers)**
Now, we move on to transformer architecture. Before we introduce the human feedback element, we'll use the PPO algorithm to train the transformer to do a simple task (generating more periods, i.e. shorter sentences). This will involve some concepts that came up in the PPO Atari material (e.g. the policy and value networks as having shared architecture), as well as some new ideas (e.g. keeping the transformer on-distribution using a KL divergence penalty term).
#### **RLHF (transformers)** - 28th July 2023
#### **RLHF (transformers)**
This section combines the conceptual ideas from the last 2 sections, by having you implement RLHF on transformers. You'll be doing all the coding from the ground up, i.e. with no assistance from libraries like `trlx`.
#### **RLHF (trlx), re-release** - 1st August 2023
#### **RLHF (trlx), re-release**
Lastly, we plan to restructure the RLHF & `trlx` chapter, to make it easier to follow & flow better with the rest of the chapter. Learning how to use `trlx` is still valuable, because once you have a conceptual understanding of the RLHF process, there's no need to go through all the low-level implementational details.
Expand All @@ -90,7 +90,7 @@
We'll start to release a monthly sequence on mechanistic interpretability, in the same vein as Steven Caspar's sequence (they will also be published on LessWrong). Each challenge will involve interpreting a transformer trained on a particular LeetCode problem. The balanced brackets transformer exercises are a good example (although I expect the average task in this sequence to be slightly easier).
**The first one is planned for release on Saturday 15th July.**
**The first one is currently live.**
</details>
Expand Down Expand Up @@ -125,22 +125,13 @@
</details>
<details>
<summary>📽️ <b>Video walkthroughs of solutions</b><br>Planned release: <b>July - August 2023</b></summary>
---
Over the next couple of months, video walkthroughs to most of the exercise sets are planned to be released. The mechanistic interpretability chapter will be highest priority; after that we'll work on videos for the RL chapter.
</details>
## Speculative
This section lists a few changes we're thinking about making, but which we aren't certain about yet.
* A chapter on diffusion models: GANs, VAEs, diffusion models, stable diffusion.
* Evals (possibly as a subsection of a chapter, or as its own micro section).
* Video walkthroughs of solutions to exercises (this depends on demand).
""", unsafe_allow_html=True)

0 comments on commit 4812948

Please sign in to comment.