deploy: 1dd7e4a

n1o · Mar 28, 2024 · 23505f6 · 23505f6
1 parent b2c4714
commit 23505f6
Showing 1 changed file with 3 additions and 5 deletions.
diff --git a/posts/t5-the-old-new-thing/index.html b/posts/t5-the-old-new-thing/index.html
@@ -33,15 +33,13 @@
 <a class=heading-link href=#s-denosing><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
 <span class=sr-only>Link to heading</span></a></h3><p>The S-Denosing objective aligns with the autoregressive Language Model (LLM) approach. We corrupt the input text by removing all tokens that appear after a certain token. Subsequently, the model is trained to reconstruct the original text from the corrupted version. This objective emphasizes the generation of fluent and coherent text.</p><h1 id=to-other-relating-papers>To other relating papers
 <a class=heading-link href=#to-other-relating-papers><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
-<span class=sr-only>Link to heading</span></a></h1><p>We have previously explored how UL2 introduced multiple pretraining objectives. To further enhance this approach, we can introduce additional task-specific pretraining objectives.</p><h2 id=codet5-codet5>CodeT5 CodeT5+
-<a class=heading-link href=#codet5-codet5><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
+<span class=sr-only>Link to heading</span></a></h1><p>We have previously explored how UL2 introduced multiple pretraining objectives. To further enhance this approach, we can introduce additional task-specific pretraining objectives.</p><h2 id=codet5-and-codet5>CodeT5 and CodeT5+
+<a class=heading-link href=#codet5-and-codet5><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
 <span class=sr-only>Link to heading</span></a></h2><p>CodeT5 introduces two additional pretraining objectives: <strong>Identifier Tagging</strong> and <strong>Masked Identifier Prediction</strong>.</p><h3 id=identifier-tagging>Identifier Tagging
 <a class=heading-link href=#identifier-tagging><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
 <span class=sr-only>Link to heading</span></a></h3><p>The goal is to teach the model knowledge of whether if a code token is a identifier or not, and it can be viewed as sort of syntax hihglihting.</p><h3 id=masked-identifier-prediction>Masked Identifier Prediction
 <a class=heading-link href=#masked-identifier-prediction><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
-<span class=sr-only>Link to heading</span></a></h3><p>Here we mask a random identifier and replace all its occurrences by a sentinel token. It can be viewed as a sort of code obfuscation, where if we change a name of a identifier it has no impact on the code. Technically this should teach the model to perform deobfuscation.</p><h2 id=codet5>CodeT5
-<a class=heading-link href=#codet5><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
-<span class=sr-only>Link to heading</span></a></h2><p>CodeT5+ builds upon the foundation of CodeT5. It involves instruction tuning and utilizes both a shallow encoder and a deep decoder.</p><p>In the context of pretraining objectives, we distinguish between unimodal and bimodal pretraining. Bimodal pretraining incorporates both text and code, a strategy also employed in CodeT5. The goal is to equip the model with the ability to generate code from text (and vice versa). However, in the case of CodeT5, this approach led to reduced performance on code-to-code tasks, such as translating code from one programming language to another or detecting code defects.</p><p>For CodeT5+, the bimodal pretraining objectives include the following:</p><h3 id=text-code-contrastive-learning>Text-Code Contrastive Learning
+<span class=sr-only>Link to heading</span></a></h3><p>Here we mask a random identifier and replace all its occurrences by a sentinel token. It can be viewed as a sort of code obfuscation, where if we change a name of a identifier it has no impact on the code. Technically this should teach the model to perform deobfuscation.</p><p><strong>CodeT5+</strong></p><p>CodeT5+ builds upon the foundation of CodeT5. It involves instruction tuning and utilizes both a shallow encoder and a deep decoder.</p><p>In the context of pretraining objectives, we distinguish between unimodal and bimodal pretraining. Bimodal pretraining incorporates both text and code, a strategy also employed in CodeT5. The goal is to equip the model with the ability to generate code from text (and vice versa). However, in the case of CodeT5, this approach led to reduced performance on code-to-code tasks, such as translating code from one programming language to another or detecting code defects.</p><p>For CodeT5+, the bimodal pretraining objectives include the following:</p><h3 id=text-code-contrastive-learning>Text-Code Contrastive Learning
 <a class=heading-link href=#text-code-contrastive-learning><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>
 <span class=sr-only>Link to heading</span></a></h3><p>In this objective, we work with positive and negative pairs of code and text. The idea is that for positive samples, the code and text representations should be close together in the representation space. This task activates only the encoder, which encodes the text-code snippets into a continuous representation space.</p><h3 id=text-code-matching>Text-Code Matching
 <a class=heading-link href=#text-code-matching><i class="fa fa-link" aria-hidden=true title="Link to heading"></i>