chore: ii-atsuhiko hands-on

xhiroga · Nov 12, 2024 · d719677 · d719677
1 parent 8c3cebb
commit d719677
Show file tree

Hide file tree

Showing 5 changed files with 930 additions and 255 deletions.
diff --git a/computer-science/machine-learning/05-large-scale-pretrained-model.md b/computer-science/machine-learning/05-large-scale-pretrained-model.md
@@ -76,6 +76,13 @@ Next Token Predictionでは、一般的に1epochのみ学習させる。
 3. 重複フィルタリング (1兆文字 → 3500億文字)
 4. ホスト名フィルタリング (3500億文字 → 3100億文字)
 
+継続事前学習はドメイン特化LLMの開発にも用いられる。ドメイン特化には継続事前学習, SFT, RAGなどの手法が考えられる。
+
+知識を参照するだけならRAGの方が性能が良い一方で、論理的思考など知識を活用するには継続事前学習の方が有利という主張がある。[^pfn_2024]
+[^pfn_2024]: <https://tech.preferred.jp/ja/blog/llm_knowledge_injection/>
+
+ドメイン特化のための継続事前学習の工夫としては、生のテキストデータではなく、テキストの読解力を問う問題に加工して与えるなどの工夫がある。
+
 #### 語彙拡張
 
 <!-- TODO -->

diff --git a/computer-science/machine-learning/_src/ii-atsuhiko-hands-on/.gitignore b/computer-science/machine-learning/_src/ii-atsuhiko-hands-on/.gitignore
@@ -1,2 +1,4 @@
-
-Hands-on
+Hands-on
+NEFTune_*
+output_neftune
+wandb