feat: Word count for Fine-tuning Topic

sul-dlss-labs · Sep 7, 2024 · 4509580 · 4509580
1 parent 6a569f9
commit 4509580
Show file tree

Hide file tree

Showing 3 changed files with 53 additions and 7 deletions.
diff --git a/checklist.md b/checklist.md
@@ -109,7 +109,7 @@
       - [x] LLMs copyedit
       - [x] References copied into resources
     - [ ] Fine-tuning or Training LLMs
-      - [ ] 250 words
+      - [x] 250 words
       - [ ] LLMs copyedit
       - [ ] References copied into resources
   - [ ] Generative AI Use Cases for FOLIO

diff --git a/exploring-llms/training-llms.html b/exploring-llms/training-llms.html
@@ -32,12 +32,35 @@ <h2>OpenAI Training</h2>
 <ul>
 <li>Create an Embedding</li>
 <li>Select a model</li>
-<li></li>
 </ul>
 <p><strong>NOTE:</strong> Fine-tuning a ChatGPT model is significantly more cost than just using OpenAI's
 API model inference. </p>
-<h2>Local Claude</h2>
-<h2>Exercise</h2>
+<h2>Fine-tuning LLMs with Llama.cpp</h2>
+<p>The <a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> project allows you to run and fine-tune
+LLaMA models on your local computer.  <a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> provides more 
+lower-level access to these Open source LLMs. There is
+also a <a href="https://github.com/abetlen/llama-cpp-python">Python SDK</a> for integrating with 
+<a href="https://github.com/folio-labs/edge-ai">edge-ai</a>. <a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> provides an
+OpenAI API compatible server that also allows us to integrate with <a href="dspy-docs.vercel.app/">DSPy</a></p>
+<p><a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> can also be run with [Docker]<sup id="fnref:DOCKER"><a class="footnote-ref" href="#fn:DOCKER">1</a></sup> on your computer if you 
+don't want or can't compile the C++ source code to run on your computer.</p>
+<h3>Downloading a LLaMA-based Model</h3>
+<p><a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> uses the <a href="https://github.com/ggerganov/ggml/blob/master/docs/gguf.md">GGUF</a> 
+format for model inference and training. Look for GGUF models on <a href="https://huggingface.co/l">HuggingFace</a>
+and if you compiled <a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a> with <code>libcurl</code> support, you can use the <code>llama-cli</code> command-line
+client to download:</p>
+<p><code>./llama-cli --hf-repo lmstudio-community/Reflection-Llama-3.1-70B-GGUF --hf-file Reflection-Llama-3.1-70B-GGUF.gguf</code></p>
+<p>If <code>libcurl</code> hasn't been installed, you can usually directly download the models directly from <a href="https://huggingface.co/l">HuggingFace</a> and
+store in the <code>/models</code> directory under the main <a href="https://github.com/ggerganov/llama.cpp">LLaMA.cpp</a>.</p>
+<h3>Running the Model in Inference Mode</h3>
+<div class="footnote">
+<hr />
+<ol>
+<li id="fn:DOCKER">
+<p><a href="https://github.com/ggerganov/llama.cpp/blob/master/docs/docker.md">LLaMA.cpp with Docker</a>&#160;<a class="footnote-backref" href="#fnref:DOCKER" title="Jump back to footnote 1 in the text">&#8617;</a></p>
+</li>
+</ol>
+</div>
         </article>
         <div class="col-3">
           <h4>Navigation</h4>

diff --git a/exploring-llms/training-llms.md b/exploring-llms/training-llms.md
@@ -11,16 +11,39 @@ To train or fine-tune a ChatGPT model, the OpenAI API provides the following end
 
 - Create an Embedding
 - Select a model
-- 
 
 **NOTE:** Fine-tuning a ChatGPT model is significantly more cost than just using OpenAI's
 API model inference. 
 
-## Local Claude
 
+## Fine-tuning LLMs with Llama.cpp
+The [LLaMA.cpp][LLAMA.CCP] project allows you to run and fine-tune
+LLaMA models on your local computer.  [LLaMA.cpp][LLAMA.CCP] provides more 
+lower-level access to these Open source LLMs. There is
+also a [Python SDK](https://github.com/abetlen/llama-cpp-python) for integrating with 
+[edge-ai](https://github.com/folio-labs/edge-ai). [LLaMA.cpp][LLAMA.CCP] provides an
+OpenAI API compatible server that also allows us to integrate with [DSPy](dspy-docs.vercel.app/)
 
-## Exercise 
+[LLaMA.cpp][LLAMA.CCP] can also be run with [Docker][^DOCKER] on your computer if you 
+don't want or can't compile the C++ source code to run on your computer.
 
+### Downloading a LLaMA-based Model
+[LLaMA.cpp][LLAMA.CCP] uses the [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) 
+format for model inference and training. Look for GGUF models on [HuggingFace][HUGFACE]
+and if you compiled [LLaMA.cpp][LLAMA.CCP] with `libcurl` support, you can use the `llama-cli` command-line
+client to download:
+
+ `./llama-cli --hf-repo lmstudio-community/Reflection-Llama-3.1-70B-GGUF --hf-file Reflection-Llama-3.1-70B-GGUF.gguf`
+
+If `libcurl` hasn't been installed, you can usually directly download the models directly from [HuggingFace][HUGFACE] and
+store in the `/models` directory under the main [LLaMA.cpp][LLAMA.CCP].
+
+### Running the Model in Inference Mode
+
+[HUGFACE]: https://huggingface.co/l
 [LLAMA]: https://ai.meta.com/
+[LLAMA.CCP]: https://github.com/ggerganov/llama.cpp
 [OPENAI]: https://openai.com/
 
+[^DOCKER]: [LLaMA.cpp with Docker](https://github.com/ggerganov/llama.cpp/blob/master/docs/docker.md)
+