The Idea

This is (approximately) the code I used for a talk I gave at the San Antonio AITX Meetup in August 2024, "Build Your Own RAG from Common Household Objects."

Here the "common household objects" are

Ollama
Sentence Transformers
a dataset (I used the page listing the Ig Nobel Prize winners

The Idea

An LLM can generate text like

import ollama

result = ollama.generate(
    # This model is reasonably fast even on my crappy laptop
    'phi3:mini-128k',
    'What is the best deep learning framework?',
)

print(result['response'])

But it only knows whatever information it was trained on. If you want to it use other information (or only specific information), you could fine-tune it with that information, but that's a lot of work. Or you could just feed it that information as part of the prompt:

best_framework = "joelnet"

result = ollama.generate(
    # This model is reasonably fast even on my crappy laptop
    'phi3:mini-128k',
    'What is the best deep learning framework? '
    f'(hint: the answer is {best_framework})',
)

print(result['response'])

The idea behind Retrieval-Augmented Generation ("RAG") is that we use some kind of information retrieval (here, embedding similarity) to find documents related to a query, shove them into a prompt, and then get the LLM to craft an answer.

The Household Objects

Ollama

Ollama allows you to run a LLM on your local machine. This is more convenient than using (say) the OpenAI API, especially if (say) you work somewhere where you are not allowed to use the OpenAI API.

If you have a powerful MacBook or a GPU, llama3.1 is a very good model. If you have a wimpier computer, phi3:mini-128k is ... good for its size?

Sentence-Transformers

Here we use sentence-transformers to create embeddings for our documents and query. The all-MiniLM-L6-v2 model is not ideal for this, but it's fast and good enough.

The Dataset

Use your own!

Running the RAG

Make sure you have Ollama running and have pulled whatever model you're going to use.
Make sure you have a recent Python and create a virtualenv in your favorite way.
pip install the requirements
python main.py

And then you get a little interactive session:

Ask a question: what sleep apnea treatment won a prize
The group that won a prize for their contribution in treating sleep apnea was the team consisting of Milo A. Puhan, Alex Suarez, Christian Lo Cascio, Alfred Zahn, Markus Heitz, and Otto Braendli. They were recognized jointly with Fernanda Ito, Enrico Bernard, Rodrigo Aneros (Tyron Anero), for their research on the effectiveness of using a didgeridoo as treatment against obstructive sleep apnea syndrome. The prize they received was from France and UK's Medicine Prize [FRANCE, UK]. They delivered their acceptance speech via recorded video during the award ceremony that took place in December 2016 at Acta Chiropterologica journal publishing house

Anyway, it is not the world's best RAG, but it's also only 50 lines of code, and the goal is mainly to show you how a RAG works. You could make it better if you wanted.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
ollama_example.py		ollama_example.py
requirements.txt		requirements.txt
st_example.py		st_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Idea

The Household Objects

Ollama

Sentence-Transformers

The Dataset

Running the RAG

About

Releases

Packages

Languages

License

joelgrus/rag-from-household-objects

Folders and files

Latest commit

History

Repository files navigation

The Idea

The Household Objects

Ollama

Sentence-Transformers

The Dataset

Running the RAG

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages