RAG Chatbot

A rudimental retrieval and generation model chatbot.

What is RAG?

The RAG (Retrieval-Augmented Generation) architecture is a machine learning model that combines the power of retrieval-based and generative approaches to produce answers. It first retrieves relevant documents or data from a large dataset based on the input query. Then, it uses this retrieved information to augment the generation process of a transformer-based model, such as BERT or GPT, to produce more accurate, informed, and contextually relevant answers. This hybrid approach leverages the strengths of both retrieval and generation, offering improved performance on tasks like question answering, content generation, and information synthesis.

What the hell?

Think of RAG like asking for travel advice. First, you gather stories and tips (retrieval) from friends who've visited the places you're interested in. Then, you use all those stories and tips to plan your own perfect trip (generation). RAG does something similar: it first finds info related to your question, then mixes that info to craft a tailored answer.

What do we need?

Python (I'm using ver 3.10)
A CSV Data Source (Our context)

Packages

pip install sentence-transformers
pip install openai
pip install numpy

Here's the code

import openai
from sentence_transformers import SentenceTransformer, util
import numpy as np

# Loading your OpenAI API key from an environment variable or secure source
# is a better 
openai.api_key =  'API KEY HERE, KEEP THIS A SECRET'

def read_file_in_chunks(file_path, chunk_size=1000):
    """
    Generator to read a file in chunks of text.
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

def find_most_relevant_chunk(question, chunks):
    """
    Find the most relevant chunk using sentence embeddings for semantic search.
    """
    model = SentenceTransformer('all-MiniLM-L6-v2')
    question_embedding = model.encode(question, convert_to_tensor=True)
    max_similarity = -np.inf
    relevant_chunk = None

    for chunk in chunks:
        chunk_embedding = model.encode(chunk, convert_to_tensor=True)
        similarity = util.pytorch_cos_sim(question_embedding, chunk_embedding)
        if similarity > max_similarity:
            max_similarity = similarity
            relevant_chunk = chunk

    return relevant_chunk

def ask_openai(question, context):
    """
    Ask a question to OpenAI API with the provided context.
    """
    try:
        response = openai.Completion.create(
            engine="davinci-002",  # Update this to the latest or most suitable engine
            prompt=f"{context}\n\nQuestion: {question}\nAnswer:",
            temperature=0.5,
            max_tokens=300,
            top_p=1.0,
            frequency_penalty=0.0,
            presence_penalty=0.0,
            stop=["\n"]
        )
        return response.choices[0].text.strip()
    except Exception as e:
        return str(e)

# Example usage
file_path = 'PATH TO DATA FILE GOES HERE'  # Ensure this path is correct
chunks = list(read_file_in_chunks(file_path))

while True:
    # Prompt the user to enter a question
    question = input("Please enter your question (or type 'exit' to quit): ")
    if question.lower() == 'exit':
        break

    relevant_chunk = find_most_relevant_chunk(question, chunks)

    if relevant_chunk:
        answer = ask_openai(question, relevant_chunk)
        # Print the answer in green and reset the color after
        print(f"\033[92mAnswer:\n{answer}\033[0m")
    else:
        print("\033[92mCould not find a relevant section in the text for your question.\033[0m"
```

PreviousCareer and Aspirations NextMachine Learning Aimbot

Last updated 1 year ago