How to improve rag responses with user feedback - Generative Feedback Loop

Generative Feedback Loop is a technique where we save LLM-generated responses back into the vector database. However, sometimes there is no relevant information in our knowledge base, and the LLM generates irrelevant information that can lead to completely different outcomes.

To address this, we modified the technique by improving the generated answer manually before saving it back into the vector database. This helps chatbots and RAG (Retrieval-Augmented Generation) systems stay upto date and provide accurate and relevant information.

Whenever a chatbot gives an irrelevant response, we modify that response and save that improved answer back in our vector database.
The next time a similar question is asked, the chatbot should pull the most recently added response from the vector database and give a better answer.
This process not only improves how the chatbot responds but also we can also directly add new questions in our vector database.

The Problem We Faced

In one of our previous projects, our assistant was generating responses based on irrelevant information and providing incorrect details and numbers from the LLM’s knowledge. Additionally, this information was not present in our vector database. As a result, we needed two things:

first, a way to directly add missing information into our vector database, and second, a method to improve the responses generated by the chatbot when it was providing false information.

How We Solved

We implemented a way to allow users to improve the answer whenever the assistant generates incorrect information.

Users can edit the generated answer and submit the corrected version

We then take the improved answer and save it back into the vector database.

We ensured that only the most recent and relevant data was used by adding a Unix timestamp to the dataset, which helped us track the latest version of the information. We used this process to make sure that the chatbot always had the most current and accurate answers.

We have also developed a direct method for adding missing information. In this system, the user can directly add both the question and answer into the vector database.

How It Works

Chunking the text, embedding them, and adding them into Pinecone to save.

const chunks = await chunkText(text) // Making chunks
const embeddings = await createEmbeddings(chunks) // creating Embeddings
await uploadToVectorDB(chunks, embeddings, index) // Saving into Vector Database

All Together

// Making chunks
const chunkText = async (text: string): Promise<string[]> => {
  const splitter = new RecursiveCharacterTextSplitter({
    chunkSize: 768,
    chunkOverlap: 76,
  })
  const chunks = await splitter.splitDocuments([new Document({ pageContent: text })])
  return chunks.map(chunk => chunk.pageContent.replace('\n', ' '))
}

// creating Embeddings

const createEmbeddings = async (chunks: string[]): Promise<number[][]> => {
  const embeddings = new OpenAIEmbeddings()
  return await embeddings.embedDocuments(chunks)
}

// Saving into Vector Database
const uploadToVectorDB = async (
    chunks: string[],
    embeddings: number[][],
    index: any // Replace 'any' with your Pinecone index type
): Promise<void> => {
  const vectors = chunks.map((chunk, idx) => ({
    id: uuidv4(),
    values: embeddings[idx],
    metadata: { text: chunk }
  }))

  // Upload in batches of 100
  for (let i = 0; i < vectors.length; i += 100) {
    const batch = vectors.slice(i, i + 100)
    await index.upsert(batch)
  }
}

// Usage example:
async function saveToVectorDB(text: string, index: any) {
  try {
    const chunks = await chunkText(text)
    const embeddings = await createEmbeddings(chunks)
    await uploadToVectorDB(chunks, embeddings, index)
    console.log('Successfully uploaded to vector database')
  } catch (error) {
    console.error('Error processing text:', error)
    throw error
  }
}

By modifying the Generative Feedback Loop, we were able to improve the chatbot’s responses and added the missing information directly in our vector database.

Screen Recording 2025-01-06 at 5.39.22 PM.gif

Overall, this approach gives us more control over how the chatbot learns and adapts, while also ensuring that our knowledge base remains accurate, up-to-date, and comprehensive.