
What is Retrieval Augmented Generation (RAG)?
Retrieval Augmented Generation (RAG) (original paper, Lewis et al.) leverages both generative models and retrieval models for knowledge-intensive tasks. It improves Generative AI applications by providing up-to-date information and domain-specific data from external data sources during response generation, reducing the risk of hallucinations and significantly improving performance and accuracy. Building a RAG system can be cost and data efficient without requiring technical expertise to train a model while keeping the other advantages mentioned above.Quickstart
To build RAG, you first need to create a vector store by indexing your source documents using an embedding model of your choice. LlamaIndex provides libraries to load and transform documents. After this step, you will create a VectorStoreIndex for your document objects with vector embeddings, and store them in a vector store. LlamaIndex supports numerous vector stores. See the complete list of supported vector stores here. Now when you have a query, you will retrieve relevant information from the vector store, augment it with your original query, and use an LLM to get your final output. Below you will find an example of how you can incorporate a new article into your RAG application using the Neosantara API and LlamaIndex, so that a generative model can respond with the correct information. First, install the llama-index package from Pip. See the installation documentation for different ways to install.Expected Output
Additional Configuration Options
You can customize the RAG system further by:- Adjusting retrieval parameters:
- Configuring LLM parameters:
Troubleshooting
Common issues and solutions:- API Key Error: Ensure your Neosantara API key is correctly set in the environment variable
- Document Loading Error: Check that the document directory exists and contains readable files
- Import Error: Make sure all required packages are installed with
pip install llama-index llama-index-llms-openai-like llama-index-embeddings-openai-like