RAG - retrieval-augmented generation

Presentation - Empowering GenAI with RAG

RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information and to give users insight into LLMs' generative process.

RAG combines retrieval and generation processes to enhance the capabilities of LLMs
In RAG, the model retrieves relevant information from a knowledge base or external sources
This retrieved information is then used in conjunction with the model's internal knowledge to generate coherent and contextually relevant responses
RAG enables LLMs to produce higher-quality and more context-aware outputs compared to traditional generation methods
Essentially, RAG empowers LLMs to leverage external knowledge for improved performance in various natural language processing tasks

Why is Retrieval-Augmented Generation important

You can think of the LLM as an over-enthusiastic new employee who refuses to stay informed with current events but will always answer every question with absolute confidence.
Unfortunately, such an attitude can negatively impact user trust and is not something you want your chatbots to emulate!
RAG is one approach to solving some of these challenges. It redirects the LLM to retrieve relevant information from authortative, pre-determined knowledge sources.
Organizations have greater control over the generated text output, and users gain insights into how the ML generates the response.

Codes

Advanced

rag-architecture

RAG from Scratch

Stop Saying RAG Is Dead – Hamel’s Blog

Advanced RAG Techniques

Query Expansion (with multiple queries)
- GitHub - pdichone/advanced-rag-techniques
- Downsides
  - Lots of results
    - queries might not always be relevant or useful
  - Results not always relevant and or useful

Advanced RAG Techniques: Unlocking the Next Level | by Tarun Singh | Medium

RIG - Retrieval Interleaved Generation - DataGemma through RIG and RAG - by Bugra Akyildiz

Contextual Retrieval

Contextual Retrieval (introduced by Anthropic1) addresses a common issue in traditional Retrieval-Augmented Generation (RAG) systems: individual text chunks often lack enough context for accurate retrieval and understanding.

Contextual Retrieval enhances each chunk by adding specific, explanatory context before embedding or indexing it. This preserves the relationship between the chunk and its broader document, significantly improving the system's ability to retrieve and use the most relevant information.

Better Context for your RAG with Contextual Retrieval | MLExpert - Get Things Done with AI Bootcamp

GraphRAG

Graph RAG

Cache-Augmented Generation (CAG)

RAG vs. CAG: Solving Knowledge Gaps in AI Models - YouTube

GitHub - hhhuang/CAG: Cache-Augmented Generation: A Simple, Efficient Alternative to RAG

Tools

NoCode Tools

RAGFlow - RAGFlow is a RAG engine for deep document understanding! It lets you build enterprise-grade RAG workflows on complex docs with well-founded citations. Supports multimodal data understanding, web search, deep research, etc. 100% open-source with 59k+ stars!
- GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
xpander - xpander is a framework-agnostic backend for agents that manages memory, tools, multi-user states, events, guardrails, etc. While it is not a core no-code tool, you can build, test, and deploy Agents by primarily using the UI. Compatible with LlamaIndex, CrewAI, etc.
- https://github.com/xpander-ai/xpander.ai
Transformer Lab - Transformer Lab is an app to experiment with LLMs: - Train, fine-tune, or chat.
- One-click LLM download (DeepSeek, Gemma, etc.)
- Drag-n-drop UI for RAG.
- Built-in logging, and more. A 100% open-source and local!
- GitHub - transformerlab/transformerlab-app: Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Llama Factory - LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source with 50k stars!
- GitHub - hiyouga/LLaMA-Factory: Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Langflow - Langflow is a drag-and-drop visual tool to build AI agents. It lets you build and deploy AI-powered agents and workflows. Supports all major LLMs, vector DBs, etc. 100% open-source with 82k+ stars!
- GitHub - langflow-ai/langflow: Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
AutoAgent - AutoAgent is a zero-code framework that lets you build and deploy Agents using natural language. It comes with: - Universal LLM support
- Native self-managing Vector DB
- Function-calling and ReAct interaction modes.
- 100% open-source with 5k stars!
- GitHub - HKUDS/AutoAgent: "AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"
GitHub - truefoundry/cognita: RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Why is Retrieval-Augmented Generation important​

Codes​

Advanced​

Advanced RAG Techniques​

Contextual Retrieval​

GraphRAG​

Cache-Augmented Generation (CAG)​

Tools​

NoCode Tools​

Links​