tags : Machine Learning, Deploying ML applications (applied ML), Embeddings, Open Source LLMs (Transformers)
Back to Basics for RAG w/ Jo Bergum - YouTube
Beyond the Basics of Retrieval for Augmenting Generation (w/ Ben Clavié) - YouTube
Systematically improving RAG applications - YouTube
What?
- It’s a work from MetaAI, “very specific way of training a generative model with a retriever together”
- All the popular definition of RAG(using Embeddings) is just one of many ways to do RAG.
Other ways to do RAG involve
- Multi-Hop retrieval
- Resolve some dependency before fetching the first info
- Late interaction
- You prepare dense vectors for the
data
andquery
- But it’s hard to create a single vector for a long document
- Late interaction says, instead of a single vector, we represent a document as a matrix. So we represent the document as a set of its tokens and not as the document itself.
- You prepare dense vectors for the
Concepts
Chunking
Types of RAG
Original RAG
very specific way of training a generative model with a retriever together
Embeddings search RAG
Common usecase
GraphRAG
Steps
-
Step 0: Chunking
This step is not graph rag specific but if we want to do graphRAG kind of search, we’d still want to do an initial semantic breakdown into more manageable chunks. This can be done based on based on whatever angle you want to do it, you can also do semantic chunking.
Example semantic chunking prompt:
template_en = """ { "instruction": "\nPlease understand the content of the text in the input field, recognize the structure and components of the text, and determine the segmentation points according to the semantic theme, dividing it into several non-overlapping sections. If the article has recognizable structural information such as chapters, please divide it according to the top-level structure.\nPlease return the results according to the schema definition, including summaries and starting points of the sections. The format must be a JSON string. Please follow the examples given in the example field.", "schema": { "Section Summary": "A brief summary of the section text", "Section Starting Point": "The starting point of the section in the original text, limited to about 20 characters. This segmentation point will be used to split the original text, so it must be found in the original text!" }, "input": "$input", "example": [ { "input": "Jay Chou (Jay Chou), born on January 18, 1979, in Xinbei City, Taiwan Province, originally from Yongchun County, Fujian Province, is a Mandopop male singer, musician, actor, director, screenwriter, and a graduate of Tamkang Senior High School.\nIn 2000, recommended by Yang Junrong, Jay Chou started singing his own compositions.", "output": [ { "Section Summary": "Personal Introduction", "Section Starting Point": "Jay Chou (Jay Chou), born on January 18" }, { "Section Summary": "Career Start", "Section Starting Point": "\nIn 2000, recommended by Yang Junrong" } ] }, { "input": "Hangzhou Flexible Employment Personnel Housing Provident Fund Management Measures (Trial)\nTo expand the benefits of the housing provident fund system and support flexible employment personnel to solve housing problems, according to the State Council's 'Housing Provident Fund Management Regulations', 'Zhejiang Province Housing Provident Fund Regulations' and the relevant provisions and requirements of the Ministry of Housing and Urban-Rural Development and the Zhejiang Provincial Department of Housing and Urban-Rural Development on flexible employment personnel participating in the housing provident fund system, combined with the actual situation in Hangzhou, this method is formulated.\n1. This method applies to the voluntary deposit, use, and management of the housing provident fund for flexible employment personnel within the administrative region of this city.\n2. The flexible employment personnel referred to in this method are those who are within the administrative region of this city, aged 16 and above, and males under 60 and females under 55, with full civil capacity, and employed in a flexible manner such as part-time, self-employed, or in new forms of employment.\n3. Flexible employment personnel applying to deposit the housing provident fund should apply to the Hangzhou Housing Provident Fund Management Center (hereinafter referred to as the Provident Fund Center) for deposit registration procedures and set up personal accounts.", "output": [ { "Section Summary": "Background and Basis for Formulating the Management Measures", "Section Starting Point": "To expand the benefits of the housing provident fund system" }, { "Section Summary": "Scope of Application of the Management Measures", "Section Starting Point": "1. This method applies to the voluntary deposit" }, { "Section Summary": "Definition of Flexible Employment Personnel", "Section Starting Point": "2. The flexible employment personnel referred to in this method" }, { "Section Summary": "Procedures for Flexible Employment Personnel to Register for Deposit", "Section Starting Point": "3. Flexible employment personnel applying to deposit the housing provident fund" } ] } ] } """
-
Step 1: Graph Creation / Info Extractions
- There are 2 paths to KG generation today and both are problematic in their own ways
-
- Natural Language Processing (NLP) (model that is trained on an ontology that works with your data.)
-
- LLM
-
- atm, Most GraphRAG implementations rely on
LLM entity extraction
to build the graph. (subject-predicate-object
triplets) is the entity we want to extract.
-
Example info extraction prompt(for graph generation)
template_en: str = """ { "instruction": "You are an expert in knowledge graph extraction. Based on the schema defined by constraints, extract all entities and their attributes from the input. For attributes not explicitly mentioned in the input, return NAN. Output the results in standard JSON format as a list.", "schema": $schema, "example": [ { "input": "Thyroid nodules refer to lumps within the thyroid gland that can move up and down with swallowing, and they are a common clinical condition that can be caused by various etiologies. Clinically, many thyroid diseases, such as thyroid degeneration, inflammation, autoimmune conditions, and neoplasms, can present as nodules. Thyroid nodules can occur singly or in multiple forms; multiple nodules have a higher incidence than single nodules, but single nodules have a higher likelihood of being thyroid cancer. Patients typically have the option to register for consultation in general surgery, thyroid surgery, endocrinology, or head and neck surgery. Some patients can feel the nodules in the front of their neck. In most cases, thyroid nodules are asymptomatic, and thyroid function is normal. The probability of thyroid nodules progressing to other thyroid diseases is only about 1%. Some individuals may experience neck pain, a foreign body sensation in the throat, or a feeling of pressure. When spontaneous intracystic bleeding occurs in a thyroid nodule, the pain can be more intense. Treatment options generally include radioactive iodine therapy, Lugol's solution (a compound iodine oral solution), or antithyroid medications to suppress thyroid hormone secretion. Currently, commonly used antithyroid drugs are thiourea compounds, including propylthiouracil (PTU) and methylthiouracil (MTU) from the thiouracil class, and methimazole and carbimazole from the imidazole class.", "schema": { "Disease": { "properties": { "complication": "Disease", "commonSymptom": "Symptom", "applicableMedicine": "Medicine", "department": "HospitalDepartment", "diseaseSite": "HumanBodyPart" } },"Medicine": { "properties": { } } } "output": [ { "entity": "Thyroid Nodule", "category": "Disease", "properties": { "complication": "Thyroid Cancer", "commonSymptom": ["Neck Pain", "Foreign Body Sensation in the Throat", "Feeling of Pressure"], "applicableMedicine": ["Lugol's Solution (Compound Iodine Oral Solution)", "Propylthiouracil (PTU)", "Methylthiouracil (MTU)", "Methimazole", "Carbimazole"],\n "department": ["General Surgery", "Thyroid Surgery", "Endocrinology", "Head and Neck Surgery"],\n "diseaseSite": "Thyroid"\n }\n },\n {\n "entity": "Lugol's Solution (Compound Iodine Oral Solution)", "category": "Medicine" }, { "entity": "Propylthiouracil (PTU)", "category": "Medicine" }, { "entity": "Methylthiouracil (MTU)", "category": "Medicine" }, { "entity": "Methimazole", "category": "Medicine" }, { "entity": "Carbimazole", "category": "Medicine" } ], "input": "$input" } """
- There are 2 paths to KG generation today and both are problematic in their own ways
-
TODO Step 1.5: Deciding the store
- In a vector store, “John manages Mary” and “Mary reports to John” might have similar embeddings, but the directional relationship isn’t explicitly captured. So we might need a graph store (eg. neo4j)
- Graph databases excel at multi-hop queries and relationship traversal
-
Step 2: Index Building
- See “Font Loading” in Open Source LLMs (Transformers)
- Sometime fontloading/index building is enough
-
Step 3: Search
Examples/Tools
- https://github.com/superlinear-ai/raglite (dependency free)
- https://github.com/HKUDS/LightRAG
- https://github.com/pingcap/autoflow
- https://github.com/infiniflow/ragflow
- https://github.com/OpenSPG/KAG
- https://microsoft.github.io/graphrag/
- https://github.com/getzep/graphiti 🌟
Links
Vector DB/Semantic Search
See Embeddings
- puffer something??
- https://github.com/wizenheimer/tinkerbird
- another framework: https://github.com/neuml/txtai
Multimodal RAG
Dump
- Introducing AutoRAG: fully managed Retrieval-Augmented Generation on Cloudflare
- smol RAG toolkit: https://github.com/superlinear-ai/raglite (dependency free)
- Handing multimodal data