feat: implement KM-RAG methodology artifacts and core architectural standards with supporting query and service updates
This commit is contained in:
@@ -0,0 +1,28 @@
|
||||
# Core Concepts of KM-RAG (Knowledge-Map RAG)
|
||||
|
||||
Knowledge-Map RAG (KM-RAG) shifts the paradigm from "mechanical chunking" to "structured knowledge engineering".
|
||||
|
||||
## 1. From Chunks to Knowledge Units (KU)
|
||||
Instead of random character-based splits, knowledge is partitioned into **Knowledge Units** that preserve structural meaning:
|
||||
- **Unit Types**: `Section`, `Table`, `Definition`, `ProcedureStep`, `PolicyRule`.
|
||||
- **Properties**: Stable ID, Version, Canonical Text, Rendered Context, Provenance (source, page, path).
|
||||
|
||||
## 2. The Knowledge Map (Graph)
|
||||
Relationships between Knowledge Units are explicitly modeled to enhance retrieval and context assembly:
|
||||
- `HAS_UNIT`: Document contains Unit.
|
||||
- `NEXT` / `PREVIOUS`: Sequential flow between units.
|
||||
- `DEFINES`: Unit defines a specific entity or term.
|
||||
- `REFERENCES`: Unit refers to another unit.
|
||||
- `EXCEPTION_OF`: Unit describes an exception to a rule in another unit.
|
||||
|
||||
## 3. Retrieval Strategy: "Plan over Similarity"
|
||||
Retrieval is not just `top-k` similarity but a multi-stage process:
|
||||
1. **Candidate Generation**: Hybrid search (Vector + Keyword) to find potential matches.
|
||||
2. **Graph Expansion**: Pulling related units (e.g., "Get the section this table belongs to" or "Get the definition of term X used here").
|
||||
3. **Reranking**: Using a Cross-Encoder to precisely score the expanded candidates.
|
||||
4. **Context Assembly**: Building a grounded context with explicit citations.
|
||||
|
||||
## 4. Governance and Provenance
|
||||
- **Audit Trail**: Every answer must be traceable back to specific Knowledge Units with valid provenance.
|
||||
- **Permission-Aware**: Retrieval filters must enforce ACLs at the unit/graph level before the LLM sees the data.
|
||||
- **Continuous Evaluation**: Monitoring "Faithfulness" (groundedness) and "Answer Relevance" using tools like RAGAS or TruLens.
|
||||
Reference in New Issue
Block a user