mjasin/Nexus.Reader

Files

T

mjasin afdfc31d1a feat: implement KM-RAG methodology artifacts and core architectural standards with supporting query and service updates

2026-05-03 16:12:07 +02:00

1.8 KiB

Raw Blame History

Core Concepts of KM-RAG (Knowledge-Map RAG)

Knowledge-Map RAG (KM-RAG) shifts the paradigm from "mechanical chunking" to "structured knowledge engineering".

1. From Chunks to Knowledge Units (KU)

Instead of random character-based splits, knowledge is partitioned into Knowledge Units that preserve structural meaning:

Unit Types: Section, Table, Definition, ProcedureStep, PolicyRule.
Properties: Stable ID, Version, Canonical Text, Rendered Context, Provenance (source, page, path).

2. The Knowledge Map (Graph)

Relationships between Knowledge Units are explicitly modeled to enhance retrieval and context assembly:

HAS_UNIT: Document contains Unit.
NEXT / PREVIOUS: Sequential flow between units.
DEFINES: Unit defines a specific entity or term.
REFERENCES: Unit refers to another unit.
EXCEPTION_OF: Unit describes an exception to a rule in another unit.

3. Retrieval Strategy: "Plan over Similarity"

Retrieval is not just top-k similarity but a multi-stage process:

Candidate Generation: Hybrid search (Vector + Keyword) to find potential matches.
Graph Expansion: Pulling related units (e.g., "Get the section this table belongs to" or "Get the definition of term X used here").
Reranking: Using a Cross-Encoder to precisely score the expanded candidates.
Context Assembly: Building a grounded context with explicit citations.

4. Governance and Provenance

Audit Trail: Every answer must be traceable back to specific Knowledge Units with valid provenance.
Permission-Aware: Retrieval filters must enforce ACLs at the unit/graph level before the LLM sees the data.
Continuous Evaluation: Monitoring "Faithfulness" (groundedness) and "Answer Relevance" using tools like RAGAS or TruLens.