# Implementation Patterns for KM-RAG in .NET This guide outlines how to implement KM-RAG patterns using C# and .NET, building on existing infrastructures like EF Core and `Microsoft.Extensions.AI`. ## 1. Defining Knowledge Units Represent units as strongly-typed entities to capture metadata and relationships. ```csharp public enum KnowledgeUnitType { Section, Table, Definition, Step, Rule } public class KnowledgeUnit { public string Id { get; set; } // Stable Hash(Source, Content, Version) public string SourceId { get; set; } public string Version { get; set; } public KnowledgeUnitType Type { get; set; } public string Content { get; set; } public string MetadataJson { get; set; } // page, section_path, etc. public Vector? Embedding { get; set; } // Graph Relationships public List OutgoingLinks { get; set; } = new(); } public class KnowledgeUnitLink { public string TargetUnitId { get; set; } public string RelationType { get; set; } // "Next", "Defines", "References" } ``` ## 2. Multi-Stage Retrieval Transition from simple `Take(Limit)` to a pipeline. ### Step A: Hybrid Candidate Generation Combine `pgvector` cosine similarity with full-text search if available. ```csharp var queryVector = await _embeddingGenerator.GenerateAsync(queryText); var candidates = await _dbContext.KnowledgeUnits .Where(u => u.TenantId == tenantId) .OrderBy(u => u.Embedding.CosineDistance(queryVector)) .Take(20) // Get more candidates for reranking .Select(u => new { u.Id, u.Content, u.Type }) .ToListAsync(); ``` ### Step B: Graph Expansion Retrieve related units to provide full context. ```csharp // Example: Get "Contextual Neighbors" var expandedIds = await _dbContext.KnowledgeUnitLinks .Where(l => candidateIds.Contains(l.SourceUnitId) && l.RelationType == "ParentSection") .Select(l => l.TargetUnitId) .Distinct() .ToListAsync(); var contextUnits = await _dbContext.KnowledgeUnits .Where(u => expandedIds.Contains(u.Id)) .ToListAsync(); ``` ## 3. Reranking and Citations Use a model to score the relevance of the expanded context and ensure the LLM cites sources. ```csharp // System Prompt for Grounded Generation var systemPrompt = @" You are a precision assistant. Answer ONLY using the provided Knowledge Units. If the information is missing, state 'Information not found in knowledge map'. Each answer segment MUST include a citation in format [UnitId]. "; // Response Structure (using System.Text.Json or Structured Outputs) public class RagResponse { public string Answer { get; set; } public List Citations { get; set; } } ``` ## 4. Ingestion Workflow Instead of `string.Split`, use structural parsers: 1. **Parse**: Extract sections/tables (e.g., using `Unstructured` or custom Logic). 2. **Normalize**: Assign stable IDs based on content hash + source metadata. 3. **Embed**: Generate vectors for the canonical text of each unit. 4. **Relate**: Build links (e.g., `prev` -> `curr` -> `next`).