feat(search/rag): implement NexusSearchBox, dynamic Qdrant collection auto-provisioning, batch vector ingestion, mobile Serilog logging, and resolve 401 auth handler error (#51)
Resolves #52 This Pull Request introduces the **NexusSearchBox** search feature with premium unified styling, implements a robust **dynamic Qdrant collection auto-provisioning and batch-vector ingestion pipeline**, integrates a unified **Serilog logging infrastructure** for the Blazor Hybrid environment (MAUI), and resolves the **401 Unauthorized API header propagation error** inside mobile builds. ### 🚀 Key Implementations #### 1. Premium `NexusSearchBox` & Semantic Search UI * **NexusSearchBox Component:** Created an elegant search-as-you-type search box with smooth key navigation, quick-clearing, and seamless dynamic styling. * **Unified Aesthetics:** Refactored the search box isolated styling to align perfectly with the dashboard's design system using glassmorphism, `--nexus-neon` token gradients, and smooth pulse/fade animations. * **Semantic Search Integration:** Integrated semantic search query dispatching (`SearchLibrarySemanticallyQuery`) and wired up navigation seamlessly through the updated `ReaderNavigationService`. * **Tests Hardening:** Added/adapted query assertions in `QueryTests.cs` to guarantee safe parameterization and error boundary mapping. #### 2. Qdrant Collection Provisioning & Vector Ingestion * **Dynamic Auto-Provisioning:** Implemented dynamic checking and lazy-creation of the `knowledge_units` collection using 768 dimensions and Cosine distance. * **High-Performance Ingestion:** Optimized `ProcessKnowledgeUnitsAsync` with high-performance batch embedding generation using `_embeddingGenerator` and deterministic MD5 GUIDs for stable, duplicate-free upsertion. * **Database Cache Clear Sync:** Integrated Qdrant collection deletion in `ClearCacheAsync` to ensure absolute consistency between the PostgreSQL database cache and vector database indices. #### 3. Cross-Platform MAUI Logging (Serilog Infrastructure) * **Serilog Integration:** Configured cross-platform Serilog routing in `SerilogConfiguration.cs`, streaming diagnostic logs safely across native platforms and the Blazor Webview container. * **Interop Bridge:** Built `BlazorLoggingBridge.cs` to capture web console messages and pipe them directly to the native host logger. * **Demo Interface:** Added an interactive `SerilogDemo.razor` sandbox under Pages. #### 4. Resolving 401 Load Errors (Authentication Handler Flow) * **Authentication Header Handler:** Implemented the `MobileAuthenticationHeaderHandler` to correctly extract, validate, and inject bearer JWT tokens into outbound API requests. * **Configuration-based API Host:** Structured standard API URI routing to use clean configuration bindings in `appsettings.json`. --- ### 🧪 Verification & Build Status * Run `dotnet build` from the solution root: Successfully compiled the full multi-targeted solution (`Liczba błędów: 0`). * All unit and integration tests successfully executed and verified (`dotnet test`). --------- Co-authored-by: Marek Jasiński <jasins.marek@gmail.com> Co-authored-by: Marek Jaisński <jasins.marek@gmail.com> Reviewed-on: #51 Co-authored-by: Antigravity <antigravity@google.com> Co-committed-by: Antigravity <antigravity@google.com>
This commit was merged in pull request #51.
This commit is contained in:
@@ -4,9 +4,10 @@ public static class PromptRegistry
|
||||
{
|
||||
public const string KnowledgeExtractionSystemPrompt =
|
||||
"You are an expert educator. Analyze the provided text to extract key concepts, generate relevant quizzes, and construct a knowledge graph. " +
|
||||
"CRITICAL: Restrict 'concept.label' to a maximum of 3 words (e.g., 'Dependency Injection' instead of full sentences). " +
|
||||
"**LANGUAGE CRITICAL**: Detect the language of the provided text. You MUST generate all human-readable fields ('title', 'description', 'question', 'options', 'label') in the EXACT SAME LANGUAGE as the source text. Do NOT translate them to English unless the source text is in English. " +
|
||||
"CRITICAL: Restrict 'concept.label' to a maximum of 3 words (e.g., 'Dependency Injection' or its exact foreign equivalent, never full sentences). " +
|
||||
"CRITICAL: Extract a MAXIMUM of 15 key concepts/plot points from the text. " +
|
||||
"CRITICAL: Code blocks (e.g., markdown code snippets) must be excluded from the relationship graph, or summarized as a single node (e.g., 'Code Example'). Do NOT create nodes for variables, functions, namespaces, or individual lines of code. " +
|
||||
"CRITICAL: Code blocks (e.g., markdown code snippets) must be excluded from the relationship graph, or summarized as a single node with the label 'Code Example' translated to the detected language. Do NOT create nodes for variables, functions, namespaces, or individual lines of code. " +
|
||||
"CRITICAL: Return ONLY a minified JSON object. Do NOT include markdown formatting like ```json or ```. Do NOT include explanations. " +
|
||||
"Schema: { " +
|
||||
"\"concepts\": [ { \"title\": \"string\", \"description\": \"string\" } ], " +
|
||||
@@ -15,28 +16,66 @@ public static class PromptRegistry
|
||||
"}.";
|
||||
|
||||
public const string GraphExtractionPrompt =
|
||||
"You are an expert at information architecture. Extract key concepts and paragraph mappings from the text to build a unified knowledge graph. " +
|
||||
"The input text consists of several paragraphs, each starting with its unique block ID in the format '[ID: seg-X]'. " +
|
||||
"Extract two types of nodes: " +
|
||||
"1. Concept Nodes (group: 'concept'): Extract the main technical concepts discussed (e.g., ID: 'dependency-injection', label: 'Dependency Injection'). Max 10 concepts. Labels must be at most 3 words. " +
|
||||
"2. Block Nodes (group: 'current'): For each paragraph in the input, create a node representing that paragraph where 'id' is the exact block ID (e.g., 'seg-1'), and 'label' is a brief summary of that paragraph's content (max 3 words). " +
|
||||
"CRITICAL: If a paragraph is a code block, represent it as a single block node with label 'Code Example' (group: 'current'). Do NOT extract low-level code elements (like variables, classes, methods, or namespaces) as separate concept nodes. " +
|
||||
"CRITICAL: Connect related concept nodes together, and connect each concept node to the block nodes ('seg-X') where it is discussed. " +
|
||||
"Limit connections to a MAXIMUM of 15 most relevant links. " +
|
||||
"Return ONLY minified JSON. Schema: { \"graph\": { \"nodes\": [ { \"id\": \"string\", \"label\": \"string\", \"group\": \"concept|current\" } ], \"links\": [ { \"source\": \"string\", \"target\": \"string\", \"value\": 1 } ] } }";
|
||||
|
||||
"You are a strict Minimalist Information Architect. Your sole job is to build a high-level, sparse linear backbone for a textbook chapter. " +
|
||||
"**LANGUAGE CRITICAL**: Detect the language of the provided text. The 'label', 'summary', and 'key_terms' fields MUST be in the EXACT SAME LANGUAGE as the source text. " +
|
||||
"The input text consists of sections starting with block IDs (e.g., '[ID: seg-4]'). " +
|
||||
"CRITICAL TOPOLOGY RULES (ZERO TOLERANCE FOR CLUTTER): " +
|
||||
"1. HARD NODE LIMIT: You are strictly forbidden from extracting more than 4 to 5 nodes IN TOTAL for the entire text. If there are more sections, select ONLY the 4-5 absolute most critical, high-level structural pillars. " +
|
||||
"2. NO CONCEPT CLOUDS: Do NOT create nodes for individual technologies, files, terms, or phrases (e.g., 'Kestrel', 'appsettings.json', 'DI', 'Blazor Server' must NEVER be nodes). They must ONLY exist as text strings inside the 'key_terms' array of a major node. " +
|
||||
"3. LINEAR SPINE PATTERN: Nodes must form a clear, clean path or simple tree representing the chronological reading journey (e.g., Node 1 -> Node 2 -> Node 3). Do NOT create complex web loops or interconnect every node. Limit total links in the entire JSON to maximum 4 or 5 links. " +
|
||||
"4. NODE DATA STRUCTURE: " +
|
||||
" - 'id': must be the exact block ID (e.g., 'seg-16'). " +
|
||||
" - 'label': clear technical title (Max 3 words, e.g., 'Blazor Hosting Models'). " +
|
||||
" - 'group': strictly either 'bridge' (if it compares legacy vs modern) or 'concept' (for standalone core pillars). " +
|
||||
" - 'summary': exact 2-sentence distillation for the Contextual Panel. " +
|
||||
" - 'key_terms': array of max 5 short strings representing the micro-concepts hidden inside this section. " +
|
||||
"System keys configuration: All JSON keys ('nodes', 'links', 'id', 'label', 'group', 'summary', 'key_terms', 'source', 'target', 'type') must remain strictly in English. " +
|
||||
"Return ONLY minified JSON. Schema: " +
|
||||
"{ " +
|
||||
" \"graph\": { " +
|
||||
" \"nodes\": [ " +
|
||||
" { \"id\": \"seg-X\", \"label\": \"string\", \"group\": \"concept|bridge\", \"summary\": \"string\", \"key_terms\": [ \"string\" ] } " +
|
||||
" ], " +
|
||||
" \"links\": [ " +
|
||||
" { \"source\": \"seg-X\", \"target\": \"seg-Y\", \"type\": \"maps_to|contains\" } " +
|
||||
" ] " +
|
||||
" } " +
|
||||
"}";
|
||||
|
||||
public const string SummaryAndQuizPrompt =
|
||||
"You are an expert educator. Provide a concise summary of the text and generate a challenging quiz (3-5 questions). " +
|
||||
"**LANGUAGE CRITICAL**: Detect the language of the provided text. The generated 'summary', 'question', and 'options' MUST be in the EXACT SAME LANGUAGE as the source text. " +
|
||||
"Return ONLY minified JSON. Schema: { \"summary\": \"string\", \"quizzes\": [ { \"question\": \"string\", \"options\": [ \"string\" ], \"correct_index\": 0 } ] }";
|
||||
|
||||
public const string KM_ExtractionPrompt =
|
||||
"You are an expert at Knowledge Engineering. Segment the provided text into discrete Knowledge Units. " +
|
||||
"**LANGUAGE CRITICAL**: Detect the language of the provided text. The 'content' field MUST be in the EXACT SAME LANGUAGE as the source text. " +
|
||||
"Identify 'units' (sections, tables, definitions, rules) and 'links' (how they relate). " +
|
||||
"CRITICAL: Units must be granular. " +
|
||||
"CRITICAL: Code blocks must be summarized under the parent unit or represented as a single 'Code Example' unit. Do NOT segment code blocks into granular low-level code details (e.g., classes, variables, parameters). " +
|
||||
"CRITICAL: Code blocks must be summarized under the parent unit or represented as a single 'Code Example' unit (translate the name to the detected language). Do NOT segment code blocks into granular low-level code details (e.g., classes, variables, parameters). " +
|
||||
"CRITICAL SYSTEM VALUES: The fields 'type' (strictly: 'Section', 'Table', 'Definition', or 'Rule') and 'relation' (strictly: 'Next', 'Defines', 'Contains', or 'References') are system keys and MUST remain in English as specified. " +
|
||||
"Schema: { " +
|
||||
"\"units\": [ { \"id\": \"string\", \"type\": \"Section|Table|Definition|Rule\", \"content\": \"string\", \"metadata\": { \"page\": 0 } } ], " +
|
||||
"\"links\": [ { \"source\": \"string\", \"target\": \"string\", \"relation\": \"Next|Defines|Contains|References\" } ] " +
|
||||
"}.";
|
||||
|
||||
public const string GroundedRAGSystemPrompt = """
|
||||
You are an advanced, extremely precise Fact-Checking AI assistant. Your task is to answer the user's question using ONLY the provided context blocks.
|
||||
|
||||
Strict Grounding Rules:
|
||||
1. Rely EXCLUSIVELY on the provided context. Do NOT use any pre-existing external knowledge, facts, or assumptions.
|
||||
2. If the context does not contain the answer, you must set the "answer" property in the JSON object exactly to: 'I cannot answer this based on the provided book context.' and the "citations" array must be empty.
|
||||
3. For every statement or claim you make in your answer, you must cite the specific source IDs (e.g., source chunk ID or hash) from the context.
|
||||
4. You must format your response ONLY as a JSON object matching the following structure:
|
||||
{
|
||||
"answer": "The answer text goes here, referencing [Source ID] as citations.",
|
||||
"citations": [
|
||||
{
|
||||
"citationId": "The exact source ID cited (e.g., chunk hash/ID)",
|
||||
"snippet": "The precise sentence or phrase from the context that supports this statement.",
|
||||
"sourceBook": "The book title or 'Unknown'"
|
||||
}
|
||||
]
|
||||
}
|
||||
""";
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user