Files
Nexus.Reader/backlog-ai.md
T

88 lines
4.3 KiB
Markdown

# 🤖 LLM Agent Implementation Backlog: AI Semantic Integration
**Project Context:** .NET 10, EF Core (SQLite), `Microsoft.Extensions.AI`.
**Core Goal:** Integrate Gemini 1.5 Flash with a persistent Semantic Cache to minimize API costs and latency.
---
## 🏗️ Phase 1: Persistence & Domain Layer
**Objective:** Define the storage schema to prevent redundant AI calls.
### Task 1.1: Create `SemanticKnowledgeCache` Entity
* **Target Folder:** `Core/Entities` or `Infrastructure/Persistence/Entities`.
* **Requirements:**
* Create a class `SemanticKnowledgeCache`.
* **Properties:**
* `string ContentHash` (Key, Fixed length 64).
* `string JsonData` (Required, stores the serialized AI output).
* `string ModelId` (Default: "gemini-1.5-flash").
* `string PromptVersion` (Default: "1.0").
* `DateTime CreatedAt` (UTC).
* **LLM Instructions:** "Generate an EF Core entity for SemanticKnowledgeCache. Ensure `ContentHash` has a Unique Index for O(1) lookups."
### Task 1.2: Implement Hashing Utility
* **Target Folder:** `Core/Helpers` or `Infrastructure/Security`.
* **Requirements:**
* Create `ContentHasher` class.
* Method `string ComputeHash(string input)`.
* **Logic:** Normalize input (Trim, lower-case) -> Compute SHA-256 -> Return Hex string.
* **LLM Instructions:** "Create a thread-safe utility to generate SHA-256 hashes from strings. Ensure it handles nulls and whitespace consistently."
---
## 🧠 Phase 2: AI Client & Contract Definition
**Objective:** Set up the communication bridge with Google Gemini API.
### Task 2.1: Define Data Transfer Objects (DTOs)
* **Target Folder:** `Core/DTOs/AI`.
* **Requirements:**
* Define `KnowledgePacket` record containing `List<KeyConcept>` and `List<QuizQuestion>`.
* Use `[JsonPropertyName]` attributes for strict JSON mapping.
* **LLM Instructions:** "Define immutable records for the AI response schema. Ensure they match the expected JSON structure from the system prompt."
### Task 2.2: Infrastructure AI Client Setup
* **Target:** `Program.cs` / Dependency Injection.
* **Requirements:**
* Install `Microsoft.Extensions.AI` and `Microsoft.Extensions.AI.Google`.
* Register `IChatClient` using `GoogleChatClient`.
* Inject `ApiKey` from `IConfiguration`.
* **LLM Instructions:** "Register the GoogleChatClient in the DI container. Use the .NET 10 `AddChatClient` extension pattern."
---
## ⚙️ Phase 3: Service Orchestration (The "Smart" Logic)
**Objective:** Implement the caching proxy logic.
### Task 3.1: Create `KnowledgeService` Implementation
* **Target Folder:** `Application/Services`.
* **Logic Flow:**
1. `hash = ContentHasher.ComputeHash(inputText)`.
2. `cached = await dbContext.Cache.FirstOrDefaultAsync(h => h.ContentHash == hash)`.
3. If `cached` exists AND `PromptVersion` matches -> Deserialize and return.
4. Else -> Call `IChatClient.CompleteAsync<KnowledgePacket>(...)`.
5. Save result to DB with the hash -> Return.
* **LLM Instructions:** "Implement a service that acts as a proxy between the UI and the Gemini API. It must prioritize SQLite cache hits over API calls."
### Task 3.2: System Prompt Engineering
* **Requirements:**
* Create a `PromptRegistry` class.
* **System Message:** "You are an educational assistant. Analyze the text and output ONLY valid minified JSON. Schema: { 'concepts': [], 'quizzes': [] }. Do not include markdown formatting like \` \` \` json."
* **LLM Instructions:** "Craft a high-precision system prompt for Gemini 1.5 Flash to ensure it returns parseable JSON without unnecessary tokens."
---
## 🛡️ Phase 4: Resilience & Optimization
**Objective:** Handle API limits and monitor performance.
### Task 4.1: Resilience Pipeline (Polly)
* **Requirements:**
* Implement an `HttpRetry` policy specifically for `429 Too Many Requests`.
* Use Exponential Backoff with Jitter.
* **LLM Instructions:** "Add a resilience pipeline to the AI client using Polly. Handle rate-limiting gracefully to stay within the Gemini Free Tier limits."
### Task 4.2: Request Pre-processing (Token Saving)
* **Logic:**
* Check input string length.
* If `length > threshold`, truncate or throw an error to prevent massive token spend.
* **LLM Instructions:** "Add a guard clause to the KnowledgeService to validate input size before calling the API. Log the estimated token count."