Nexus.Reader/backlog-ai.md

# 🤖 LLM Agent Implementation Backlog: AI Semantic Integration

**Project Context:** .NET 10, EF Core (SQLite), `Microsoft.Extensions.AI`.
**Core Goal:** Integrate Gemini 1.5 Flash with a persistent Semantic Cache to minimize API costs and latency.

---

## 🏗️ Phase 1: Persistence & Domain Layer
**Objective:** Define the storage schema to prevent redundant AI calls.

### Task 1.1: Create `SemanticKnowledgeCache` Entity
* **Target Folder:** `Core/Entities` or `Infrastructure/Persistence/Entities`.
* **Requirements:**
    * Create a class `SemanticKnowledgeCache`.
    * **Properties:**
        * `string ContentHash` (Key, Fixed length 64).
        * `string JsonData` (Required, stores the serialized AI output).
        * `string ModelId` (Default: "gemini-1.5-flash").
        * `string PromptVersion` (Default: "1.0").
        * `DateTime CreatedAt` (UTC).
* **LLM Instructions:** "Generate an EF Core entity for SemanticKnowledgeCache. Ensure `ContentHash` has a Unique Index for O(1) lookups."

### Task 1.2: Implement Hashing Utility
* **Target Folder:** `Core/Helpers` or `Infrastructure/Security`.
* **Requirements:**
    * Create `ContentHasher` class.
    * Method `string ComputeHash(string input)`.
    * **Logic:** Normalize input (Trim, lower-case) -> Compute SHA-256 -> Return Hex string.
* **LLM Instructions:** "Create a thread-safe utility to generate SHA-256 hashes from strings. Ensure it handles nulls and whitespace consistently."

---

## 🧠 Phase 2: AI Client & Contract Definition
**Objective:** Set up the communication bridge with Google Gemini API.

### Task 2.1: Define Data Transfer Objects (DTOs)
* **Target Folder:** `Core/DTOs/AI`.
* **Requirements:**
    * Define `KnowledgePacket` record containing `List<KeyConcept>` and `List<QuizQuestion>`.
    * Use `[JsonPropertyName]` attributes for strict JSON mapping.
* **LLM Instructions:** "Define immutable records for the AI response schema. Ensure they match the expected JSON structure from the system prompt."

### Task 2.2: Infrastructure AI Client Setup
* **Target:** `Program.cs` / Dependency Injection.
* **Requirements:**
    * Install `Microsoft.Extensions.AI` and `Microsoft.Extensions.AI.Google`.
    * Register `IChatClient` using `GoogleChatClient`.
    * Inject `ApiKey` from `IConfiguration`.
* **LLM Instructions:** "Register the GoogleChatClient in the DI container. Use the .NET 10 `AddChatClient` extension pattern."

---

## ⚙️ Phase 3: Service Orchestration (The "Smart" Logic)
**Objective:** Implement the caching proxy logic.

### Task 3.1: Create `KnowledgeService` Implementation
* **Target Folder:** `Application/Services`.
* **Logic Flow:**
    1.  `hash = ContentHasher.ComputeHash(inputText)`.
    2.  `cached = await dbContext.Cache.FirstOrDefaultAsync(h => h.ContentHash == hash)`.
    3.  If `cached` exists AND `PromptVersion` matches -> Deserialize and return.
    4.  Else -> Call `IChatClient.CompleteAsync<KnowledgePacket>(...)`.
    5.  Save result to DB with the hash -> Return.
* **LLM Instructions:** "Implement a service that acts as a proxy between the UI and the Gemini API. It must prioritize SQLite cache hits over API calls."

### Task 3.2: System Prompt Engineering
* **Requirements:**
    * Create a `PromptRegistry` class.
    * **System Message:** "You are an educational assistant. Analyze the text and output ONLY valid minified JSON. Schema: { 'concepts': [], 'quizzes': [] }. Do not include markdown formatting like \` \` \` json."
* **LLM Instructions:** "Craft a high-precision system prompt for Gemini 1.5 Flash to ensure it returns parseable JSON without unnecessary tokens."

---

## 🛡️ Phase 4: Resilience & Optimization
**Objective:** Handle API limits and monitor performance.

### Task 4.1: Resilience Pipeline (Polly)
* **Requirements:**
    * Implement an `HttpRetry` policy specifically for `429 Too Many Requests`.
    * Use Exponential Backoff with Jitter.
* **LLM Instructions:** "Add a resilience pipeline to the AI client using Polly. Handle rate-limiting gracefully to stay within the Gemini Free Tier limits."

### Task 4.2: Request Pre-processing (Token Saving)
* **Logic:**
    * Check input string length.
    * If `length > threshold`, truncate or throw an error to prevent massive token spend.
* **LLM Instructions:** "Add a guard clause to the KnowledgeService to validate input size before calling the API. Log the estimated token count."