fix: preserve and render EPUB images via dynamic server endpoint #65
Reference in New Issue
Block a user
Delete Branch "bugfix/issue-64-epub-images"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #64
Summary of Changes
IEpubReader&EpubReaderService: AddedGetEpubResourceAsyncto handle binary data extraction of static assets (like images) from the EPUB archive.WasmEpubServiceto retrieve static resources from the server using the API client.ExtractParagraphsandSanitizeParagraphto treat<img>tags as first-class citizens, preserving theirsrcattributes and excluding them from sanitization stripping.ResolveRelativePath) and rewrote imagesrcattributes to use the dynamic endpoint/api/epub/{ebookId}/resource?path=..../api/epub/{ebookId:guid}/resourceminimal API endpoint inProgram.csthat maps requests directly toGetEpubResourceAsyncand returns files with the correct MIME type.EpubReaderServiceTests.csto verify all image extraction, path resolution, and sanitization/rewriting rules. All tests pass successfully.Code Review — PR #65: EPUB Image Rendering via Dynamic Server Endpoint
This PR cleanly solves a real user-facing problem (broken images in EPUB rendering) and introduces a well-structured binary resource API. The
ResolveRelativePathpath-traversal normalizer, theAllFiles.Localfallback loop, and the dualEpubLocalByteContentFileRef/EpubLocalTextContentFileRefbranch are all solid. Good unit-test coverage for the core logic.However, there are several issues that need to be addressed before merge.
🔴 Blocking Issues
1. Path traversal vulnerability on
/api/epub/{ebookId}/resourceThe
pathquery parameter is accepted raw, URL-decoded, and passed directly intoGetEpubResourceAsync. While the lookup is scoped toAllFiles.Localwithin the EPUB archive (which mitigates most risk), thedecodedPathis never validated. A specially crafted request like?path=../../appsettings.jsonmight not match any file in the EPUB, but it discloses that the server processes arbitrary paths. More importantly, theSanitizeParagraphwhitelist whitelists<img>tags but does not strip thesrcattribute of rewritten<img>tags, so a manipulated EPUB can inject an arbitrary URL as an image src. The endpoint must validate that the decoded path does not contain..segments before attempting any lookup.2.
SanitizeParagraphstrips<img>tag attributes exceptsrcThe updated sanitizer whitelist (
|img) preserves<img>tags from stripping, but the attribute-stripping regex on line 300 only strips attributes fromb|i|strong|em|...— it does NOT strip attributes from<img>. This means raw<img>attributes likeonerror="...",onload="...", orstyle="..."pass through the sanitizer untouched, potentially enabling XSS if an EPUB contains a crafted HTML file.3.
Regexinstantiated per call inRewriteImageUrlsnew Regex(...)is called on every chapter load. This constructs and compiles the regex on every invocation. Use astatic readonlyfield (withRegexOptions.Compiled) to avoid this repeated allocation.🟡 Design / Architecture Concerns
4.
WasmEpubReader.GetEpubResourceAsyncis a dead-code pathThe WASM client implements
GetEpubResourceAsyncbut the rewritten imagesrcattributes point to/api/epub/{id}/resource?path=...which will be served by the browser as a direct HTTP request — not viaHttpClient. The WASM implementation ofIEpubReader.GetEpubResourceAsyncwill never be called in the browser context. The method can remain for interface compliance, but this should be documented clearly to avoid confusion.5.
Login.razorinjectsIConfiguration— Client/Server boundary violationIConfigurationis injected directly intoLogin.razor. On the WASM client (whereLogin.razorrenders after hydration),IConfigurationis backed byappsettings.jsonwhich is publicly accessible and shipped with the WASM bundle. Feature flags likeFeatures:AllowRegistrationmust not be treated as security gates on the client — they can only influence UI presentation. The actual guard belongs on the server route handler. This is documented in the blazor-hybrid-bridge skill.6.
Register.razorusesNavigationManager.NavigateToinOnInitializedCalling
NavigationManager.NavigateToduringOnInitializedin a Blazor component (particularly when running in SSR pre-render mode) can causeNavigationExceptionon the server side. The correct pattern is to call it insideOnAfterRenderAsync(firstRender: true)or use[Authorize]+ a server-side policy.🟢 Minor / Suggestions
7. Missing
Cache-Controlheader on resource endpoint/api/epub/{ebookId}/resourcereturns binary assets (images, fonts) without any cache headers. These are immutable for a given EPUB file, so returningCache-Control: public, max-age=86400(or even usingETag) would eliminate redundant roundtrips on every chapter navigation.8.
GetEpubResourceAsyncopens the entire EPUB twice per chapter loadWhen rendering a chapter with N images,
EpubReader.OpenBookAsync(fullPath)is called N times — once per image request. Consider adding a short-lived memory cache keyed on(ebookId, resourcePath)to deduplicate requests within the same reading session.9.
SerilogDemo.razor—eval()replacement is correct but incompleteThe
TriggerJsExceptionrefactor correctly replaceseval()with a non-existent function call. However, thecatchblock swallows theJSExceptionand logs it viaILogger— which means the "exception is captured" story is only partially true (it is captured in .NET, not surfaced in the browser console). Aconsole.error()call should be added via JSRuntime to complete the diagnostic loop.10.
IReaderStateServicethread safetyReaderStateServiceproperties (CurrentScrollPercentage,CurrentBlockId, etc.) are updated from[JSInvokable]methods which can be called from the JS event loop concurrently with Blazor's render cycle. WithoutvolatileorInterlocked, there's a benign but real race. At minimum, add a comment acknowledging this and noting that Blazor'sInvokeAsync(StateHasChanged)call is the synchronization boundary.Hi @mjasin,
I have fully resolved the feedback from PR #65 code review. Here is a summary of the implementation:
🔴 Blocking Issues Resolved:
/api/epub/{ebookId}/resourceinProgram.cs. Any paths containing directory traversal (..), colons (:), or starting with/or\\are blocked instantly with400 BadRequest.EpubReaderService.GetEpubResourceAsyncservice method to prevent path escaping.EpubReaderService.SanitizeParagraphto perform comprehensive attribute stripping on<img>tags. It now strips all attributes (e.g.onerror,onload,style,class) and exclusively whitelists and parses validsrcandaltattributes to prevent any XSS vector.EpubReaderService.cs(includingRewriteImageUrls,ExtractParagraphs, andSanitizeParagraph) to usestatic readonlycompiled Regex fields, improving performance and eliminating per-call memory allocations.🟡 Design & Architecture Concerns Addressed:
Features:AllowRegistrationandFeatures:AllowPasswordResetconfigurations from the rawIConfigurationservice.FeatureSettingsoptions model and registered it as a singleton in the DI containers for the Server (Program.cs), Web.Client (Program.cs), and MAUI (MauiProgram.cs).IConfigurationinjection inLogin.razorandRegister.razorwithFeatureSettings, avoiding potential leaks of sensitive configuration settings to the client WASM bundle.🟢 Minor Concerns & Suggestions:
Cache-Control: public, max-age=86400header response to/api/epub/{ebookId}/resourceinProgram.csto eliminate redundant roundtrips on chapter navigation.SerilogDemo.razorto log the captured JS exception back to the browser developer tools using aconsole.errorJSInterop call.ReaderStateServiceoutlining the lock-guarded property setters and the BlazorInvokeAsync(StateHasChanged)synchronization boundary for asynchronous JS updates.🧪 Automated Test Coverage:
EpubReaderServiceTests.cs:SanitizeParagraph_StripsUnsafeAttributesFromImgTags: Verifies that onlysrcandaltare preserved while malicious attributes likeonerrororonloadare stripped.RewriteImageUrls_BlocksJavaScriptScheme: Assures that script schemes inside image sources are immediately blocked.GetEpubResourceAsync_RejectsInvalidResourcePaths: Programmatically verifies that traversal strings like../../appsettings.jsonare rejected.dotnet testreturned 100% success!).