Bug: Images within EPUB books not displaying in the reader canvas #64
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Bug Description
Images inside EPUB books are currently not rendering within the book reader page.
Technical Analysis
Sanitization & Extraction:
EpubReaderService.ExtractParagraphsuses a regex that only captures specific tags (p,h[1-6],ul,ol,blockquote,pre,hr). It does not capture or include root-level or nested<img>tags if they are structured outside or inside these containers without proper nesting.EpubReaderService.SanitizeParagraphhas a strict whitelist regex:clean = Regex.Replace(clean, @"<(?!/?(b|i|strong|em|h[1-6]|p|ul|ol|li|blockquote|pre|code|br|hr)\b)[^>]+>", "", RegexOptions.IgnoreCase);. This explicitly strips out any tag not in the whitelist, which deletes<img>tags entirely.SanitizeParagraphthen runsclean = Regex.Replace(clean, @"<(b|i|strong|em|h[1-6]|p|ul|ol|li|blockquote|pre|code|br|hr)\b[^>]*>", "<$1>", RegexOptions.IgnoreCase);which removes all attributes (likesrc,alt, etc.).Image Serving Endpoint:
<img>tags withsrcattribute are preserved, the browser cannot resolve relative EPUB zip paths (e.g.../images/pic1.png). We need a server endpoint (e.g./api/epub/{ebookId}/resource?path={path}) that can read the requested resource file dynamically from the EPUB archive.URL Rewriting:
EpubReaderService, when parsing the HTML content of a chapter, we must rewrite thesrcattribute of<img>tags from their relative paths inside the EPUB to our web-accessible resource endpoint.Proposed Solution
GetEpubResourceAsynctoIEpubReaderandEpubReaderServiceto retrieve binary resource files (images) from an EPUB./api/epub/{ebookId:guid}/resourceinProgram.csthat returns the image bytes with the correct MIME type.EpubReaderService.ExtractParagraphsto match<img>elements if they appear at root-level (or ensure the regex is flexible enough).EpubReaderService.SanitizeParagraphto preserve<img>tags along with theirsrcattributes, and rewrite them to reference/api/epub/{ebookId}/resource?path={resolvedPath}.