OpenReader Preloads Audio, Makes Every Document a Private Audiobook

OpenReader v3.0.0 is a self-hosted web application that reads EPUB, PDF, TXT, Markdown, and DOCX files aloud while highlighting each word in sync. It functions as a private document reader and audiobook generator, handling everything from layout-aware text extraction to sentence-level speech alignment. This new version layers preloading, persistent caching, and admin controls on top of an already mature open-source stack.
Developer Richardr1126 has maintained the project for over a year, steadily growing it to 300+ GitHub stars under the previous name OpenReader-WebUI. The v3.0.0 release focuses on making self-hosting smoother and more practical by keeping audio ready before you reach it and letting admins manage TTS providers without touching config files. The tool was built to give people full ownership of their texts, voices, and playback without depending on paid cloud services.
Preloaded audio and runtime admin controls
- TTS audio preloads across upcoming pages.
- Persistent caching on embedded or S3 storage.
- New admin panel for multiple TTS providers.
- Site-wide feature flags without redeploying.
- Layout-aware PDF parsing with geometry highlights.
- Word-level alignment via ONNX Whisper.
- Segment-based read-along for five document formats.
- Audiobook export in m4b or mp3.
This reader is for anyone who handles sensitive or local documents and wants to listen instead of read. It works with self-hosted TTS backends like Kokoro-FastAPI, so audio stays entirely on your own hardware. Privacy-conscious professionals, local AI hobbyists, and small agencies can use it to turn reports, books, or notes into spoken content with accurate highlighting and chaptered audiobook exports.
Developer invites community input
The project has been live in various forms for over a year, and the developer actively tracks feature requests and bug reports on GitHub. No major known limitations are flagged in the source, but the architecture assumes you’re comfortable running Docker containers or deploying to platforms like Vercel. Future improvements will likely depend on community suggestions, as the repository is open to pull requests.
“TTS now preloads audio across multiple pages ahead of where you are.” — Source: Reddit