Quidlibet: Difference between revisions
Created page with "{{Infobox | 01_name = Quidlibet | 02_url = https://quidlibet.yusupov.cloud | 03_developer = Michel Vuijlsteke | 04_released = 2025 | 05_genre = AI-generated fictional book library | 06_language = Python | 07_framework = Flask 3.0 | 08_license = Proprietary }} '''Quidlibet''' (Latin for "anything whatsoever") is a web application hosted at <code>quidlibet.yusupov.cloud</code> that generates and catalogues entirely fictional..." |
No edit summary |
||
| Line 10: | Line 10: | ||
}} | }} | ||
'''Quidlibet''' (Latin for "anything whatsoever") is a web application hosted at <code>quidlibet.yusupov.cloud</code> that generates and catalogues entirely fictional books. Each book is a complete literary artefact: a plausible title, a named author with a biographical sketch and portrait, a Markdown-formatted synopsis, an AI-generated cover in genre-appropriate style, a publication date and page count, and between four and eight reader reviews with individually calibrated star ratings. The site presents itself as a browsable library catalogue under the heading "Possible Books," inviting visitors to enter a title | '''Quidlibet''' (Latin for "anything whatsoever") is a web application hosted at <code>quidlibet.yusupov.cloud</code> that generates and catalogues entirely fictional books. Each book is a complete literary artefact: a plausible title, a named author with a biographical sketch and portrait, a Markdown-formatted synopsis, an AI-generated cover in genre-appropriate style, a publication date and page count, and between four and eight reader reviews with individually calibrated star ratings. The site presents itself as a browsable library catalogue under the heading "Possible Books," inviting visitors to enter a title and optionally an author and genre for "a book that you'd like to read but that doesn't actually exist." In addition to on-demand generation, an automated pipeline publishes up to twelve new books per day on a cron schedule. | ||
== Technology stack == | == Technology stack == | ||
Revision as of 13:24, 13 April 2026
| Infobox | |
|---|---|
| name | Quidlibet |
| url | https://quidlibet.yusupov.cloud |
| developer | Michel Vuijlsteke |
| released | 2025 |
| genre | AI-generated fictional book library |
| language | Python |
| framework | Flask 3.0 |
| license | Proprietary |
Quidlibet (Latin for "anything whatsoever") is a web application hosted at quidlibet.yusupov.cloud that generates and catalogues entirely fictional books. Each book is a complete literary artefact: a plausible title, a named author with a biographical sketch and portrait, a Markdown-formatted synopsis, an AI-generated cover in genre-appropriate style, a publication date and page count, and between four and eight reader reviews with individually calibrated star ratings. The site presents itself as a browsable library catalogue under the heading "Possible Books," inviting visitors to enter a title and optionally an author and genre for "a book that you'd like to read but that doesn't actually exist." In addition to on-demand generation, an automated pipeline publishes up to twelve new books per day on a cron schedule.
Technology stack
The application is built on Flask 3.0 with Flask-SQLAlchemy as its ORM and Flask-Login for authentication.[1] It uses SQLite as its database and is deployed behind Nginx on a Linux VPS. Additional dependencies include Pillow for image processing, the Python markdown library for rich-text rendering, Beautiful Soup for web scraping, python-dotenv for configuration, and the OpenAI Python client for all language-model and image-generation calls. The front end uses Bootstrap 5.3 with a custom CSS layer, Google Fonts (Roboto Serif, Roboto Slab, Roboto), and Bootstrap Icons. The site is installable as a progressive web application via a Web App Manifest and a minimal service worker.
Data model
Books
Each book record carries a title, a foreign key to an Author, a foreign key to a Genre (with a legacy genre-name string for backward compatibility), a Markdown synopsis, a num_pages integer, an ISO 8601 publication_date, a cover_image path, an avg_rating and rating_count (derived from reviews), and the ip_address and timestamp of the generation request. Books are addressed by URL slug in the form /book/{id}-{title-slug}, where the leading integer guarantees uniqueness and the slug provides readability.
Authors
Authors are stored with a name in "Lastname, Firstname" format (rendered as "Firstname Lastname" in the interface via a Jinja display-name filter), a Markdown-formatted bio, and an optional author_image path. Author names are matched case-insensitively using Python's SequenceMatcher with a similarity threshold, so that minor variations do not create duplicate records.
Genres
The Genre model supports hierarchical categorisation via a self-referential parent_id foreign key. Each genre has a name, an optional description, a hex color code for UI tags, an icon class, a display_order, and an is_active flag.
Reviews
Each review belongs to a book and carries a reviewer_name, a review_date (in-universe), Markdown review_text, and a stars rating between one and five. Reviews are cascade-deleted when their parent book is removed.
Supporting models
A GenerationJob tracks the asynchronous lifecycle of each generation request (pending → processing → completed or failed), with a progress_message field polled by the front end. A GenerationLog provides an audit trail of every generation attempt — successful or otherwise — including the model used, the API surface called, and any error message. A GenerationQueueItem implements a database-backed seed queue as an alternative to the file-based daily queue.
Book generation pipeline
Book generation is performed in the main Flask application via a multi-step pipeline of OpenAI API calls. The default text model is GPT-5; the image model is gpt-image-1.
Title and metadata
When a visitor submits a title (and optionally an author and genre), the application creates a GenerationJob and redirects to a status page that polls for progress. On the status page, while the visitor waits, a rotating display of faux-library-science progress messages — "Reticulating splines…," "Reconciling author names against authority files…," "De-duplicating near-identical editions and printings…," "Calibrating star ratings to review sentiment…" — plays at random intervals for entertainment. Behind the scenes, the pipeline sends a structured prompt to the text model asking it to return a JSON object with the book's synopsis, page count, publication date, author name (invented if not supplied), and an author biography. The prompt instructs the model to write the synopsis in Markdown with bold and italic formatting, to keep it under two paragraphs of 120 words each, and to anchor it with concrete names, places, and objects rather than vague abstractions.
Author reuse and continuity
Before generating a new author, the pipeline queries the database for existing authors whose names are at least 80% similar (by sequence-matching across four name-format permutations). If a match is found — which happens for roughly a quarter of all generated books — the pipeline reuses the existing author record. It supplies the model with a context dossier listing the author's existing books, genres, and publication dates, and instructs it to revise the biography so that it naturally covers the author's expanded body of work. A hard constraint requires that any new publication date differ from the author's existing dates by at least six months. Biographies are post-processed by a helper that detects and corrects any instance of the "Lastname, Firstname" storage format that may have leaked into the prose.
Synopses and formatting
Author biographies are written in Markdown: the author's name is bold on first mention and for notable awards, book titles are italicised, and paragraphs are separated by blank lines. Synopses follow similar conventions. All Markdown content is rendered through a Jinja filter backed by the Python markdown library (with extensions for line breaks, fenced code, and tables). On grid pages where biographies appear as truncated previews, the Markdown is first rendered to HTML, then stripped of tags, so that formatting tokens do not leak into the plaintext snippet.
Review generation and rating profiles
Reviews are generated in a separate API call. Before prompting, the pipeline constructs a deterministic rating profile by seeding a random number generator with the SHA-256 hash of the book's title, synopsis, and publication date. This seed selects one of four rating clusters — low (18% probability, mean 2.2–3.0), mid (34%, mean 3.2–3.9), high (26%, mean 4.0–4.6), or polarised (22%, mean 3.0–3.8 with high spread) — and derives a target mean, standard deviation, skew direction, and a rant count (zero to two extended reviews of either one–two or five stars).
The model is asked to produce four to eight reviews, each assigned a distinct stylistic palette from a rotating set of eight: crisp capsule (one or two punchy sentences), craft critique (prose, structure, pacing), character study (interiority, dialogue, motives), worldbuilding lens (setting, atmosphere, rules), theme tracer (motifs and subtext), comparative take (comparisons to two non-celebrity authors), sceptic's ledger (bullet-point pros and cons), and librarian angle (audience notes). Each review also rotates through a primary focus — characters, plot, prose, world, themes, or audience — so that no two reviews in a set read the same way. Review dates must fall between the book's publication date and the present day. The prompt bans stock phrases ("page-turner," "unputdownable") and requires diverse, plausible reviewer names.
After creation, the pipeline calculates the book's average rating and review count directly from the database and stores them on the book record.
Cover generation
Cover images are generated via OpenAI's gpt-image-1 model at 1024 × 1536 pixels (2:3 portrait aspect). The prompt is tailored to the book's genre through a weighted style-selection system: cookbooks receive an eightfold bias toward photographic covers; graphic novels toward vector/graphic styles; science fiction, fantasy, and horror toward illustration; biography and history toward photography; and romance toward an even mix. The prompt specifies the exact title and author name as cover typography and bans watermarks. Generated images are decoded from base64, converted to optimised progressive JPEG via Pillow (quality 92), and saved as book_{id}.jpg. The pipeline allows two attempts; if both fail, the entire book-generation transaction is rolled back so that no coverless book can enter the database.
Author-photo generation
After a successful cover, the pipeline checks whether the author already has a portrait. If not, it sends a prompt to gpt-image-1 for a photorealistic head-and-shoulders portrait at 1024 × 1024 pixels. The result is centre-cropped to a 512 × 512 square JPEG and saved as author_{id}.jpg. Unlike cover generation, photo generation is non-blocking: failures are logged but do not abort the job. In the interface, authors without photos receive a CSS-generated avatar showing their initials.
Error handling and transaction safety
If any exception occurs during book creation, review insertion, or image generation, the pipeline issues an immediate db.session.rollback() to discard all uncommitted data. The GenerationJob and GenerationLog records — committed in an earlier transaction — survive, so the failure is auditable. The job is marked as failed with a descriptive error message. In templates, all cover <img> tags include an onerror handler that replaces a broken image with a CSS gradient placeholder bearing the book's title, as a defence against stale database references to missing files.
Daily batch generation
A standalone script, fast_generator.py, generates twelve book seeds (title, author, genre) in a single GPT-5 API call lasting roughly twelve seconds. The batch pipeline incorporates several layers of anti-repetition machinery.
Goodreads title inspiration
Before prompting, the script scrapes two to three randomly chosen genre pages on Goodreads (from a pool of eight: historical fiction, history, fantasy, science fiction, romance, cookbooks, general fiction, and horror). It extracts book titles from div.bookBox img[alt] elements, cleans them of series numbers, author suffixes, and emoji, and presents a sample as "inspiration titles — for tone and flavour only, not to be reproduced." If Goodreads is unreachable, the script falls back to a curated JSON file of titles harvested from the Internet Archive and Open Library.
Anti-sameness system
To prevent the model from gravitating toward a narrow band of title shapes, the script enforces structural diversity through five mechanisms:
- Title-class rotation. Ten structural classes — concrete object, place or institution, named person, event or incident, documentary phrase, odd juxtaposition, fragment or question, subtitle-led, idiomatic phrase, and one-word punch — are allocated evenly across the batch so that each slot has a designated shape.
- Avoidance guidance. The script analyses a rolling memory of up to 2,000 previously generated titles (persisted in a JSON file) plus all titles already in the database. It identifies the twenty most overused content words, the most common title frames (e.g., "The Last ___," "Beyond the ___"), the most frequent opening words, and the most frequent closing nouns, and encodes these as explicit "do not use" directives in the prompt.
- Batch validator. After generation, each title in the batch is scored for repetitiveness: collisions on first word, end noun, or structural frame within the batch incur penalties of 0.5–1.5 points; matches against historical overuse patterns add further penalties; and two-word titles composed entirely of generic mood words score a 2.0 penalty. Titles exceeding a cumulative penalty of 3.0 are rejected and regenerated.
- Title memory. Accepted titles are appended to the rolling memory file (capped at 2,000 entries, oldest dropped first) and checked against the database to prevent exact duplicates.
- Database deduplication. Each final title is compared case-insensitively to every existing book in the database; duplicates are silently skipped.
The output is written to daily_books.json, a flat array of title/author/genre objects with generation timestamps.
Hourly processing
A companion script, simple_hourly_processor.py, runs once per hour via cron. It selects one book from the daily JSON file — indexed by a hash of the current date and hour — and triggers its full generation (metadata, reviews, cover, author photo) by calling the application's /cron/hourly_book_full endpoint. If all twelve daily seeds have already been processed, the hour is skipped. In production, a typical crontab runs the batch generator daily at midnight and the hourly processor every other hour, yielding up to twelve new books per day.
Cron security
Cron endpoints are protected by a localhost check: the request's client IP must be 127.0.0.1, ::1, or localhost. For deployments where the cron job calls the public URL rather than the loopback address, two overrides are available: an X-API-Key header (or api_key query parameter) checked against an environment variable, or a global DISABLE_CRON_LOCALHOST_CHECK flag.
Public interface
Home page
The home page displays a hero section with the title "Possible Books," a rotating tagline (changed every thirty minutes, seeded by the current UTC half-hour), and a generation form with an autocomplete datalist of existing authors. Below the form, a grid of the twelve most recent books (with cover images or gradient fallbacks) appears alongside the twenty-five most recent reviews, sorted by review date.
Book pages
Each book page shows the cover image (or fallback), title, author byline linked to the author's page, genre tag, page count, publication date, and average star rating. The synopsis and author biography are rendered from Markdown. Below, reviews are listed with reviewer name, star display (filled and hollow star characters), review date, and Markdown-formatted review text. Administrators see additional controls: edit book, regenerate cover, delete book, regenerate reviews, and recalculate rating.
Author pages
The author page displays the portrait (or a CSS initials placeholder), the full Markdown biography, and a grid of the author's books. The authors index lists all authors sorted alphabetically by surname, with a plaintext preview of each biography (Markdown rendered, tags stripped, truncated to 100 characters).
Archive
The archive page presents a paginated, filterable grid of all books. A sidebar offers filters by publication-year range (dynamically bucketed), minimum average rating (one to five stars), and genre (checkboxes, grouped by popularity). Filters are applied via form auto-submission on change; an explicit "clear filters" link resets the view.
Genres
The genres page lists all active genres alphabetically with book counts.
Theme
The site defaults to a dark colour scheme (#111315 background, #e6e6e6 text) and supports a light mode toggled via a moon-icon button in the masthead, persisted in localStorage. The toggle respects the operating system's prefers-color-scheme setting as a default. Typography uses Roboto Serif for display headings and Roboto for body text and UI elements.
Progressive web application
The site ships a Web App Manifest declaring standalone display mode, dark background and theme colours, and maskable icons at 192 and 512 pixels. A service worker registers at the root scope but operates in passthrough mode — all fetch events are returned unmodified — to satisfy PWA installability requirements without risking stale cached pages in a database-backed application that publishes new content hourly.
See also
References
- ↑ requirements.txt in the project repository lists Flask 3.0.3, Flask-SQLAlchemy 3.1.1, Flask-Login 0.6.3, and Flask-Caching 2.3.0.