What we set out to build
Problem
A launcher can find games long before it can trust them. Scanners return messy offerings: duplicate entries, renamed folders, launcher-specific IDs, and “almost-a-game” junk directories. If we don’t define identity early, everything downstream becomes haunted—IGDB binds to the wrong title, play sessions attach to the wrong record, and the Backlog Board becomes a mirage.
ARCH-NOTES
Promise
The Codex must be local-first and stable: once a game is known, it should remain the same record even if the scan changes its shape. We want a truth we can upsert into SQLite every time without drifting.
DB-SCHEMA
What shipped today
✅ Declared the core “game record” as a durable SQLite row in games, anchored by id as the primary key and reinforced with identity/context fields like launcher_type, steam_app_id, executable_path, and install_dir.
DB-SCHEMA
✅ Codified display vs. identity names:display_name is what the player sees; normalized_name is what the system uses to match, dedupe, and reason about the same title across scans.
DB-SCHEMA
✅ Strengthened the scan → normalize → upsert contract so rescans don’t create phantom duplicates. The scanner’s job is to return DetectedGame candidates; the database’s job is to merge them into the Codex safely.
ARCH-NOTES
✅ Kept “metadata” separate from “identity” by continuing the 1:1 relationship: base identity in games, enrichment in game_igdb_metadata. The record exists even when IGDB data is missing or sealed.
DB-SCHEMA
Behind the curtain
The choice: identity is not a single field—it’s a bundle of truth
In a perfect world, every launcher hands us a single universal identifier. In the real world, we have partial sigils:
- Steam can give us
steam_app_id(strong identity when present). - Standalone installs often only give us paths (
executable_path,install_dir) and a name guess. - Epic and others land somewhere in between, depending on what local artifacts we can legally read.
So the decision was: don’t pretend identity is one magic string. We model identity as:
- A stable DB
id(what the Codex truly trusts). - Launcher context (
launcher_type, optionalsteam_app_id). - Launch truth (
executable_path,launch_args). - Install truth (
install_dir,is_installed,install_source). - Time truth (
last_detected_at,updated_at).
DB-SCHEMA
That bundle gives us resilience. If the display name changes (or gets corrected), the record remains the same. If the scanner gets smarter later (it will), we can update the row without rewriting history.
The cut: we did not invent a universal hash (yet)
It’s tempting to create a “super fingerprint” hash across paths + names + launcher hints. But that can backfire:
- Paths change (moved libraries, reinstalls).
- Names vary (localized titles, punctuation, edition suffixes).
- Hashes become brittle, and brittle identity creates duplicates.
So we’re postponing any “perfect fingerprint” until we’ve seen enough real libraries to justify it. For now, the Codex stays humble: stable id, careful upserts, and explicit fields that we can reason about.
ARCH-NOTES
Rough edges / dragons
⚠️ Dedupe is a living beast. If a user has the same game installed via multiple sources (Steam + standalone, etc.), we must decide whether that’s one Codex entry or two. The schema can represent either, but the product decision needs a clear rule (and probably a UI affordance).
DB-SCHEMA
⚠️ Normalization drift. normalized_name is powerful, but if we change normalization rules later, we need to ensure old records don’t become harder to match. This argues for keeping normalization logic stable, versioned, or at least carefully migrated.
DB-SCHEMA
⚠️ Board ordering depends on consistent identity. The Backlog Board’s sort_index only stays meaningful if a game remains the same record across rescans. If identity splits, ordering “teleports.” That’s why we’re treating the game record definition as foundational, not cosmetic.
BACKLOG-BOARD-SPEC
Next on the path
🔜 Scanner cleanup pass: keep reducing false positives and junk “games,” so identity pressure stays low and the Codex remains clean.
TASK-LIST
🔜 Clarify multi-source rules: decide what happens when the same title appears from multiple launchers—merge, separate, or “linked variants.”
🔜 Chronicle polish: once identity is steady, the Chronicle experience can become richer without fear of attaching lore to the wrong entry.
TASK-LIST
🔜 IGDB validation sweep: identity and matching are inseparable—verify that our record model consistently binds to correct IGDB metadata across a representative library.
TASK-LIST
— Filed into the Chronicle.
