Drop fuzzy difflib scoring: MusicBrainz resolves track->album release-group MBID, Lidarr album/lookup?term=mbid:<id> returns the exact album. Live-verified against the user's Lidarr. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.3 KiB
Smarter Lidarr Matching — Design
Date: 2026-06-08 Status: Approved
Context & Goal
Live testing of the REST API exposed a real weakness: musicfetch's
lidarr_search trusts Lidarr's universal /api/v1/search ordering, which is
fuzzy and unranked. A query of Daft Punk - Discovery ranked a novelty remix
("Daft Punk's Discovery but it's in the SM64 Soundfont" by Pignickel) #1, and
the real Discovery by Daft Punk wasn't even top-5. The interactive CLI picker
lets a human work around this; the API's noninteractive top-pick cannot and
grabs garbage.
The real input shape is Shazam-style Artist - Track. Lidarr only grabs
albums, never single tracks, so we must resolve a track to the album that
contains it, then pick the best-matching Lidarr album.
Goal: make lidarr_search return a scored, best-first list of Lidarr
hits so the noninteractive API picks the correct album, and the CLI picker shows
good matches first. Resolve Artist - Track → album via MusicBrainz.
Decisions (confirmed with user)
- Fix in the shared
musicfetch.lidarr_search(not an API-only layer) — both the CLI picker and the API noninteractive pick benefit; no duplicated logic. Signature unchanged:lidarr_search(query, limit) -> list[Hit](drop-in). - Resolve track → album via MusicBrainz (the same upstream Lidarr uses). Lidarr's own track indexing is too weak. One extra HTTP call, no API key.
- Track-first semantics (
Artist - Track): the right side is treated as a track to resolve to its album. (YouTube path already handles exact tracks; this makes Lidarr the accurate album/discography source.) - No fuzzy scoring. Live-verified that Lidarr's
album/lookupaccepts a direct MusicBrainz id:term=mbid:<release-group-mbid>(alsoterm=lidarr:<mbid>) returns exactly one album. So we resolve the album's MBID via MusicBrainz and ask Lidarr for that exact MBID — no difflib, no ranking heuristics. The only selection is deterministic release-group type-filtering inside the MusicBrainz step (prefer studio Album over single/comp/live). - YouTube fallback preserved exactly as today (see below).
Architecture
All changes live in the musicfetch binary (single file). New/changed units:
musicfetch
├── _split_query(query) -> (left, right|None) # split on first " - "
├── musicbrainz_best_album(artist, track) -> dict|None
│ # MB recording search -> best release-group {album_title, artist, year, rg_mbid}
├── _lidarr_album_candidates(term) / _lidarr_artist_candidates(term) -> list[Hit]
├── _universal_search(query, limit) -> list[Hit] # /api/v1/search last resort
└── lidarr_search(query, limit) -> list[Hit] # REWRITTEN: MBID-exact + fallbacks
Data flow
Artist - Trackquery: a.musicbrainz_best_album(artist, track)→{album_title, artist, year, rg_mbid}. b. LidarrGET /api/v1/album/lookup?term=mbid:<rg_mbid>→ 0 or 1 exact album →Hit. c. EnrichHit.yearfrom MB when the Lidarr hit lacks one. Return it.- Single-term query (no
-):_fallback_lookup— artist-first concatenation of/artist/lookup+/album/lookupfor the raw term (a bare term is most often an artist). No scoring; the interactive picker / noninteractive top-pick consume the order. - Fallbacks (never regress): if MusicBrainz misses or the exact MBID lookup
returns nothing, use
_fallback_lookup(query)(album-first there, since a dash query named an album/track). If/album/lookupand/artist/lookupboth yield nothing, fall back to the existing/api/v1/search.lidarr_searchreturns[]only when everything fails or the key is missing.
MusicBrainz client details
- Endpoint:
https://musicbrainz.org/ws/2/recording?query=<lucene>&fmt=json&limit=10where lucene =artist:"<artist>" AND recording:"<track>". - Headers:
User-Agent: musicfetch/2.0 (https://github.com/…)(MB requires a descriptive UA). Timeout ~8s. Rate-limit: at most ~1 request/sec (a process-level min-interval guard; this tool makes one call per fetch so it's effectively a courtesy delay). - Release-group selection from the returned recordings' releases:
prefer
primary-type == "Album"with nosecondary-types(excludes Compilation, Live, Single, Soundtrack); among those choose the earliestfirst-release-date. Fall back to any release-group if none qualify. Return{album_title, artist, year, rg_mbid}orNone.
YouTube Fallback (unchanged, documented)
This feature does not alter fallback behavior:
source=auto(default):build_combined_hitsincludes YouTube hits. If Lidarr times out or returns no results,lidarr_searchreturns[]and the top YouTube hit is picked. If a Lidarr album is picked but has no indexer release,actions.perform_fetchfalls through to the top YouTube hit.source=lidarr: lidarr-only by design — no YouTube fallback (the explicit "force Lidarr" switch). Unchanged.
Error Handling
- All MB and Lidarr HTTP calls are wrapped; exceptions/timeouts are caught and
degrade to the next fallback tier.
lidarr_searchnever raises. - Empty/garbled MB JSON → treated as no match.
- Existing
DEBUGlogging extended to show MB query and chosen release-group.
Testing
Unit tests (mock requests, no live network):
musicbrainz_best_album: from canned MB JSON, picks studio Album over a single and a compilation; picks earliest among Albums; falls back to any release-group when no studio exists; returnsNoneon empty/exception._split_query:"A - B"→("A","B"); no dash →("A", None); only first-splits.lidarr_search:Artist - Trackresolves via MB then does anmbid:exact lookup returning the real album (year enriched from MB); MB miss → fallback lookup; fallback empty →/api/v1/search; no key →[].
Manual live check (end of implementation): with the API pointed at the user's
Lidarr (10.2.1.16:8686), lidarr_search("Daft Punk - Harder Better Faster Stronger") resolves to Discovery by Daft Punk (the exact MBID
48117b90-a16e-34ca-a514-19c702df1158), not a single/compilation/novelty.
Out of Scope (YAGNI)
Caching MB responses, multi-track/album disambiguation UI, fuzzy similarity
scoring (eliminated by the mbid: exact lookup), MB cover-art lookup.