# Smarter Lidarr Matching Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Make `musicfetch.lidarr_search` resolve a Shazam-style `Artist - Track` to the correct album by asking MusicBrainz for the studio album's release-group MBID, then doing an **exact** Lidarr lookup (`album/lookup?term=mbid:`) — so the noninteractive API picks the real album (Daft Punk *Discovery*) instead of junk (Pignickel novelty), with **no fuzzy ranking system**. **Architecture:** All changes are in the single-file `musicfetch` binary (the shared search used by both the CLI picker and the REST API). New helpers `_split_query` and `musicbrainz_best_album`, plus a rewritten `lidarr_search` with small lookup helpers and tiered fallbacks. Tests import the binary as a module via the existing `server.mf` loader (which registers it in `sys.modules` as `musicfetch_core`). **Tech Stack:** Python 3.10+, stdlib `time`, `requests` (already a dep), pytest with `monkeypatch`. No new dependencies. Live-validated against MusicBrainz + the user's Lidarr 3.1.0 — `album/lookup?term=mbid:48117b90-a16e-34ca-a514-19c702df1158` returns exactly `Discovery — Daft Punk`. --- ## Context for the implementer `musicfetch` is an executable Python file (no `.py` ext) at the repo root. Relevant existing pieces: - `Hit` dataclass: fields `source, kind, title, artist, album, year, thumbnail, payload`. - `_album_to_hit(album)` → `Hit(source="lidarr", kind="album", ..., payload={"album": album})`. The raw Lidarr album dict carries `foreignAlbumId` (MusicBrainz release-group MBID) and `releaseDate`. - `_artist_to_hit(artist)` → `Hit(source="lidarr", kind="artist", ...)`. - `lidarr_get(path, params=None, timeout=15)` → GET helper, raises on HTTP error. - `API_KEY`, `dbg(...)`, `err(...)`, module-level `requests`, `from requests.exceptions import RequestException, Timeout`. - Current `lidarr_search(query, limit)` at lines ~129-162 trusts `/api/v1/search` ordering then falls back to `/album/lookup` + `/artist/lookup`. **This is what we replace.** **Why MusicBrainz is still required:** Lidarr has no track-search endpoint; `album/lookup` only matches albums/artists. Shazam gives `Artist - Track`, and the track name won't match the album title in Lidarr. MusicBrainz recording search maps track → album, and gives us the release-group MBID that Lidarr's `mbid:` lookup resolves exactly. No scoring needed. **Don't break callers:** `lidarr_search(query, limit) -> list[Hit]` signature stays identical. `build_combined_hits` and the API depend on it returning `[]` on failure (so the YouTube fallback works). **Tests access the binary like this** (top of each new test module): ```python import server.mf # noqa: F401 — loads musicfetch and registers musicfetch_core in sys.modules import musicfetch_core as mf ``` Set `mf.API_KEY` via `monkeypatch.setattr(mf, "API_KEY", "testkey")` where needed. **One import to add** to the top imports block of `musicfetch` (Task 2): `import time`. --- ### Task 1: Query splitter `_split_query` **Files:** - Modify: `musicfetch` (add `_split_query` just above `lidarr_search`) - Test: `tests/test_lidarr_match.py` - [ ] **Step 1: Write the failing test** Create `tests/test_lidarr_match.py`: ```python import server.mf # noqa: F401 — loads musicfetch, registers musicfetch_core in sys.modules import musicfetch_core as mf def test_split_query_with_dash(): assert mf._split_query("Daft Punk - Discovery") == ("Daft Punk", "Discovery") def test_split_query_no_dash(): assert mf._split_query("Daft Punk") == ("Daft Punk", None) def test_split_query_splits_on_first_dash_only(): assert mf._split_query("A - B - C") == ("A", "B - C") def test_split_query_strips_whitespace(): assert mf._split_query(" Daft Punk - Discovery ") == ("Daft Punk", "Discovery") ``` - [ ] **Step 2: Run test to verify it fails** Run: `pytest tests/test_lidarr_match.py -v` Expected: FAIL — `AttributeError: module 'musicfetch_core' has no attribute '_split_query'` - [ ] **Step 3: Add the implementation** In `musicfetch`, immediately above `def lidarr_search(`: ```python def _split_query(query: str) -> tuple[str, Optional[str]]: """Split a Shazam-style 'Artist - Track' on the first ' - '. Returns (artist, track) or (term, None) when there is no separator.""" if " - " in query: left, right = query.split(" - ", 1) return left.strip(), right.strip() return query.strip(), None ``` - [ ] **Step 4: Run test to verify it passes** Run: `pytest tests/test_lidarr_match.py -v` Expected: PASS (4 passed) - [ ] **Step 5: Commit** ```bash git add musicfetch tests/test_lidarr_match.py git commit -m "feat(lidarr): add Artist - Track query splitter" ``` --- ### Task 2: MusicBrainz track→album resolver **Files:** - Modify: `musicfetch` (add `import time` to top imports; add MB constants + `_mb_rate_limit`, `_mb_artist_credit`, `musicbrainz_best_album` above `lidarr_search`) - Test: `tests/test_musicbrainz.py` The release-group selection prefers studio albums (`primary-type == "Album"` with no `secondary-types`), choosing the earliest dated one, skipping Single/Compilation/Live. Verified live: for "Daft Punk / Harder Better Faster Stronger" MB returns a Single, Compilations, Live albums, and the studio **Discovery** (mbid `48117b90-a16e-34ca-a514-19c702df1158`). - [ ] **Step 1: Write the failing test** Create `tests/test_musicbrainz.py`: ```python import server.mf # noqa: F401 import musicfetch_core as mf class _FakeResp: def __init__(self, payload): self._payload = payload def raise_for_status(self): pass def json(self): return self._payload # Trimmed real-shaped MB recording response. MB_PAYLOAD = { "recordings": [ { "artist-credit": [{"name": "Daft Punk"}], "releases": [ {"date": "2001", "release-group": {"id": "single-mbid", "title": "Harder, Better, Faster, Stronger", "primary-type": "Single", "secondary-types": []}}, {"date": "2002", "release-group": {"id": "comp-mbid", "title": "Musique, Vol. 1", "primary-type": "Album", "secondary-types": ["Compilation"]}}, {"date": "2001", "release-group": {"id": "48117b90-a16e-34ca-a514-19c702df1158", "title": "Discovery", "primary-type": "Album", "secondary-types": []}}, ], } ] } def test_picks_studio_album_over_single_and_comp(monkeypatch): monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(MB_PAYLOAD)) monkeypatch.setattr(mf.time, "sleep", lambda *_: None) out = mf.musicbrainz_best_album("Daft Punk", "Harder Better Faster Stronger") assert out["album_title"] == "Discovery" assert out["artist"] == "Daft Punk" assert out["year"] == "2001" assert out["rg_mbid"] == "48117b90-a16e-34ca-a514-19c702df1158" def test_returns_none_on_empty(monkeypatch): monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp({"recordings": []})) monkeypatch.setattr(mf.time, "sleep", lambda *_: None) assert mf.musicbrainz_best_album("Nobody", "Nothing") is None def test_returns_none_on_exception(monkeypatch): def boom(*a, **k): raise mf.requests.exceptions.RequestException("network down") monkeypatch.setattr(mf.requests, "get", boom) monkeypatch.setattr(mf.time, "sleep", lambda *_: None) assert mf.musicbrainz_best_album("Daft Punk", "Discovery") is None def test_falls_back_to_any_releasegroup_when_no_studio(monkeypatch): payload = {"recordings": [{"artist-credit": [{"name": "X"}], "releases": [ {"date": "2010", "release-group": {"id": "live1", "title": "Live Thing", "primary-type": "Album", "secondary-types": ["Live"]}}, ]}]} monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload)) monkeypatch.setattr(mf.time, "sleep", lambda *_: None) out = mf.musicbrainz_best_album("X", "Y") assert out["album_title"] == "Live Thing" def test_first_artist_credit_only(monkeypatch): payload = {"recordings": [{"artist-credit": [{"name": "SLVMLORD"}, {"name": "Travis Bradley"}], "releases": [{"date": "2025", "release-group": {"id": "x", "title": "Album X", "primary-type": "Album", "secondary-types": []}}]}]} monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload)) monkeypatch.setattr(mf.time, "sleep", lambda *_: None) out = mf.musicbrainz_best_album("SLVMLORD", "Under My Skin") assert out["artist"] == "SLVMLORD" ``` - [ ] **Step 2: Run test to verify it fails** Run: `pytest tests/test_musicbrainz.py -v` Expected: FAIL — `AttributeError: ... 'musicbrainz_best_album'` - [ ] **Step 3: Add the implementation** Add `import time` to the top imports block of `musicfetch` (with `import json`, `import os`, etc.). Then add above `lidarr_search`: ```python MUSICBRAINZ_URL = "https://musicbrainz.org/ws/2" MB_HEADERS = {"User-Agent": "musicfetch/2.0 (https://github.com/; personal music fetcher)"} _mb_last_call = 0.0 def _mb_rate_limit(): """Courtesy ~1 req/sec to MusicBrainz.""" global _mb_last_call elapsed = time.time() - _mb_last_call if elapsed < 1.0: time.sleep(1.0 - elapsed) _mb_last_call = time.time() def _mb_artist_credit(credit) -> str: """First credited artist name only (ignore featured/secondary).""" if credit and isinstance(credit, list) and isinstance(credit[0], dict): return credit[0].get("name") or (credit[0].get("artist") or {}).get("name", "") return "" def musicbrainz_best_album(artist: str, track: str, timeout: int = 8) -> Optional[dict]: """Resolve 'artist - track' to its best studio album via MusicBrainz. Returns {album_title, artist, year, rg_mbid} or None. Never raises.""" query = f'artist:"{artist}" AND recording:"{track}"' try: _mb_rate_limit() resp = requests.get( f"{MUSICBRAINZ_URL}/recording", params={"query": query, "fmt": "json", "limit": 10}, headers=MB_HEADERS, timeout=timeout, ) resp.raise_for_status() data = resp.json() except Exception as e: # noqa: BLE001 — degrade to fallback on any failure dbg(f"MusicBrainz lookup failed: {e}") return None # candidate = (is_studio, date_sortkey, title, artist, year, mbid) candidates = [] for rec in data.get("recordings", []): rec_artist = _mb_artist_credit(rec.get("artist-credit")) for rel in rec.get("releases", []): rg = rel.get("release-group") or {} title = rg.get("title") or rel.get("title") or "" if not title: continue mbid = rg.get("id") or "" primary = rg.get("primary-type") or "" secondary = rg.get("secondary-types") or [] date = rel.get("date") or rg.get("first-release-date") or "" is_studio = primary == "Album" and not secondary candidates.append((is_studio, date or "9999", title, rec_artist, date[:4], mbid)) if not candidates: return None pool = [c for c in candidates if c[0]] or candidates pool.sort(key=lambda c: c[1]) # earliest date first _, _, title, art, year, mbid = pool[0] dbg(f"MusicBrainz resolved '{artist} - {track}' -> '{title}' ({year}) mbid={mbid}") return {"album_title": title, "artist": art or artist, "year": year, "rg_mbid": mbid} ``` - [ ] **Step 4: Run test to verify it passes** Run: `pytest tests/test_musicbrainz.py -v` Expected: PASS (5 passed) - [ ] **Step 5: Commit** ```bash git add musicfetch tests/test_musicbrainz.py git commit -m "feat(lidarr): MusicBrainz track-to-album resolver" ``` --- ### Task 3: Rewrite `lidarr_search` for MBID-exact lookup **Files:** - Modify: `musicfetch` (replace `lidarr_search`; add `_lidarr_album_candidates`, `_lidarr_artist_candidates`, `_fallback_lookup`, `_universal_search`) - Test: `tests/test_lidarr_search.py` - [ ] **Step 1: Write the failing test** Create `tests/test_lidarr_search.py`: ```python import server.mf # noqa: F401 import musicfetch_core as mf DISCOVERY_MBID = "48117b90-a16e-34ca-a514-19c702df1158" DISCOVERY_ALBUM = {"title": "Discovery", "artist": {"artistName": "Daft Punk"}, "releaseDate": "2001-01-01", "foreignAlbumId": DISCOVERY_MBID} def test_artist_track_uses_mbid_exact_lookup(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "testkey") monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk", "year": "2001", "rg_mbid": DISCOVERY_MBID}) seen = {} def fake_get(path, params=None, timeout=15): seen["term"] = (params or {}).get("term") if path == "/api/v1/album/lookup" and seen["term"] == f"mbid:{DISCOVERY_MBID}": return [DISCOVERY_ALBUM] return [] monkeypatch.setattr(mf, "lidarr_get", fake_get) hits = mf.lidarr_search("Daft Punk - Harder Better Faster Stronger", 10) assert seen["term"] == f"mbid:{DISCOVERY_MBID}" # exact MBID lookup, not fuzzy assert hits[0].album == "Discovery" assert hits[0].artist == "Daft Punk" assert hits[0].payload["album"]["foreignAlbumId"] == DISCOVERY_MBID def test_year_enriched_from_musicbrainz(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "testkey") monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk", "year": "2001", "rg_mbid": DISCOVERY_MBID}) no_year = [{"title": "Discovery", "artist": {"artistName": "Daft Punk"}, "releaseDate": "", "foreignAlbumId": DISCOVERY_MBID}] monkeypatch.setattr(mf, "lidarr_get", lambda path, params=None, timeout=15: no_year if path == "/api/v1/album/lookup" else []) hits = mf.lidarr_search("Daft Punk - Discovery", 10) assert hits[0].year == "2001" def test_no_api_key_returns_empty(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "") assert mf.lidarr_search("Daft Punk - Discovery", 10) == [] def test_mb_miss_falls_back_to_lookup(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "testkey") monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None) monkeypatch.setattr(mf, "lidarr_get", lambda path, params=None, timeout=15: [DISCOVERY_ALBUM] if path == "/api/v1/album/lookup" else []) hits = mf.lidarr_search("Daft Punk - Discovery", 10) assert hits[0].album == "Discovery" def test_single_term_is_artist_first(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "testkey") def fake_get(path, params=None, timeout=15): if path == "/api/v1/artist/lookup": return [{"artistName": "Daft Punk"}] if path == "/api/v1/album/lookup": return [DISCOVERY_ALBUM] return [] monkeypatch.setattr(mf, "lidarr_get", fake_get) hits = mf.lidarr_search("Daft Punk", 10) assert hits[0].kind == "artist" # bare term -> artist first assert hits[0].artist == "Daft Punk" def test_last_resort_universal_search(monkeypatch): monkeypatch.setattr(mf, "API_KEY", "testkey") monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None) def fake_get(path, params=None, timeout=15): if path == "/api/v1/search": return [{"album": DISCOVERY_ALBUM}] return [] # album/lookup + artist/lookup empty monkeypatch.setattr(mf, "lidarr_get", fake_get) hits = mf.lidarr_search("Daft Punk - Discovery", 10) assert hits and hits[0].album == "Discovery" ``` - [ ] **Step 2: Run test to verify it fails** Run: `pytest tests/test_lidarr_search.py -v` Expected: FAIL (current `lidarr_search` ignores MB / `mbid:` lookup) - [ ] **Step 3: Replace `lidarr_search` and add helpers** In `musicfetch`, replace the entire existing `def lidarr_search(...)` body (lines ~129-162) with the following, adding the helpers below it: ```python def lidarr_search(query: str, limit: int) -> list[Hit]: """Return Lidarr hits, best match first. Resolves 'Artist - Track' to an album's MusicBrainz release-group MBID, then does an exact Lidarr lookup (term=mbid:) — no fuzzy ranking. Falls back so it never raises and returns [] only on total failure / missing key.""" if not API_KEY: err("LIDARR_API_KEY not set — skipping Lidarr search.") return [] artist, right = _split_query(query) if right: mb = musicbrainz_best_album(artist, right) if mb and mb["rg_mbid"]: hits = _lidarr_album_candidates(f"mbid:{mb['rg_mbid']}") for h in hits: if not h.year and mb["year"]: h.year = mb["year"] if hits: return hits[:limit] # MusicBrainz miss / no exact album → plain lookup (album-first: a dash # query named an album/track). return _fallback_lookup(query, limit, artist_first=False) # Bare term is most often an artist. return _fallback_lookup(query, limit, artist_first=True) def _lidarr_album_candidates(term: str) -> list[Hit]: try: return [_album_to_hit(a) for a in lidarr_get("/api/v1/album/lookup", params={"term": term})] except RequestException as e: dbg(f"album/lookup failed: {e}") return [] def _lidarr_artist_candidates(term: str) -> list[Hit]: try: return [_artist_to_hit(a) for a in lidarr_get("/api/v1/artist/lookup", params={"term": term})] except RequestException as e: dbg(f"artist/lookup failed: {e}") return [] def _fallback_lookup(query: str, limit: int, artist_first: bool) -> list[Hit]: """Plain album + artist lookups (no scoring); /search as last resort.""" albums = _lidarr_album_candidates(query) artists = _lidarr_artist_candidates(query) hits = (artists + albums) if artist_first else (albums + artists) if hits: return hits[:limit] return _universal_search(query, limit) def _universal_search(query: str, limit: int) -> list[Hit]: """Last resort: Lidarr's fuzzy /search (unranked).""" hits: list[Hit] = [] try: for item in lidarr_get("/api/v1/search", params={"term": query}): if item.get("album"): hits.append(_album_to_hit(item["album"])) elif item.get("artist"): hits.append(_artist_to_hit(item["artist"])) except RequestException as e: dbg(f"/api/v1/search failed: {e}") return hits[:limit] ``` - [ ] **Step 4: Run tests to verify they pass** Run: `pytest tests/test_lidarr_search.py -v` Expected: PASS (6 passed) - [ ] **Step 5: Run the full suite** Run: `pytest -q` Expected: all green (prior 27 + new split/musicbrainz/lidarr-search tests), and `python3 -m py_compile musicfetch` clean. - [ ] **Step 6: Commit** ```bash git add musicfetch tests/test_lidarr_search.py git commit -m "feat(lidarr): exact MBID album lookup via MusicBrainz resolution" ``` --- ### Task 4: Live verification against the user's Lidarr **Files:** none (manual verification by the controller, not a subagent). - [ ] **Step 1: Read-only check — `lidarr_search` resolves the real album** No mutation; confirms the MB → `mbid:` exact lookup end-to-end: ```bash cd /home/zhering/Documents/musicfetch env LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=49cf02acb4c7436b842df2150056d468 \ python3 -c "import server.mf, musicfetch_core as mf; \ hits=mf.lidarr_search('Daft Punk - Harder Better Faster Stronger', 5); \ print([(h.artist, h.album, h.payload['album'].get('foreignAlbumId')) for h in hits[:3]])" ``` Expected: first hit `('Daft Punk', 'Discovery', '48117b90-a16e-34ca-a514-19c702df1158')`. - [ ] **Step 2: Spot-check a second track** (different artist), e.g.: ```bash env LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=49cf02acb4c7436b842df2150056d468 \ python3 -c "import server.mf, musicfetch_core as mf; \ print([(h.artist,h.album) for h in mf.lidarr_search('Tame Impala - The Less I Know The Better',3)])" ``` Expected: top hit is the album containing that track (e.g. *Currents*), not a single/compilation. - [ ] **Step 3: (Optional, mutating) full /fetch** — only with user approval, since it adds the artist+album to their Lidarr. Start the API (`env MUSICFETCH_API_KEY=… LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=… MUSICFETCH_ROOT=/media/music python3 -m uvicorn server.app:app --port 6769`), `POST /fetch?q=...&source=lidarr`, observe job + Lidarr UI, then clean up any added test artist via `DELETE /api/v1/artist/?deleteFiles=false`. --- ## Self-Review **Spec coverage:** - Shared `lidarr_search` rewrite, same signature → Task 3. ✅ - MusicBrainz resolver w/ studio release-group selection + first-artist credit → Task 2. ✅ - `mbid:` exact Lidarr lookup (no fuzzy scoring) → Task 3. ✅ - Query split → Task 1. ✅ - Fallback tiers (MB miss → `_fallback_lookup` → `/api/v1/search`; returns [] on total failure / no key) → Task 3 (`test_mb_miss_falls_back_to_lookup`, `test_last_resort_universal_search`, `test_no_api_key_returns_empty`). ✅ - Year enrichment from MB → Task 3 (`test_year_enriched_from_musicbrainz`). ✅ - YouTube-fallback preserved (signature unchanged; `[]` on failure) → guaranteed + `test_no_api_key_returns_empty`. ✅ - Single-term artist-first ordering → Task 3 (`test_single_term_is_artist_first`). ✅ - Out-of-scope (difflib scoring removed; metadata/quality-profile hardening raised separately) intentionally excluded. **Placeholder scan:** None — all code and test bodies complete; real MBID/JSON baked in. **Type consistency:** `lidarr_search(query, limit) -> list[Hit]` unchanged. `musicbrainz_best_album` returns `{album_title, artist, year, rg_mbid}` — keys identical across Task 2 (definition) and Task 3 (consumes `mb["rg_mbid"]`, `mb["year"]`) and tests. `_split_query -> (str, Optional[str])` consistent. `_lidarr_album_candidates`/`_lidarr_artist_candidates`/`_fallback_lookup(query, limit, artist_first)`/`_universal_search(query, limit)` signatures consistent between Task 3 definition and call sites. `_album_to_hit` payload `{"album": {...}}` with `foreignAlbumId` matches the assertions in Task 3.