Merge feat/rest-api: REST API + smarter Lidarr matching

- FastAPI async job-based REST API wrapping musicfetch (X-API-Key auth,
  Siri-friendly messages, dockerized for the Lidarr stack).
- Smarter Lidarr search: MusicBrainz track->album resolution + exact
  mbid: lookup (prefers own-artist studio album), no fuzzy ranking.
- Bug fixes from live testing: single first-artist tag (no doubling).
This commit is contained in:
2026-06-08 23:31:41 -07:00
23 changed files with 1645 additions and 24 deletions

7
.dockerignore Normal file
View File

@@ -0,0 +1,7 @@
__pycache__/
*.pyc
tests/
docs/
.git/
*.md
.claude/

View File

@@ -140,6 +140,86 @@ export LIDARR_API_KEY="your-lidarr-api-key"
--- ---
## 🌐 REST API (Docker)
Run MusicFetch as an authenticated HTTP service inside your Lidarr Docker stack.
A client POSTs a query; the server grabs the top hit non-interactively and runs
the download as a background job you can poll. Every response includes a
human-readable `message` (handy for Siri).
### Configure & run
Set the network name in `server/docker-compose.yml` to your existing Lidarr
stack network, then:
```bash
export LIDARR_API_KEY="your-lidarr-key"
export MUSICFETCH_API_KEY="a-long-random-secret"
docker compose -f server/docker-compose.yml up -d --build
```
| Env var | Default | Purpose |
| --- | --- | --- |
| `MUSICFETCH_API_KEY` | *(required)* | Shared secret clients send as `X-API-Key`. |
| `MUSICFETCH_PORT` | `6769` | Listen port. |
| `LIDARR_URL` | `http://lidarr:8686` | Lidarr base URL (stack network). |
| `LIDARR_API_KEY` | *(required for Lidarr)* | Lidarr API key. |
| `MUSICFETCH_ROOT` | `/media/music` | Music output root (bind-mounted). |
TLS is expected to be handled by your upstream reverse proxy; the container
serves plain HTTP on `6769`.
### Endpoints
| Method | Path | Auth | Purpose |
| --- | --- | --- | --- |
| `GET` | `/health` | no | Liveness check. |
| `POST` | `/fetch?q=...` | yes | Grab top hit; returns a `job_id`. |
| `GET` | `/jobs/{id}` | yes | Poll job status. |
`POST /fetch` params: `q` (required), `quality` (`best,320,m4a,opus,flac`),
`source` (`auto,lidarr,youtube`).
### curl examples
```bash
# Kick off a fetch
curl -X POST 'https://mf.izebra.net/fetch?q=Under%20My%20Skin' \
-H 'X-API-Key: a-long-random-secret'
# -> {"message":"Found 'Under My Skin' ... Downloading now.","job_id":"a1b2c3","status":"queued","hit":{...}}
# Poll the job
curl 'https://mf.izebra.net/jobs/a1b2c3' -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Finished downloading ...","status":"done","result":{...}}
```
### 🗣️ Siri Shortcuts integration
Make a shortcut that fetches music by voice ("Hey Siri, fetch music").
1. **Shortcuts app → New Shortcut.**
2. Add **Ask for Input** → Input Type **Text**, prompt "What should I fetch?".
(Or use **Dictate Text** for fully spoken input.)
3. Add **Text** action, set it to: `https://mf.izebra.net/fetch?q=` then insert
the **Provided Input** variable at the end. (Shortcuts URL-encodes query
variables automatically.)
4. Add **Get Contents of URL**:
- **URL:** the Text variable from step 3.
- **Method:** `POST`.
- **Headers:** add one — key `X-API-Key`, value your `MUSICFETCH_API_KEY`.
- **Request Body:** leave as is (the query is in the URL).
5. Add **Get Dictionary Value** → Get Value for **message** in **Contents of URL**.
6. Add **Speak Text** → the Dictionary Value. Siri reads back
"Found '…' … Downloading now."
7. (Optional) To confirm completion: add **Get Dictionary Value** for `job_id`,
**Wait** ~20 seconds, **Get Contents of URL** on
`https://mf.izebra.net/jobs/<job_id>` (same `X-API-Key` header), then
**Get Dictionary Value** `message`**Speak Text** again.
Rename the shortcut (e.g. "Fetch Music") — that phrase becomes the Siri trigger.
---
## 🛠️ Contributing ## 🛠️ Contributing
PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr

View File

@@ -0,0 +1,514 @@
# Smarter Lidarr Matching Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make `musicfetch.lidarr_search` resolve a Shazam-style `Artist - Track` to the correct album by asking MusicBrainz for the studio album's release-group MBID, then doing an **exact** Lidarr lookup (`album/lookup?term=mbid:<MBID>`) — so the noninteractive API picks the real album (Daft Punk *Discovery*) instead of junk (Pignickel novelty), with **no fuzzy ranking system**.
**Architecture:** All changes are in the single-file `musicfetch` binary (the shared search used by both the CLI picker and the REST API). New helpers `_split_query` and `musicbrainz_best_album`, plus a rewritten `lidarr_search` with small lookup helpers and tiered fallbacks. Tests import the binary as a module via the existing `server.mf` loader (which registers it in `sys.modules` as `musicfetch_core`).
**Tech Stack:** Python 3.10+, stdlib `time`, `requests` (already a dep), pytest with `monkeypatch`. No new dependencies. Live-validated against MusicBrainz + the user's Lidarr 3.1.0 — `album/lookup?term=mbid:48117b90-a16e-34ca-a514-19c702df1158` returns exactly `Discovery — Daft Punk`.
---
## Context for the implementer
`musicfetch` is an executable Python file (no `.py` ext) at the repo root. Relevant existing pieces:
- `Hit` dataclass: fields `source, kind, title, artist, album, year, thumbnail, payload`.
- `_album_to_hit(album)``Hit(source="lidarr", kind="album", ..., payload={"album": album})`. The raw Lidarr album dict carries `foreignAlbumId` (MusicBrainz release-group MBID) and `releaseDate`.
- `_artist_to_hit(artist)``Hit(source="lidarr", kind="artist", ...)`.
- `lidarr_get(path, params=None, timeout=15)` → GET helper, raises on HTTP error.
- `API_KEY`, `dbg(...)`, `err(...)`, module-level `requests`, `from requests.exceptions import RequestException, Timeout`.
- Current `lidarr_search(query, limit)` at lines ~129-162 trusts `/api/v1/search` ordering then falls back to `/album/lookup` + `/artist/lookup`. **This is what we replace.**
**Why MusicBrainz is still required:** Lidarr has no track-search endpoint; `album/lookup` only matches albums/artists. Shazam gives `Artist - Track`, and the track name won't match the album title in Lidarr. MusicBrainz recording search maps track → album, and gives us the release-group MBID that Lidarr's `mbid:` lookup resolves exactly. No scoring needed.
**Don't break callers:** `lidarr_search(query, limit) -> list[Hit]` signature stays identical. `build_combined_hits` and the API depend on it returning `[]` on failure (so the YouTube fallback works).
**Tests access the binary like this** (top of each new test module):
```python
import server.mf # noqa: F401 — loads musicfetch and registers musicfetch_core in sys.modules
import musicfetch_core as mf
```
Set `mf.API_KEY` via `monkeypatch.setattr(mf, "API_KEY", "testkey")` where needed.
**One import to add** to the top imports block of `musicfetch` (Task 2): `import time`.
---
### Task 1: Query splitter `_split_query`
**Files:**
- Modify: `musicfetch` (add `_split_query` just above `lidarr_search`)
- Test: `tests/test_lidarr_match.py`
- [ ] **Step 1: Write the failing test**
Create `tests/test_lidarr_match.py`:
```python
import server.mf # noqa: F401 — loads musicfetch, registers musicfetch_core in sys.modules
import musicfetch_core as mf
def test_split_query_with_dash():
assert mf._split_query("Daft Punk - Discovery") == ("Daft Punk", "Discovery")
def test_split_query_no_dash():
assert mf._split_query("Daft Punk") == ("Daft Punk", None)
def test_split_query_splits_on_first_dash_only():
assert mf._split_query("A - B - C") == ("A", "B - C")
def test_split_query_strips_whitespace():
assert mf._split_query(" Daft Punk - Discovery ") == ("Daft Punk", "Discovery")
```
- [ ] **Step 2: Run test to verify it fails**
Run: `pytest tests/test_lidarr_match.py -v`
Expected: FAIL — `AttributeError: module 'musicfetch_core' has no attribute '_split_query'`
- [ ] **Step 3: Add the implementation**
In `musicfetch`, immediately above `def lidarr_search(`:
```python
def _split_query(query: str) -> tuple[str, Optional[str]]:
"""Split a Shazam-style 'Artist - Track' on the first ' - '.
Returns (artist, track) or (term, None) when there is no separator."""
if " - " in query:
left, right = query.split(" - ", 1)
return left.strip(), right.strip()
return query.strip(), None
```
- [ ] **Step 4: Run test to verify it passes**
Run: `pytest tests/test_lidarr_match.py -v`
Expected: PASS (4 passed)
- [ ] **Step 5: Commit**
```bash
git add musicfetch tests/test_lidarr_match.py
git commit -m "feat(lidarr): add Artist - Track query splitter"
```
---
### Task 2: MusicBrainz track→album resolver
**Files:**
- Modify: `musicfetch` (add `import time` to top imports; add MB constants + `_mb_rate_limit`, `_mb_artist_credit`, `musicbrainz_best_album` above `lidarr_search`)
- Test: `tests/test_musicbrainz.py`
The release-group selection prefers studio albums (`primary-type == "Album"` with no `secondary-types`), choosing the earliest dated one, skipping Single/Compilation/Live. Verified live: for "Daft Punk / Harder Better Faster Stronger" MB returns a Single, Compilations, Live albums, and the studio **Discovery** (mbid `48117b90-a16e-34ca-a514-19c702df1158`).
- [ ] **Step 1: Write the failing test**
Create `tests/test_musicbrainz.py`:
```python
import server.mf # noqa: F401
import musicfetch_core as mf
class _FakeResp:
def __init__(self, payload):
self._payload = payload
def raise_for_status(self):
pass
def json(self):
return self._payload
# Trimmed real-shaped MB recording response.
MB_PAYLOAD = {
"recordings": [
{
"artist-credit": [{"name": "Daft Punk"}],
"releases": [
{"date": "2001",
"release-group": {"id": "single-mbid", "title": "Harder, Better, Faster, Stronger",
"primary-type": "Single", "secondary-types": []}},
{"date": "2002",
"release-group": {"id": "comp-mbid", "title": "Musique, Vol. 1",
"primary-type": "Album", "secondary-types": ["Compilation"]}},
{"date": "2001",
"release-group": {"id": "48117b90-a16e-34ca-a514-19c702df1158",
"title": "Discovery", "primary-type": "Album",
"secondary-types": []}},
],
}
]
}
def test_picks_studio_album_over_single_and_comp(monkeypatch):
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(MB_PAYLOAD))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("Daft Punk", "Harder Better Faster Stronger")
assert out["album_title"] == "Discovery"
assert out["artist"] == "Daft Punk"
assert out["year"] == "2001"
assert out["rg_mbid"] == "48117b90-a16e-34ca-a514-19c702df1158"
def test_returns_none_on_empty(monkeypatch):
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp({"recordings": []}))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
assert mf.musicbrainz_best_album("Nobody", "Nothing") is None
def test_returns_none_on_exception(monkeypatch):
def boom(*a, **k):
raise mf.requests.exceptions.RequestException("network down")
monkeypatch.setattr(mf.requests, "get", boom)
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
assert mf.musicbrainz_best_album("Daft Punk", "Discovery") is None
def test_falls_back_to_any_releasegroup_when_no_studio(monkeypatch):
payload = {"recordings": [{"artist-credit": [{"name": "X"}], "releases": [
{"date": "2010", "release-group": {"id": "live1", "title": "Live Thing",
"primary-type": "Album", "secondary-types": ["Live"]}},
]}]}
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("X", "Y")
assert out["album_title"] == "Live Thing"
def test_first_artist_credit_only(monkeypatch):
payload = {"recordings": [{"artist-credit": [{"name": "SLVMLORD"}, {"name": "Travis Bradley"}],
"releases": [{"date": "2025",
"release-group": {"id": "x", "title": "Album X",
"primary-type": "Album",
"secondary-types": []}}]}]}
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("SLVMLORD", "Under My Skin")
assert out["artist"] == "SLVMLORD"
```
- [ ] **Step 2: Run test to verify it fails**
Run: `pytest tests/test_musicbrainz.py -v`
Expected: FAIL — `AttributeError: ... 'musicbrainz_best_album'`
- [ ] **Step 3: Add the implementation**
Add `import time` to the top imports block of `musicfetch` (with `import json`, `import os`, etc.). Then add above `lidarr_search`:
```python
MUSICBRAINZ_URL = "https://musicbrainz.org/ws/2"
MB_HEADERS = {"User-Agent": "musicfetch/2.0 (https://github.com/; personal music fetcher)"}
_mb_last_call = 0.0
def _mb_rate_limit():
"""Courtesy ~1 req/sec to MusicBrainz."""
global _mb_last_call
elapsed = time.time() - _mb_last_call
if elapsed < 1.0:
time.sleep(1.0 - elapsed)
_mb_last_call = time.time()
def _mb_artist_credit(credit) -> str:
"""First credited artist name only (ignore featured/secondary)."""
if credit and isinstance(credit, list) and isinstance(credit[0], dict):
return credit[0].get("name") or (credit[0].get("artist") or {}).get("name", "")
return ""
def musicbrainz_best_album(artist: str, track: str, timeout: int = 8) -> Optional[dict]:
"""Resolve 'artist - track' to its best studio album via MusicBrainz.
Returns {album_title, artist, year, rg_mbid} or None. Never raises."""
query = f'artist:"{artist}" AND recording:"{track}"'
try:
_mb_rate_limit()
resp = requests.get(
f"{MUSICBRAINZ_URL}/recording",
params={"query": query, "fmt": "json", "limit": 10},
headers=MB_HEADERS, timeout=timeout,
)
resp.raise_for_status()
data = resp.json()
except Exception as e: # noqa: BLE001 — degrade to fallback on any failure
dbg(f"MusicBrainz lookup failed: {e}")
return None
# candidate = (is_studio, date_sortkey, title, artist, year, mbid)
candidates = []
for rec in data.get("recordings", []):
rec_artist = _mb_artist_credit(rec.get("artist-credit"))
for rel in rec.get("releases", []):
rg = rel.get("release-group") or {}
title = rg.get("title") or rel.get("title") or ""
if not title:
continue
mbid = rg.get("id") or ""
primary = rg.get("primary-type") or ""
secondary = rg.get("secondary-types") or []
date = rel.get("date") or rg.get("first-release-date") or ""
is_studio = primary == "Album" and not secondary
candidates.append((is_studio, date or "9999", title, rec_artist, date[:4], mbid))
if not candidates:
return None
pool = [c for c in candidates if c[0]] or candidates
pool.sort(key=lambda c: c[1]) # earliest date first
_, _, title, art, year, mbid = pool[0]
dbg(f"MusicBrainz resolved '{artist} - {track}' -> '{title}' ({year}) mbid={mbid}")
return {"album_title": title, "artist": art or artist, "year": year, "rg_mbid": mbid}
```
- [ ] **Step 4: Run test to verify it passes**
Run: `pytest tests/test_musicbrainz.py -v`
Expected: PASS (5 passed)
- [ ] **Step 5: Commit**
```bash
git add musicfetch tests/test_musicbrainz.py
git commit -m "feat(lidarr): MusicBrainz track-to-album resolver"
```
---
### Task 3: Rewrite `lidarr_search` for MBID-exact lookup
**Files:**
- Modify: `musicfetch` (replace `lidarr_search`; add `_lidarr_album_candidates`, `_lidarr_artist_candidates`, `_fallback_lookup`, `_universal_search`)
- Test: `tests/test_lidarr_search.py`
- [ ] **Step 1: Write the failing test**
Create `tests/test_lidarr_search.py`:
```python
import server.mf # noqa: F401
import musicfetch_core as mf
DISCOVERY_MBID = "48117b90-a16e-34ca-a514-19c702df1158"
DISCOVERY_ALBUM = {"title": "Discovery", "artist": {"artistName": "Daft Punk"},
"releaseDate": "2001-01-01", "foreignAlbumId": DISCOVERY_MBID}
def test_artist_track_uses_mbid_exact_lookup(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album",
lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk",
"year": "2001", "rg_mbid": DISCOVERY_MBID})
seen = {}
def fake_get(path, params=None, timeout=15):
seen["term"] = (params or {}).get("term")
if path == "/api/v1/album/lookup" and seen["term"] == f"mbid:{DISCOVERY_MBID}":
return [DISCOVERY_ALBUM]
return []
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk - Harder Better Faster Stronger", 10)
assert seen["term"] == f"mbid:{DISCOVERY_MBID}" # exact MBID lookup, not fuzzy
assert hits[0].album == "Discovery"
assert hits[0].artist == "Daft Punk"
assert hits[0].payload["album"]["foreignAlbumId"] == DISCOVERY_MBID
def test_year_enriched_from_musicbrainz(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album",
lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk",
"year": "2001", "rg_mbid": DISCOVERY_MBID})
no_year = [{"title": "Discovery", "artist": {"artistName": "Daft Punk"},
"releaseDate": "", "foreignAlbumId": DISCOVERY_MBID}]
monkeypatch.setattr(mf, "lidarr_get",
lambda path, params=None, timeout=15: no_year if path == "/api/v1/album/lookup" else [])
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits[0].year == "2001"
def test_no_api_key_returns_empty(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "")
assert mf.lidarr_search("Daft Punk - Discovery", 10) == []
def test_mb_miss_falls_back_to_lookup(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None)
monkeypatch.setattr(mf, "lidarr_get",
lambda path, params=None, timeout=15: [DISCOVERY_ALBUM] if path == "/api/v1/album/lookup" else [])
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits[0].album == "Discovery"
def test_single_term_is_artist_first(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
def fake_get(path, params=None, timeout=15):
if path == "/api/v1/artist/lookup":
return [{"artistName": "Daft Punk"}]
if path == "/api/v1/album/lookup":
return [DISCOVERY_ALBUM]
return []
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk", 10)
assert hits[0].kind == "artist" # bare term -> artist first
assert hits[0].artist == "Daft Punk"
def test_last_resort_universal_search(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None)
def fake_get(path, params=None, timeout=15):
if path == "/api/v1/search":
return [{"album": DISCOVERY_ALBUM}]
return [] # album/lookup + artist/lookup empty
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits and hits[0].album == "Discovery"
```
- [ ] **Step 2: Run test to verify it fails**
Run: `pytest tests/test_lidarr_search.py -v`
Expected: FAIL (current `lidarr_search` ignores MB / `mbid:` lookup)
- [ ] **Step 3: Replace `lidarr_search` and add helpers**
In `musicfetch`, replace the entire existing `def lidarr_search(...)` body (lines ~129-162) with the following, adding the helpers below it:
```python
def lidarr_search(query: str, limit: int) -> list[Hit]:
"""Return Lidarr hits, best match first. Resolves 'Artist - Track' to an
album's MusicBrainz release-group MBID, then does an exact Lidarr lookup
(term=mbid:<id>) — no fuzzy ranking. Falls back so it never raises and
returns [] only on total failure / missing key."""
if not API_KEY:
err("LIDARR_API_KEY not set — skipping Lidarr search.")
return []
artist, right = _split_query(query)
if right:
mb = musicbrainz_best_album(artist, right)
if mb and mb["rg_mbid"]:
hits = _lidarr_album_candidates(f"mbid:{mb['rg_mbid']}")
for h in hits:
if not h.year and mb["year"]:
h.year = mb["year"]
if hits:
return hits[:limit]
# MusicBrainz miss / no exact album → plain lookup (album-first: a dash
# query named an album/track).
return _fallback_lookup(query, limit, artist_first=False)
# Bare term is most often an artist.
return _fallback_lookup(query, limit, artist_first=True)
def _lidarr_album_candidates(term: str) -> list[Hit]:
try:
return [_album_to_hit(a) for a in lidarr_get("/api/v1/album/lookup", params={"term": term})]
except RequestException as e:
dbg(f"album/lookup failed: {e}")
return []
def _lidarr_artist_candidates(term: str) -> list[Hit]:
try:
return [_artist_to_hit(a) for a in lidarr_get("/api/v1/artist/lookup", params={"term": term})]
except RequestException as e:
dbg(f"artist/lookup failed: {e}")
return []
def _fallback_lookup(query: str, limit: int, artist_first: bool) -> list[Hit]:
"""Plain album + artist lookups (no scoring); /search as last resort."""
albums = _lidarr_album_candidates(query)
artists = _lidarr_artist_candidates(query)
hits = (artists + albums) if artist_first else (albums + artists)
if hits:
return hits[:limit]
return _universal_search(query, limit)
def _universal_search(query: str, limit: int) -> list[Hit]:
"""Last resort: Lidarr's fuzzy /search (unranked)."""
hits: list[Hit] = []
try:
for item in lidarr_get("/api/v1/search", params={"term": query}):
if item.get("album"):
hits.append(_album_to_hit(item["album"]))
elif item.get("artist"):
hits.append(_artist_to_hit(item["artist"]))
except RequestException as e:
dbg(f"/api/v1/search failed: {e}")
return hits[:limit]
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `pytest tests/test_lidarr_search.py -v`
Expected: PASS (6 passed)
- [ ] **Step 5: Run the full suite**
Run: `pytest -q`
Expected: all green (prior 27 + new split/musicbrainz/lidarr-search tests), and `python3 -m py_compile musicfetch` clean.
- [ ] **Step 6: Commit**
```bash
git add musicfetch tests/test_lidarr_search.py
git commit -m "feat(lidarr): exact MBID album lookup via MusicBrainz resolution"
```
---
### Task 4: Live verification against the user's Lidarr
**Files:** none (manual verification by the controller, not a subagent).
- [ ] **Step 1: Read-only check — `lidarr_search` resolves the real album**
No mutation; confirms the MB → `mbid:` exact lookup end-to-end:
```bash
cd /home/zhering/Documents/musicfetch
env LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=49cf02acb4c7436b842df2150056d468 \
python3 -c "import server.mf, musicfetch_core as mf; \
hits=mf.lidarr_search('Daft Punk - Harder Better Faster Stronger', 5); \
print([(h.artist, h.album, h.payload['album'].get('foreignAlbumId')) for h in hits[:3]])"
```
Expected: first hit `('Daft Punk', 'Discovery', '48117b90-a16e-34ca-a514-19c702df1158')`.
- [ ] **Step 2: Spot-check a second track** (different artist), e.g.:
```bash
env LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=49cf02acb4c7436b842df2150056d468 \
python3 -c "import server.mf, musicfetch_core as mf; \
print([(h.artist,h.album) for h in mf.lidarr_search('Tame Impala - The Less I Know The Better',3)])"
```
Expected: top hit is the album containing that track (e.g. *Currents*), not a single/compilation.
- [ ] **Step 3: (Optional, mutating) full /fetch** — only with user approval, since it adds the artist+album to their Lidarr. Start the API (`env MUSICFETCH_API_KEY=… LIDARR_URL=http://10.2.1.16:8686 LIDARR_API_KEY=… MUSICFETCH_ROOT=/media/music python3 -m uvicorn server.app:app --port 6769`), `POST /fetch?q=...&source=lidarr`, observe job + Lidarr UI, then clean up any added test artist via `DELETE /api/v1/artist/<id>?deleteFiles=false`.
---
## Self-Review
**Spec coverage:**
- Shared `lidarr_search` rewrite, same signature → Task 3. ✅
- MusicBrainz resolver w/ studio release-group selection + first-artist credit → Task 2. ✅
- `mbid:` exact Lidarr lookup (no fuzzy scoring) → Task 3. ✅
- Query split → Task 1. ✅
- Fallback tiers (MB miss → `_fallback_lookup``/api/v1/search`; returns [] on total failure / no key) → Task 3 (`test_mb_miss_falls_back_to_lookup`, `test_last_resort_universal_search`, `test_no_api_key_returns_empty`). ✅
- Year enrichment from MB → Task 3 (`test_year_enriched_from_musicbrainz`). ✅
- YouTube-fallback preserved (signature unchanged; `[]` on failure) → guaranteed + `test_no_api_key_returns_empty`. ✅
- Single-term artist-first ordering → Task 3 (`test_single_term_is_artist_first`). ✅
- Out-of-scope (difflib scoring removed; metadata/quality-profile hardening raised separately) intentionally excluded.
**Placeholder scan:** None — all code and test bodies complete; real MBID/JSON baked in.
**Type consistency:** `lidarr_search(query, limit) -> list[Hit]` unchanged. `musicbrainz_best_album` returns `{album_title, artist, year, rg_mbid}` — keys identical across Task 2 (definition) and Task 3 (consumes `mb["rg_mbid"]`, `mb["year"]`) and tests. `_split_query -> (str, Optional[str])` consistent. `_lidarr_album_candidates`/`_lidarr_artist_candidates`/`_fallback_lookup(query, limit, artist_first)`/`_universal_search(query, limit)` signatures consistent between Task 3 definition and call sites. `_album_to_hit` payload `{"album": {...}}` with `foreignAlbumId` matches the assertions in Task 3.

View File

@@ -0,0 +1,123 @@
# Smarter Lidarr Matching — Design
**Date:** 2026-06-08
**Status:** Approved
## Context & Goal
Live testing of the REST API exposed a real weakness: `musicfetch`'s
`lidarr_search` trusts Lidarr's universal `/api/v1/search` ordering, which is
fuzzy and unranked. A query of `Daft Punk - Discovery` ranked a novelty remix
("Daft Punk's Discovery but it's in the SM64 Soundfont" by *Pignickel*) #1, and
the real *Discovery* by Daft Punk wasn't even top-5. The interactive CLI picker
lets a human work around this; the **API's noninteractive top-pick cannot** and
grabs garbage.
The real input shape is Shazam-style `Artist - Track`. Lidarr only grabs
**albums**, never single tracks, so we must resolve a track to the album that
contains it, then pick the best-matching Lidarr album.
**Goal:** make `lidarr_search` return a **scored, best-first** list of Lidarr
hits so the noninteractive API picks the correct album, and the CLI picker shows
good matches first. Resolve `Artist - Track` → album via MusicBrainz.
## Decisions (confirmed with user)
- **Fix in the shared `musicfetch.lidarr_search`** (not an API-only layer) — both
the CLI picker and the API noninteractive pick benefit; no duplicated logic.
Signature unchanged: `lidarr_search(query, limit) -> list[Hit]` (drop-in).
- **Resolve track → album via MusicBrainz** (the same upstream Lidarr uses).
Lidarr's own track indexing is too weak. One extra HTTP call, no API key.
- **Track-first semantics** (`Artist - Track`): the right side is treated as a
track to resolve to its album. (YouTube path already handles exact tracks; this
makes Lidarr the accurate album/discography source.)
- **No fuzzy scoring.** Live-verified that Lidarr's `album/lookup` accepts a
direct MusicBrainz id: `term=mbid:<release-group-mbid>` (also `term=lidarr:<mbid>`)
returns **exactly one** album. So we resolve the album's MBID via MusicBrainz and
ask Lidarr for that exact MBID — no difflib, no ranking heuristics. The only
selection is deterministic release-group type-filtering inside the MusicBrainz
step (prefer studio Album over single/comp/live).
- **YouTube fallback preserved** exactly as today (see below).
## Architecture
All changes live in the `musicfetch` binary (single file). New/changed units:
```
musicfetch
├── _split_query(query) -> (left, right|None) # split on first " - "
├── musicbrainz_best_album(artist, track) -> dict|None
│ # MB recording search -> best release-group {album_title, artist, year, rg_mbid}
├── _lidarr_album_candidates(term) / _lidarr_artist_candidates(term) -> list[Hit]
├── _universal_search(query, limit) -> list[Hit] # /api/v1/search last resort
└── lidarr_search(query, limit) -> list[Hit] # REWRITTEN: MBID-exact + fallbacks
```
### Data flow
1. **`Artist - Track` query:**
a. `musicbrainz_best_album(artist, track)``{album_title, artist, year, rg_mbid}`.
b. Lidarr `GET /api/v1/album/lookup?term=mbid:<rg_mbid>` → 0 or 1 exact album → `Hit`.
c. Enrich `Hit.year` from MB when the Lidarr hit lacks one. Return it.
2. **Single-term query (no ` - `):** `_fallback_lookup` — artist-first concatenation
of `/artist/lookup` + `/album/lookup` for the raw term (a bare term is most often
an artist). No scoring; the interactive picker / noninteractive top-pick consume
the order.
3. **Fallbacks (never regress):** if MusicBrainz misses or the exact MBID lookup
returns nothing, use `_fallback_lookup(query)` (album-first there, since a dash
query named an album/track). If `/album/lookup` and `/artist/lookup` both yield
nothing, fall back to the existing `/api/v1/search`. `lidarr_search` returns `[]`
only when everything fails or the key is missing.
### MusicBrainz client details
- Endpoint: `https://musicbrainz.org/ws/2/recording?query=<lucene>&fmt=json&limit=10`
where lucene = `artist:"<artist>" AND recording:"<track>"`.
- Headers: `User-Agent: musicfetch/2.0 (https://github.com/…)` (MB requires a
descriptive UA). Timeout ~8s. Rate-limit: at most ~1 request/sec (a process-level
min-interval guard; this tool makes one call per fetch so it's effectively a
courtesy delay).
- **Release-group selection** from the returned recordings' releases:
prefer `primary-type == "Album"` with **no** `secondary-types` (excludes
Compilation, Live, Single, Soundtrack); among those choose the earliest
`first-release-date`. Fall back to any release-group if none qualify. Return
`{album_title, artist, year, rg_mbid}` or `None`.
## YouTube Fallback (unchanged, documented)
This feature does not alter fallback behavior:
- **`source=auto` (default):** `build_combined_hits` includes YouTube hits. If
Lidarr times out or returns no results, `lidarr_search` returns `[]` and the top
YouTube hit is picked. If a Lidarr album is picked but has no indexer release,
`actions.perform_fetch` falls through to the top YouTube hit.
- **`source=lidarr`:** lidarr-only by design — **no** YouTube fallback (the
explicit "force Lidarr" switch). Unchanged.
## Error Handling
- All MB and Lidarr HTTP calls are wrapped; exceptions/timeouts are caught and
degrade to the next fallback tier. `lidarr_search` never raises.
- Empty/garbled MB JSON → treated as no match.
- Existing `DEBUG` logging extended to show MB query and chosen release-group.
## Testing
Unit tests (mock `requests`, no live network):
- `musicbrainz_best_album`: from canned MB JSON, picks studio Album over a single
and a compilation; picks earliest among Albums; falls back to any release-group
when no studio exists; returns `None` on empty/exception.
- `_split_query`: `"A - B"``("A","B")`; no dash → `("A", None)`; only first
` - ` splits.
- `lidarr_search`: `Artist - Track` resolves via MB then does an `mbid:` exact
lookup returning the real album (year enriched from MB); MB miss → fallback
lookup; fallback empty → `/api/v1/search`; no key → `[]`.
Manual live check (end of implementation): with the API pointed at the user's
Lidarr (`10.2.1.16:8686`), `lidarr_search("Daft Punk - Harder Better Faster
Stronger")` resolves to **Discovery** by Daft Punk (the exact MBID
`48117b90-a16e-34ca-a514-19c702df1158`), not a single/compilation/novelty.
## Out of Scope (YAGNI)
Caching MB responses, multi-track/album disambiguation UI, fuzzy similarity
scoring (eliminated by the `mbid:` exact lookup), MB cover-art lookup.

View File

@@ -12,12 +12,13 @@ import os
import re import re
import subprocess import subprocess
import sys import sys
import time
from concurrent.futures import ThreadPoolExecutor from concurrent.futures import ThreadPoolExecutor
from dataclasses import dataclass, field from dataclasses import dataclass, field
from typing import Optional from typing import Optional
import requests import requests
from requests.exceptions import RequestException, Timeout from requests.exceptions import RequestException
# Optional deps — degrade gracefully if missing. # Optional deps — degrade gracefully if missing.
try: try:
@@ -126,39 +127,151 @@ def _artist_to_hit(artist: dict) -> Hit:
) )
MUSICBRAINZ_URL = "https://musicbrainz.org/ws/2"
MB_HEADERS = {"User-Agent": "musicfetch/2.0 (https://github.com/; personal music fetcher)"}
_mb_last_call = 0.0
def _mb_rate_limit():
"""Courtesy ~1 req/sec to MusicBrainz."""
global _mb_last_call
elapsed = time.time() - _mb_last_call
if elapsed < 1.0:
time.sleep(1.0 - elapsed)
_mb_last_call = time.time()
def _mb_artist_credit(credit) -> str:
"""First credited artist name only (ignore featured/secondary)."""
if credit and isinstance(credit, list) and isinstance(credit[0], dict):
return credit[0].get("name") or (credit[0].get("artist") or {}).get("name", "")
return ""
def musicbrainz_best_album(artist: str, track: str, timeout: int = 8) -> Optional[dict]:
"""Resolve 'artist - track' to its best studio album via MusicBrainz.
Prefers a studio album credited to the track's own artist (not a Various
Artists compilation). Returns {album_title, artist, year, rg_mbid} or None.
Never raises."""
query = f'artist:"{artist}" AND recording:"{track}"'
try:
_mb_rate_limit()
resp = requests.get(
f"{MUSICBRAINZ_URL}/recording",
params={"query": query, "fmt": "json", "limit": 25},
headers=MB_HEADERS, timeout=timeout,
)
resp.raise_for_status()
data = resp.json()
except Exception as e: # noqa: BLE001 — degrade to fallback on any failure
dbg(f"MusicBrainz lookup failed: {e}")
return None
# candidate = (own_studio, is_studio, date_sortkey, title, artist, year, mbid)
candidates = []
for rec in data.get("recordings", []):
rec_artist = _mb_artist_credit(rec.get("artist-credit"))
for rel in rec.get("releases", []):
rg = rel.get("release-group") or {}
title = rg.get("title") or rel.get("title") or ""
if not title:
continue
mbid = rg.get("id") or ""
primary = rg.get("primary-type") or ""
secondary = rg.get("secondary-types") or []
rel_artist = _mb_artist_credit(rel.get("artist-credit"))
date = rel.get("date") or rg.get("first-release-date") or ""
is_studio = primary == "Album" and not secondary
own_studio = is_studio and (
not rel_artist or rel_artist.casefold() == rec_artist.casefold()
)
candidates.append((own_studio, is_studio, date or "9999", title, rec_artist, date[:4], mbid))
if not candidates:
return None
pool = ([c for c in candidates if c[0]]
or [c for c in candidates if c[1]]
or candidates)
pool.sort(key=lambda c: c[2]) # earliest date first
_, _, _, title, art, year, mbid = pool[0]
dbg(f"MusicBrainz resolved '{artist} - {track}' -> '{title}' ({year}) mbid={mbid}")
return {"album_title": title, "artist": art or artist, "year": year, "rg_mbid": mbid}
def _split_query(query: str) -> tuple[str, Optional[str]]:
"""Split a Shazam-style 'Artist - Track' on the first ' - '.
Returns (artist, track) or (term, None) when there is no separator."""
if " - " in query:
left, right = query.split(" - ", 1)
return left.strip(), right.strip()
return query.strip(), None
def lidarr_search(query: str, limit: int) -> list[Hit]: def lidarr_search(query: str, limit: int) -> list[Hit]:
"""Universal search via /api/v1/search; fall back to album+artist lookup.""" """Return Lidarr hits, best match first. Resolves 'Artist - Track' to an
album's MusicBrainz release-group MBID, then does an exact Lidarr lookup
(term=mbid:<id>) — no fuzzy ranking. Falls back so it never raises and
returns [] only on total failure / missing key."""
if not API_KEY: if not API_KEY:
err("LIDARR_API_KEY not set — skipping Lidarr search.") err("LIDARR_API_KEY not set — skipping Lidarr search.")
return [] return []
artist, right = _split_query(query)
if right:
mb = musicbrainz_best_album(artist, right)
if mb and mb["rg_mbid"]:
hits = _lidarr_album_candidates(f"mbid:{mb['rg_mbid']}")
for h in hits:
if not h.year and mb["year"]:
h.year = mb["year"]
if hits:
return hits[:limit]
# MusicBrainz miss / no exact album → plain lookup (album-first: a dash
# query named an album/track).
return _fallback_lookup(query, limit, artist_first=False)
# Bare term is most often an artist.
return _fallback_lookup(query, limit, artist_first=True)
def _lidarr_album_candidates(term: str) -> list[Hit]:
try:
return [_album_to_hit(a) for a in lidarr_get("/api/v1/album/lookup", params={"term": term})]
except RequestException as e:
dbg(f"album/lookup failed: {e}")
return []
def _lidarr_artist_candidates(term: str) -> list[Hit]:
try:
return [_artist_to_hit(a) for a in lidarr_get("/api/v1/artist/lookup", params={"term": term})]
except RequestException as e:
dbg(f"artist/lookup failed: {e}")
return []
def _fallback_lookup(query: str, limit: int, artist_first: bool) -> list[Hit]:
"""Plain album + artist lookups (no scoring); /search as last resort."""
albums = _lidarr_album_candidates(query)
artists = _lidarr_artist_candidates(query)
hits = (artists + albums) if artist_first else (albums + artists)
if hits:
return hits[:limit]
return _universal_search(query, limit)
def _universal_search(query: str, limit: int) -> list[Hit]:
"""Last resort: Lidarr's fuzzy /search (unranked)."""
hits: list[Hit] = [] hits: list[Hit] = []
try: try:
results = lidarr_get("/api/v1/search", params={"term": query}) for item in lidarr_get("/api/v1/search", params={"term": query}):
for item in results:
# /search returns objects with 'foreignId' and either 'album' or 'artist'.
if item.get("album"): if item.get("album"):
hits.append(_album_to_hit(item["album"])) hits.append(_album_to_hit(item["album"]))
elif item.get("artist"): elif item.get("artist"):
hits.append(_artist_to_hit(item["artist"])) hits.append(_artist_to_hit(item["artist"]))
if hits:
return hits[:limit]
dbg("/api/v1/search returned nothing useful; trying lookup endpoints.")
except Timeout:
err("Lidarr universal search timed out.")
except RequestException as e: except RequestException as e:
dbg(f"/api/v1/search unavailable ({e}); falling back to lookup endpoints.") dbg(f"/api/v1/search failed: {e}")
# Fallback: album lookup then artist lookup.
try:
for album in lidarr_get("/api/v1/album/lookup", params={"term": query}):
hits.append(_album_to_hit(album))
except RequestException as e:
dbg(f"album/lookup failed: {e}")
try:
for artist in lidarr_get("/api/v1/artist/lookup", params={"term": query}):
hits.append(_artist_to_hit(artist))
except RequestException as e:
dbg(f"artist/lookup failed: {e}")
return hits[:limit] return hits[:limit]
@@ -474,7 +587,10 @@ def yt_download(url_or_query: str, target_folder: str, quality: str, dry_run: bo
# Override tags from the chosen hit so they don't rely on scraped titles. # Override tags from the chosen hit so they don't rely on scraped titles.
if hit: if hit:
if hit.artist: if hit.artist:
cmd += ["--replace-in-metadata", "artist", ".*", hit.artist] # First artist only; anchored ^.*$ replaces the whole field exactly once
# (a bare .* matches twice and doubles the value).
primary_artist = hit.artist.split(",")[0].strip()
cmd += ["--replace-in-metadata", "artist", "^.*$", primary_artist]
if hit.album: if hit.album:
cmd += ["--parse-metadata", f"{hit.album}:%(album)s"] cmd += ["--parse-metadata", f"{hit.album}:%(album)s"]
if hit.title: if hit.title:

15
server/Dockerfile Normal file
View File

@@ -0,0 +1,15 @@
FROM python:3.12-slim
RUN apt-get update \
&& apt-get install -y --no-install-recommends ffmpeg \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY server/requirements.txt /app/server/requirements.txt
RUN pip install --no-cache-dir -r /app/server/requirements.txt
COPY musicfetch /app/musicfetch
COPY server /app/server
EXPOSE 6769
CMD ["sh", "-c", "uvicorn server.app:app --host 0.0.0.0 --port ${MUSICFETCH_PORT:-6769}"]

0
server/__init__.py Normal file
View File

63
server/actions.py Normal file
View File

@@ -0,0 +1,63 @@
"""Glue between a chosen Hit and a side-effecting download. Mirrors musicfetch's
main() dispatch but returns a structured result dict and speakable messages."""
import os
from . import mf
def _source_label(hit) -> str:
return "YouTube Music" if hit.source == "youtube" else "Lidarr"
def _title(hit) -> str:
return hit.album if hit.kind == "album" else (hit.title or hit.album or hit.artist)
def _primary_artist(hit) -> str:
"""First artist only — ignore featured/secondary artists."""
return (hit.artist.split(",")[0].strip() if hit.artist else "") or "unknown artist"
def started_message(hit) -> str:
return f"Found '{_title(hit)}' by {_primary_artist(hit)} on {_source_label(hit)}. Downloading now."
def done_message(hit) -> str:
return f"Finished downloading '{_title(hit)}' by {_primary_artist(hit)}."
def failed_message(hit) -> str:
return f"Failed to download '{_title(hit)}' by {_primary_artist(hit)}."
def _yt_path(hit, root: str) -> str:
artist_dir = (hit.artist.split(",")[0].strip() if hit.artist else "") or "Unknown Artist"
return os.path.join(root, artist_dir, "youtube")
def _download_youtube(hit, quality: str, root: str) -> dict:
mf.act_youtube(hit, root, quality, False)
return {"path": _yt_path(hit, root), "lidarr_album_id": None}
def perform_fetch(chosen, hits: list, quality: str, root: str) -> dict:
"""Run the download for the chosen hit. Returns {"path", "lidarr_album_id"}.
Raises on unrecoverable failure (recorded by the job worker)."""
if chosen.source == "youtube":
return _download_youtube(chosen, quality, root)
if chosen.kind == "album":
handled = mf.act_lidarr_album(chosen, root, False, False)
if handled:
return {"path": None, "lidarr_album_id": chosen.payload.get("album", {}).get("id")}
# No indexer release -> fall through to the top YouTube hit, like the CLI.
yt = next((h for h in hits if h.source == "youtube"), None)
if yt is None:
raise RuntimeError("No Lidarr release and no YouTube fallback available.")
return _download_youtube(yt, quality, root)
# Lidarr artist pick.
ok = mf.act_lidarr_artist(chosen, root, False, False)
if not ok:
raise RuntimeError("Failed to add artist to Lidarr.")
return {"path": None, "lidarr_album_id": None}

84
server/app.py Normal file
View File

@@ -0,0 +1,84 @@
"""MusicFetch REST API. Plain HTTP behind an upstream TLS reverse proxy."""
import os
from fastapi import Depends, FastAPI, Header, HTTPException, Query
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse
from . import actions, jobs, mf
API_KEY = os.environ.get("MUSICFETCH_API_KEY", "")
ROOT = os.environ.get("MUSICFETCH_ROOT", "/media/music")
app = FastAPI(title="MusicFetch API")
def require_key(x_api_key: str = Header(default="")):
if not API_KEY or x_api_key != API_KEY:
raise HTTPException(status_code=401, detail="Invalid API key.")
@app.exception_handler(HTTPException)
async def _http_exc(_req, exc: HTTPException):
# Always return a Siri-speakable {"message": ...} body.
return JSONResponse(status_code=exc.status_code, content={"message": exc.detail})
@app.exception_handler(RequestValidationError)
async def _validation_exc(_req, exc: RequestValidationError):
return JSONResponse(status_code=422, content={"message": "Invalid or missing request parameters."})
@app.get("/health")
def health():
return {"status": "ok"}
def _hit_public(hit) -> dict:
return {"source": hit.source, "kind": hit.kind, "artist": hit.artist,
"album": hit.album, "title": hit.title, "year": hit.year}
def _job_public(job) -> dict:
return {"message": job.message, "job_id": job.id, "status": job.status,
"hit": _hit_public(job.hit) if job.hit is not None else None,
"result": job.result, "error": job.error}
@app.post("/fetch", dependencies=[Depends(require_key)])
def fetch(q: str = Query(..., min_length=1),
quality: str = Query("best"),
source: str = Query("auto")):
if quality not in mf.QUALITY_CHOICES:
raise HTTPException(status_code=422, detail=f"Invalid quality '{quality}'.")
if source not in ("auto", "lidarr", "youtube"):
raise HTTPException(status_code=422, detail=f"Invalid source '{source}'.")
yt_first = source == "youtube"
hits = mf.build_combined_hits(q, 10, yt_first,
lidarr_only=(source == "lidarr"),
yt_only=(source == "youtube"))
if not hits:
raise HTTPException(status_code=404, detail=f"No results found for '{q}'.")
chosen = mf.pick(hits, q, True, yt_first)
if chosen is None:
raise HTTPException(status_code=404, detail=f"No results found for '{q}'.")
job = jobs.create_job(hit=chosen, message=actions.started_message(chosen))
response = _job_public(job) # snapshot "queued" state before background thread starts
jobs.run_job(
job.id,
lambda: actions.perform_fetch(chosen, hits, quality, ROOT),
done_message=actions.done_message(chosen),
fail_message=actions.failed_message(chosen),
)
return response
@app.get("/jobs/{job_id}", dependencies=[Depends(require_key)])
def job_status(job_id: str):
job = jobs.get_job(job_id)
if job is None:
raise HTTPException(status_code=404, detail="No such job.")
return _job_public(job)

25
server/docker-compose.yml Normal file
View File

@@ -0,0 +1,25 @@
services:
musicfetch-api:
build:
context: ..
dockerfile: server/Dockerfile
container_name: musicfetch-api
restart: unless-stopped
ports:
- "6769:6769"
environment:
LIDARR_URL: "http://lidarr:8686"
LIDARR_API_KEY: "${LIDARR_API_KEY}"
MUSICFETCH_API_KEY: "${MUSICFETCH_API_KEY}"
MUSICFETCH_ROOT: "/media/music"
MUSICFETCH_PORT: "6769"
volumes:
- /media/music:/media/music
networks:
- lidarr_net
networks:
lidarr_net:
external: true
# Set to the actual network name of your existing Lidarr stack, e.g.:
# name: media_default

64
server/jobs.py Normal file
View File

@@ -0,0 +1,64 @@
"""In-memory async job store. Personal-scale: jobs are lost on restart.
Generic — knows nothing about musicfetch; callers pass a no-arg `fn`."""
import time
import uuid
from concurrent.futures import ThreadPoolExecutor
from dataclasses import dataclass, field
from typing import Any, Callable, Optional
_EXECUTOR = ThreadPoolExecutor(max_workers=2)
JOBS: "dict[str, Job]" = {}
_MAX_JOBS = 200 # cap to bound memory
@dataclass
class Job:
id: str
status: str # queued | running | done | failed
hit: Any
message: str
result: Optional[dict] = None
error: Optional[str] = None
created_at: float = field(default_factory=time.time)
updated_at: float = field(default_factory=time.time)
def _touch(job: "Job", **changes):
for k, v in changes.items():
setattr(job, k, v)
job.updated_at = time.time()
def _evict_if_needed():
# Post-condition: len(JOBS) <= _MAX_JOBS (evicts oldest overflow entries).
if len(JOBS) <= _MAX_JOBS:
return
for jid in sorted(JOBS, key=lambda j: JOBS[j].created_at)[: len(JOBS) - _MAX_JOBS]:
JOBS.pop(jid, None)
def create_job(hit: Any, message: str) -> "Job":
job = Job(id=uuid.uuid4().hex[:8], status="queued", hit=hit, message=message)
JOBS[job.id] = job
_evict_if_needed()
return job
def get_job(job_id: str) -> Optional["Job"]:
return JOBS.get(job_id)
def run_job(job_id: str, fn: Callable[[], dict], done_message: str,
fail_message: str = "Something went wrong while fetching.") -> None:
def _task():
job = JOBS.get(job_id)
if job is None:
return
_touch(job, status="running")
try:
result = fn()
_touch(job, status="done", result=result, message=done_message)
except Exception as e: # noqa: BLE001 — record any failure on the job
_touch(job, status="failed", error=f"{type(e).__name__}: {e}",
message=fail_message)
_EXECUTOR.submit(_task)

29
server/mf.py Normal file
View File

@@ -0,0 +1,29 @@
"""Loads the sibling standalone `musicfetch` script (no .py extension) as a
module and re-exports the symbols the API reuses. This is the single seam
between the REST API and the CLI; musicfetch itself is unchanged."""
import importlib.machinery
import importlib.util
import os
import sys
_HERE = os.path.dirname(os.path.abspath(__file__))
_MF_PATH = os.environ.get("MUSICFETCH_BIN", os.path.join(_HERE, "..", "musicfetch"))
# spec_from_file_location returns None for extension-less files, so use
# SourceFileLoader directly to handle the bare `musicfetch` binary.
_loader = importlib.machinery.SourceFileLoader("musicfetch_core", _MF_PATH)
_spec = importlib.util.spec_from_loader("musicfetch_core", _loader)
_mod = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(_mod) # safe: musicfetch guards main() behind __main__
sys.modules["musicfetch_core"] = _mod
Hit = _mod.Hit
build_combined_hits = _mod.build_combined_hits
pick = _mod.pick
act_youtube = _mod.act_youtube
act_lidarr_album = _mod.act_lidarr_album
act_lidarr_artist = _mod.act_lidarr_artist
QUALITY_CHOICES = _mod.QUALITY_CHOICES
__all__ = ["Hit", "build_combined_hits", "pick", "act_youtube",
"act_lidarr_album", "act_lidarr_artist", "QUALITY_CHOICES"]

6
server/requirements.txt Normal file
View File

@@ -0,0 +1,6 @@
fastapi
uvicorn[standard]
requests
ytmusicapi
rich
yt-dlp

0
tests/__init__.py Normal file
View File

16
tests/conftest.py Normal file
View File

@@ -0,0 +1,16 @@
import os
import pytest
os.environ.setdefault("MUSICFETCH_API_KEY", "test-key")
@pytest.fixture
def client():
from fastapi.testclient import TestClient
from server.app import app
return TestClient(app)
@pytest.fixture
def auth():
return {"X-API-Key": "test-key"}

87
tests/test_actions.py Normal file
View File

@@ -0,0 +1,87 @@
import pytest
from server import actions, mf
def make_yt_hit():
return mf.Hit(source="youtube", kind="track", title="Together",
artist="Avril Lavigne", album="Under My Skin", year="2004",
payload={"videoId": "abc"})
def make_lidarr_album_hit():
return mf.Hit(source="lidarr", kind="album", title="Under My Skin",
artist="Avril Lavigne", album="Under My Skin", year="2004",
payload={"album": {"id": 5, "title": "Under My Skin"}})
def test_started_message_mentions_source_and_title():
msg = actions.started_message(make_yt_hit())
assert "Together" in msg
assert "Avril Lavigne" in msg
assert "YouTube" in msg
def test_done_message_mentions_title():
msg = actions.done_message(make_yt_hit())
assert "Together" in msg
assert "Avril Lavigne" in msg
def test_messages_use_only_first_artist():
hit = mf.Hit(source="youtube", kind="track", title="Under My Skin",
artist="SLVMLORD, James John, BobbyGee", album="X", year="",
payload={"videoId": "abc"})
for msg in (actions.started_message(hit), actions.done_message(hit),
actions.failed_message(hit)):
assert "SLVMLORD" in msg
assert "James John" not in msg
assert "BobbyGee" not in msg
def test_perform_youtube_calls_act_youtube(monkeypatch):
calls = {}
monkeypatch.setattr(mf, "act_youtube",
lambda hit, root, quality, dry_run: calls.update(hit=hit, root=root, quality=quality))
hit = make_yt_hit()
result = actions.perform_fetch(hit, [hit], quality="best", root="/media/music")
assert calls["quality"] == "best"
assert result["path"] == "/media/music/Avril Lavigne/youtube"
assert result["lidarr_album_id"] is None
def test_perform_lidarr_album_handled(monkeypatch):
monkeypatch.setattr(mf, "act_lidarr_album",
lambda hit, root, search_all, dry_run: True)
hit = make_lidarr_album_hit()
result = actions.perform_fetch(hit, [hit], quality="best", root="/media/music")
assert result["lidarr_album_id"] == 5
assert result["path"] is None
def test_perform_lidarr_album_fallsthrough_to_youtube(monkeypatch):
monkeypatch.setattr(mf, "act_lidarr_album",
lambda hit, root, search_all, dry_run: False)
yt_calls = {}
monkeypatch.setattr(mf, "act_youtube",
lambda hit, root, quality, dry_run: yt_calls.update(hit=hit))
lidarr_hit = make_lidarr_album_hit()
yt_hit = make_yt_hit()
result = actions.perform_fetch(lidarr_hit, [lidarr_hit, yt_hit],
quality="best", root="/media/music")
assert yt_calls["hit"] is yt_hit
assert result["path"] == "/media/music/Avril Lavigne/youtube"
def test_perform_lidarr_album_no_release_no_fallback_raises(monkeypatch):
monkeypatch.setattr(mf, "act_lidarr_album",
lambda hit, root, search_all, dry_run: False)
hit = make_lidarr_album_hit()
with pytest.raises(RuntimeError):
actions.perform_fetch(hit, [hit], quality="best", root="/media/music")
def test_failed_message_mentions_title_and_artist():
msg = actions.failed_message(make_yt_hit())
assert "Together" in msg
assert "Avril Lavigne" in msg

105
tests/test_api.py Normal file
View File

@@ -0,0 +1,105 @@
import time
import pytest
from server import mf, jobs as jobs_mod
@pytest.fixture(autouse=True)
def _clear_jobs():
jobs_mod.JOBS.clear()
yield
jobs_mod.JOBS.clear()
def _yt_hit():
return mf.Hit(source="youtube", kind="track", title="Together",
artist="Avril Lavigne", album="Under My Skin", year="2004",
payload={"videoId": "abc"})
def test_fetch_returns_job_and_message(client, auth, monkeypatch):
hit = _yt_hit()
monkeypatch.setattr("server.app.mf.build_combined_hits",
lambda q, limit, yt_first, lidarr_only, yt_only: [hit])
monkeypatch.setattr("server.app.mf.pick",
lambda hits, q, noninteractive, yt_first: hits[0])
monkeypatch.setattr("server.app.actions.perform_fetch",
lambda chosen, hits, quality, root: {"path": "/media/music/x", "lidarr_album_id": None})
r = client.post("/fetch", params={"q": "Under My Skin"}, headers=auth)
assert r.status_code == 200
body = r.json()
assert body["status"] == "queued"
assert "Together" in body["message"]
assert body["hit"]["artist"] == "Avril Lavigne"
assert body["job_id"]
def test_fetch_no_hits_returns_404(client, auth, monkeypatch):
monkeypatch.setattr("server.app.mf.build_combined_hits",
lambda q, limit, yt_first, lidarr_only, yt_only: [])
r = client.post("/fetch", params={"q": "zzzz"}, headers=auth)
assert r.status_code == 404
assert "zzzz" in r.json()["message"]
def test_fetch_missing_q_returns_422(client, auth):
r = client.post("/fetch", headers=auth)
assert r.status_code == 422
assert "message" in r.json()
def test_job_lifecycle_done(client, auth, monkeypatch):
hit = _yt_hit()
monkeypatch.setattr("server.app.mf.build_combined_hits",
lambda q, limit, yt_first, lidarr_only, yt_only: [hit])
monkeypatch.setattr("server.app.mf.pick",
lambda hits, q, noninteractive, yt_first: hits[0])
monkeypatch.setattr("server.app.actions.perform_fetch",
lambda chosen, hits, quality, root: {"path": "/media/music/x", "lidarr_album_id": None})
job_id = client.post("/fetch", params={"q": "x"}, headers=auth).json()["job_id"]
end = time.time() + 2
status = None
while time.time() < end:
body = client.get(f"/jobs/{job_id}", headers=auth).json()
status = body["status"]
if status == "done":
break
time.sleep(0.01)
assert status == "done"
assert body["result"]["path"] == "/media/music/x"
assert "Finished" in body["message"]
def test_unknown_job_404(client, auth):
r = client.get("/jobs/deadbeef", headers=auth)
assert r.status_code == 404
assert "message" in r.json()
def test_jobs_requires_key(client):
r = client.get("/jobs/whatever")
assert r.status_code == 401
def test_fetch_invalid_quality_422(client, auth):
r = client.post("/fetch", params={"q": "x", "quality": "bogus"}, headers=auth)
assert r.status_code == 422
assert "message" in r.json()
def test_fetch_invalid_source_422(client, auth):
r = client.post("/fetch", params={"q": "x", "source": "spotify"}, headers=auth)
assert r.status_code == 422
assert "message" in r.json()
def test_fetch_pick_none_returns_404(client, auth, monkeypatch):
hit = _yt_hit()
monkeypatch.setattr("server.app.mf.build_combined_hits",
lambda q, limit, yt_first, lidarr_only, yt_only: [hit])
monkeypatch.setattr("server.app.mf.pick",
lambda hits, q, noninteractive, yt_first: None)
r = client.post("/fetch", params={"q": "x"}, headers=auth)
assert r.status_code == 404
assert "message" in r.json()

16
tests/test_auth.py Normal file
View File

@@ -0,0 +1,16 @@
def test_health_no_auth(client):
r = client.get("/health")
assert r.status_code == 200
assert r.json() == {"status": "ok"}
def test_fetch_requires_key(client):
r = client.post("/fetch", params={"q": "anything"})
assert r.status_code == 401
assert "message" in r.json()
def test_fetch_rejects_wrong_key(client):
r = client.post("/fetch", params={"q": "anything"},
headers={"X-API-Key": "wrong"})
assert r.status_code == 401

55
tests/test_jobs.py Normal file
View File

@@ -0,0 +1,55 @@
import time
from server import jobs
def _wait(job_id, status, timeout=2.0):
end = time.time() + timeout
while time.time() < end:
j = jobs.get_job(job_id)
if j and j.status == status:
return j
time.sleep(0.01)
raise AssertionError(f"job {job_id} never reached {status}")
def test_create_job_is_queued():
job = jobs.create_job(hit={"artist": "A"}, message="queued msg")
assert job.status == "queued"
assert job.hit == {"artist": "A"}
assert jobs.get_job(job.id) is job
def test_run_job_success_sets_done():
job = jobs.create_job(hit={}, message="m")
jobs.run_job(job.id, lambda: {"path": "/x", "lidarr_album_id": None},
done_message="done!")
j = _wait(job.id, "done")
assert j.result == {"path": "/x", "lidarr_album_id": None}
assert j.message == "done!"
assert j.error is None
def test_run_job_failure_sets_failed():
job = jobs.create_job(hit={}, message="m")
def boom():
raise RuntimeError("kaboom")
jobs.run_job(job.id, boom, done_message="done!", fail_message="it broke")
j = _wait(job.id, "failed")
assert j.error and "kaboom" in j.error
assert j.message == "it broke"
def test_get_unknown_job_returns_none():
assert jobs.get_job("nope") is None
def test_eviction_keeps_within_cap():
jobs.JOBS.clear()
for i in range(jobs._MAX_JOBS + 25):
jobs.create_job(hit={"i": i}, message="m")
assert len(jobs.JOBS) <= jobs._MAX_JOBS
jobs.JOBS.clear()
def teardown_module():
jobs.JOBS.clear()

View File

@@ -0,0 +1,18 @@
import server.mf # noqa: F401 — loads musicfetch, registers musicfetch_core in sys.modules
import musicfetch_core as mf
def test_split_query_with_dash():
assert mf._split_query("Daft Punk - Discovery") == ("Daft Punk", "Discovery")
def test_split_query_no_dash():
assert mf._split_query("Daft Punk") == ("Daft Punk", None)
def test_split_query_splits_on_first_dash_only():
assert mf._split_query("A - B - C") == ("A", "B - C")
def test_split_query_strips_whitespace():
assert mf._split_query(" Daft Punk - Discovery ") == ("Daft Punk", "Discovery")

View File

@@ -0,0 +1,83 @@
import server.mf # noqa: F401
import musicfetch_core as mf
DISCOVERY_MBID = "48117b90-a16e-34ca-a514-19c702df1158"
DISCOVERY_ALBUM = {"title": "Discovery", "artist": {"artistName": "Daft Punk"},
"releaseDate": "2001-01-01", "foreignAlbumId": DISCOVERY_MBID}
def test_artist_track_uses_mbid_exact_lookup(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album",
lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk",
"year": "2001", "rg_mbid": DISCOVERY_MBID})
seen = {}
def fake_get(path, params=None, timeout=15):
seen["term"] = (params or {}).get("term")
if path == "/api/v1/album/lookup" and seen["term"] == f"mbid:{DISCOVERY_MBID}":
return [DISCOVERY_ALBUM]
return []
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk - Harder Better Faster Stronger", 10)
assert seen["term"] == f"mbid:{DISCOVERY_MBID}"
assert hits[0].album == "Discovery"
assert hits[0].artist == "Daft Punk"
assert hits[0].payload["album"]["foreignAlbumId"] == DISCOVERY_MBID
def test_year_enriched_from_musicbrainz(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album",
lambda artist, track: {"album_title": "Discovery", "artist": "Daft Punk",
"year": "2001", "rg_mbid": DISCOVERY_MBID})
no_year = [{"title": "Discovery", "artist": {"artistName": "Daft Punk"},
"releaseDate": "", "foreignAlbumId": DISCOVERY_MBID}]
monkeypatch.setattr(mf, "lidarr_get",
lambda path, params=None, timeout=15: no_year if path == "/api/v1/album/lookup" else [])
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits[0].year == "2001"
def test_no_api_key_returns_empty(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "")
assert mf.lidarr_search("Daft Punk - Discovery", 10) == []
def test_mb_miss_falls_back_to_lookup(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None)
monkeypatch.setattr(mf, "lidarr_get",
lambda path, params=None, timeout=15: [DISCOVERY_ALBUM] if path == "/api/v1/album/lookup" else [])
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits[0].album == "Discovery"
def test_single_term_is_artist_first(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
def fake_get(path, params=None, timeout=15):
if path == "/api/v1/artist/lookup":
return [{"artistName": "Daft Punk"}]
if path == "/api/v1/album/lookup":
return [DISCOVERY_ALBUM]
return []
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk", 10)
assert hits[0].kind == "artist"
assert hits[0].artist == "Daft Punk"
def test_last_resort_universal_search(monkeypatch):
monkeypatch.setattr(mf, "API_KEY", "testkey")
monkeypatch.setattr(mf, "musicbrainz_best_album", lambda artist, track: None)
def fake_get(path, params=None, timeout=15):
if path == "/api/v1/search":
return [{"album": DISCOVERY_ALBUM}]
return []
monkeypatch.setattr(mf, "lidarr_get", fake_get)
hits = mf.lidarr_search("Daft Punk - Discovery", 10)
assert hits and hits[0].album == "Discovery"

19
tests/test_mf_loader.py Normal file
View File

@@ -0,0 +1,19 @@
"""Tests for the server.mf loader."""
def test_mf_reexports_musicfetch_symbols():
from server import mf
assert hasattr(mf, "Hit")
assert callable(mf.build_combined_hits)
assert callable(mf.pick)
assert callable(mf.act_youtube)
assert callable(mf.act_lidarr_album)
assert callable(mf.act_lidarr_artist)
assert isinstance(mf.QUALITY_CHOICES, list)
def test_mf_hit_constructs():
from server import mf
h = mf.Hit(source="youtube", kind="track", title="x", artist="y")
assert h.source == "youtube"
assert h.artist == "y"

96
tests/test_musicbrainz.py Normal file
View File

@@ -0,0 +1,96 @@
import server.mf # noqa: F401
import musicfetch_core as mf
class _FakeResp:
def __init__(self, payload):
self._payload = payload
def raise_for_status(self):
pass
def json(self):
return self._payload
MB_PAYLOAD = {
"recordings": [
{
"artist-credit": [{"name": "Daft Punk"}],
"releases": [
{"date": "2001",
"release-group": {"id": "single-mbid", "title": "Harder, Better, Faster, Stronger",
"primary-type": "Single", "secondary-types": []}},
{"date": "2002",
"release-group": {"id": "comp-mbid", "title": "Musique, Vol. 1",
"primary-type": "Album", "secondary-types": ["Compilation"]}},
{"date": "2001",
"release-group": {"id": "48117b90-a16e-34ca-a514-19c702df1158",
"title": "Discovery", "primary-type": "Album",
"secondary-types": []}},
],
}
]
}
def test_picks_studio_album_over_single_and_comp(monkeypatch):
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(MB_PAYLOAD))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("Daft Punk", "Harder Better Faster Stronger")
assert out["album_title"] == "Discovery"
assert out["artist"] == "Daft Punk"
assert out["year"] == "2001"
assert out["rg_mbid"] == "48117b90-a16e-34ca-a514-19c702df1158"
def test_returns_none_on_empty(monkeypatch):
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp({"recordings": []}))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
assert mf.musicbrainz_best_album("Nobody", "Nothing") is None
def test_returns_none_on_exception(monkeypatch):
def boom(*a, **k):
raise mf.requests.exceptions.RequestException("network down")
monkeypatch.setattr(mf.requests, "get", boom)
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
assert mf.musicbrainz_best_album("Daft Punk", "Discovery") is None
def test_falls_back_to_any_releasegroup_when_no_studio(monkeypatch):
payload = {"recordings": [{"artist-credit": [{"name": "X"}], "releases": [
{"date": "2010", "release-group": {"id": "live1", "title": "Live Thing",
"primary-type": "Album", "secondary-types": ["Live"]}},
]}]}
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("X", "Y")
assert out["album_title"] == "Live Thing"
def test_first_artist_credit_only(monkeypatch):
payload = {"recordings": [{"artist-credit": [{"name": "SLVMLORD"}, {"name": "Travis Bradley"}],
"releases": [{"date": "2025",
"release-group": {"id": "x", "title": "Album X",
"primary-type": "Album",
"secondary-types": []}}]}]}
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("SLVMLORD", "Under My Skin")
assert out["artist"] == "SLVMLORD"
def test_prefers_own_artist_studio_over_various_artists(monkeypatch):
# A studio-looking VA compilation dated earlier must NOT beat the artist's own album.
payload = {"recordings": [{"artist-credit": [{"name": "Daft Punk"}], "releases": [
{"date": "2001-10-26", "artist-credit": [{"name": "Various Artists"}],
"release-group": {"id": "va-mbid", "title": "All The Hits Now",
"primary-type": "Album", "secondary-types": []}},
{"date": "2002", "artist-credit": [{"name": "Daft Punk"}],
"release-group": {"id": "48117b90-a16e-34ca-a514-19c702df1158", "title": "Discovery",
"primary-type": "Album", "secondary-types": []}},
]}]}
monkeypatch.setattr(mf.requests, "get", lambda *a, **k: _FakeResp(payload))
monkeypatch.setattr(mf.time, "sleep", lambda *_: None)
out = mf.musicbrainz_best_album("Daft Punk", "Harder Better Faster Stronger")
assert out["album_title"] == "Discovery"
assert out["rg_mbid"] == "48117b90-a16e-34ca-a514-19c702df1158"