Go to file

zebra 140bfef7c9 feat: yt-dlp cookie support + surface real failure reason; default workers 4

Bulk --repair on unauthenticated YouTube trips the bot-check (HTTP 429 "Sign
in to confirm you're not a bot"), after which every call fails until the IP
flag clears. Add cookie support so authenticated requests bypass it:

- --cookies FILE / --cookies-from-browser BROWSER (and $YTDLP_COOKIES /
  $YTDLP_COOKIES_FROM_BROWSER for the API container), threaded into every
  yt-dlp invocation (search, probe, download, repair metadata fetch).
- run_yt_dlp_get_metadata now logs yt-dlp's last stderr line (the actual 429 /
  bot-check / network reason) instead of a bare exit code.
- Default --repair workers lowered 8 -> 4 (safe without cookies; raise with).
- compose: optional YTDLP_COOKIES env + commented cookies mount.
- README: how to obtain cookies (Chrome/Firefox, browser-read vs cookies.txt
  export); gitignore cookies.txt.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-13 11:25:39 -07:00

docs/superpowers

Plan playlists + profile hardening (5 TDD tasks)

2026-06-08 23:42:13 -07:00

server

feat: yt-dlp cookie support + surface real failure reason; default workers 4

2026-06-13 11:25:39 -07:00

tests

feat: yt-dlp cookie support + surface real failure reason; default workers 4

2026-06-13 11:25:39 -07:00

.dockerignore

fix(server): make .dockerignore effective at repo root, pin yt-dlp in requirements

2026-06-08 20:23:37 -07:00

.gitignore

feat: yt-dlp cookie support + surface real failure reason; default workers 4

2026-06-13 11:25:39 -07:00

LICENSE

Initial commit

2025-06-09 20:25:03 +00:00

musicfetch

feat: yt-dlp cookie support + surface real failure reason; default workers 4

2026-06-13 11:25:39 -07:00

README.md

feat: yt-dlp cookie support + surface real failure reason; default workers 4

2026-06-13 11:25:39 -07:00

README.md

🎵 MusicFetch

MusicFetch is a smart command-line utility that finds music by searching Lidarr (your music collection manager) and YouTube Music at the same time, shows you the top hits in an interactive picker, and downloads/queues whatever you choose. It accepts:

A free-form query: an artist, an album, a track title, or combos like "Artist - Title" or "Artist - Album" (e.g. "ODESZA - Bloom", "Daft Punk", "Discovery").
A URL (e.g. "https://music.youtube.com/watch?v=..." or a regular YouTube URL).

Lidarr is tried first by default. If you pick a Lidarr album but no indexer release is available, MusicFetch automatically falls through to the top YouTube hit. YouTube downloads prefer YouTube Music URLs so album art and tags are correct.

🚀 Features

One unified picker showing the top hits from Lidarr and YouTube together, with matching keywords bolded.
Lidarr-first flow: pick an album → adds artist+album (monitored) → interactive indexer search → falls through to YouTube only if no release is found.
Accurate YouTube metadata via ytmusicapi (real artist / album / year / album art), with yt-dlp scraping as a fallback.
Explicit tag overrides on download so files are tagged from the chosen hit, not from scraped titles.
Non-interactive, YouTube-first, dry-run, quality, limit, and source-restriction flags.

📦 Dependencies & Installation

🐍 Python

Python 3.10+
requests
ytmusicapi (recommended — accurate YouTube Music metadata)
rich (recommended — nicer picker table + bold keyword matching)

pip install requests ytmusicapi rich

Note: if you hit ModuleNotFoundError: No module named 'idna' from requests, repair it with:
pip install --force-reinstall idna requests

ytmusicapi and rich are optional — without them MusicFetch falls back to yt-dlp search scraping and a plain ANSI picker.

📼 External Tools

yt-dlp (audio download/extraction) and ffmpeg (for -x extraction / embedding).

pip install -U yt-dlp
sudo apt install ffmpeg      # or your distro's equivalent

⚙️ Configuration

Set via environment variables:

Variable	Default	Purpose
`LIDARR_API_KEY`	(required for Lidarr)	Lidarr API key.
`LIDARR_URL`	`http://localhost:8686`	Lidarr base URL.
`MUSICFETCH_ROOT`	`/media/music`	Default output root folder.

export LIDARR_API_KEY="your-lidarr-api-key"

🧑‍💻 Usage

./musicfetch [OPTIONS] QUERY...

Options

Flag	Description
`-n`, `--noninteractive`	Auto-pick the top hit (no prompt).
`-s`, `--ytsearch`	Search/select YouTube first instead of Lidarr first.
`-d`, `--dry-run`	Show every action without executing it.
`-q`, `--quality {best,320,m4a,opus,flac}`	Audio quality/format (default `best`).
`--limit N`	Hits per source (default 10).
`--lidarr-only`	Skip YouTube.
`--yt-only`	Skip Lidarr.
`-o`, `--root PATH`	Output root folder (default `/media/music`).
`--search-all`	Search all albums when adding an artist to Lidarr.
`--repair`	Re-tag existing downloads under `--root` from source metadata (see below).
`--workers N`	Parallel metadata fetches during `--repair` (default 4).
`--cookies FILE`	yt-dlp `cookies.txt` for authenticated YouTube (avoids bot-check / rate limits).
`--cookies-from-browser BROWSER`	Load YouTube cookies from a local browser (e.g. `firefox`).
`--retag-from-path`	Offline: re-tag artist/title from folder + filename (see below).
`-x`, `--exclude NAME`	Folder under `--root` to skip during `--repair`/`--retag-from-path` (repeatable).
`--debug`	Verbose output.

Examples

# Interactive: combined Lidarr + YouTube picker
./musicfetch "ODESZA - Bloom"

# Just an artist / just an album / just a title all work
./musicfetch "Daft Punk"
./musicfetch "Discovery"

# YouTube first, auto-pick top hit
./musicfetch -s -n "Daft Punk - Harder Better Faster Stronger"

# Dry run — see what would happen, change nothing
./musicfetch -d "ODESZA - Bloom"

# YouTube only, lossless preferred
./musicfetch --yt-only -q flac "Bonobo - Kerala"

# Download by URL (single track or playlist/set/album, any yt-dlp site)
./musicfetch "https://music.youtube.com/watch?v=xxxxxxxxxxx"
./musicfetch "https://soundcloud.com/artist/sets/my-mix"

🔧 Repair existing tags

--repair walks <root>/<artist>/<source>/ (the youtube/soundcloud/… download folders — Lidarr album folders are skipped), re-fetches authoritative metadata for each file using the [id] in its filename, and fixes tags. Useful when downloads landed with missing album or wrong year.

It is deliberately conservative: it overwrites album and year (the usual breakage), and fills in artist/title when they are missing or a known-bogus placeholder (NA, Unknown Album, Unknown Artist — left behind by older buggy tagging) — but it never overwrites a genuine existing artist/title with a channel name or decorated video title. A bogus NA [<id>].<ext> filename is renamed to the recovered title, and a literal NA album with no source album is normalised to Unknown Album.

Each file is its own yt-dlp network round-trip, so repair runs them in a thread pool; --workers N (default 4) caps concurrency. Progress prints every 100 files. Requires mutagen (a yt-dlp dependency, usually already present). CLI-only — not exposed via the REST API.

Cookies (important for bulk repair). Unauthenticated YouTube requests get throttled fast — a large --repair (or even a --dry-run, which still fetches) will trip "Sign in to confirm you're not a bot" (HTTP 429) and every subsequent call fails until the IP-level flag clears. Pass authenticated cookies to avoid it:

./musicfetch --repair --cookies /path/cookies.txt -o /media/music        # exported cookies.txt
./musicfetch --repair --cookies-from-browser firefox -o /media/music     # or read from a browser

With cookies you can raise --workers; without them keep it low (≤4) and expect occasional throttling. Cookies also apply to normal fetches/downloads. The same can be set for the API container via $YTDLP_COOKIES / $YTDLP_COOKIES_FROM_BROWSER. If you do get flagged, stop — retrying extends it; wait ~30-60 min (429) or longer for a bot-check.

Getting YouTube cookies

⚠️ Use a throwaway / secondary Google account, not your main one — bulk automated requests can get the account flagged. You must be logged in to YouTube in the browser first.

Option A — read straight from the browser (simplest, host CLI only). --cookies-from-browser reads the browser's own cookie store, so there's nothing to export:

./musicfetch --repair --cookies-from-browser firefox -o /media/music
./musicfetch --repair --cookies-from-browser chrome  -o /media/music

Firefox: works while open; just be logged in to YouTube.
Chrome / Chromium / Brave / Edge: must be fully quit when you run this (Chrome locks its cookie DB, and newer versions encrypt it — close the browser entirely first). On Linux a running Chrome will usually fail with a "could not copy cookie database / locked" error.
Specify a profile if not the default, e.g. --cookies-from-browser "chrome:Profile 1".

This only works where the browser lives (your host), not inside the Docker container.

Option B — export a cookies.txt (works anywhere, incl. the container/server). Use a Netscape-format cookie exporter, then point --cookies / $YTDLP_COOKIES at the file:

Install a cookies exporter extension:
- Firefox: "cookies.txt" (a.k.a. Export Cookies).
- Chrome: "Get cookies.txt LOCALLY" (pick a LOCALLY-running one — avoid extensions that upload your cookies anywhere).
Log in to https://www.youtube.com, click the extension, Export → save cookies.txt.

Use it:

./musicfetch --repair --cookies ~/cookies.txt -o /media/music

For the API container, mount it and set the env var (see server/docker-compose.yml):

environment:
  YTDLP_COOKIES: "/cookies.txt"
volumes:
  - /host/path/cookies.txt:/cookies.txt:ro

Cookies expire — if YouTube starts rejecting them, re-export. Treat cookies.txt like a password (it is your logged-in session); keep it out of git (.gitignore it).

# Preview what would change (writes nothing)
./musicfetch --repair -d

# Apply fixes under a specific root
./musicfetch --repair -o /media/music

--retag-from-path is an offline companion: it derives artist and title purely from the folder name + filename (stripping (Official Video) / (Lyrics)-style decorations, and treating an Artist - Title filename correctly), with no network. Use it to undo bad tags — e.g. titles/artists clobbered by an earlier --repair on music videos. It overwrites artist/title and leaves album/year alone.

./musicfetch --retag-from-path -d            # preview
./musicfetch --retag-from-path -o /media/music

# Skip folders (e.g. hand-curated playlists you don't want re-tagged)
./musicfetch --repair -x Unsorted -x playlists

📁 Output Structure

<root>/
├── Artist Name/
│   ├── Album Name/    (managed by Lidarr)
│   ├── youtube/       (YouTube / YouTube Music downloads)
│   ├── soundcloud/    (SoundCloud downloads)
│   └── <source>/      (one folder per yt-dlp source)

❓ Troubleshooting

No Lidarr hits / "LIDARR_API_KEY not set": export your key and confirm LIDARR_URL is reachable.
Wrong album art from YouTube: install ytmusicapi so MusicFetch can resolve proper YouTube Music URLs and metadata.
yt-dlp errors: update with yt-dlp -U; ensure ffmpeg is installed for extraction/embedding.
idna import error: pip install --force-reinstall idna requests.
Permission denied writing files: ensure the output root exists and is writable (-o/--root or MUSICFETCH_ROOT).

🌐 REST API (Docker)

Run MusicFetch as an authenticated HTTP service inside your Lidarr Docker stack. A client POSTs a query; the server grabs the top hit non-interactively and runs the download as a background job you can poll. Every response includes a human-readable message (handy for Siri).

Configure & run

Set the network name in server/docker-compose.yml to your existing Lidarr stack network, then:

export LIDARR_API_KEY="your-lidarr-key"
export MUSICFETCH_API_KEY="a-long-random-secret"
docker compose -f server/docker-compose.yml up -d --build

Env var	Default	Purpose
`MUSICFETCH_API_KEY`	(required)	Shared secret clients send as `X-API-Key`.
`MUSICFETCH_PORT`	`6769`	Listen port.
`LIDARR_URL`	`http://lidarr:8686`	Lidarr base URL (stack network).
`LIDARR_API_KEY`	(required for Lidarr)	Lidarr API key.
`MUSICFETCH_ROOT`	`/media/music`	Music output root (bind-mounted).

TLS is expected to be handled by your upstream reverse proxy; the container serves plain HTTP on 6769.

Endpoints

Method	Path	Auth	Purpose
`GET`	`/health`	no	Liveness check.
`POST`	`/fetch?q=...`	yes	Grab top hit; returns a `job_id`.
`GET`	`/jobs/{id}`	yes	Poll job status.

POST /fetch params: q (required), quality (best,320,m4a,opus,flac), source (auto,lidarr,youtube).

curl examples

# Kick off a fetch
curl -X POST 'https://mf.izebra.net/fetch?q=Under%20My%20Skin' \
  -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Found 'Under My Skin' ... Downloading now.","job_id":"a1b2c3","status":"queued","hit":{...}}

# Poll the job
curl 'https://mf.izebra.net/jobs/a1b2c3' -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Finished downloading ...","status":"done","result":{...}}

🗣️ Siri Shortcuts integration

Make a shortcut that fetches music by voice ("Hey Siri, fetch music").

Shortcuts app → New Shortcut.
Add Ask for Input → Input Type Text, prompt "What should I fetch?". (Or use Dictate Text for fully spoken input.)
Add Text action, set it to: https://mf.izebra.net/fetch?q= then insert the Provided Input variable at the end. (Shortcuts URL-encodes query variables automatically.)
Add Get Contents of URL:
- URL: the Text variable from step 3.
- Method: POST.
- Headers: add one — key X-API-Key, value your MUSICFETCH_API_KEY.
- Request Body: leave as is (the query is in the URL).
Add Get Dictionary Value → Get Value for message in Contents of URL.
Add Speak Text → the Dictionary Value. Siri reads back "Found '…' … Downloading now."
(Optional) To confirm completion: add Get Dictionary Value for job_id, Wait ~20 seconds, Get Contents of URL on https://mf.izebra.net/jobs/<job_id> (same X-API-Key header), then Get Dictionary Value message → Speak Text again.

Rename the shortcut (e.g. "Fetch Music") — that phrase becomes the Siri trigger.

🛠️ Contributing

PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr replacement. Keep it a single bash-friendly executable.

📜 License

GPL V3.0