zebra 140bfef7c9 feat: yt-dlp cookie support + surface real failure reason; default workers 4
Bulk --repair on unauthenticated YouTube trips the bot-check (HTTP 429 "Sign
in to confirm you're not a bot"), after which every call fails until the IP
flag clears. Add cookie support so authenticated requests bypass it:

- --cookies FILE / --cookies-from-browser BROWSER (and $YTDLP_COOKIES /
  $YTDLP_COOKIES_FROM_BROWSER for the API container), threaded into every
  yt-dlp invocation (search, probe, download, repair metadata fetch).
- run_yt_dlp_get_metadata now logs yt-dlp's last stderr line (the actual 429 /
  bot-check / network reason) instead of a bare exit code.
- Default --repair workers lowered 8 -> 4 (safe without cookies; raise with).
- compose: optional YTDLP_COOKIES env + commented cookies mount.
- README: how to obtain cookies (Chrome/Firefox, browser-read vs cookies.txt
  export); gitignore cookies.txt.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 11:25:39 -07:00
2025-06-09 20:25:03 +00:00

🎵 MusicFetch

MusicFetch is a smart command-line utility that finds music by searching Lidarr (your music collection manager) and YouTube Music at the same time, shows you the top hits in an interactive picker, and downloads/queues whatever you choose. It accepts:

  • A free-form query: an artist, an album, a track title, or combos like "Artist - Title" or "Artist - Album" (e.g. "ODESZA - Bloom", "Daft Punk", "Discovery").
  • A URL (e.g. "https://music.youtube.com/watch?v=..." or a regular YouTube URL).

Lidarr is tried first by default. If you pick a Lidarr album but no indexer release is available, MusicFetch automatically falls through to the top YouTube hit. YouTube downloads prefer YouTube Music URLs so album art and tags are correct.


🚀 Features

  • One unified picker showing the top hits from Lidarr and YouTube together, with matching keywords bolded.
  • Lidarr-first flow: pick an album → adds artist+album (monitored) → interactive indexer search → falls through to YouTube only if no release is found.
  • Accurate YouTube metadata via ytmusicapi (real artist / album / year / album art), with yt-dlp scraping as a fallback.
  • Explicit tag overrides on download so files are tagged from the chosen hit, not from scraped titles.
  • Non-interactive, YouTube-first, dry-run, quality, limit, and source-restriction flags.

📦 Dependencies & Installation

🐍 Python

  • Python 3.10+
  • requests
  • ytmusicapi (recommended — accurate YouTube Music metadata)
  • rich (recommended — nicer picker table + bold keyword matching)
pip install requests ytmusicapi rich

Note: if you hit ModuleNotFoundError: No module named 'idna' from requests, repair it with:

pip install --force-reinstall idna requests

ytmusicapi and rich are optional — without them MusicFetch falls back to yt-dlp search scraping and a plain ANSI picker.

📼 External Tools

  • yt-dlp (audio download/extraction) and ffmpeg (for -x extraction / embedding).
pip install -U yt-dlp
sudo apt install ffmpeg      # or your distro's equivalent

⚙️ Configuration

Set via environment variables:

Variable Default Purpose
LIDARR_API_KEY (required for Lidarr) Lidarr API key.
LIDARR_URL http://localhost:8686 Lidarr base URL.
MUSICFETCH_ROOT /media/music Default output root folder.
export LIDARR_API_KEY="your-lidarr-api-key"

🧑‍💻 Usage

./musicfetch [OPTIONS] QUERY...

Options

Flag Description
-n, --noninteractive Auto-pick the top hit (no prompt).
-s, --ytsearch Search/select YouTube first instead of Lidarr first.
-d, --dry-run Show every action without executing it.
-q, --quality {best,320,m4a,opus,flac} Audio quality/format (default best).
--limit N Hits per source (default 10).
--lidarr-only Skip YouTube.
--yt-only Skip Lidarr.
-o, --root PATH Output root folder (default /media/music).
--search-all Search all albums when adding an artist to Lidarr.
--repair Re-tag existing downloads under --root from source metadata (see below).
--workers N Parallel metadata fetches during --repair (default 4).
--cookies FILE yt-dlp cookies.txt for authenticated YouTube (avoids bot-check / rate limits).
--cookies-from-browser BROWSER Load YouTube cookies from a local browser (e.g. firefox).
--retag-from-path Offline: re-tag artist/title from folder + filename (see below).
-x, --exclude NAME Folder under --root to skip during --repair/--retag-from-path (repeatable).
--debug Verbose output.

Examples

# Interactive: combined Lidarr + YouTube picker
./musicfetch "ODESZA - Bloom"

# Just an artist / just an album / just a title all work
./musicfetch "Daft Punk"
./musicfetch "Discovery"

# YouTube first, auto-pick top hit
./musicfetch -s -n "Daft Punk - Harder Better Faster Stronger"

# Dry run — see what would happen, change nothing
./musicfetch -d "ODESZA - Bloom"

# YouTube only, lossless preferred
./musicfetch --yt-only -q flac "Bonobo - Kerala"

# Download by URL (single track or playlist/set/album, any yt-dlp site)
./musicfetch "https://music.youtube.com/watch?v=xxxxxxxxxxx"
./musicfetch "https://soundcloud.com/artist/sets/my-mix"

🔧 Repair existing tags

--repair walks <root>/<artist>/<source>/ (the youtube/soundcloud/… download folders — Lidarr album folders are skipped), re-fetches authoritative metadata for each file using the [id] in its filename, and fixes tags. Useful when downloads landed with missing album or wrong year.

It is deliberately conservative: it overwrites album and year (the usual breakage), and fills in artist/title when they are missing or a known-bogus placeholder (NA, Unknown Album, Unknown Artist — left behind by older buggy tagging) — but it never overwrites a genuine existing artist/title with a channel name or decorated video title. A bogus NA [<id>].<ext> filename is renamed to the recovered title, and a literal NA album with no source album is normalised to Unknown Album.

Each file is its own yt-dlp network round-trip, so repair runs them in a thread pool; --workers N (default 4) caps concurrency. Progress prints every 100 files. Requires mutagen (a yt-dlp dependency, usually already present). CLI-only — not exposed via the REST API.

Cookies (important for bulk repair). Unauthenticated YouTube requests get throttled fast — a large --repair (or even a --dry-run, which still fetches) will trip "Sign in to confirm you're not a bot" (HTTP 429) and every subsequent call fails until the IP-level flag clears. Pass authenticated cookies to avoid it:

./musicfetch --repair --cookies /path/cookies.txt -o /media/music        # exported cookies.txt
./musicfetch --repair --cookies-from-browser firefox -o /media/music     # or read from a browser

With cookies you can raise --workers; without them keep it low (≤4) and expect occasional throttling. Cookies also apply to normal fetches/downloads. The same can be set for the API container via $YTDLP_COOKIES / $YTDLP_COOKIES_FROM_BROWSER. If you do get flagged, stop — retrying extends it; wait ~30-60 min (429) or longer for a bot-check.

Getting YouTube cookies

⚠️ Use a throwaway / secondary Google account, not your main one — bulk automated requests can get the account flagged. You must be logged in to YouTube in the browser first.

Option A — read straight from the browser (simplest, host CLI only). --cookies-from-browser reads the browser's own cookie store, so there's nothing to export:

./musicfetch --repair --cookies-from-browser firefox -o /media/music
./musicfetch --repair --cookies-from-browser chrome  -o /media/music
  • Firefox: works while open; just be logged in to YouTube.
  • Chrome / Chromium / Brave / Edge: must be fully quit when you run this (Chrome locks its cookie DB, and newer versions encrypt it — close the browser entirely first). On Linux a running Chrome will usually fail with a "could not copy cookie database / locked" error.
  • Specify a profile if not the default, e.g. --cookies-from-browser "chrome:Profile 1".

This only works where the browser lives (your host), not inside the Docker container.

Option B — export a cookies.txt (works anywhere, incl. the container/server). Use a Netscape-format cookie exporter, then point --cookies / $YTDLP_COOKIES at the file:

  1. Install a cookies exporter extension:

    • Firefox: "cookies.txt" (a.k.a. Export Cookies).
    • Chrome: "Get cookies.txt LOCALLY" (pick a LOCALLY-running one — avoid extensions that upload your cookies anywhere).
  2. Log in to https://www.youtube.com, click the extension, Export → save cookies.txt.

  3. Use it:

    ./musicfetch --repair --cookies ~/cookies.txt -o /media/music
    

    For the API container, mount it and set the env var (see server/docker-compose.yml):

    environment:
      YTDLP_COOKIES: "/cookies.txt"
    volumes:
      - /host/path/cookies.txt:/cookies.txt:ro
    

Cookies expire — if YouTube starts rejecting them, re-export. Treat cookies.txt like a password (it is your logged-in session); keep it out of git (.gitignore it).

# Preview what would change (writes nothing)
./musicfetch --repair -d

# Apply fixes under a specific root
./musicfetch --repair -o /media/music

--retag-from-path is an offline companion: it derives artist and title purely from the folder name + filename (stripping (Official Video) / (Lyrics)-style decorations, and treating an Artist - Title filename correctly), with no network. Use it to undo bad tags — e.g. titles/artists clobbered by an earlier --repair on music videos. It overwrites artist/title and leaves album/year alone.

./musicfetch --retag-from-path -d            # preview
./musicfetch --retag-from-path -o /media/music

# Skip folders (e.g. hand-curated playlists you don't want re-tagged)
./musicfetch --repair -x Unsorted -x playlists

📁 Output Structure

<root>/
├── Artist Name/
│   ├── Album Name/    (managed by Lidarr)
│   ├── youtube/       (YouTube / YouTube Music downloads)
│   ├── soundcloud/    (SoundCloud downloads)
│   └── <source>/      (one folder per yt-dlp source)

Troubleshooting

  • No Lidarr hits / "LIDARR_API_KEY not set": export your key and confirm LIDARR_URL is reachable.
  • Wrong album art from YouTube: install ytmusicapi so MusicFetch can resolve proper YouTube Music URLs and metadata.
  • yt-dlp errors: update with yt-dlp -U; ensure ffmpeg is installed for extraction/embedding.
  • idna import error: pip install --force-reinstall idna requests.
  • Permission denied writing files: ensure the output root exists and is writable (-o/--root or MUSICFETCH_ROOT).

🌐 REST API (Docker)

Run MusicFetch as an authenticated HTTP service inside your Lidarr Docker stack. A client POSTs a query; the server grabs the top hit non-interactively and runs the download as a background job you can poll. Every response includes a human-readable message (handy for Siri).

Configure & run

Set the network name in server/docker-compose.yml to your existing Lidarr stack network, then:

export LIDARR_API_KEY="your-lidarr-key"
export MUSICFETCH_API_KEY="a-long-random-secret"
docker compose -f server/docker-compose.yml up -d --build
Env var Default Purpose
MUSICFETCH_API_KEY (required) Shared secret clients send as X-API-Key.
MUSICFETCH_PORT 6769 Listen port.
LIDARR_URL http://lidarr:8686 Lidarr base URL (stack network).
LIDARR_API_KEY (required for Lidarr) Lidarr API key.
MUSICFETCH_ROOT /media/music Music output root (bind-mounted).

TLS is expected to be handled by your upstream reverse proxy; the container serves plain HTTP on 6769.

Endpoints

Method Path Auth Purpose
GET /health no Liveness check.
POST /fetch?q=... yes Grab top hit; returns a job_id.
GET /jobs/{id} yes Poll job status.

POST /fetch params: q (required), quality (best,320,m4a,opus,flac), source (auto,lidarr,youtube).

curl examples

# Kick off a fetch
curl -X POST 'https://mf.izebra.net/fetch?q=Under%20My%20Skin' \
  -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Found 'Under My Skin' ... Downloading now.","job_id":"a1b2c3","status":"queued","hit":{...}}

# Poll the job
curl 'https://mf.izebra.net/jobs/a1b2c3' -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Finished downloading ...","status":"done","result":{...}}

🗣️ Siri Shortcuts integration

Make a shortcut that fetches music by voice ("Hey Siri, fetch music").

  1. Shortcuts app → New Shortcut.
  2. Add Ask for Input → Input Type Text, prompt "What should I fetch?". (Or use Dictate Text for fully spoken input.)
  3. Add Text action, set it to: https://mf.izebra.net/fetch?q= then insert the Provided Input variable at the end. (Shortcuts URL-encodes query variables automatically.)
  4. Add Get Contents of URL:
    • URL: the Text variable from step 3.
    • Method: POST.
    • Headers: add one — key X-API-Key, value your MUSICFETCH_API_KEY.
    • Request Body: leave as is (the query is in the URL).
  5. Add Get Dictionary Value → Get Value for message in Contents of URL.
  6. Add Speak Text → the Dictionary Value. Siri reads back "Found '…' … Downloading now."
  7. (Optional) To confirm completion: add Get Dictionary Value for job_id, Wait ~20 seconds, Get Contents of URL on https://mf.izebra.net/jobs/<job_id> (same X-API-Key header), then Get Dictionary Value messageSpeak Text again.

Rename the shortcut (e.g. "Fetch Music") — that phrase becomes the Siri trigger.


🛠️ Contributing

PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr replacement. Keep it a single bash-friendly executable.


📜 License

GPL V3.0

Description
A Lidarr middleware program written in python that extends library management at the command line.
Readme GPL-3.0 692 KiB
Languages
Python 99.1%
Dockerfile 0.9%