Files
musicfetch/README.md
zebra 140bfef7c9 feat: yt-dlp cookie support + surface real failure reason; default workers 4
Bulk --repair on unauthenticated YouTube trips the bot-check (HTTP 429 "Sign
in to confirm you're not a bot"), after which every call fails until the IP
flag clears. Add cookie support so authenticated requests bypass it:

- --cookies FILE / --cookies-from-browser BROWSER (and $YTDLP_COOKIES /
  $YTDLP_COOKIES_FROM_BROWSER for the API container), threaded into every
  yt-dlp invocation (search, probe, download, repair metadata fetch).
- run_yt_dlp_get_metadata now logs yt-dlp's last stderr line (the actual 429 /
  bot-check / network reason) instead of a bare exit code.
- Default --repair workers lowered 8 -> 4 (safe without cookies; raise with).
- compose: optional YTDLP_COOKIES env + commented cookies mount.
- README: how to obtain cookies (Chrome/Firefox, browser-read vs cookies.txt
  export); gitignore cookies.txt.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 11:25:39 -07:00

345 lines
13 KiB
Markdown

# 🎵 MusicFetch
**MusicFetch** is a smart command-line utility that finds music by searching
**Lidarr** (your music collection manager) and **YouTube Music** at the same
time, shows you the top hits in an interactive picker, and downloads/queues
whatever you choose. It accepts:
- A **free-form query**: an artist, an album, a track title, or combos like
`"Artist - Title"` or `"Artist - Album"` (e.g. `"ODESZA - Bloom"`, `"Daft Punk"`, `"Discovery"`).
- A **URL** (e.g. `"https://music.youtube.com/watch?v=..."` or a regular YouTube URL).
Lidarr is tried first by default. If you pick a Lidarr album but **no indexer
release is available**, MusicFetch automatically falls through to the top
YouTube hit. YouTube downloads prefer **YouTube Music URLs** so album art and
tags are correct.
---
## 🚀 Features
- One unified picker showing the top hits from **Lidarr and YouTube together**, with matching keywords **bolded**.
- Lidarr-first flow: pick an album → adds artist+album (monitored) → interactive indexer search → falls through to YouTube only if no release is found.
- Accurate YouTube metadata via `ytmusicapi` (real artist / album / year / album art), with `yt-dlp` scraping as a fallback.
- Explicit tag overrides on download so files are tagged from the chosen hit, not from scraped titles.
- Non-interactive, YouTube-first, dry-run, quality, limit, and source-restriction flags.
---
## 📦 Dependencies & Installation
### 🐍 Python
- Python 3.10+
- `requests`
- `ytmusicapi` (recommended — accurate YouTube Music metadata)
- `rich` (recommended — nicer picker table + bold keyword matching)
```bash
pip install requests ytmusicapi rich
```
> **Note:** if you hit `ModuleNotFoundError: No module named 'idna'` from
> `requests`, repair it with:
> ```bash
> pip install --force-reinstall idna requests
> ```
`ytmusicapi` and `rich` are optional — without them MusicFetch falls back to
`yt-dlp` search scraping and a plain ANSI picker.
### 📼 External Tools
- `yt-dlp` (audio download/extraction) and `ffmpeg` (for `-x` extraction / embedding).
```bash
pip install -U yt-dlp
sudo apt install ffmpeg # or your distro's equivalent
```
---
## ⚙️ Configuration
Set via environment variables:
| Variable | Default | Purpose |
|-------------------|-------------------------|----------------------------------|
| `LIDARR_API_KEY` | *(required for Lidarr)* | Lidarr API key. |
| `LIDARR_URL` | `http://localhost:8686` | Lidarr base URL. |
| `MUSICFETCH_ROOT` | `/media/music` | Default output root folder. |
```bash
export LIDARR_API_KEY="your-lidarr-api-key"
```
---
## 🧑‍💻 Usage
```bash
./musicfetch [OPTIONS] QUERY...
```
### Options
| Flag | Description |
|------|-------------|
| `-n`, `--noninteractive` | Auto-pick the top hit (no prompt). |
| `-s`, `--ytsearch` | Search/select YouTube first instead of Lidarr first. |
| `-d`, `--dry-run` | Show every action without executing it. |
| `-q`, `--quality {best,320,m4a,opus,flac}` | Audio quality/format (default `best`). |
| `--limit N` | Hits per source (default 10). |
| `--lidarr-only` | Skip YouTube. |
| `--yt-only` | Skip Lidarr. |
| `-o`, `--root PATH` | Output root folder (default `/media/music`). |
| `--search-all` | Search all albums when adding an artist to Lidarr. |
| `--repair` | Re-tag existing downloads under `--root` from source metadata (see below). |
| `--workers N` | Parallel metadata fetches during `--repair` (default 4). |
| `--cookies FILE` | yt-dlp `cookies.txt` for authenticated YouTube (avoids bot-check / rate limits). |
| `--cookies-from-browser BROWSER` | Load YouTube cookies from a local browser (e.g. `firefox`). |
| `--retag-from-path` | Offline: re-tag artist/title from folder + filename (see below). |
| `-x`, `--exclude NAME` | Folder under `--root` to skip during `--repair`/`--retag-from-path` (repeatable). |
| `--debug` | Verbose output. |
### Examples
```bash
# Interactive: combined Lidarr + YouTube picker
./musicfetch "ODESZA - Bloom"
# Just an artist / just an album / just a title all work
./musicfetch "Daft Punk"
./musicfetch "Discovery"
# YouTube first, auto-pick top hit
./musicfetch -s -n "Daft Punk - Harder Better Faster Stronger"
# Dry run — see what would happen, change nothing
./musicfetch -d "ODESZA - Bloom"
# YouTube only, lossless preferred
./musicfetch --yt-only -q flac "Bonobo - Kerala"
# Download by URL (single track or playlist/set/album, any yt-dlp site)
./musicfetch "https://music.youtube.com/watch?v=xxxxxxxxxxx"
./musicfetch "https://soundcloud.com/artist/sets/my-mix"
```
### 🔧 Repair existing tags
`--repair` walks `<root>/<artist>/<source>/` (the `youtube`/`soundcloud`/… download
folders — Lidarr album folders are skipped), re-fetches authoritative metadata for each
file using the `[id]` in its filename, and fixes tags. Useful when downloads landed with
missing album or wrong year.
It is deliberately **conservative**: it overwrites **album** and **year** (the usual
breakage), and fills in **artist**/**title** when they are missing *or* a known-bogus
placeholder (`NA`, `Unknown Album`, `Unknown Artist` — left behind by older buggy tagging) —
but it never overwrites a genuine existing artist/title with a channel name or decorated video
title. A bogus `NA [<id>].<ext>` filename is renamed to the recovered title, and a literal
`NA` album with no source album is normalised to `Unknown Album`.
Each file is its own yt-dlp network round-trip, so repair runs them in a thread pool;
`--workers N` (default 4) caps concurrency. Progress prints every 100 files. Requires
`mutagen` (a yt-dlp dependency, usually already present). CLI-only — not exposed via the REST API.
**Cookies (important for bulk repair).** Unauthenticated YouTube requests get throttled fast —
a large `--repair` (or even a `--dry-run`, which still fetches) will trip *"Sign in to confirm
you're not a bot"* (HTTP 429) and every subsequent call fails until the IP-level flag clears.
Pass authenticated cookies to avoid it:
```bash
./musicfetch --repair --cookies /path/cookies.txt -o /media/music # exported cookies.txt
./musicfetch --repair --cookies-from-browser firefox -o /media/music # or read from a browser
```
With cookies you can raise `--workers`; without them keep it low (≤4) and expect occasional
throttling. Cookies also apply to normal fetches/downloads. The same can be set for the API
container via `$YTDLP_COOKIES` / `$YTDLP_COOKIES_FROM_BROWSER`. If you do get flagged, **stop**
retrying extends it; wait ~30-60 min (429) or longer for a bot-check.
#### Getting YouTube cookies
> ⚠️ Use a **throwaway / secondary Google account**, not your main one — bulk automated
> requests can get the account flagged. You must be **logged in to YouTube** in the browser
> first.
**Option A — read straight from the browser (simplest, host CLI only).**
`--cookies-from-browser` reads the browser's own cookie store, so there's nothing to export:
```bash
./musicfetch --repair --cookies-from-browser firefox -o /media/music
./musicfetch --repair --cookies-from-browser chrome -o /media/music
```
- **Firefox:** works while open; just be logged in to YouTube.
- **Chrome / Chromium / Brave / Edge:** must be **fully quit** when you run this (Chrome locks
its cookie DB, and newer versions encrypt it — close the browser entirely first). On Linux a
running Chrome will usually fail with a "could not copy cookie database / locked" error.
- Specify a profile if not the default, e.g. `--cookies-from-browser "chrome:Profile 1"`.
This only works where the browser lives (your host), **not** inside the Docker container.
**Option B — export a `cookies.txt` (works anywhere, incl. the container/server).**
Use a Netscape-format cookie exporter, then point `--cookies` / `$YTDLP_COOKIES` at the file:
1. Install a cookies exporter extension:
- Firefox: *"cookies.txt"* (a.k.a. *Export Cookies*).
- Chrome: *"Get cookies.txt LOCALLY"* (pick a **LOCALLY**-running one — avoid extensions that
upload your cookies anywhere).
2. Log in to <https://www.youtube.com>, click the extension, **Export** → save `cookies.txt`.
3. Use it:
```bash
./musicfetch --repair --cookies ~/cookies.txt -o /media/music
```
For the API container, mount it and set the env var (see `server/docker-compose.yml`):
```yaml
environment:
YTDLP_COOKIES: "/cookies.txt"
volumes:
- /host/path/cookies.txt:/cookies.txt:ro
```
Cookies expire — if YouTube starts rejecting them, re-export. Treat `cookies.txt` like a
password (it *is* your logged-in session); keep it out of git (`.gitignore` it).
```bash
# Preview what would change (writes nothing)
./musicfetch --repair -d
# Apply fixes under a specific root
./musicfetch --repair -o /media/music
```
**`--retag-from-path`** is an offline companion: it derives **artist** and **title** purely
from the folder name + filename (stripping `(Official Video)` / `(Lyrics)`-style decorations,
and treating an `Artist - Title` filename correctly), with no network. Use it to undo bad
tags — e.g. titles/artists clobbered by an earlier `--repair` on music videos. It overwrites
artist/title and leaves album/year alone.
```bash
./musicfetch --retag-from-path -d # preview
./musicfetch --retag-from-path -o /media/music
# Skip folders (e.g. hand-curated playlists you don't want re-tagged)
./musicfetch --repair -x Unsorted -x playlists
```
### 📁 Output Structure
```text
<root>/
├── Artist Name/
│ ├── Album Name/ (managed by Lidarr)
│ ├── youtube/ (YouTube / YouTube Music downloads)
│ ├── soundcloud/ (SoundCloud downloads)
│ └── <source>/ (one folder per yt-dlp source)
```
---
## ❓ Troubleshooting
- **No Lidarr hits / "LIDARR_API_KEY not set":** export your key and confirm `LIDARR_URL` is reachable.
- **Wrong album art from YouTube:** install `ytmusicapi` so MusicFetch can resolve proper YouTube Music URLs and metadata.
- **`yt-dlp` errors:** update with `yt-dlp -U`; ensure `ffmpeg` is installed for extraction/embedding.
- **`idna` import error:** `pip install --force-reinstall idna requests`.
- **Permission denied writing files:** ensure the output root exists and is writable (`-o`/`--root` or `MUSICFETCH_ROOT`).
---
## 🌐 REST API (Docker)
Run MusicFetch as an authenticated HTTP service inside your Lidarr Docker stack.
A client POSTs a query; the server grabs the top hit non-interactively and runs
the download as a background job you can poll. Every response includes a
human-readable `message` (handy for Siri).
### Configure & run
Set the network name in `server/docker-compose.yml` to your existing Lidarr
stack network, then:
```bash
export LIDARR_API_KEY="your-lidarr-key"
export MUSICFETCH_API_KEY="a-long-random-secret"
docker compose -f server/docker-compose.yml up -d --build
```
| Env var | Default | Purpose |
| --- | --- | --- |
| `MUSICFETCH_API_KEY` | *(required)* | Shared secret clients send as `X-API-Key`. |
| `MUSICFETCH_PORT` | `6769` | Listen port. |
| `LIDARR_URL` | `http://lidarr:8686` | Lidarr base URL (stack network). |
| `LIDARR_API_KEY` | *(required for Lidarr)* | Lidarr API key. |
| `MUSICFETCH_ROOT` | `/media/music` | Music output root (bind-mounted). |
TLS is expected to be handled by your upstream reverse proxy; the container
serves plain HTTP on `6769`.
### Endpoints
| Method | Path | Auth | Purpose |
| --- | --- | --- | --- |
| `GET` | `/health` | no | Liveness check. |
| `POST` | `/fetch?q=...` | yes | Grab top hit; returns a `job_id`. |
| `GET` | `/jobs/{id}` | yes | Poll job status. |
`POST /fetch` params: `q` (required), `quality` (`best,320,m4a,opus,flac`),
`source` (`auto,lidarr,youtube`).
### curl examples
```bash
# Kick off a fetch
curl -X POST 'https://mf.izebra.net/fetch?q=Under%20My%20Skin' \
-H 'X-API-Key: a-long-random-secret'
# -> {"message":"Found 'Under My Skin' ... Downloading now.","job_id":"a1b2c3","status":"queued","hit":{...}}
# Poll the job
curl 'https://mf.izebra.net/jobs/a1b2c3' -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Finished downloading ...","status":"done","result":{...}}
```
### 🗣️ Siri Shortcuts integration
Make a shortcut that fetches music by voice ("Hey Siri, fetch music").
1. **Shortcuts app → New Shortcut.**
2. Add **Ask for Input** → Input Type **Text**, prompt "What should I fetch?".
(Or use **Dictate Text** for fully spoken input.)
3. Add **Text** action, set it to: `https://mf.izebra.net/fetch?q=` then insert
the **Provided Input** variable at the end. (Shortcuts URL-encodes query
variables automatically.)
4. Add **Get Contents of URL**:
- **URL:** the Text variable from step 3.
- **Method:** `POST`.
- **Headers:** add one — key `X-API-Key`, value your `MUSICFETCH_API_KEY`.
- **Request Body:** leave as is (the query is in the URL).
5. Add **Get Dictionary Value** → Get Value for **message** in **Contents of URL**.
6. Add **Speak Text** → the Dictionary Value. Siri reads back
"Found '…' … Downloading now."
7. (Optional) To confirm completion: add **Get Dictionary Value** for `job_id`,
**Wait** ~20 seconds, **Get Contents of URL** on
`https://mf.izebra.net/jobs/<job_id>` (same `X-API-Key` header), then
**Get Dictionary Value** `message` → **Speak Text** again.
Rename the shortcut (e.g. "Fetch Music") — that phrase becomes the Siri trigger.
---
## 🛠️ Contributing
PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr
replacement. Keep it a single bash-friendly executable.
---
## 📜 License
GPL V3.0