musicfetch/README.md

# 🎵 MusicFetch

**MusicFetch** is a smart command-line utility that finds music by searching
**Lidarr** (your music collection manager) and **YouTube Music** at the same
time, shows you the top hits in an interactive picker, and downloads/queues
whatever you choose. It accepts:

- A **free-form query**: an artist, an album, a track title, or combos like
  `"Artist - Title"` or `"Artist - Album"` (e.g. `"ODESZA - Bloom"`, `"Daft Punk"`, `"Discovery"`).
- A **URL** (e.g. `"https://music.youtube.com/watch?v=..."` or a regular YouTube URL).
- **Any streaming link** (Spotify, Apple Music, Tidal, Deezer, …): resolved to
  metadata via [Odesli/song.link](https://odesli.co), then searched on Lidarr
  first and downloaded from the matching YouTube track if no Lidarr release is
  available. YouTube and SoundCloud links download directly.

Lidarr is tried first by default. If you pick a Lidarr album but **no indexer
release is available**, MusicFetch automatically falls through to the top
YouTube hit. YouTube downloads prefer **YouTube Music URLs** so album art and
tags are correct.

---

## 🚀 Features

- One unified picker showing the top hits from **Lidarr and YouTube together**, with matching keywords **bolded**.
- Lidarr-first flow: pick an album → adds artist+album (monitored) → interactive indexer search → falls through to YouTube only if no release is found.
- Accurate YouTube metadata via `ytmusicapi` (real artist / album / year / album art), with `yt-dlp` scraping as a fallback.
- Explicit tag overrides on download so files are tagged from the chosen hit, not from scraped titles.
- Non-interactive, YouTube-first, dry-run, quality, limit, and source-restriction flags.

---

## 📦 Dependencies & Installation

### 🐍 Python

- Python 3.10+
- `requests`
- `ytmusicapi` (recommended — accurate YouTube Music metadata)
- `rich` (recommended — nicer picker table + bold keyword matching)

```bash
pip install requests ytmusicapi rich
```

> **Note:** if you hit `ModuleNotFoundError: No module named 'idna'` from
> `requests`, repair it with:
> ```bash
> pip install --force-reinstall idna requests
> ```

`ytmusicapi` and `rich` are optional — without them MusicFetch falls back to
`yt-dlp` search scraping and a plain ANSI picker.

### 📼 External Tools

- `yt-dlp` (audio download/extraction) and `ffmpeg` (for `-x` extraction / embedding).

```bash
pip install -U yt-dlp
sudo apt install ffmpeg      # or your distro's equivalent
```

---

## ⚙️ Configuration

Set via environment variables:

| Variable          | Default                 | Purpose                          |
|-------------------|-------------------------|----------------------------------|
| `LIDARR_API_KEY`  | *(required for Lidarr)* | Lidarr API key.                  |
| `LIDARR_URL`      | `http://localhost:8686` | Lidarr base URL.                 |
| `MUSICFETCH_ROOT` | `/media/music`          | Default output root folder.      |

```bash
export LIDARR_API_KEY="your-lidarr-api-key"
```

---

## 🧑‍💻 Usage

```bash
./musicfetch [OPTIONS] QUERY...
```

### Options

| Flag | Description |
|------|-------------|
| `-n`, `--noninteractive` | Auto-pick the top hit (no prompt). |
| `-s`, `--ytsearch` | Search/select YouTube first instead of Lidarr first. |
| `-d`, `--dry-run` | Show every action without executing it. |
| `-q`, `--quality {best,320,m4a,opus,flac}` | Audio quality/format (default `best`). |
| `--limit N` | Hits per source (default 10). |
| `--lidarr-only` | Skip YouTube. |
| `--yt-only` | Skip Lidarr. |
| `-o`, `--root PATH` | Output root folder (default `/media/music`). |
| `--search-all` | Search all albums when adding an artist to Lidarr. |
| `--repair` | Re-tag existing downloads under `--root` from source metadata (see below). |
| `--workers N` | Parallel metadata fetches during `--repair` (default 4). |
| `--cookies FILE` | yt-dlp `cookies.txt` for authenticated YouTube (avoids bot-check / rate limits). |
| `--cookies-from-browser BROWSER` | Load YouTube cookies from a local browser (e.g. `firefox`). |
| `--retag-from-path` | Offline: re-tag artist/title from folder + filename (see below). |
| `-x`, `--exclude NAME` | Folder under `--root` to skip during `--repair`/`--retag-from-path` (repeatable). |
| `--debug` | Verbose output. |

### Examples

```bash
# Interactive: combined Lidarr + YouTube picker
./musicfetch "ODESZA - Bloom"

# Just an artist / just an album / just a title all work
./musicfetch "Daft Punk"
./musicfetch "Discovery"

# YouTube first, auto-pick top hit
./musicfetch -s -n "Daft Punk - Harder Better Faster Stronger"

# Dry run — see what would happen, change nothing
./musicfetch -d "ODESZA - Bloom"

# YouTube only, lossless preferred
./musicfetch --yt-only -q flac "Bonobo - Kerala"

# Download by URL (single track or playlist/set/album, any yt-dlp site)
./musicfetch "https://music.youtube.com/watch?v=xxxxxxxxxxx"
./musicfetch "https://soundcloud.com/artist/sets/my-mix"
```

### 🔧 Repair existing tags

`--repair` walks `<root>/<artist>/<source>/` (the `youtube`/`soundcloud`/… download
folders — Lidarr album folders are skipped), re-fetches authoritative metadata for each
file using the `[id]` in its filename, and fixes tags. Useful when downloads landed with
missing album or wrong year.

It is deliberately **conservative**: it overwrites **album** and **year** (the usual
breakage), and fills in **artist**/**title** when they are missing *or* a known-bogus
placeholder (`NA`, `Unknown Album`, `Unknown Artist` — left behind by older buggy tagging) —
but it never overwrites a genuine existing artist/title with a channel name or decorated video
title. A bogus `NA [<id>].<ext>` filename is renamed to the recovered title, and a literal
`NA` album with no source album is normalised to `Unknown Album`.

Each file is its own yt-dlp network round-trip, so repair runs them in a thread pool;
`--workers N` (default 4) caps concurrency. Progress prints every 100 files. Requires
`mutagen` (a yt-dlp dependency, usually already present). CLI-only — not exposed via the REST API.

**Cookies (important for bulk repair).** Unauthenticated YouTube requests get throttled fast —
a large `--repair` (or even a `--dry-run`, which still fetches) will trip *"Sign in to confirm
you're not a bot"* (HTTP 429) and every subsequent call fails until the IP-level flag clears.
Pass authenticated cookies to avoid it:

```bash
./musicfetch --repair --cookies /path/cookies.txt -o /media/music        # exported cookies.txt
./musicfetch --repair --cookies-from-browser firefox -o /media/music     # or read from a browser
```

With cookies you can raise `--workers`; without them keep it low (≤4) and expect occasional
throttling. Cookies also apply to normal fetches/downloads. The same can be set for the API
container via `$YTDLP_COOKIES` / `$YTDLP_COOKIES_FROM_BROWSER`. If you do get flagged, **stop** —
retrying extends it; wait ~30-60 min (429) or longer for a bot-check.

#### Getting YouTube cookies

> ⚠️ Use a **throwaway / secondary Google account**, not your main one — bulk automated
> requests can get the account flagged. You must be **logged in to YouTube** in the browser
> first.

**Option A — read straight from the browser (simplest, host CLI only).**
`--cookies-from-browser` reads the browser's own cookie store, so there's nothing to export:

```bash
./musicfetch --repair --cookies-from-browser firefox -o /media/music
./musicfetch --repair --cookies-from-browser chrome  -o /media/music
```

- **Firefox:** works while open; just be logged in to YouTube.
- **Chrome / Chromium / Brave / Edge:** must be **fully quit** when you run this (Chrome locks
  its cookie DB, and newer versions encrypt it — close the browser entirely first). On Linux a
  running Chrome will usually fail with a "could not copy cookie database / locked" error.
- Specify a profile if not the default, e.g. `--cookies-from-browser "chrome:Profile 1"`.

This only works where the browser lives (your host), **not** inside the Docker container.

**Option B — export a `cookies.txt` (works anywhere, incl. the container/server).**
Use a Netscape-format cookie exporter, then point `--cookies` / `$YTDLP_COOKIES` at the file:

1. Install a cookies exporter extension:
   - Firefox: *"cookies.txt"* (a.k.a. *Export Cookies*).
   - Chrome: *"Get cookies.txt LOCALLY"* (pick a **LOCALLY**-running one — avoid extensions that
     upload your cookies anywhere).
2. Log in to <https://www.youtube.com>, click the extension, **Export** → save `cookies.txt`.
3. Use it:

   ```bash
   ./musicfetch --repair --cookies ~/cookies.txt -o /media/music
   ```

   For the API container, mount it and set the env var (see `server/docker-compose.yml`):

   ```yaml
   environment:
     YTDLP_COOKIES: "/cookies.txt"
   volumes:
     - /host/path/cookies.txt:/cookies.txt:ro
   ```

Cookies expire — if YouTube starts rejecting them, re-export. Treat `cookies.txt` like a
password (it *is* your logged-in session); keep it out of git (`.gitignore` it).

```bash
# Preview what would change (writes nothing)
./musicfetch --repair -d

# Apply fixes under a specific root
./musicfetch --repair -o /media/music
```

**`--retag-from-path`** is an offline companion: it derives **artist** and **title** purely
from the folder name + filename (stripping `(Official Video)` / `(Lyrics)`-style decorations,
and treating an `Artist - Title` filename correctly), with no network. Use it to undo bad
tags — e.g. titles/artists clobbered by an earlier `--repair` on music videos. It overwrites
artist/title and leaves album/year alone.

```bash
./musicfetch --retag-from-path -d            # preview
./musicfetch --retag-from-path -o /media/music

# Skip folders (e.g. hand-curated playlists you don't want re-tagged)
./musicfetch --repair -x Unsorted -x playlists
```

### 📁 Output Structure

```text
<root>/
├── Artist Name/
│   ├── Album Name/    (managed by Lidarr)
│   ├── youtube/       (YouTube / YouTube Music downloads)
│   ├── soundcloud/    (SoundCloud downloads)
│   └── <source>/      (one folder per yt-dlp source)
```

---

## ❓ Troubleshooting

- **No Lidarr hits / "LIDARR_API_KEY not set":** export your key and confirm `LIDARR_URL` is reachable.
- **Wrong album art from YouTube:** install `ytmusicapi` so MusicFetch can resolve proper YouTube Music URLs and metadata.
- **`yt-dlp` errors:** update with `yt-dlp -U`; ensure `ffmpeg` is installed for extraction/embedding.
- **`idna` import error:** `pip install --force-reinstall idna requests`.
- **Permission denied writing files:** ensure the output root exists and is writable (`-o`/`--root` or `MUSICFETCH_ROOT`).

---

## 🌐 REST API (Docker)

Run MusicFetch as an authenticated HTTP service inside your Lidarr Docker stack.
A client POSTs a query; the server grabs the top hit non-interactively and runs
the download as a background job you can poll. Every response includes a
human-readable `message` (handy for Siri).

### Configure & run

Set the network name in `server/docker-compose.yml` to your existing Lidarr
stack network, then:

```bash
export LIDARR_API_KEY="your-lidarr-key"
export MUSICFETCH_API_KEY="a-long-random-secret"
docker compose -f server/docker-compose.yml up -d --build
```

| Env var | Default | Purpose |
| --- | --- | --- |
| `MUSICFETCH_API_KEY` | *(required)* | Shared secret clients send as `X-API-Key`. |
| `MUSICFETCH_PORT` | `6769` | Listen port. |
| `LIDARR_URL` | `http://lidarr:8686` | Lidarr base URL (stack network). |
| `LIDARR_API_KEY` | *(required for Lidarr)* | Lidarr API key. |
| `MUSICFETCH_ROOT` | `/media/music` | Music output root (bind-mounted). |

TLS is expected to be handled by your upstream reverse proxy; the container
serves plain HTTP on `6769`.

### Endpoints

| Method | Path | Auth | Purpose |
| --- | --- | --- | --- |
| `GET` | `/health` | no | Liveness check. |
| `POST` | `/fetch?q=...` | yes | Grab top hit; returns a `job_id`. |
| `GET` | `/jobs/{id}` | yes | Poll job status. |

`POST /fetch` params: `q` (required), `quality` (`best,320,m4a,opus,flac`),
`source` (`auto,lidarr,youtube`).

### curl examples

```bash
# Kick off a fetch
curl -X POST 'https://mf.izebra.net/fetch?q=Under%20My%20Skin' \
  -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Found 'Under My Skin' ... Downloading now.","job_id":"a1b2c3","status":"queued","hit":{...}}

# Poll the job
curl 'https://mf.izebra.net/jobs/a1b2c3' -H 'X-API-Key: a-long-random-secret'
# -> {"message":"Finished downloading ...","status":"done","result":{...}}
```

### 🗣️ Siri Shortcuts integration

Make a shortcut that fetches music by voice ("Hey Siri, fetch music").

1. **Shortcuts app → New Shortcut.**
2. Add **Ask for Input** → Input Type **Text**, prompt "What should I fetch?".
   (Or use **Dictate Text** for fully spoken input.)
3. Add **Text** action, set it to: `https://mf.izebra.net/fetch?q=` then insert
   the **Provided Input** variable at the end. (Shortcuts URL-encodes query
   variables automatically.)
4. Add **Get Contents of URL**:
   - **URL:** the Text variable from step 3.
   - **Method:** `POST`.
   - **Headers:** add one — key `X-API-Key`, value your `MUSICFETCH_API_KEY`.
   - **Request Body:** leave as is (the query is in the URL).
5. Add **Get Dictionary Value** → Get Value for **message** in **Contents of URL**.
6. Add **Speak Text** → the Dictionary Value. Siri reads back
   "Found '…' … Downloading now."
7. (Optional) To confirm completion: add **Get Dictionary Value** for `job_id`,
   **Wait** ~20 seconds, **Get Contents of URL** on
   `https://mf.izebra.net/jobs/<job_id>` (same `X-API-Key` header), then
   **Get Dictionary Value** `message` → **Speak Text** again.

Rename the shortcut (e.g. "Fetch Music") — that phrase becomes the Siri trigger.

---

## 🛠️ Contributing

PRs welcome. This script is middleware around Lidarr + yt-dlp, not a Lidarr
replacement. Keep it a single bash-friendly executable.

---

## 📜 License

GPL V3.0