Add comprehensive README and fix speaker entity in CLAUDE.md
Full documentation covering architecture, deployment, API endpoints, speaker entity mapping, pipeline stages, and how recommendations improve over time. Fixed stale speaker entity reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
175
README.md
175
README.md
@@ -1,3 +1,176 @@
|
||||
# haunt-fm
|
||||
|
||||
Personal music recommendation service — captures listening history, discovers similar tracks via Last.fm, embeds audio with CLAP, generates playlists
|
||||
Personal music recommendation engine that captures listening history from Music Assistant, discovers similar music via Last.fm, computes audio embeddings with CLAP, and generates playlists mixing known favorites with new discoveries — played back on house speakers via Apple Music.
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
You play music on any speaker
|
||||
→ HA automation logs the listen event
|
||||
→ Last.fm discovers similar tracks (~50 per listened track)
|
||||
→ iTunes Search API finds 30-second audio previews
|
||||
→ CLAP model computes 512-dim audio embeddings
|
||||
→ pgvector stores and indexes embeddings (HNSW cosine similarity)
|
||||
→ Taste profile = weighted average of listened-track embeddings
|
||||
→ Recommendations = closest unheard tracks by cosine similarity
|
||||
→ Playlist mixes known favorites + new discoveries
|
||||
→ Music Assistant plays it on speakers via Apple Music
|
||||
```
|
||||
|
||||
## Deployment
|
||||
|
||||
Runs on the NAS as two Docker containers:
|
||||
|
||||
| Container | Image | Port | Purpose |
|
||||
|-----------|-------|------|---------|
|
||||
| `haunt-fm` | Custom build | 8321 → 8000 | FastAPI app + embedding worker |
|
||||
| `haunt-fm-db` | `pgvector/pgvector:pg17` | internal | PostgreSQL + pgvector |
|
||||
|
||||
```bash
|
||||
# Deploy / rebuild
|
||||
cd /volume1/homes/antialias/projects/haunt-fm
|
||||
git pull && docker-compose up -d --build haunt-fm
|
||||
|
||||
# Run migrations
|
||||
docker exec haunt-fm alembic upgrade head
|
||||
```
|
||||
|
||||
**Access:**
|
||||
- Status page: https://recommend.haunt.house
|
||||
- Health check: http://192.168.86.51:8321/health
|
||||
- API status: http://192.168.86.51:8321/api/status
|
||||
- Source: https://git.dev.abaci.one/antialias/haunt-fm
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Path | Purpose |
|
||||
|--------|------|---------|
|
||||
| GET | `/health` | Health check (DB connectivity) |
|
||||
| GET | `/api/status` | Full pipeline status JSON |
|
||||
| GET | `/` | HTML status dashboard |
|
||||
| POST | `/api/history/webhook` | Log a listen event (from HA automation) |
|
||||
| POST | `/api/admin/discover` | Expand listening history via Last.fm |
|
||||
| POST | `/api/admin/build-taste-profile` | Rebuild taste profile from embeddings |
|
||||
| GET | `/api/recommendations?limit=50` | Get ranked recommendations |
|
||||
| POST | `/api/playlists/generate` | Generate and optionally play a playlist |
|
||||
|
||||
## Usage
|
||||
|
||||
### Generate and play a playlist
|
||||
|
||||
```bash
|
||||
curl -X POST http://192.168.86.51:8321/api/playlists/generate \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"total_tracks": 20,
|
||||
"known_pct": 30,
|
||||
"speaker_entity": "media_player.living_room_speaker_2",
|
||||
"auto_play": true
|
||||
}'
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `total_tracks` — number of tracks in the playlist (default 20)
|
||||
- `known_pct` — percentage of known-liked tracks vs new discoveries (default 30)
|
||||
- `speaker_entity` — Music Assistant entity ID (must be a `_2` suffix entity)
|
||||
- `auto_play` — `true` to immediately play on the speaker
|
||||
|
||||
### Speaker entities
|
||||
|
||||
The `speaker_entity` **must** be a Music Assistant entity (the `_2` suffix ones) for text search to resolve through Apple Music. Raw Cast entities cannot resolve search queries.
|
||||
|
||||
| Speaker | Entity ID |
|
||||
|---------|-----------|
|
||||
| Living Room speaker | `media_player.living_room_speaker_2` |
|
||||
| Dining Room speaker | `media_player.dining_room_speaker_2` |
|
||||
| basement mini | `media_player.basement_mini_2` |
|
||||
| Kitchen stereo | `media_player.kitchen_stereo_2` |
|
||||
| Study speaker | `media_player.study_speaker_2` |
|
||||
| Butler's Pantry speaker | `media_player.butlers_pantry_speaker_2` |
|
||||
| Master bathroom speaker | `media_player.master_bathroom_speaker_2` |
|
||||
| Kids Room speaker | `media_player.kids_room_speaker_2` |
|
||||
| Guest bedroom speaker 2 | `media_player.guest_bedroom_speaker_2_2` |
|
||||
| Garage Wifi | `media_player.garage_wifi_2` |
|
||||
| Whole House | `media_player.whole_house_2` |
|
||||
| downstairs | `media_player.downstairs_2` |
|
||||
| upstairs | `media_player.upstairs_2` |
|
||||
|
||||
### Other operations
|
||||
|
||||
```bash
|
||||
# Log a listen event manually
|
||||
curl -X POST http://192.168.86.51:8321/api/history/webhook \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"title":"Paranoid Android","artist":"Radiohead","album":"OK Computer"}'
|
||||
|
||||
# Run Last.fm discovery (expand candidate pool)
|
||||
curl -X POST http://192.168.86.51:8321/api/admin/discover \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"limit": 50}'
|
||||
|
||||
# Rebuild taste profile
|
||||
curl -X POST http://192.168.86.51:8321/api/admin/build-taste-profile
|
||||
|
||||
# Get recommendations (without playing)
|
||||
curl http://192.168.86.51:8321/api/recommendations?limit=20
|
||||
```
|
||||
|
||||
## Pipeline Stages
|
||||
|
||||
1. **Listening History** — HA automation POSTs to webhook when music plays on any Music Assistant speaker. Deduplicates events within 60 seconds.
|
||||
2. **Discovery** — Last.fm `track.getSimilar` expands each listened track to ~50 candidates.
|
||||
3. **Preview Lookup** — iTunes Search API finds 30-second AAC preview URLs (rate-limited ~20 req/min).
|
||||
4. **Embedding** — Background worker downloads previews, runs CLAP model (`laion/larger_clap_music`), stores 512-dim vectors in pgvector with HNSW index.
|
||||
5. **Taste Profile** — Weighted average of listened-track embeddings (play count * recency decay).
|
||||
6. **Recommendations** — pgvector cosine similarity against taste profile, excluding known tracks.
|
||||
7. **Playlist** — Mix known-liked + new recommendations, interleave, play via Music Assistant.
|
||||
|
||||
## Improving Recommendations Over Time
|
||||
|
||||
Recommendations improve as the system accumulates more data:
|
||||
|
||||
- **Listen to music** — every track played on any speaker is logged automatically
|
||||
- **Run discovery periodically** — `POST /api/admin/discover` to expand the candidate pool via Last.fm
|
||||
- **Rebuild taste profile** — `POST /api/admin/build-taste-profile` after significant new listening activity
|
||||
- **Embedding worker runs continuously** — new candidates are automatically downloaded and embedded
|
||||
|
||||
The taste profile is a weighted average of all listened-track embeddings. More diverse listening history = more nuanced recommendations.
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Component | Choice |
|
||||
|-----------|--------|
|
||||
| App framework | FastAPI + SQLAlchemy async + Alembic |
|
||||
| Database | PostgreSQL 17 + pgvector (HNSW cosine similarity) |
|
||||
| Embedding model | CLAP `laion/larger_clap_music` (512-dim, PyTorch CPU) |
|
||||
| Audio previews | iTunes Search API (free, no auth, 30s AAC) |
|
||||
| Discovery | Last.fm `track.getSimilar` API |
|
||||
| Playback | Music Assistant via Home Assistant REST API |
|
||||
| Music catalog | Apple Music (via Music Assistant) |
|
||||
| Reverse proxy | Traefik (`recommend.haunt.house`) |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
All prefixed with `HAUNTFM_`. Key ones:
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `HAUNTFM_DATABASE_URL` | PostgreSQL connection string |
|
||||
| `HAUNTFM_LASTFM_API_KEY` | Last.fm API key for discovery |
|
||||
| `HAUNTFM_HA_URL` | Home Assistant URL |
|
||||
| `HAUNTFM_HA_TOKEN` | Home Assistant long-lived access token |
|
||||
| `HAUNTFM_EMBEDDING_WORKER_ENABLED` | Enable/disable background embedding worker |
|
||||
| `HAUNTFM_EMBEDDING_BATCH_SIZE` | Tracks per batch (default 10) |
|
||||
| `HAUNTFM_EMBEDDING_INTERVAL_SECONDS` | Seconds between batch checks (default 30) |
|
||||
| `HAUNTFM_MODEL_CACHE_DIR` | CLAP model cache directory |
|
||||
| `HAUNTFM_AUDIO_CACHE_DIR` | Downloaded preview cache directory |
|
||||
|
||||
See `.env.example` for full list.
|
||||
|
||||
## Integrations
|
||||
|
||||
- **Home Assistant** — automation `haunt_fm_log_music_play` captures listening history; REST API used for speaker playback
|
||||
- **Music Assistant** — resolves text search queries to Apple Music tracks, streams to Cast speakers
|
||||
- **OpenClaw** — has a skill doc (`skills/haunt-fm/SKILL.md`) so you can request playlists via Telegram/iMessage
|
||||
- **Traefik** — routes `recommend.haunt.house` to the service
|
||||
- **Porkbun DNS** — CNAME for `recommend.haunt.house`
|
||||
|
||||
Reference in New Issue
Block a user