Skip to content

1001Tracklists Integration

CrateDigger integrates with 1001Tracklists to match your recordings against known tracklists, embed chapter markers, and populate rich per-track and per-set metadata. 1001Tracklists is a community website that logs exactly what tracks DJs played during sets, including timestamps. It is the source of everything in CrateDigger that goes beyond what a filename and embedded video tags can provide.

Do I need an account?

Short answer: no, but the feature gap is significant.

Capability Without account With account
Filename and embedded metadata parsing Yes Yes
Alias resolution (artists.json, places.json entries) Yes Yes
Organize into library tree Yes Yes
Cover art (embedded or sampled video frame) Yes Yes
Posters (per-video and folder) Yes Yes
Artist artwork from fanart.tv and TheAudioDB Yes Yes
NFO metadata Yes (from filename and embedded tags) Yes (richer, from 1001Tracklists)
MKV file-level tags (ARTIST, TITLE, DATE_RELEASED) Yes (from filename) Yes (with canonicalized DJ names)
Chapter markers per track No Yes
Per-chapter track metadata (title, label, genre, artist MBIDs) No Yes
Album-level multi-artist tags No Yes
Stage, venue, festival, and event taxonomy Partial (aliases only) Full
DJ artwork from 1001Tracklists No Yes
Canonical DJ name (casing, learned aliases) Partial Full

Without an account, CrateDigger still produces a usable library. Metadata comes from parsing your filenames using your places.json entries and artists.json rules, reading any embedded MKV tags, looking up artist artwork from fanart.tv via a resolved MusicBrainz ID, and extracting cover art from embedded attachments or sampled video frames. You get organized folders, reasonable posters, and NFO files. What you lose is everything tied to the authoritative tracklist: chapter markers, per-track metadata (title, label, genre, artist MBIDs), canonical DJ naming beyond your manual aliases, and the stage and venue context tags.

With an account, every row in the table is filled in. identify matches your recording against 1001Tracklists, and embeds per-chapter metadata plus album-level event context directly into the MKV.

Account setup

You need a free account at 1001Tracklists. Configure your credentials in one of two ways.

Config file

Add your email and password under the [tracklists] section in your config.toml (~/CrateDigger/config.toml on Linux and macOS, Documents\CrateDigger\config.toml on Windows):

[tracklists]
email = "your@email.com"
password = "your-password"

Environment variables

export TRACKLISTS_EMAIL="your@email.com"
export TRACKLISTS_PASSWORD="your-password"
$env:TRACKLISTS_EMAIL = "your@email.com"
$env:TRACKLISTS_PASSWORD = "your-password"

Environment variables override config file values. If your email is not configured at all, CrateDigger prompts for both credentials interactively before starting. If your email is set but your password is missing, it exits with an error.

What CrateDigger extracts from a tracklist

For each identified tracklist, CrateDigger captures:

  • Chapter markers: one per track with timestamp and title, written directly into the MKV as Matroska chapters. See per-chapter tags.
  • Per-track metadata: artist, title, label, and genre per chapter, with pipe-aligned multi-artist support. See per-chapter tags.
  • Album-level event context: tracklist URL, title, ID, date, and source taxonomy (festival, venue, conference, event promoter, country, stage). See collection-level tags.
  • DJ list and album-artist MBIDs: canonical DJ names, 1001Tracklists slugs, and aligned MusicBrainz IDs for multi-value album-artist credits. See album-level artist tags.
  • DJ artwork URL: the DJ photo from the tracklist page, used as a background source in the poster pipeline.

How venue and location data affects routing

When 1001Tracklists has a linked festival for a set, CrateDigger uses that as the primary routing target. When no festival is present, it works down a chain:

  1. Festival (linked on the tracklist page)
  2. Venue (a linked venue page, if present)
  3. Location (plain-text location from the page heading, if present)
  4. Artist (last resort when none of the above is available)

The venue and location data that 1001Tracklists provides feeds directly into your places.json registry. If the venue name matches a canonical name or alias in places.json, CrateDigger uses the canonical name for folder and file naming. If it does not match, it uses the raw text from the tracklist page.

Adding a places.json entry for a venue is all it takes to bring it into the same alias-resolution pipeline as festivals. A set recorded at a club with no places.json entry uses the raw venue name or falls through to the artist folder. A set with no linked venue and no plain-text location always routes by artist. This is the intended behavior when 1001Tracklists has no location data to provide.

See Places for the full registry format and matching rules.

What identification looks like

Interactive selection is the default mode. identify prints a ranked list of results and prompts you to pick the correct match or skip. See identify: step-by-step for a sample transcript and auto-mode behavior.

Rate limiting and caching

CrateDigger paces its 1001Tracklists traffic in two ways:

  • Between files: after one file finishes, CrateDigger waits so that at least tracklists.delay_seconds (default 5s) has passed since that file's processing began. Time already spent on the previous file counts, so if interactive selection or fetching took longer than delay_seconds the next file starts immediately. You can override the value per run with --delay.
  • Between requests inside one file: a short fixed 0.5-second pause between consecutive 1001Tracklists requests (the tracklist page fetch, per-source lookups, and per-DJ profile fetches triggered by one selection) so the cascade that follows an interactive pick does not hit the site as a single burst.

If 1001Tracklists returns a rate-limit response, CrateDigger waits 30 seconds and retries. After the retry limit, it stops with a message asking you to solve a captcha on the 1001Tracklists website.

Two local caches make subsequent runs faster:

  • DJ cache: canonical DJ names and aliases learned from tracklist pages. Used to standardize name casing in tags and file names.
  • Source cache: festival, venue, radio, and conference names. Used to classify tracklists by source type.

Each cache entry's actual lifetime jitters by ±20% around the configured base (see Configuration: cache TTL), so a bulk first-run fill does not cause all entries to expire at the same time.

Troubleshooting

A "Scraping canary" WARNING appears in the output

You may see a line like this:

WARNING  Scraping canary: tracklist page missing selectors ['tlpItem row', 'cue_seconds input'] at https://www.1001tracklists.com/tracklist/abc123/some-set/

This means 1001tracklists.com has changed its page structure in a way that the parser no longer finds the expected data. Enrichment for that page type is likely to be incomplete until CrateDigger is updated to match the new layout. Affected fields may include genres, event date, stage name, or DJ artwork, depending on which selectors are missing.

What to do:

  1. Report the issue at github.com/Rouzax/CrateDigger/issues. Paste the full WARNING line so the selector names and URL are visible.
  2. Expect that files processed after this point may be missing some enrichment fields until a patched version is released.

What you do not need to worry about: the scraper still tries to extract whatever it can, so chapter markers and core tracklist data may still be present even when the canary fires. One WARNING is emitted per unique breakage per run, regardless of how many files are processed, so a single warning does not mean every file in the batch is broken.

See also