Organizing a Large Music Library in 2021

I have a large music library of MP3 files, around 3 TB, and finally got around to organizing it. I had a few requirements:

  • I wanted the file tags standardized as they came from a variety of sources.
  • I wanted the files organized as FIRSTLETTER/ARTIST/ALBUM/song.mp3

For this, I ended up with a two-step solution:

Step 1: Tag Everything with Beets

Beets is a very powerful metadata tagging and organizational system. It’s easiest to launch it via a Docker container, along with the PHP container used later.

First, start with the config.yaml for beets:

# config.yaml

plugins: fetchart embedart convert scrub replaygain lastgenre chroma web bucket
directory: /music
library: /config/musiclibrary.blb
art_filename: albumart
threaded: yes
original_date: no
per_disc_numbering: no

ui:
    color: yes

convert:
    auto: no
    ffmpeg: /usr/bin/ffmpeg
    opts: -ab 320k -ac 2 -ar 48000
    max_bitrate: 320
    threads: 1
    
paths:
    default: $albumartist/$album%aunique{}/$track - $title
    #default: $artist/$album%aunique{}/$track - $title
    singleton: %bucket{$artist}/$artist/$album%aunique{}/$track - $title
    comp: Various Artists/$album%aunique{}/$track - $title
    albumtype_soundtrack: Soundtracks/$album/$track $title 
        
import:
    write: yes
    copy: no
    move: yes
    resume: yes
    incremental: yes
    quiet_fallback: skip
    timid: no
    log: /config/beet.log

lastgenre:
    auto: yes
    source: album

embedart:
    auto: yes

fetchart:
    auto: yes
    
replaygain:
    auto: no

scrub:
    auto: yes

replace:
    '^\.': _
    '[\x00-\x1f]': _
    '[<>:"\?\*\|]': _
    '[\xE8-\xEB]': e
    '[\xEC-\xEF]': i
    '[\xE2-\xE6]': a
    '[\xF2-\xF6]': o
    '[\xF8]': o
    '\.$': _
    '\s+$': ''

web:
    host: 0.0.0.0
    port: 8337

match:
    strong_rec_thresh: 0.20

bucket:
   bucket_alpha: ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']

Then change the folder paths as needed and launch the containers:

# docker-compose.yaml
---
version: "2.1"
services:
  beets:
    image: ghcr.io/linuxserver/beets
    container_name: beets
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/Los_Angeles
    volumes:
      - /c/Users/Me/Documents/beets:/config
      - /l/Music/destination:/music
      - /l/Music:/downloads
    ports:
      - 8337:8337
    restart: unless-stopped
    
  app-php73:
    build:
      context: .
      dockerfile: Dockerfile.php73
    container_name: app-php73
    volumes:
      - /c/Users/me/Documents/beets:/config
      - /l/Music/destination:/music
      - /l/Music:/downloads
      - ./:/docker/
    # cap and privileged needed for slowlog
    cap_add:
      - SYS_PTRACE
    privileged: true
    environment:
      - VIRTUAL_HOST=.app.boilerplate.docker
      - VIRTUAL_PORT=80
      - POSTFIX_RELAYHOST=[mail]:1025
      - "PS1=[app-php73 $$(whoami):$$(pwd)] $$ "

With the corresponding Dockerfile.php73:

# Dockerfile.php73

FROM webdevops/php-apache-dev:centos-7-php7

ENV PROVISION_CONTEXT "development"

# Deploy scripts/configurations

RUN docker-service-disable postfix

# Misc
RUN yum install -y joe

# Update PHP
RUN yum -y install http://rpms.remirepo.net/enterprise/remi-release-7.rpm \
    && yum remove -y php70w-pecl-imagick \
    && yum install -y \
        php73 \
        php73-php-fpm \
        php73-php-pecl-xdebug \
        php73-php-mysqlnd \
        php73-php-mbstring \
        php73-php-intl \
        php73-php-gd \
        php73-php-opcache \
        php73-php-pecl-imagick \
        php73-php-xml \
        php73-php-pecl-redis5 \
        php73-php-pecl-zip \
        php73-php-soap \
    && rm -f /usr/bin/php \
    && ln -s /usr/bin/php73 /usr/bin/php \
    && mkdir -p /usr/local/php/7.3.10/bin \
    && ln -s /usr/bin/php /usr/local/php/7.3.10/bin/php \
    && rm /usr/local/bin/php-fpm \
    && ln -s /opt/remi/php73/root/usr/sbin/php-fpm /usr/local/bin/php-fpm
	
RUN yum -y localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm \
	&& yum -y install ffmpeg

# Configure volume/workdir
WORKDIR /app/

After launching these, you can go into the beets container and update all the tags in your source directory:

beet -vv import -sq /downloads

A few things to note:

  • Beets will only move the file if the tags have changed – I think this is by design. We will move the files in the next step.
  • While you can download the entire Musicbrainz database (30+GB) it’s not worth it. With a free Musicbrainz account you can get access to the API with a rate limit of 1/s which is a lot more than you can use with beets in practice.

Step 2: Move the Files

After beets has retagged everything and moved some of the files (maybe 30% in my experience), you can run this mp3move.php script to move the remainder of the files. This uses ffmpeg directly, as I found it to be more capable of extracting ID3 tags than other pure-PHP solutions. It also shells out to use Linux directory scanning rather than PHP iteration since the latter is known to be buggy on Windows/Docker if you’re dealing with large directories.

Leave a comment

Your email address will not be published. Required fields are marked *