» Dezrann documentation

Rebuilding Dezrann public corpora

Corpora are described through .json files (see metadata). As of Q1 2025, most corpora are built on the Dezrann host from these files through the tools/dezrann-corpus.py script located in the dezrann-corpus repository. Several corpora also require preliminary steps to prepare data and/or metadata from upstream archives.

The --build option of tools/dezrann-corpus.py is a shorthand for:

Prerequisites

To rebuild these corpora, you need an account with admin access on a Dezrann host, by either:

You can also rebuild corpora on a specific host using --host myhost on any dezrann-corpus command.

Corpus

bach-fugues » The Well-Tempered Clavier, Book I

From dezrann-corpus:

python tools/dezrann-corpus.py --build bach-fugues

mozart-piano-sonatas » Mozart Piano Sonatas

From dezrann-corpus:

python tools/dezrann-corpus.py --build mozart-piano-sonatas

Alternative option, after downloading long-term archive from https://doi.org/10.57745/OHRWPC:

python tools/dezrann-corpus.py --build --from-archive mozart-piano-sonatas.zip

mozart-string-quartets » Mozart String Quartets

From dezrann-corpus:

python tools/dezrann-corpus.py --build mozart-string-quartets

classical-symphonies » Classical and Early-Romantic Symphonies

From dezrann-corpus:

python tools/dezrann-corpus.py --build classical-symphonies
python tools/dezrann-corpus.py --build mozart-symphonies
python tools/dezrann-corpus.py --build haydn-symphonies
python tools/dezrann-corpus.py --build beethoven-symphonies

⚠️ As the scores are very large, and as the audio files are quite long, this build demands large quantities of memory on the Dezrann backend.

Then, on the server (the corpus actually contains three subcorpora, we need to automatize that):

mv sandbox/classical-symphonies/ .
mv sandbox/beethoven-symphonies/ classical-symphonies/
mv sandbox/mozart-symphonies/ classical-symphonies/
mv sandbox/haydn-symphonies/ classical-symphonies/

schubert-winterreise » Winterreise (Winter Journey)

Follow instructions at https://gitlab.com/algomus.fr/dezrann/schubert-winterreise to prepare all data.

Clone this repository as a sibling directory of dezrann-corpus

Then, from dezrann-corpus:

python tools/dezrann-corpus.py --build schubert-winterreise

openscore-lieder » 19th Century Lieder from female composers

(Optional) To regenerate metadata from upstream, clone https://github.com/OpenScore/Lieder (⚠️ wait for PR) as a sibling of dezrann-corpus, then, from Lieder:

cd data
python to_dezrann.py
cp openscore-lieder-dezrann.json ../../dezrann-corpus/metadata/openscore-lieder.json`

Then, from dezrann-corpus:

python tools/dezrann-corpus.py --build openscore-lieder

weimar-jazz » Weimar Jazz Database

Follow instructions at https://gitlab.com/algomus.fr/dezrann-corpus/corpus/weimar-jazz-database/jazz/README.md to prepare all data.

Then, from dezrann-corpus:

python tools/dezrann-corpus.py --build weimar-jazz

supra » SUPRA

From dezrann-corpus, to prepare all data:

cd corpus/supra/src
yarn install
yarn start

(⚠️ TO BE CONFIRMED) Then, copy directly the data to your server:

scp -r pieces myhost:corpus/sandbox/supra
python tools/dezrann-corpus.py --build supra

slovenian-folk-song-ballads » Slovenian Folk Song Ballads

From dezrann-corpus:

python tools/dezrann-corpus.py --build slovenian-folk-song-ballads

Alternative option, after downloading long-term archive from https://doi.org/10.57745/SINZFK:

python tools/dezrann-corpus.py --build --from-archive slovenian-folk-song-ballads.zip

erkomaishvili » Traditional Georgian Sacred Music sung by Artem Erkomaishvili

(Optional) To regenerate metadata from upstream, from dezrann-corpus:

cd corpus/erkomaishvili/scripts
yarn install
yarn run corpus
python3 clean-scores.py
cp ../corpus/erkomaishvili-pieces.json ../../../metadata

Then, from dezrann-corpus:

python tools/dezrann-corpus.py --build erkomaishvili