» Dezrann documentation
This page provides the status of the Dezrann corpora, including statistics on sources (scores, measure maps, audio/video content, and synchronization). There is a particular emphasis on quality ratings for corpora and pieces. Integrating multi-modal data from various sources is challenging. Rather than claiming that the integration is perfect, the main objective here is to accurately describe what is working, what is not, and to document any issues encountered. There are also links to open issues on the dezrann-corpus GitLab. While we prefer to have issues resolved, it is far better to have open issues describing problems than to ignore them! You are welcome to report issues and/or contribute to fixing them.
2025-02-08 12:40
🌏 Public corpora
bach-fugues |
4.0 |
4.0 3.0 |
2025-01-08 362M |
24 [24] |
24 2.9 |
3.0 |
24 4.8 |
24 4.0 [48] |
32 |
48 2.9 |
✅ |
8 |
mozart-piano-sonatas |
5.0 |
4.0 4.0 |
2025-01-24 356M |
72 [53] |
54 3.0 |
4.9 |
63 5.0 |
56 4.0 [54] |
1 |
56 2.6 |
✅ |
8 |
mozart-string-quartets |
3.0 |
4.0 2.0 |
2025-01-08 482M |
109 [72] |
83 2.9 |
0.0 |
83 1.9 |
84 0.7 [21] |
|
83 0.3 |
✅ |
4 |
classical-symphonies |
2.0 |
4.0 5.0 |
2025-01-09 1.2G |
48 |
24 2.4 |
1.2 |
24 3.5 |
22 1.5 |
|
6 0.6 |
✅ |
9 |
schubert-winterreise |
3.0 |
4.0 |
2024-03-11 361M |
24 |
24 |
|
24 |
48 4.0 |
|
48 4.0 |
✅ |
13 |
openscore-lieder |
3.0 |
4.0 3.0 |
2025-01-30 283M |
174 [170] |
174 4.0 |
3.0 |
53 3.0 |
32 4.0 [32] |
|
27 3.0 |
✅ |
5 |
weimar-jazz |
2.0 |
4.0 |
2025-01-29 1.5G |
456 [333] |
456 3.0 |
2.0 |
456 3.0 |
4.0 [228] |
329 |
329 |
✅ |
9 |
supra |
3.0 |
3.0 |
2024-03-11 3.1G |
456 |
|
0.0 |
|
3.0 [456] |
|
3.0 |
✅ |
7 |
slovenian-folk-song-ballads |
5.0 |
4.0 4.0 |
2024-12-19 114M |
404 [404] |
404 3.0 |
4.0 |
404 4.0 |
404 5.0 [23] |
|
404 3.0 |
✅ |
5 |
erkomaishvili |
3.0 |
4.0 |
2025-01-31 341M |
101 [101] |
101 |
|
|
404 4.0 [404] |
|
404 4.0 |
✅ |
7 |
bach-fugues » The Well-Tempered Clavier, Book I
- https://www.dezrann.net/explore/bach-fugues
- Bach’s Well-Tempered Clavier has been extensively studied, and systematic analyses of Bach’s fugues have been published by Prout (1910), Tovey (1924), Keller (1965), and Bruhn. Giraud et al (2015) published an annotation dataset detailing the 24 fugues of the first book, together with algorithms for fugue analysis. The Dezrann corpus contains these 24 annotated fugues, with scores synchronized to open recordings by Kimiko Ishizaka as well as performances recorded by the Bach Netherlands Society
- ✅ The corpus is in good condition, with only minor issues. For the majority of pieces, the score are well presented, include analyses, and are synchronized with two videos.
- Content
- 24 pieces
- 24 scores, 24 analyses, 24 recordings, 32 videos, 48 synchronizations
- On server: 48 audio, 48 wave, 24 score, 24 3d
- Quality: corpus: 4.0 [1], corpus:metadata: 4.0 [1], annotation: 0→5 (avg 4.8) [24], audio: 4.0 [24], audio:synchro: 0→4 (avg 2.9) [24], metadata: 3.0 [24], musical-time: 3.0 [24], score: 0→3 (avg 2.9) [24]
- Metadata (68 KB)
- License: ? (scores), CC-BY-3.0 (audio), ODbL-1.0 (annotations)
- Maintainers
- Issues: 8 opened, 9 closed
- Rebuilt: 2025-01-08 362M
mozart-piano-sonatas » Mozart Piano Sonatas
- https://www.dezrann.net/explore/mozart-piano-sonatas
- The corpus consists of complete scores of all 18 sonatas with form, harmony, and cadence annotations (Hentschel et al., 2021). Sonatas 1 (K279), 2 (K280) and 5 (K283) also have texture annotations (Couturier et al., 2022). Some movements also have synchronized audio. The corpus uses measure maps (Gotham et al., 2023) to improve annotation interoperability.
- ✅ The corpus is in good condition, with only minor issues. It is fully reproducible from a long-term archive. For the majority of pieces, the score are well presented, include analyses. Some of them have synchronized audio.
- Content
- 72 pieces
- 54 scores, 54 measure maps, 63 analyses, 56 recordings, 1 video, 56 synchronizations
- On server: 54 audio, 54 wave, 53 score, 53 3d
- Quality: corpus: 5.0 [1], corpus:metadata: 4.0 [1], annotation: 5.0 [54], audio: 4.0 [54], audio:synchro: 0→3 (avg 2.6) [54], metadata: 4.0 [54], musical-time: 1→5 (avg 4.9) [53], score: 3.0 [54], warning: ? [0]
- Metadata (128 KB)
- License: CC-BY-NC-SA-4.0 (scores), ODbL (annotations), CC0-1.0, CC-BY-NC-SA-3.0 (specific recordings)
- Maintainers
- Issues: 8 opened, 3 closed
- Rebuilt: 2025-01-24 356M
mozart-string-quartets » Mozart String Quartets
- https://www.dezrann.net/explore/mozart-string-quartets
- The corpus shows here 72 out of the 86 movements. Cadence and key and form annotations are provided for some of these movements (mainly first movements, in sonata form), as published in (Allegraud et al., 2019) and (Feisthauer, 2021).
- ✅ The corpus is in good condition, with only minor issues. However, the quality of scores is not equal among the corpus, and some pieces are not yet synchronized.
- Content
- 109 pieces
- 83 scores, 83 analyses, 84 recordings, 83 synchronizations
- On server: 72 score, 72 3d, 21 audio, 21 wave
- Quality: corpus: 3.0 [1], corpus:metadata: 4.0 [1], annotation: ??, audio: 0→3 (avg 0.7) [83], audio:synchro: 0→4 (avg 0.3) [83], metadata: 2.0 [83], musical-time: 0.0 [83], score: ??
- Metadata (205 KB)
- License: ? (scores), ODbL (annotations), CC-BY-NC-ND-3.0 (audio)
- Maintainers
- Issues: 4 opened, 1 closed
- Rebuilt: 2025-01-08 482M
classical-symphonies » 🚧 Classical and Early-Romantic Symphonies
- https://www.dezrann.net/explore/classical-symphonies
- The corpus includes first movements of 24 symphonies composed between 1779 and 1824: the last six Haydn Symphonies (99–104), three Mozart Symphonies (38–40), and all nine Beethoven Symphonies. These movements are analyzed with textural annotations by (Le et al., 2022). Audio recordings by the Bamberger Symphoniker (Mozart) and by the Royal Philharmonic Orchestra (Haydn, Beethoven, 1960-61) will be soon added.
⚠️ Note that the experience with this corpus is not smooth due to numerous performance issues when displaying the scores. Moreover, many pieces are not synchronized with the audio recordings. However, the annotation data of the corpus is in good condition.
- ⚠️ The experience with this corpus is not smooth due to numerous performance issues when displaying the scores. Moreover, many pieces are not synchronized with the audio recordings. However, the annotation data of the corpus is in good condition.
- Content
- 48 pieces
- 24 scores, 24 analyses, 22 recordings, 6 synchronizations
- Quality: corpus: 2.0 [1], corpus:metadata: 4.0 [1], annotation: 3→5 (avg 3.5) [24], audio: ??, audio:synchro: 0→3 (avg 0.6) [24], metadata: 4→5 (avg 5.0) [24], musical-time: ??, score: ??
- Metadata (53 KB)
- License: CC-BY-NC-SA-4.0 (scores), ODbL (annotations), CC0-1.0, CC-BY-NC-SA-3.0 (specific recordings)
- Maintainers
- Issues: 9 opened, 1 closed
- Rebuilt: 2025-01-09 1.2G
schubert-winterreise » Winterreise (Winter Journey)
- https://www.dezrann.net/explore/schubert-winterreise
- The Schubert Winterreise Dataset (SWD, Weiß 2021) contains, for all of the 24 lieder, scores, harmonic and formal analyses, as well as synchronized recordings. The free recordings with Gerhard Hüsch and Hanns-Udo Müller (1933) and Randall Scarlata and Jeremy Denk (2006) are available through Dezrann.
- The corpus is in good condition. Most scores are well presented, with two synchronized recordings and analyses. ⚠️ However, some issues remain on specific pieces.
- Content
- 24 pieces
- 24 scores, 24 measure maps, 24 analyses, 48 recordings, 48 synchronizations
- Quality: corpus: 3.0 [1], corpus:metadata: 4.0 [1], audio: 4.0 [24], audio:synchro: 4.0 [24]
- Metadata (43 KB)
- License: CC-BY-3.0 (scores, annotations), PDM-1.0, CC-BY-NC-ND-3.0 (audio)
- Maintainers
- Issues: 13 opened, 3 closed
- Rebuilt: 2024-03-11 361M
openscore-lieder » 19th Century Lieder from female composers
- https://www.dezrann.net/explore/openscore-lieder
- The OpenScore Lieder corpus consists of over 1,300 songs from the long nineteenth century. The collection is available to play online at musescore.com and is also available for download. For more on the score collection see (Gotham and Jonas 2021) or this magazine piece. This Dezrann collection presents a subset of the scores by women composers, including harmonic analyses published on the ‘When in Rome’ meta-corpus reported in Gotham et al. 2023a. The corpus uses measure maps (Gotham et al., 2023b) to improve annotation interoperability.
- ✅ The corpus is in good condition, with only minor issues. The score are well presented, some of them include analyses and/or synchronized recordings. Perspectives include adding more analyses and open recordings.
- Content
- 174 pieces
- 174 scores, 174 measure maps, 53 analyses, 32 recordings, 27 synchronizations
- On server: 170 score, 170 3d, 32 audio, 32 wave
- Quality: corpus: 3.0 [1], corpus:metadata: 4.0 [1], metadata: 3.0 [174], musical-time: 3.0 [174], score: 4.0 [174], annotation: 3.0 [53], audio: 4.0 [32], audio:synchro: 3.0 [27]
- Metadata (177 KB)
- Issues: 5 opened, 3 closed
- Rebuilt: 2025-01-30 283M
weimar-jazz » 🚧 Weimar Jazz Database
- https://www.dezrann.net/explore/weimar-jazz
- Started at the University of Music in Weimar, the Jazzomat project studied the jazz repertoire, in particular by transcribing and analyzing 400+ solos and aligning them to recordings. The Dezrann corpus contains 330+ of these high-quality jazz transcriptions, with chords, sections, and form annotation, from which 200+ with synchronized audio.
🚧 Note that work is still ongoing on this corpus, in particular the display of scores could be improved
- 🚧 Work on this corpus is still ongoing to improve the integration into Dezrann. The rendering of scores could be improved with a better extraction from WJD internal data. The synchronization is sometimes off.
- Content
- 456 pieces
- 456 scores, 456 measure maps, 456 analyses, 329 videos, 329 synchronizations
- On server: 333 score, 333 3d, 228 audio, 228 wave
- Quality: corpus: 2.0 [1], corpus:metadata: 4.0 [1], annotation: 3.0 [456], audio: 4.0 [456], musical-time: 2.0 [456], score: 3.0 [456]
- Metadata (623 KB)
- Issues: 9 opened, 3 closed
- Rebuilt: 2025-01-29 1.5G
supra » SUPRA
- https://www.dezrann.net/explore/supra
- The Stanford University Piano Roll Archive is a research portal for some rolls digitized from the Stanford Libraries’ collection of 15,000+ piano and organ rolls. SUPRA contains 456 Welte T-100 piano rolls from the years 1905-1928 with rendered ‘expressive’ audio, talking into account dynamics and tempo information. The Dezrann corpus shows here these 456 piano rolls aligned to the audio files.
- ✅ The corpus is in good condition, with only minor issues. Most piano rolls are well presented, with synchronized audio. Perspectives include to add scores and analyses for some pieces.
- Content
- 456 pieces
- ⚠️ No sources ?
- On server: 456 audio, 456 wave
- Quality: corpus: 3.0 [1], corpus:metadata: 3.0 [1], musical-time: 0.0 [456], audio: 3.0 [456], audio:synchro: 3.0 [456]
- Metadata (314 KB)
- Issues: 7 opened, 4 closed
- Rebuilt: 2024-03-11 3.1G
slovenian-folk-song-ballads » Slovenian Folk Song Ballads
- https://www.dezrann.net/explore/slovenian-folk-song-ballads
- Zbirka Slovenske ljudske pripovedne pesmi vsebuje transkribirano terensko gradivo, ki so ga zbrali slovenski etnologi, folkloristi, etnomuzikologi in različni sodelavci Glasbenonarodopisni inštitut ZRC SAZU v letih od 1819 do 1995. Tematsko razvrščena v družinske pripovedne pesmi, obsega 404 enoglasnih zapisov ljudskih pesmi in vključuje začetni verz besedila, obsežne metapodatke in glasbeno analizo, ki zajema konture, harmonijo in strukturo pesmi (melodije in besedila)(glej Borsan et al., 2023). Poleg tega vključuje omejeno število (23) razpoložljivih posnetkov. Uredniški odbor zbirke: Vanessa Nina Borsan, aktualni člani Glasbenonarodopisni inštitut ZRC SAZU (Mojca Kovačič, Marjeta Pisk) in raziskovalna skupina Algomus.
- ✅ The corpus is in good condition, with only minor issues. It is fully reproducible from a long-term archive. For the majority of pieces, the score are well presented and include analyses. A few of them have synchronized recordings.
- Content
- 404 pieces
- 404 scores, 404 measure maps, 404 analyses, 404 recordings, 404 synchronizations
- On server: 404 score, 404 3d, 23 audio, 23 wave
- Quality: corpus: 5.0 [1], corpus:metadata: 4.0 [1], annotation: 4.0 [404], audio: 5.0 [404], audio:synchro: 3.0 [404], metadata: 4.0 [404], musical-time: 4.0 [404], score: 3.0 [404]
- Metadata (717 KB)
- Issues: 5 opened, 5 closed
- Rebuilt: 2024-12-19 114M
erkomaishvili » Traditional Georgian Sacred Music sung by Artem Erkomaishvili
- https://www.dezrann.net/explore/erkomaishvili
- The Erkomaishvili dataset consists of historic tape recordings of three-voice Georgian religious songs performed in 1966 by the master chanter Artem Erkomaishvili. Successive overdubbing recordings were done for each song: top voice, then top and second voice, then the three voices together. These recordings have been digitized, curated, and analyzed by (Rosenzweig, 2020) for computational musicology research. The dataset includes audio material, scores based on the transcriptions by (Shugliashvili, 2014), synchronizations, and F0 annotations. The Dezrann corpus contains here all 101 songs of the Erkomaishvili dataset, with scores synchronized with the audio files.
- ✅ The corpus is in good condition, with only minor issues. For the majority of pieces, the score are well presented, with the four synchronized audios.
- Content
- 101 pieces
- 101 scores, 404 recordings, 404 synchronizations
- On server: 404 audio, 404 wave, 101 score, 101 3d
- Quality: corpus: 3.0 [1], corpus:metadata: 4.0 [1], audio: 4.0 [101], audio:synchro: 4.0 [101]
- Metadata (210 KB)
- Issues: 7 opened, 7 closed
- Rebuilt: 2025-01-31 341M