» Dezrann documentation
A metadata/corpus.json
file gathers all data/metadata information on pieces into a corpus, that is:
corpus
(corpus metadata)opus
(piece metadata) and sources
You are welcome to discuss or to propose MR to improve the definition and the handling of such metadata.
{
"corpus": {
// Corpus metadata [mandatory: at least id and title]
"id": "bach-fugues",
"title": "Das wohltemperierte Klavier, Buch I",
(...)
},
"pieces": {
"bwv846": {
// Piece metadata [mandatory, see below]
"opus": { ... },
// Sources (scores, audio, analyses, ...)
// [mandatory: at least one score/audio source]
"sources": [ ... ],
// Piece settings
"settings": { ... }
},
"bwv847": {
...
}
},
"settings": {
// Optional
"access": "public"
},
"template": {
// Optional
...
}
}
Files should be referred by a URL (preferably to some git repository) or a stable external identifier. They can also be referred by a local path (relative to the corpus.json
file).
There can be optional settings
specific to Dezrann or any other application.
The metadata/corpus.json
file can be split in several files, for example when the corpus
section is hand-curated whereas the pieces
are produced by another script.
The /metadata directory contains several examples, such as:
minimal.json (minimal example, two pieces, scores, audios, analyses)
openscore-lieder-corpus.json (corpus
) and openscore-lieder.json (pieces
, generated from data in another git)
bach-fugues.json (with some templating yielding bach-fugues.full.json)
mozart-piano-sonatas.json (with some templating yielding mozart-piano-sonatas.full.json)
supra.json and weimar-jazz.json (only corpus
, the upload of the pieces
is directly handled by custom scripts)
For most of the baroque/classical/romantic music, the score is the reference, there may be recordings. For jazz/pop, a recording is the reference, there may be a score, such as in several examples on edmus.json.
corpus
MetadataThis section describes the corpus as a whole. This is used both on http://dezrann.net/corpora and on corpus pages such as http://dezrann.net/explore/mozart-piano-sonatas.
To draft a corpus, only id
and title
are mandatory. The other fields will have to be carefully written/reviewed/translated to prepare the release of a public corpus.
id
: Mandatory, unique identifier, such as "bach-fugues", "corelli-trio-sonatas" or "schubert-winterreise
.
🏳️ shorttitle
: Title with < 25 characters, to be displayed on http://dezrann.net/corpora and at other places with reduced space
🏳️ motto
: Short text (< 120 characters, nominal sentence) advertising the corpus, including some number of the works
image
: url of one image (recommended size: XXXpx XXXpx) to illustrate the corpus
🏳️ title
: Mandatory, title
🏳️ text
: 3-6 lines presenting the corpus, with a historical / musicological perspective (and not referring to Dezrann). Links (in markdown) to Wikipedia or other sites are welcome.
🏳️ availability
: (for the general public) 1-4 lines detailing what is actually in Dezrann, including stating which content is there (scores, autographs, analyses, synchronized audio), both qualitatively and quantitatively.
status
: (for the technical audience) short message summarizing the main issues with this corpus, that will be displayed on https://doc.dezrann.net/status
🏳️ contributors.*
: see metadata.md#the-contributors-bloc. Before a public release, the corpus should contain a contributors.maintainer
field.
showcase
: a list of 1-4 ids of pieces in the corpus with a particularly high quality (availability of sources, quality of those). They will be randomly displayed/showcased from some places such as http://dezrann.net/corpora
Optional external references ref*
: see below
quality:corpus
and quality:corpus:metadata
: See quality.md. If you just completed a draft of corpus.json
, start with these both fields to 1
.
genre
opus
: catalogue numbers or opus numbers
piece
Metadata and DataThe piece data may be either static / fully redacted, or produced by a script from some, and/or maintained with templates, see templating.md.
sources
DataAt least one source has to be provided.
"sources": [
// Scores (.MEI (preferred), .musicxml, .mscz, .krn), see scores.md
{
"score": "http://gitlab.com/bla/bli/02.krn",
"license": "CC-BY-SA-4.0",
"contributors": {
"encoder": "Jane Foo",
"editor": "KernScores"
}
},
{
"score": "http://bla.net/sonata-12.mscz",
"measure-map": "http://bla.net/sonata-12.mm.json"
},
{
"score": "vivaldi-04.mei"
},
// Scores from Neuma pipeline
{
"score:neuma": "all:collabscore:saintsaens-ref:C080_0"
},
{
"score:gallica": "ark:/12148/bpt6k1162028r"
},
// Audio/video sources
{
"audio": "http://gitlab.com/bla/bli/bla-07.mp3",
"source": "http://a-wonderful-open-music-project.org/",
"license": "CC0-1.0"
},
{
"video": "http://a-wonderful-open-music-project.org/my-video-07.mp4",,
"source": "http://a-wonderful-open-music-project.org/",
"license": "CC0-1.0",
"contributors": {
"performer": "Clara Dee"
},
"info": "Studio recording"
}
{
"video:yt": "df6DFfs",
"contributors": {
"performer": "Clara Dee"
},
"info": "Live recording at the Schnupz Concert Hall"
},
{
"audio:yt": "df6DFfs",
},
{
"audio:yt": "df6edfs",
"synchro": "http://gitlab.com/bla/bli/synchro.json"
},
// Analyses that will be displayed in Dezrann
{
"analysis": "https://gitlab.com/foo/bar/haydn-symph099-mvt1.dez",
"contributors": {
"analyst": "Dinh-Viet-Toan Le, Francesco Maccarini"
},
"ref:doi": "10.1145/3543882.3543884"
},
// Special sources, with provided image(s) and position files
// Any scan, with positions
{
"images": [ "http://bli/scan-07-page1.jpg", "http://bli/scan-07-page2.jpg" ],
"positions": "positions-07.json"
},
{
"image": "http://bli/scan-07.jpg",
"positions": "positions-07.json"
},
// Scan of piano rolls (to be better specified)
{
"audio": "bla/bli07.mp3",
"image": "bli/roll-07.jpg",
"positions": "positions-07.json"
},
{
"grid": "| D A | Bm D/F# | G D | Em7 A7"
}
]
For each source, one (and only one) of these fields has to be provided:
score
(url or file)
score:neuma
or score:gallica
video
(url or file)
video:yt
audio
(url or file), including videos that are… not real videos
audio:yt
analysis
(url or file in .dez format)These fields are optional
contributors
(dictionary, see below)name
(short name used to refer to this source and distinguish it from others. For example, for a score, it may correspond to contributors.encoder
. For an audio, it is usually the same as contributors.performer
or contributors.artist
. But it can also be contributors.editor
when relevant, for example to acknowledge a open-data project)info
(short string, < 100 characters)license
(SPDX identifier)source
(url)album
(string)(Jazz: Concert place / Recording Label, to be detailed)
opus
Piece Metadataopus
: Basic information(when applicable) corpus
or collection
corpus
for generic names such as "Mozart piano sonatas"
(they do not bring more information than what is in piece:title
and composer
)collection
for data giving more information ("London symphonies"
, "Le quattro stagioni"
). The two fields may be different.(mandatory) id
: Unique identifier, such as "bach/bwv847"
. Should include some opus
information and be consistent with other identifiers used in Dezrann. (Note that there is also a id
field outside opus
, it will be removed at some time.)
opus
: Titles(mandatory) 🏳️ title
or piece:title
: We follow the Open Opus style guide, except that we do not put the opus information (see opus
below).
(when applicable) 🏳️ nickname
or piece:nickname
(when applicable) movement:num
and movement:title
(in this case, we do not use title
and nickname
but rather piece:title
and piece:nickname
)
(mandatory) opus
: catalogue number or opus number, such as "opus": "K.551"
. There are something numbers that are outside catalogue/opus numbers and that may be encoded as other fields, such as "symphony:num": "41"
for Mozart’s Jupiter.
opus
: Other informationgenre
key
: key/tonality of this file (movement, not piece), such as D minor
of Bb major
meter
: such as 4/4
or 3/4, 6/8
. More complex meters can be given through measure maps in the score
source.
corpus
, opus
, and one of the sources
year
: year, or range of years
ref
: External references{
"ref": "https://github.com/DCMLab/mozart_piano_sonatas/tree/main/scores",
"ref:gallica": "ark:/12148/btv1b55002567w",
"ref:neuma": "all:collabscore:saintsaens-ref:C080_0",
"ref:doi": "10.1145/3543882.3543884",
"ref:wikipedia": "Symphony_No._35_(Mozart)",
"ref:musicbrainz": "611a55ef-cfb4-3bbf-9d65-7d4af5506093",
"ref:musicbrainz:recording": "ed80fa40-4871-4669-9509-18bcfee420fd",
"ref:imslp": "19_Sonatas_for_the_Piano_(Mozart,_Wolfgang_Amadeus)",
"ref:kernscores": "mozart/sonata/sonata04-1.krn",
"ref:rism": "990062490",
"ref:wjd": 70
}
ref
is any URL. The other ref:
s are identifiers on some sites/databases, as defined on RefsBase.ts. When possible, put a reference of the particular movement. But it’s also acceptable to put an external reference to a piece or even to the collection.
It is strongly recommended to include as much information as possible. Specifically, for scores, it is advisable to include ref:rism to trace the primary source used by the encoder(s).
contributors
bloc🏳️ One of the three following fields is usually defined:
(for opus
)
composer
artist
(on pop music, when the music is frequently related to the artist rather than the composer)performer
(for transcription of jazz solos)(for an audio source
)
performer
🏳️ The following fields can also be used:
lyricist
arranger
editor
(edition, publisher)encoder
(digital encoding of an existing edition)transcriber
(transcription of solos)analyst
(supervision of analyses)annotator
(annotation, following precise analytical guidelines)maintainer
(long-term maintainance of the corpus or the piece)In either corpus, piece, or source metadata, the 🏳️ fields can be localized, such as:
"collection": "Le quattro stagioni",
"collection:en": "The four seasons",
"collection:fr": "Les quatre saisons",
(...)
"ref:wikipedia:fr": "Les_Quatre_Saisons",
Do not put :xx
for the title in the original language.
Note that even contributor names such as composers may be known with some variations across languages.
"contributors": {
"composer": "Johann Sebastian Bach",
"composer:fr": "Jean-Sébastien Bach"
}
Do not localize numeric or formalized fields such as key
, opus
, or genre
.