ยป Dezrann documentation
A corpus on Dezrann should contain score(s), set(s) of analyses/annotations, synchronized audio(s) (or at least two of the three), and appropriate metadata. As far as possible, these files should be available under open-data licenses, as described on Open Science and Licenses. We try to add corpus in a reproducible way, as for the public corpora available on the platform. Data you use or you create (scores, annotations, audios, metadata) have to be available within a git or through any stable URL.
As a corpus curator/maintainer, your responsability is mostly to prepare, update, and maintain a corpus description file metadata/xxxxxx.json
giving all the information and pointing to some sources. This involves the following steps.
(Sources data)
.dez
format representing labels, or can be done later, once the scores are on Dezrann;(Metadata)
metadata/xxxxxx.json
combining corpus and piece metadata, as described on Specifying corpus and piece data and metadata.When the metadata/xxxxxx.json
file is ready:
Either contact us with the corpus description file and/or open a MR with this file in the /metadata
directory. As of Q1 2025, uploading the corpus on Dezrann now involves manual steps on the server, and the process is, so this the preferred step. Better tools are scheduled for Q3 2025.
And/or, following examples on how to rebuild public corpora, build the corpus with the tools/dezrann-corpus.py
script on a local or on a public Dezrann installation. Note that the corpus has to be built and checked on a local installation of Dezrann (or on the test server, contact us to have an account) and before upload to the production public server.
Once the corpus is on Dezrann, in the sandbox:
metadata/xxxxxx.json
file, and to rebuild the corpus;(Communication, maintenance)
text
, motto
, availability
, status
), with a few lines presenting the corpus;tools/archive.py
a long-term archive (to be documented) and upload it on a institutional repository;