ยป Dezrann corpus/developer documentation
A corpus on Dezrann should contain score(s), set(s) of analyses/annotations, synchronized audio(s) (or at least two of the three), and appropriate metadata. As far as possible, these files should be available under open-data licenses, as described on Open Science and Licenses. We try to add corpus in a reproducible way, as for the public corpora available on the platform. Data you use or you create (scores, annotations, audios, metadata) have to be available within a git or through any stable URL.
As a corpus curator/maintainer, your responsibility is mostly to
prepare, update, and maintain a corpus description file such as
metadata/my-corpus.json giving all the information and
pointing to some sources. This involves the following steps.
First tests, before adding a new corpus. Add a single piece
(Sources data)
.dez
format representing labels, or can be done later, once the scores
are on Dezrann;(Metadata)
metadata/my-corpus.json combining corpus and
piece metadata, as described on Specifying
corpus and piece data and metadata.metadata/template-one-piece.json template that
is a quite short example with one piece and four sources (score, audio,
YT video, analysis). These sources are now unrelated, pick what you want
for your piece.When the metadata/my-corpus.json file is ready:
Either contact
us with the corpus description file and/or open a MR with this file
in the /metadata directory. As of Q2 2025, uploading the
corpus on Dezrann now involves manual steps on the server, and the
process is, so this the preferred step. Better tools are scheduled for
Q4 2025.
๐๏ธ New (Q3 2025)! And/or test directly your corpus on the Dezrann test server with:
curl -sS --request POST --url https://test-ws.dezrann.net/corpus --header 'Content-Type: multipart/form-data' --form metadata=@my-corpus.json
tools/dezrann-corpus.py script on a local or on a public
Dezrann installation. Note that the corpus has to be built and checked
on a local installation of Dezrann (or on the test server, contact us to
have an account) and before upload to the production public server.๐๏ธ Once the corpus is on Dezrann, in the sandbox (as for exemple on https://test.dezrann.net/~/incoming-sandbox/piece-yourname)
metadata/my-corpus.json
file, and to rebuild the corpus;(Communication, maintenance)
text, motto, availability,
status), with a few lines presenting the corpus;tools/archive.py a long-term archive (to be documented) and
upload it on a institutional repository;