DCML conversion including anacruses #32

napulen · 2021-03-26T15:49:56Z

The measure numbering in this piece seems off in the annotation. This is an anacrusic piece.

When-in-Rome/Corpus/Quartets/Beethoven,_Ludwig_van/Op18_No4/4/analysis.txt

m1 c: i

malcolmsailor · 2022-01-07T14:28:08Z

I am also running up against this issue in a number of the Beethoven quartet analyses.

Sometimes they use m0 for pickups:

./Op018_No2/2/analysis.txt
./Op135/2/analysis.txt
./Op018_No2/3/analysis.txt
./Op130/1/analysis.txt
./Op131/2/analysis.txt
./Op131/4/analysis.txt
./Op131/1/analysis.txt
./Op095/4/analysis.txt

But at other times, no, like the example noted by @napulen above. In fact, the alignment between chord annotations and score seems to be off in all of the following pieces. I haven't manually inspected them yet but I will shortly.

Beethoven,_Ludwig_van/Op059_No3/1
Beethoven,_Ludwig_van/Op018_No6/3
Beethoven,_Ludwig_van/Op074/4
Beethoven,_Ludwig_van/Op074/3
Beethoven,_Ludwig_van/Op131/3
Beethoven,_Ludwig_van/Op130/2
Beethoven,_Ludwig_van/Op018_No4/1
Beethoven,_Ludwig_van/Op018_No4/4
Beethoven,_Ludwig_van/Op018_No4/3
Beethoven,_Ludwig_van/Op127/2
Beethoven,_Ludwig_van/Op018_No3/4
Beethoven,_Ludwig_van/Op018_No3/3
Beethoven,_Ludwig_van/Op018_No2/3
Beethoven,_Ludwig_van/Op018_No2/2
Beethoven,_Ludwig_van/Op018_No5/4
Beethoven,_Ludwig_van/Op018_No5/3

I can write a little script to decrement the measure numbers by 1 in the relevant files. But is there any reason not to do that? (I know there are cases like 0p 57 no. 1, iii, where the piece actually begins with a full measure but the music [and the analysis] only begin later in the measure.)

malcolmsailor · 2022-01-07T15:08:55Z

Nevermind, Op. 57 no. 1 iii does not begin with a rest, I misremembered, it does in fact begin with a pickup measure.

malcolmsailor · 2022-01-07T15:22:22Z

OK I just flipped through the score of the Beethoven Quartets and double-checked all the analyses for movements that begin with pickups. It appears that these are the movements that begin with "m1" rather than "m0". Many (but not all of these movements) are also missing a beat annotation for the first roman numeral, like the case @napulen singled out above (this seems to be more common in the early quartets); I have put the beat annotation that should be there after the file directory (regardless of whether it is there or not). NB I found a few errors here as well.

Beethoven,_Ludwig_van/Op018_No3/3 3
Beethoven,_Ludwig_van/Op018_No3/4 2.33
Beethoven,_Ludwig_van/Op018_No4/1 4
Beethoven,_Ludwig_van/Op018_No4/3 3
Beethoven,_Ludwig_van/Op018_No4/4 2.5
Beethoven,_Ludwig_van/Op018_No5/2 3
Beethoven,_Ludwig_van/Op018_No5/3 2.5
Beethoven,_Ludwig_van/Op018_No5/4 2.25 # Begins on wrong beat
Beethoven,_Ludwig_van/Op018_No6/3 2.5
Beethoven,_Ludwig_van/Op057_No1/3 2
Beethoven,_Ludwig_van/Op074/3 2.5
Beethoven,_Ludwig_van/Op074/4 2
Beethoven,_Ludwig_van/Op127/2 3.67 # Begins on wrong beat
Beethoven,_Ludwig_van/Op132/3 3 # Begins on wrong beat

I'm happy to make a pull request that:

decrements the measure numbers in these files
adds an initial beat annotation to all the files
corrects the errors

MarkGotham · 2022-01-07T15:56:11Z

Hi @malcolmsailor,

Many thanks for this and Happy New Year! Let's 'resolve' to advance some of these issues in 2022!

Please do contribute fixes for issues like these, but consider doing so as part of a wider update. I've held off new work on this corpus conversion partly because DCML are actively working on their own update of the ABC original (both corpus and syntax).

I think some context may be useful here. I'll be as candid as I can in speaking about myself / WiR, though in discussing others' work in progress, I'm obviously not in a position to speak with authority.

ABC (from DCML):

There were issues with the ABC corpus as it was initially released including things like negative beat numbers in the tsv files.
They have since made a lot of changes, both to the corpus and to the underlying syntax standard it's based on (see recent commits / release history etc).
That work is ongoing, including (I think) work on the analyses, the syntax, and even the score encoding.

The analyses here:

As the metadata suggests, the analyses here are based on an initial conversion of ABC, though they have been adjusted for several reasons to:
- make conversion possible (the two syntaxes don't perfectly line up),
- fix those fundamental issues (e.g. negative beat numbers), and
- adjust the musical reading in some places.
I think Dmitri is planning to release his own set of analysis that are based on further adjustment of these files. That would be the moment to
- add them (analysis_DT.txt perhaps?)
- re-convert the DCML set (analysis_DCML.txt), and
- probably simply delete the current set which is useful for the moment, but doesn't truly represent DCML then, nor now, nor DT.

So, in short, I think what's needed here is:

a updated converter of the DCML standard (see my initial offering here and feel free to have at it). Note that this converter takes the tabular representation as its input. That could be useful for flexibility in accommodating other tabular representations (like the BPS and many other representations for machine learning), though an updated version of this converter may opt to read directly from DCML's regex.
preferably also some work on music21 (both Roman numerals and Roman text) to enable more direct conversion between these formats (though this leads directly to the heart of a large, open question for our field: how to represent harmony?).
re-conversion of the DCML analyses with the updated converter.
(if / when relevant:) replacing the current analyses with Dmitri's curated set.

I hope that both helps clarify the situation and also accurately reflects everyone's plans! As I'm sure is abundantly clear, producing a meta-corpus of all the the harmonic analyses out there in all their different formats is a work in progress! Less WiR and more WIP!

napulen · 2022-01-07T21:11:47Z

Maybe asking DCML folks about these specific issues and their status on the newer version of the syntax/encodings would also be helpful.

I know @johentsch has been involved in the new syntax efforts. Pinging here just to take advantage of the already useful discussion.

johentsch · 2022-01-11T08:57:03Z

@malcolmsailor, we've had the scores corrected, they can be found on the ABC/v2 branch.

The branch also follows the general folder structure that the other published DCML corpora will/do have and the same TSV format (created by the MuseScore parser ms3). Therefore, developing a new converter might be worth the effort and I would be happy to assist.

malcolmsailor · 2022-01-31T15:07:44Z

Mark, what does an updated converter need precisely?

Looking at one of the files, it seems that the order of the columns has changed in the new format, so that needs to be accounted for. And then the way that measures and onsets are represented has changed, so that needs to be taken into account. Is there anything else that needs to be changed?

My inclination is to take most items directly from the regex (which is now in the "label" column), since that will be easy to update if the DCML standard changes again in the future. To do this we can just iterate over the named groups in the regex with groupdict().

On the other hand there are a few columns like "globalkey", "localkey", and "pedal" that can't be read directly from the regex. Since those are already present in the TSV, it seems most straightforward to just take those from the corresponding columns. Perhaps I am missing something?

MarkGotham · 2022-02-04T18:31:04Z

Thanks Malcolm, all.

Sounds good! And yes. the converter update might well be a small job.

Wrt the columns numbering, as you say, we could just update those numbers, or move over to reading from the regex. Alternatively, it's also perhaps worth considering a slightly flexible hybrid that reads the TSV, and expects certain column names but doesn't specify the column order in advance. E.g. reading in the sv file, then converting that data to a list of dicts with the original column names as keys:

def data_by_heading(table, headings_row=0):
    headings = table[headings_row]
    rows = []
    for entry in table[headings_row+1:]:
        data = {}
        for idx, col in enumerate(entry):
            data[headings[idx]] = col
        rows.append(data)
    return rows

Zooming out, it would be good for the converter to aim to capture everything that's directly shared between DCML and Romantext. It can't do much more than that. You mention pedals, for instance - that is supported in both, so that could be a new feature to add.

malcolmsailor · 2022-02-07T16:33:50Z

OK I took a first stab at an updated converter, which is in a fork I made of music21:
https://github.com/malcolmsailor/music21/blob/master/music21/romanText/tsvConverter.py

I used Mark's idea of reading the column names dynamically from each TSV file, since I think that could be readily extended to other tabular formats. It seemed simplest to make the mapping from indices to attribute names an attribute of the TsvHandler class, and then to make makeTabChord a method (and indeed a private method) of the TsvHandler class, since it needs this mapping.

However, this means that conversion going the other way is going to need to know what the canonical list of column names are. I suppose we can just take these from any of the .tsv files in the ABC corpus. But I didn't do this yet.

A number of columns have been renamed because they have new names in the tsv files:

global_key -> globalkey
local_key -> localkey
measure -> mc
beat -> mc_onset
In each case, I added a property attribute to the TabChord class with the old name, so code that interfaces with TabChord doesn't need to be rewritten. I also renamed 'combinedChord' to 'chord'; that name was not otherwise being used and it simplifies reading/writing headers to use the same name.

Besides mc (measure count?) there is also mn (measure number?). I'm not sure what the difference between these is---in the file I examined, they were always the same. @johentsch perhaps you can elucidate?

There are also some columns that have been deleted and don't appear to have equivalents. Namely, 'totbeat', 'altchord', 'no', 'op', 'mov', 'length'. For the most part I don't think these are important. I rewrote the code that used totbeat so as not to depend on this attribute. The one column I am wondering about is length---was it important that we were previously able to do thisEntry.quarterLength = self.length? Do we need to add logic to deduce the length of each chord? Or can music21 do that elsewhere?

malcolmsailor · 2022-02-07T16:42:48Z

I forgot to say: it passes all the tests in the tsvConverter.py file now. I don't know how much coverage those tests have, or if there are relevant tests elsewhere that should be run.

MarkGotham · 2022-02-08T16:03:20Z

Great, thanks @malcolmsailor !

A couple of quick comment for now, in haste.

Measures (to save @johentsch the trouble!):

measure count runs 1, 2, 3, sequentially for every measure object;
measure number follows score conventions: e.g.
- anacrusis = 0,
- split measures (e.g. for repeats) don't each get a distinct number each,
- possibility of 1st /2nd time e.g. measure 18a
i.e. a highly relevant part of the current issue wrt anacruses and measure counting.

music21 tests: yes, test on the file and if you need to check the interaction with other parts of that library then run python3 test/multiprocessTest.py from the base. (Not likely to be too important in this case). Various tests also run when you submit the PR. I've alerted that community to this work in progress (romanText/tsvConverter.py update work in progress / coming soon cuthbertLab/music21#1214), so we may get some suggestions from there too.
Reverse (m21>DCML) conversion. Yes perhaps with the column headings as they stand for now, perhaps with user-option to redefine (attributeXcolumn: int = 4).

Thanks again!

malcolmsailor · 2022-02-08T16:10:41Z

So which do we want to use, measure count or measure number? (Forgive me I am not up to speed on the intricacies here.) Or provide a kwarg to select one or the other?

MarkGotham · 2022-02-08T16:13:16Z

So which do we want to use, measure count or measure number? (Forgive me I am not up to speed on the intricacies here.) Or provide a kwarg to select one or the other?

Number. (Count may still be useful in analysis-score alignment)

malcolmsailor · 2022-02-08T16:15:38Z

OK presently doing the opposite so I'll change that.

Is there an easy way to store arbitrary attributes in the music21 stream, like measure count in this case? So it could possibly be used downstream.

jacobtylerwalls · 2022-02-08T17:22:17Z

Is there an easy way to store arbitrary attributes in the music21 stream, like measure count in this case?

I think .editorial is the suggested way to store arbitrary key-value pairs.

In [1]: from music21 import stream

In [2]: s = stream.Stream()

In [3]: s.editorial.arbitrary = 1234

malcolmsailor · 2022-02-09T14:25:00Z

So I updated the code to simply store all columns in the input .tsv that are not in HEADERS in the editorial attribute of each roman numeral. Perhaps this is overkill but it seemed the simplest approach.

The only thing that it seems is now preventing conversion back to tsv from producing an appropriate result is that 'localkey' is a pitch name, rather than a roman numeral. Is there existing code somewhere in music21 that, given two pitch names (with case indicating major/minor), will express the latter as a roman numeral relative to the former?

MarkGotham · 2022-02-09T16:37:06Z

Maybe getScaleDegreeAndAccidentalFromPitch?

MarkGotham · 2022-02-09T16:37:25Z

cMaj = key.Key('C')
cMaj.getScaleDegreeAndAccidentalFromPitch(pitch.Pitch('E'))

jacobtylerwalls · 2022-02-09T16:55:40Z

A bit involved, but ...

>>> p1 = 'C'
>>> p2 = 'e'
>>> p2_rn = roman.RomanNumeral('i' if p2.lower() == p2 else 'I', keyOrScale=key.Key(p2))
>>> p2_rn.pitches
(<music21.pitch.Pitch E4>, <music21.pitch.Pitch G4>, <music21.pitch.Pitch B4>)
>>> r = roman.romanNumeralFromChord(chord.Chord(p2_rn.pitches), keyObj=key.Key(p1))
>>> r
<music21.roman.RomanNumeral iii in C major>

malcolmsailor · 2022-02-09T18:50:49Z

Alright, I cleaned up all my TODOs. (There is still one TODO in the tsvConverter.py that I believe originates with you, Mark.)

However, conversion to DCML isn't necessarily perfectly functional because of 'vii' seems to be interpreted differently by music21 and DCML. So '#viio6/ii' in the DCML becomes 'viio6/ii' when written back out with M21toTSV, which then becomes 'bvii6/ii' when read in again. If there's an easy fix for this, great, but otherwise, I don't think I'm inclined to spend too much time on it as (as far as I know) no one is actually using the conversion in this direction.

MarkGotham · 2022-08-11T11:55:09Z

Closing this as the discussion is now on cuthbertLab/music21#1267 and #42

MarkGotham changed the title ~~Anacrusic piece Op18 No4 - IV~~ DCML conversion including anacruses Jan 7, 2022

MarkGotham mentioned this issue Feb 8, 2022

romanText/tsvConverter.py update work in progress / coming soon cuthbertLab/music21#1214

Closed

malcolmsailor mentioned this issue Aug 3, 2022

Unexpected value of romanNumeral after romanNumeralFromChord cuthbertLab/music21#1349

Open

MarkGotham closed this as completed Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DCML conversion including anacruses #32

DCML conversion including anacruses #32

napulen commented Mar 26, 2021

malcolmsailor commented Jan 7, 2022

malcolmsailor commented Jan 7, 2022

malcolmsailor commented Jan 7, 2022

MarkGotham commented Jan 7, 2022

napulen commented Jan 7, 2022

johentsch commented Jan 11, 2022

malcolmsailor commented Jan 31, 2022

MarkGotham commented Feb 4, 2022

malcolmsailor commented Feb 7, 2022

malcolmsailor commented Feb 7, 2022

MarkGotham commented Feb 8, 2022

malcolmsailor commented Feb 8, 2022

MarkGotham commented Feb 8, 2022

malcolmsailor commented Feb 8, 2022

jacobtylerwalls commented Feb 8, 2022

malcolmsailor commented Feb 9, 2022 •

edited

Loading

MarkGotham commented Feb 9, 2022

MarkGotham commented Feb 9, 2022 •

edited

Loading

jacobtylerwalls commented Feb 9, 2022

malcolmsailor commented Feb 9, 2022

MarkGotham commented Aug 11, 2022

DCML conversion including anacruses #32

DCML conversion including anacruses #32

Comments

napulen commented Mar 26, 2021

malcolmsailor commented Jan 7, 2022

malcolmsailor commented Jan 7, 2022

malcolmsailor commented Jan 7, 2022

MarkGotham commented Jan 7, 2022

napulen commented Jan 7, 2022

johentsch commented Jan 11, 2022

malcolmsailor commented Jan 31, 2022

MarkGotham commented Feb 4, 2022

malcolmsailor commented Feb 7, 2022

malcolmsailor commented Feb 7, 2022

MarkGotham commented Feb 8, 2022

malcolmsailor commented Feb 8, 2022

MarkGotham commented Feb 8, 2022

malcolmsailor commented Feb 8, 2022

jacobtylerwalls commented Feb 8, 2022

malcolmsailor commented Feb 9, 2022 • edited Loading

MarkGotham commented Feb 9, 2022

MarkGotham commented Feb 9, 2022 • edited Loading

jacobtylerwalls commented Feb 9, 2022

malcolmsailor commented Feb 9, 2022

MarkGotham commented Aug 11, 2022

malcolmsailor commented Feb 9, 2022 •

edited

Loading

MarkGotham commented Feb 9, 2022 •

edited

Loading