Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCML conversion including anacruses #32

Closed
napulen opened this issue Mar 26, 2021 · 21 comments
Closed

DCML conversion including anacruses #32

napulen opened this issue Mar 26, 2021 · 21 comments

Comments

@napulen
Copy link
Contributor

napulen commented Mar 26, 2021

The measure numbering in this piece seems off in the annotation. This is an anacrusic piece.

@malcolmsailor
Copy link
Contributor

I am also running up against this issue in a number of the Beethoven quartet analyses.

Sometimes they use m0 for pickups:

./Op018_No2/2/analysis.txt
./Op135/2/analysis.txt
./Op018_No2/3/analysis.txt
./Op130/1/analysis.txt
./Op131/2/analysis.txt
./Op131/4/analysis.txt
./Op131/1/analysis.txt
./Op095/4/analysis.txt

But at other times, no, like the example noted by @napulen above. In fact, the alignment between chord annotations and score seems to be off in all of the following pieces. I haven't manually inspected them yet but I will shortly.

Beethoven,_Ludwig_van/Op059_No3/1
Beethoven,_Ludwig_van/Op018_No6/3
Beethoven,_Ludwig_van/Op074/4
Beethoven,_Ludwig_van/Op074/3
Beethoven,_Ludwig_van/Op131/3
Beethoven,_Ludwig_van/Op130/2
Beethoven,_Ludwig_van/Op018_No4/1
Beethoven,_Ludwig_van/Op018_No4/4
Beethoven,_Ludwig_van/Op018_No4/3
Beethoven,_Ludwig_van/Op127/2
Beethoven,_Ludwig_van/Op018_No3/4
Beethoven,_Ludwig_van/Op018_No3/3
Beethoven,_Ludwig_van/Op018_No2/3
Beethoven,_Ludwig_van/Op018_No2/2
Beethoven,_Ludwig_van/Op018_No5/4
Beethoven,_Ludwig_van/Op018_No5/3

I can write a little script to decrement the measure numbers by 1 in the relevant files. But is there any reason not to do that? (I know there are cases like 0p 57 no. 1, iii, where the piece actually begins with a full measure but the music [and the analysis] only begin later in the measure.)

@malcolmsailor
Copy link
Contributor

Nevermind, Op. 57 no. 1 iii does not begin with a rest, I misremembered, it does in fact begin with a pickup measure.

@malcolmsailor
Copy link
Contributor

OK I just flipped through the score of the Beethoven Quartets and double-checked all the analyses for movements that begin with pickups. It appears that these are the movements that begin with "m1" rather than "m0". Many (but not all of these movements) are also missing a beat annotation for the first roman numeral, like the case @napulen singled out above (this seems to be more common in the early quartets); I have put the beat annotation that should be there after the file directory (regardless of whether it is there or not). NB I found a few errors here as well.

Beethoven,_Ludwig_van/Op018_No3/3 3
Beethoven,_Ludwig_van/Op018_No3/4 2.33
Beethoven,_Ludwig_van/Op018_No4/1 4
Beethoven,_Ludwig_van/Op018_No4/3 3
Beethoven,_Ludwig_van/Op018_No4/4 2.5
Beethoven,_Ludwig_van/Op018_No5/2 3
Beethoven,_Ludwig_van/Op018_No5/3 2.5
Beethoven,_Ludwig_van/Op018_No5/4 2.25 # Begins on wrong beat
Beethoven,_Ludwig_van/Op018_No6/3 2.5
Beethoven,_Ludwig_van/Op057_No1/3 2
Beethoven,_Ludwig_van/Op074/3 2.5
Beethoven,_Ludwig_van/Op074/4 2
Beethoven,_Ludwig_van/Op127/2 3.67 # Begins on wrong beat
Beethoven,_Ludwig_van/Op132/3 3 # Begins on wrong beat

I'm happy to make a pull request that:

  • decrements the measure numbers in these files
  • adds an initial beat annotation to all the files
  • corrects the errors

@MarkGotham
Copy link
Owner

Hi @malcolmsailor,

Many thanks for this and Happy New Year! Let's 'resolve' to advance some of these issues in 2022!

Please do contribute fixes for issues like these, but consider doing so as part of a wider update. I've held off new work on this corpus conversion partly because DCML are actively working on their own update of the ABC original (both corpus and syntax).

I think some context may be useful here. I'll be as candid as I can in speaking about myself / WiR, though in discussing others' work in progress, I'm obviously not in a position to speak with authority.

ABC (from DCML):

  • There were issues with the ABC corpus as it was initially released including things like negative beat numbers in the tsv files.
  • They have since made a lot of changes, both to the corpus and to the underlying syntax standard it's based on (see recent commits / release history etc).
  • That work is ongoing, including (I think) work on the analyses, the syntax, and even the score encoding.

The analyses here:

  • As the metadata suggests, the analyses here are based on an initial conversion of ABC, though they have been adjusted for several reasons to:
    • make conversion possible (the two syntaxes don't perfectly line up),
    • fix those fundamental issues (e.g. negative beat numbers), and
    • adjust the musical reading in some places.
  • I think Dmitri is planning to release his own set of analysis that are based on further adjustment of these files. That would be the moment to
    • add them (analysis_DT.txt perhaps?)
    • re-convert the DCML set (analysis_DCML.txt), and
    • probably simply delete the current set which is useful for the moment, but doesn't truly represent DCML then, nor now, nor DT.

So, in short, I think what's needed here is:

  • a updated converter of the DCML standard (see my initial offering here and feel free to have at it). Note that this converter takes the tabular representation as its input. That could be useful for flexibility in accommodating other tabular representations (like the BPS and many other representations for machine learning), though an updated version of this converter may opt to read directly from DCML's regex.
  • preferably also some work on music21 (both Roman numerals and Roman text) to enable more direct conversion between these formats (though this leads directly to the heart of a large, open question for our field: how to represent harmony?).
  • re-conversion of the DCML analyses with the updated converter.
  • (if / when relevant:) replacing the current analyses with Dmitri's curated set.

I hope that both helps clarify the situation and also accurately reflects everyone's plans! As I'm sure is abundantly clear, producing a meta-corpus of all the the harmonic analyses out there in all their different formats is a work in progress! Less WiR and more WIP!

@MarkGotham MarkGotham changed the title Anacrusic piece Op18 No4 - IV DCML conversion including anacruses Jan 7, 2022
@napulen
Copy link
Contributor Author

napulen commented Jan 7, 2022

Maybe asking DCML folks about these specific issues and their status on the newer version of the syntax/encodings would also be helpful.

I know @johentsch has been involved in the new syntax efforts. Pinging here just to take advantage of the already useful discussion.

@johentsch
Copy link

@malcolmsailor, we've had the scores corrected, they can be found on the ABC/v2 branch.

The branch also follows the general folder structure that the other published DCML corpora will/do have and the same TSV format (created by the MuseScore parser ms3). Therefore, developing a new converter might be worth the effort and I would be happy to assist.

@malcolmsailor
Copy link
Contributor

Mark, what does an updated converter need precisely?

Looking at one of the files, it seems that the order of the columns has changed in the new format, so that needs to be accounted for. And then the way that measures and onsets are represented has changed, so that needs to be taken into account. Is there anything else that needs to be changed?

My inclination is to take most items directly from the regex (which is now in the "label" column), since that will be easy to update if the DCML standard changes again in the future. To do this we can just iterate over the named groups in the regex with groupdict().

On the other hand there are a few columns like "globalkey", "localkey", and "pedal" that can't be read directly from the regex. Since those are already present in the TSV, it seems most straightforward to just take those from the corresponding columns. Perhaps I am missing something?

@MarkGotham
Copy link
Owner

Thanks Malcolm, all.

Sounds good! And yes. the converter update might well be a small job.

Wrt the columns numbering, as you say, we could just update those numbers, or move over to reading from the regex. Alternatively, it's also perhaps worth considering a slightly flexible hybrid that reads the TSV, and expects certain column names but doesn't specify the column order in advance. E.g. reading in the sv file, then converting that data to a list of dicts with the original column names as keys:

def data_by_heading(table, headings_row=0):
    headings = table[headings_row]
    rows = []
    for entry in table[headings_row+1:]:
        data = {}
        for idx, col in enumerate(entry):
            data[headings[idx]] = col
        rows.append(data)
    return rows

Zooming out, it would be good for the converter to aim to capture everything that's directly shared between DCML and Romantext. It can't do much more than that. You mention pedals, for instance - that is supported in both, so that could be a new feature to add.

@malcolmsailor
Copy link
Contributor

OK I took a first stab at an updated converter, which is in a fork I made of music21:
https://github.com/malcolmsailor/music21/blob/master/music21/romanText/tsvConverter.py

I used Mark's idea of reading the column names dynamically from each TSV file, since I think that could be readily extended to other tabular formats. It seemed simplest to make the mapping from indices to attribute names an attribute of the TsvHandler class, and then to make makeTabChord a method (and indeed a private method) of the TsvHandler class, since it needs this mapping.

However, this means that conversion going the other way is going to need to know what the canonical list of column names are. I suppose we can just take these from any of the .tsv files in the ABC corpus. But I didn't do this yet.

A number of columns have been renamed because they have new names in the tsv files:

  • global_key -> globalkey
  • local_key -> localkey
  • measure -> mc
  • beat -> mc_onset
    In each case, I added a property attribute to the TabChord class with the old name, so code that interfaces with TabChord doesn't need to be rewritten. I also renamed 'combinedChord' to 'chord'; that name was not otherwise being used and it simplifies reading/writing headers to use the same name.

Besides mc (measure count?) there is also mn (measure number?). I'm not sure what the difference between these is---in the file I examined, they were always the same. @johentsch perhaps you can elucidate?

There are also some columns that have been deleted and don't appear to have equivalents. Namely, 'totbeat', 'altchord', 'no', 'op', 'mov', 'length'. For the most part I don't think these are important. I rewrote the code that used totbeat so as not to depend on this attribute. The one column I am wondering about is length---was it important that we were previously able to do thisEntry.quarterLength = self.length? Do we need to add logic to deduce the length of each chord? Or can music21 do that elsewhere?

@malcolmsailor
Copy link
Contributor

I forgot to say: it passes all the tests in the tsvConverter.py file now. I don't know how much coverage those tests have, or if there are relevant tests elsewhere that should be run.

@MarkGotham
Copy link
Owner

Great, thanks @malcolmsailor !

A couple of quick comment for now, in haste.

  1. Measures (to save @johentsch the trouble!):
  • measure count runs 1, 2, 3, sequentially for every measure object;
  • measure number follows score conventions: e.g.
    • anacrusis = 0,
    • split measures (e.g. for repeats) don't each get a distinct number each,
    • possibility of 1st /2nd time e.g. measure 18a
  • i.e. a highly relevant part of the current issue wrt anacruses and measure counting.
  1. music21 tests: yes, test on the file and if you need to check the interaction with other parts of that library then run python3 test/multiprocessTest.py from the base. (Not likely to be too important in this case). Various tests also run when you submit the PR. I've alerted that community to this work in progress (romanText/tsvConverter.py update work in progress / coming soon cuthbertLab/music21#1214), so we may get some suggestions from there too.
  2. Reverse (m21>DCML) conversion. Yes perhaps with the column headings as they stand for now, perhaps with user-option to redefine (attributeXcolumn: int = 4).

Thanks again!

@malcolmsailor
Copy link
Contributor

So which do we want to use, measure count or measure number? (Forgive me I am not up to speed on the intricacies here.) Or provide a kwarg to select one or the other?

@MarkGotham
Copy link
Owner

So which do we want to use, measure count or measure number? (Forgive me I am not up to speed on the intricacies here.) Or provide a kwarg to select one or the other?

Number. (Count may still be useful in analysis-score alignment)

@malcolmsailor
Copy link
Contributor

OK presently doing the opposite so I'll change that.

Is there an easy way to store arbitrary attributes in the music21 stream, like measure count in this case? So it could possibly be used downstream.

@jacobtylerwalls
Copy link

Is there an easy way to store arbitrary attributes in the music21 stream, like measure count in this case?

I think .editorial is the suggested way to store arbitrary key-value pairs.

In [1]: from music21 import stream

In [2]: s = stream.Stream()

In [3]: s.editorial.arbitrary = 1234

@malcolmsailor
Copy link
Contributor

malcolmsailor commented Feb 9, 2022

So I updated the code to simply store all columns in the input .tsv that are not in HEADERS in the editorial attribute of each roman numeral. Perhaps this is overkill but it seemed the simplest approach.

The only thing that it seems is now preventing conversion back to tsv from producing an appropriate result is that 'localkey' is a pitch name, rather than a roman numeral. Is there existing code somewhere in music21 that, given two pitch names (with case indicating major/minor), will express the latter as a roman numeral relative to the former?

@MarkGotham
Copy link
Owner

Maybe getScaleDegreeAndAccidentalFromPitch?

@MarkGotham
Copy link
Owner

MarkGotham commented Feb 9, 2022

cMaj = key.Key('C')
cMaj.getScaleDegreeAndAccidentalFromPitch(pitch.Pitch('E'))

@jacobtylerwalls
Copy link

A bit involved, but ...

>>> p1 = 'C'
>>> p2 = 'e'
>>> p2_rn = roman.RomanNumeral('i' if p2.lower() == p2 else 'I', keyOrScale=key.Key(p2))
>>> p2_rn.pitches
(<music21.pitch.Pitch E4>, <music21.pitch.Pitch G4>, <music21.pitch.Pitch B4>)
>>> r = roman.romanNumeralFromChord(chord.Chord(p2_rn.pitches), keyObj=key.Key(p1))
>>> r
<music21.roman.RomanNumeral iii in C major>

@malcolmsailor
Copy link
Contributor

Alright, I cleaned up all my TODOs. (There is still one TODO in the tsvConverter.py that I believe originates with you, Mark.)

However, conversion to DCML isn't necessarily perfectly functional because of 'vii' seems to be interpreted differently by music21 and DCML. So '#viio6/ii' in the DCML becomes 'viio6/ii' when written back out with M21toTSV, which then becomes 'bvii6/ii' when read in again. If there's an easy fix for this, great, but otherwise, I don't think I'm inclined to spend too much time on it as (as far as I know) no one is actually using the conversion in this direction.

@MarkGotham
Copy link
Owner

Closing this as the discussion is now on cuthbertLab/music21#1267 and #42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants