Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utility for removing duplicated objects (e.g. keys, clefs) #1454

Merged
merged 29 commits into from
Mar 17, 2023

Conversation

MarkGotham
Copy link
Contributor

No description provided.

@MarkGotham
Copy link
Contributor Author

This is a first implementation of what we discussed in #1392.
Call it a draft that's working well enough now to share the logic and invite comments.

Questions:

  • Right breadth of generalisation? Currently defaults to meter.TimeSignature, key.KeySignature, clef.Clef but can be any list of classes. Raise exception in some cases?
  • Comparison (line 137) based on str(thisState) == str(currentState). Is this the best way? List of classes doesn't work and obviously nothing specific to any of the classes listed would be generalisable across classes.

@MarkGotham MarkGotham marked this pull request as draft October 7, 2022 12:10
@coveralls
Copy link

coveralls commented Oct 7, 2022

Coverage Status

Coverage: 93.053% (+0.0009%) from 93.052% when pulling fc0348a on MarkGotham:duplicates into a8ba6e4 on cuthbertLab:master.

@mscuthbert
Copy link
Member

Can you rename the PR for what it does -- when it gets merged and I need to do the News items, good PR titles are super essential.

@mscuthbert
Copy link
Member

I wouldn't compare the string state, just use equality (x == y) and if there's an issue with what compares as == in a class, we can fix that there. String reps are just for humans not computers. Other than that, looks like it's on the right track.

You're going to need to cast to list early or -- better -- make a list of elements to remove and their active sites and then afterwards remove them in bulk -- otherwise the stream needs to be resorted n times for n^2 log n operations instead of n log n operations. If you remove while iterating there are bugs (like popping from a list you're iterating over).

@MarkGotham MarkGotham changed the title First implementation of #1392 Remove duplicated time signatures, key signatures, clefs (cf. #1392) Oct 7, 2022
@MarkGotham
Copy link
Contributor Author

MarkGotham commented Oct 7, 2022

I wouldn't compare the string state, just use equality (x == y) and if there's an issue with what compares as == in a class, we can fix that there. String reps are just for humans not computers. Other than that, looks like it's on the right track.

Thanks. Yes, I resorted to string because when I compared the objects directly before it failed to recognise the equality. I assumed because there was a difference in something like the ID, but it seems that:

>>> k = key.KeySignature(2)
>>> k
<music21.key.KeySignature of 2 sharps>
>>> j = key.KeySignature(2)
>>> j
<music21.key.KeySignature of 2 sharps>
>>> k == j
True

So I guess that's fine? If so, the problem must have been something else.
I'll implement that soon.

@MarkGotham
Copy link
Contributor Author

Also wry mypy, I'm not totally clear on best practice in the stream. specific imports here.
It seems like there's a range of approaches among other files within the stream/ directory.
Thanks for any tips and preferences.

@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented Oct 8, 2022

I'm not totally clear on best practice in the stream. specific imports here.

Looks like you're only using stream in type annotations, which aren't evaluated at runtime. Given that, you will need to import it, but you can guard it like this so it's only imported for type-checking:

import typing as t
... other imports...
if t.TYPE_CHECKING:
    from music21 import stream

@MarkGotham
Copy link
Contributor Author

import typing as t
... other imports...
if t.TYPE_CHECKING:
    from music21 import stream

Thanks @jacobtylerwalls for the tip! Caveat is that apparently ...
from __future__ import annotations
... has to come first.
(Big thanks to python for the recent improvements to error messaging!)

@MarkGotham
Copy link
Contributor Author

Thanks for the notes @mscuthbert and @jacobtylerwalls

... The plot thickens ...

Returning to the good practice x==y comparison check, I'm getting:

  • the wrong result for time signature duplicate (3/4 ... i.e., the two 3/4s are still treated as a non-match ... hence the resorting to strings discussed above);
  • the right result for keeping the actual time signature changes (2/4, perhaps not so significant given the error on the 3/4);
  • the right result on all the other classes (key signature and clef).

As it's the time signature, I wonder if the issue is something to do with makeMeasures, but the derivations attributes seem fine?

I can imagine various work arounds (e.g., with vars() or similar) but can't immediately see anything that actually fixes the issues here.

Full comprison fail:

Expected:
    {0.0} <music21.stream.Measure 1 offset=0.0>
        {0.0} <music21.clef.TrebleClef>
        {0.0} <music21.key.KeySignature of 6 sharps>
        {0.0} <music21.meter.TimeSignature 3/4>
        {0.0} <music21.note.Note C>
        {1.0} <music21.note.Note C>
        {2.0} <music21.note.Note D>
    {3.0} <music21.stream.Measure 2 offset=3.0>
        {0.0} <music21.note.Note E>
        {1.0} <music21.note.Note F>
        {2.0} <music21.note.Note G>
    {6.0} <music21.stream.Measure 3 offset=6.0>
        {0.0} <music21.clef.BassClef>
        {0.0} <music21.key.KeySignature of 5 flats>
        {0.0} <music21.meter.TimeSignature 2/4>
        {0.0} <music21.note.Note A>
        {1.0} <music21.note.Note B>
    {8.0} <music21.stream.Measure 4 offset=8.0>
        {0.0} <music21.note.Note C>
        {1.0} <music21.bar.Barline type=final>
Got:
    {0.0} <music21.stream.Measure 1 offset=0.0>
        {0.0} <music21.clef.TrebleClef>
        {0.0} <music21.key.KeySignature of 6 sharps>
        {0.0} <music21.meter.TimeSignature 3/4>
        {0.0} <music21.note.Note C>
        {1.0} <music21.note.Note C>
        {2.0} <music21.note.Note D>
    {3.0} <music21.stream.Measure 2 offset=3.0>
        {0.0} <music21.meter.TimeSignature 3/4>
        {0.0} <music21.note.Note E>
        {1.0} <music21.note.Note F>
        {2.0} <music21.note.Note G>
    {6.0} <music21.stream.Measure 3 offset=6.0>
        {0.0} <music21.clef.BassClef>
        {0.0} <music21.key.KeySignature of 5 flats>
        {0.0} <music21.meter.TimeSignature 2/4>
        {0.0} <music21.note.Note A>
        {1.0} <music21.note.Note B>
    {8.0} <music21.stream.Measure 4 offset=8.0>
        {0.0} <music21.note.Note C>
        {1.0} <music21.bar.Barline type=final>

@jacobtylerwalls
Copy link
Member

Hey Mark. The time signature comparison issue is familiar because I remember it from #823. Until we implement __eq__ for TimeSignatures, this PR could use ratioEqual, which would be better than str.

Or, what I hear Myke saying, is that we could implement __eq__ as just calling ratioEqual, and then we can remove the workaround here:

multiMeter: bool = isMultiAttribute(TimeSignature, comparison='ratioEqual')

I suggest doing that in a separate PR first to get it in faster. There are some examples in the code base for implementing __eq__ (see KeySignature).

@jacobtylerwalls
Copy link
Member

The potential downside of making __eq__ a synonym for ratioEqual is that it will make flavors 5/8 divided differently suddenly equal. I think I'm in favor of not implementing __eq__ and just using ratioEqual in this PR.

@MarkGotham
Copy link
Contributor Author

Thanks @jacobtylerwalls for providing this rich and helpful context.

Seems like a great fix to me, specifically:

  • __eq__ for assessing equality among time signatures
  • Then we can stick with the class-neutral logic here (probably a main reason for having __eq__ right?)

@mscuthbert
Copy link
Member

I think that I'm going to need a whole equality hierarchy for Music21Object, so that classes can reasonably assume that super().__eq__(other) must be True for equality.

Will make new PR for that.

Notably, this means removing time signatures for now while they're problematic.
See cuthbertLab#1457
Simplifies nested comparisons and doesn't run for
- len 0 ... i.e. not used,
- len 1 ... i.e. doesn't change
@MarkGotham MarkGotham changed the title Remove duplicated time signatures, key signatures, clefs (cf. #1392) Remove duplicated key signatures, clefs (cf. #1392) Oct 11, 2022
@MarkGotham
Copy link
Contributor Author

Ok, here's a clean up that clearly separates out the different issues here and (hopefully) wraps this PR.

It looks like ...

  1. time signature equality is the only remaining issue here, and is not the scope of this PR.
  2. wider work on __eq__ is in the pipeline.

... So those two get their own PRs ...

  1. __eq__ comparison on time signatures #1457 for time signatures is (is a work in progress, either on its own or as part of ...
  2. ... v9 -- equality hierarchy #1458, which is clearly a bigger job for v9.

Separating that off, this PR now ...

  • reduces scope (for now) to two classes that work
  • delivers a usable implementation for what it set out to do
  • remains compatible with those (future, external) changes by being based on the == comparison.

I'd suggest we

  • merge this now
  • continue work on the other PRs
  • add more classes here when they're reliable

So then the only remaining fail is on lint with:

music21/stream/tools.py:29: [W0102(dangerous-default-value), removeDuplicates] Dangerous default value [] as argument

Not sure how to improve that, especially as there are only two lists and:

  • classesToRemove is not [] by default and
  • listOfObjectsToRemove does start as [] but isn't among the arguments.

@mscuthbert
Copy link
Member

I'd suggest we

  • merge this now

No. That's not how this project works.

@MarkGotham
Copy link
Contributor Author

No. That's not how this project works.

Thanks for the quick reply @mscuthbert. Of course it's up to you ... but I don't see why not in this case.

It may be useful to pursue this a little further as an example case for merge criteria, and to see whether any clarifications or changes are needed to the stated requirements.

At a minimum, the discussion would help me continue to serve music21 as best I can. It may well benefit others, both current and (perhaps especially) potential contributors. If this is more appropriate for the group or some other context, then feel free to move there.

As I hope will be clear, I offer this in the most constructive spirit possible with the aim of making the development and use of music21 as positive and rewarding as it can be.

Criteria for inclusion

I know and understand the developer reference guide for contributions and the newer version here.

I'm also aware that that's something of a work in progress, with the later file being new, updates being added as we speak, and integrating of those two docs presumably being on the cards?

For instance, the expectation to suggest any new contribution and get it approved in advance before developing a PR is in the latter and not the former.

In any case, as far as I can see, the conditions stated there have been met in this case.

Possible reasons against this merge:

  1. The proposals for __eq__ to emerge out of this might move the goal posts for the code here?
    • Is there likely to be any change to the behaviour of __eq__ on key signatures or clefs? Unlikely, right?
    • Even if there is, surely that would be a breaking change and out of scope here?
  2. It should include time signatures.
    • Ideally yes, but as we've discovered, they're buggy for some unclear reason that goes way beyond the scope here and might require v9 to correct.
    • Clearly there's no question of including them while they don't work, but what's the argument against the narrower scope that excludes them until they're ready?
  3. Adding time signatures (and/or other classes) later might be problematic, e.g., it might change how best to set this out.
    • Surely not, given the class-neutral logic?

Perhaps I'm missing something. Thanks for clarifying if so.

Why do I care? What's the rush? Why does it matter to music21?

Obviously we must avoid wasting volunteer contribution time. A few days ago we discussed a near-merge PR that was dormant for only a few weeks before being superseded by a parallel PR on the same topic.

Whatever the relative importance of that specific PR, it is significant to think that the process for coordination is not reliable at the work in progress phase. Once a PR is merged, then clearly it has to be taken into consideration for new contributions, but prior to that point, we're susceptible to loosing coordination.

Work on v9 will take some time and contributions that have to wait for it are particularly susceptible to this duplicated work, let alone the possibility that contributors will moving on and forget about it, or at least loose clarity on the details.

Ok, that's all!

Please excuse the long message! I hope it is useful, e.g., by channelling this discussion into more detailed contributor advice, whether or not that involves a change in practice, or simply a clarification.

Thanks for your thoughts, here or elsewhere, and here's to a great future for music21 with a wide and happy contributor base.

@MarkGotham MarkGotham marked this pull request as ready for review February 1, 2023 11:30
@MarkGotham
Copy link
Contributor Author

Mypy remains ...

Need type annotation for "removalDict" (hint: "removalDict: Dict[<type>, <type>] = ...") [var-annotated]

Any tips @jacobtylerwalls ?

Maybe something like removalDict: Dict[x, y] = dict() ... where x and y are the active site and class to remove's object classes respectively. So in y's case, meter.TimeSignature | key.KeySignature | clef.Clef?

Keen to be a good citizen ... but not clear to me that that's helpful / worth it here.

Thanks for any advice.

@jacobtylerwalls
Copy link
Member

I think Music21Object is specific enough; we don't need to guess what the user's removeClasses are. This diff quiets mypy!

diff --git a/music21/stream/tools.py b/music21/stream/tools.py
index d26719acf..20d6a5bff 100644
--- a/music21/stream/tools.py
+++ b/music21/stream/tools.py
@@ -10,6 +10,7 @@
 # ------------------------------------------------------------------------------
 
 from __future__ import annotations
+import typing as t
 
 from music21 import clef
 from music21 import environment
@@ -17,6 +18,10 @@ from music21 import key
 from music21 import meter
 from music21 import stream
 
+if t.TYPE_CHECKING:
+    from music21.base import Music21Object
+
+
 environLocal = environment.Environment('stream.tools')
 
 
@@ -156,7 +161,7 @@ def removeDuplicates(thisStream: stream.Stream,
 
     supportedClasses = (meter.TimeSignature, key.KeySignature, clef.Clef)
 
-    removalDict = {}
+    removalDict: dict[stream.Stream, list[Music21Object]] = {}
 
     if not inPlace:
         thisStream = thisStream.coreCopyAsDerivation('removeDuplicates')

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a cool tool!

@MarkGotham
Copy link
Contributor Author

Thanks again for the review/s @jacobtylerwalls .
Are you going to merge or wait for @mscuthbert's input?
Myke, I hope this satisfies what you envisioned here.

@jacobtylerwalls jacobtylerwalls changed the title Remove duplicated key signatures, clefs (cf. #1392) Add utility for removing duplicated objects (e.g. keys, clefs) Mar 17, 2023
@jacobtylerwalls jacobtylerwalls merged commit fe306c2 into cuthbertLab:master Mar 17, 2023
@jacobtylerwalls
Copy link
Member

Thanks, Mark!

@mscuthbert
Copy link
Member

Thanks Mark and Jacob.

One thing I just caught now -- should probably have a little patch that if run on an Opus it calls it on each of the .scores (which then will call on each of the .parts). Not a big deal.

mxordn added a commit to mxordn/music21 that referenced this pull request Mar 21, 2023
Add utility for removing duplicated objects (e.g. keys, clefs) (cuthbertLab#1454)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rntxt repeats duplicating time signatures (from comment on #1387)
4 participants