Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

match on music brainz id when name match fails ? #811

Closed
DutchJaFO opened this issue Aug 6, 2022 · 9 comments
Closed

match on music brainz id when name match fails ? #811

DutchJaFO opened this issue Aug 6, 2022 · 9 comments
Labels
bug Something isn't working enhancement Improvements of existing functionality

Comments

@DutchJaFO
Copy link

DutchJaFO commented Aug 6, 2022

What version of Music Assistant has the issue?

2022.8.1

The problem

I found two duplicate artist entries in my library :
= Salt-N-Pepa
= "Weird" Al Yankovic

How to reproduce

Both names looked identical, but looking more closely I noticed that "Weird" Al Yankovic had two variants :
= One with 'standard' double quotation marks showing the spotify tracks/albums : "Weird Al" Yankovic
= One with fancy double quotation marks showing the local file system tracks : “Weird Al” Yankovic

The local filesystem files had been updated with info from Music Brainz Picard with UTF-16 encoding.

The Salt-N-Pepa situation was fixed by

  1. adding the plug-in named "Unicode Hyphen" to Music Branz Picard
  2. re-scanning the affected albums/tracks
  3. the updates showed a change in artist name
  4. saving the updated tracks

I did try a force resync in MA to fix the situation, but that didn't appear to solve it.
As such I had to go nuclear (disable MA integration, remove the musicassistant.db file from system, enable MA integration).
This did

Relevant log output

No response

Additional information

Music Brainz Picard was set to use ID3 v2.3 UTF-16
image

Spotify may be using UTF-8 character set, which can explain multiple versions tracks from various sources that appear identical to the user in the MA library.

Possibly related :
Track names on spotify don't appear to match the ones in music brainz either.
Salt-N-Pepa debut album "A Salt with a deadly Pepa" has track in music brainz listed as "Shake Your Thang" while Spotify lists it as "Shake Your Thang (It's Your Thing)" which don't match based on track name for obvious reasons.

I suspect that this mismatch in track/album/artist names between spotify (streaming service) and music brainz data may also cause several other tracks, albums and artists not to be matched in MA library.

Suggestion :
If spotify (or any other service) has the music brainz Id's available this should make matching artists, albums and tracks easier.

What version of Home Assistant Core are your running

2022.8.1

What type of installation are you running?

Home Assistant OS

On what type of hardware are you running?

Generic x86-64 (e.g. Intel NUC)

@DutchJaFO
Copy link
Author

Similar, but different issue :
The album "Beautiful Garbage" on spotify is listed as "beautifulgarbage" on Music Brainz Picard.
I ran into that issue when I couldn't find the album when updating tags with Picard.
Apparently the 'official' title for that album is indeed without the space ...

@OzGav
Copy link
Contributor

OzGav commented Aug 7, 2022

Spotify doesn't supply MB ID so the exact name match is required. Please confirm that the mismatches are only occurring between local media and Spotify?

Clearly it is going to be difficult if not impossible to match when Spotify doesn't send the proper artist, track or album name. I have previously spoke to Marcel about matching on a percentage of characters but he has tried that in the past and it resulted in too many false positives.

If Marcel is happy to adjust the sort order to ignore case (as per another issue #755) then that will help a little in that similarly named, but different case, situation.

Any other ideas from others welcome!

@OzGav OzGav added enhancement Improvements of existing functionality and removed triage labels Aug 7, 2022
@DutchJaFO
Copy link
Author

DutchJaFO commented Aug 7, 2022

I consider a mismatch when a file on my system is not matched to an equivalent artist/album/track on spotify

yep ... all mismatches I've seen so far were local vs spotify.

I'd suggest these options :

  1. recommending tags in UTF-8 character encoding when tagging local files (this needs no fix in MA, but it may make life easier), because that is likely to be the standard encoding available in streaming services which would reduce workload for MA. At the very least recommend avoiding the ISO 8859-1 format, unless specific streaming services do have this as output.

  2. convert text to UTF-8 encoding if they are in either UTF-16 or ISO 8859-1 format before making a match within the MA database. I am assuming that most streaming services only provide UTF-8 format output, although some may have options for alternate character encodings.

  3. case insensitive matching preferred, maybe make it culture invariant too (this may be new). This allows MA to ignore diacritical(sp?) variants of characters and would further improve automatich matching.
    Example : E would match to È, É, Ë, è, ê, etc. (and vice versa)

  4. allow manual matching by user (this would require additional dialogs).
    The last option would help users solve situations when multiple streaming services are in use, when Plex (and similar alternatives to filesystem) are added or when there are reasons why fixing the tags in the local filesystem output isn't possible.

@OzGav
Copy link
Contributor

OzGav commented Aug 7, 2022

Some good ideas there. Let's see what @marcelveldt thinks. Note that the tags added by Picard are considered a source of truth so I am pretty sure the least preferred option is "fixing the tags" (unless fixing them is tagging with Picard :-) )

@DutchJaFO
Copy link
Author

Yes and no.
I'd argue that the user is the only one who gets to decide what his version of the truth actually is.
Especially for files that are on his system.

If he wants to match his tags with those of his favourite provider then that should be an option.
Everything else requires MA to maintain a database of sorts that matches all of the content available in Spotify(*) to MusicBrainz data ...

(*) or any other music service ... Plex does something similar by providing users with several sources of meta-data to enrich their media.

@OzGav
Copy link
Contributor

OzGav commented Aug 7, 2022

I understand your point. At this time MA attempts to do all the matching work for you to save the user time and effort. However, due to limitations with the metadata supplied from the various music providers it is difficult to achieve a 100% success rate without introducing false positives. Your idea to be able to manually link is a good one but will be a feature enhancement. On the UTF encoding Marcel accepted that as a bug on Discord so I will add that label as well.

@OzGav OzGav added the bug Something isn't working label Aug 7, 2022
@DutchJaFO
Copy link
Author

manually matching is definitely a 'nice to have' feature for a future that will make life easier, especially for users who don't have a (local) filesystem source that can be updated by re-tagging a few files.

@marcelveldt
Copy link
Member

implemented a fix in version 2022.8.2, can you confirm that works ?

@OzGav
Copy link
Contributor

OzGav commented Aug 21, 2022

Hi @DutchJaFO . The case insensitive sorting has been implemented. Marcel also believes he has implemented your suggestion #3 above and utf encoding shouldn't matter because MA only compares ascii characters now. Manual matching is something that will be looked at once the auto logic is perfected.

On that basis we will close this issue soon unless you have further problems. Thanks!

P.S. Do refresh your database after updating

@OzGav OzGav closed this as completed Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Improvements of existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants