-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match existing songs by spotify URL instead of name #1641
Conversation
How does this account for songs that are on Spotify multiple times, e.g. in different albums? They have different Spotify URLs but we wouldn't want to download them multiple times... |
I'm not sure what you mean, could you give an exemplary spotdl command (chain)? Usually these songs are only once within a playlist or album. When downloading a playlist they therefore are also downloaded only once. |
The below are all the same song, same artist, same content, but are released under different EPs/singles/albums. This is a common occurrence. Your PR will attempt to download the song 3x times, resulting in 2x YT-DLP errors (unintended behaviour as the program wastes resources trying to donload multiple times https://open.spotify.com/track/1JFX2Sj2ySyU6nIL7X3LWN?si=253198e8cc594a63 |
I assume you are talking about downloading a playlist containing only these 3 songs: Yes, this crashes. However, this is an old spotdl problem which occurs when songs with the exactly same name are right next to each other in a playlist. (It doesn't matter if it's exactly the same song with the same ID or not, only the resulting name matters.) I realize now, that my PR title is confusing, maybe just bad, since it contains "instead of" while still performing both checks. It stems from the conceptual idea I had at the beginning to make the song ID the "primary key" of a song. Errors like the YT-DLP one are the only reason I've kept the filename check in - to ensure current functionality. |
Functional overviewIt might be confusing to grasp what I tried to change and how, without going thoroughly through all the code. So I thought I'd create an overview. Metadata of songs:Changed comment tag format from Download option:Get all files (with the chosen suffix, like .mp3) from the working directory and gather their spotify URL from the comment tag, if possible. Behaviour for overwrite options:
Sync optionGet all previously downloaded songs from the JSON save-file. Behaviour for overwrite options:All three options are irrelevant for deleting songs and therefore now irrelevant for sync. Simply pass the used option to the downloader. Meta optionInstead of using a songs name to search for its metadata, try using its spotify URL to match the song. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge with upstream dev
branch, if you can't merge it for some reason open a new PR, or let me know.
Those 3 are not the same song. The last is the normal version, and the other two are the acoustic version and a remix. Should spotDL not download these versions separately (being different versions of the same song), instead of rejecting them because they all have the same name? |
You're right. And I also think that spotdl indeed should consider different song versions. That, and similar issues, is what my patch is about :-) My patch won't solve your problem yet, but it's the first step. |
I managed to merge with the master branch. Is it really that important to merge with dev? I'm new to Github. When I created my fork, I forked the master branch, and therefore the dev branch doesn't show up in my forked repo. I didn't find a way to merge with the dev branch other than using the Github conflict editor, which seems overly complicated at first glance. Apart from that, I still have two TODOs in there which I can't fix on my own. One is described in detail, the other is because of a bug, where (as far as I can see) the "output" variable is misused. |
Nevermind, I managed to merge into Now all that's left are the two TODOs I can't fix myself. If you have any questions let me know. |
I have the same issue, I suggest adding the track id as metadata and add an option to scan storage locations and crete exclude list to be used as --archive target |
Will merge this for v4.1.0 |
Hi Jakub, great to hear from you. |
v4.1.0 will be after v4.0.6, I have some major API changes in mind that's why next release won't be a minor one. I will resolve all the conflicts that will arise, so you won't have to worry about this + I will fix all the TODO's in code. |
Oh, that's awfully nice of you. If there's anything I can do or if there's anything you'd like to know, just ask. |
350f3f5
to
e21c433
Compare
Is this supposed to be implemented in dev already then? Because I'm on dev and I'm not getting the spotify urls in my comments in addition to the regular yt urls. |
Yes it is, but there have been quite some changes to my code. |
Match existing songs by spotify URL instead of name
Description
The
download
option checks whether a song exists by looking for the current spotify name. Because spotify song names change quite often, this leads to the same song being downloaded twice (with different name).Similarly, the songs deleted by
sync
are also found by matching current filenames with the song's current name on spotify, causing similar effects.This commit uses the spotify URL instead of the filename to ensure uniqueness of song files. It does so by modifying the comment tag, which previously contained the
YouTubeURL
to the formatYouTubeURL|SpotifyURL
, and using this spotify URL to match songs for equality.Related Issue
#1578
Motivation and Context
How Has This Been Tested?
Tested on (almost) clean Debian.
--format
and--overwrite {metadata,skip,force}
parameters in various combinations.meta
for all filetypes and overwrite optionsTodo
This is as far as I could get without help. Nonetheless, this is not a finished commit:
Types of Changes
Checklist