-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AniDB vs Anilist - add support for Movies and wo/o naming differences #126
Comments
I got my hands on AniDB title .xml.gz file and did some top level counting. I discarded all lines from xml except lang="x-jat" and type="main". I was left with 593 titles:
Numbers don't add up as Eiga + o or Gekijouban + o happen sometimes. I did this to do more data checks and to confirm the logic won't be harmful. I spotted some odd cases, please read on. The wo->o ruleThe overwhelming number of examples would be perfect if o became wo. Some oddities:
Gekijouban ruleSome medium disappointment here, I have to go back on my initial assumption. Here are examples where gekijouban-less title will match to tv show of the same name:
Funny outlier: Eiga ruleNot as much as Gekijouban case, but I can find similar issues. Here are examples where eiga-less title will match to tv show of the same name:
Other oddities: SummaryWo-ing the titles seems safe and desired. While all previous examples from my own library would match correct anilist title (after de-gekijoubaning or de-eigaing), there seem to be too many cases where it will cause problems. Instead, I think it's safer to attempt to do following treatment:
I attach file with cleaned titles I used for above research: https://gist.github.com/karpik123/760774de1a0a90156567d794a704e71a |
I went through my library and synced everything. I use x-jat names from AniDB and I noticed two naming patterns that should be straightforward to cover, saving a lot of work on custom mappings.
First Pattern - 'Movie'
PlexAniSync can recognise this word and attempt to do an extra attempt to match title after removing
Gekijouban<space>
from the string.Another similar example is 'Eiga':
Second Pattern - wo vs o
AniDB is almost universally done as
o
, while Anilist useswo
in titles. I don't know Japanese well enough to understand why...PlexAniSync can catch
<space>o<space>
in the string and do an extra attempt to match title after convering o into wo. Note top example from the table even has doubleo
.While some titles might genuinely use
o
in the title, I don't expect them to be a match to a completely different title even if PlexAniSync converts innocento
intowo
.The text was updated successfully, but these errors were encountered: