Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Community lexicons #43

Closed
karashiiro opened this issue Aug 13, 2021 · 45 comments
Closed

Community lexicons #43

karashiiro opened this issue Aug 13, 2021 · 45 comments
Labels
help wanted Extra attention is needed

Comments

@karashiiro
Copy link
Owner

karashiiro commented Aug 13, 2021

Migrated to #60 - please continue there!

I don't use lexicons myself, so I don't have one I'm maintaining, but if anyone else has lexicons they're willing to share I'd appreciate it if they could drop a link so I can provide them to anyone who wants them and doesn't know how to make them themselves. Alternatively, feel free to post them in the #preset-sharing channel in the goat place Discord, and I'll relink them somewhere here.

@karashiiro karashiiro added the help wanted Extra attention is needed label Aug 13, 2021
@karashiiro
Copy link
Owner Author

@karashiiro karashiiro pinned this issue Aug 13, 2021
@johnysandels
Copy link
Collaborator

johnysandels commented Aug 13, 2021

Main Character names British Voice Lexicon.zip

Works only with British voices. Uses aliases because pronemes aren't currently working with the standard voice setting.
Only corrects pronunciation on supporting characters names.

@johnysandels
Copy link
Collaborator

FFXIVCharacters&LocationsEN.zip
fixed a mistake where two entries phonemes were swapped while testing

I've done all the characters I've noticed the most when going through MSQ. As well as Mispronounced Location names. These use pronemes so pronunciation should be more consistent though different regions of English.
Works with all English voices. (tested for US and GB)

@dedren
Copy link

dedren commented Sep 13, 2021

Is there an app or something easy to make these lexicons?

@dedren
Copy link

dedren commented Sep 13, 2021

I am not sure if this should be a separate issue or not, but I tried using the FFXIVCharacters&LocationsEN.zip lexicon (both through Amazon Polly and directly uploaded to the plugin) and it said "Maximum lexicons size has been exceeded".

@karashiiro
Copy link
Owner Author

Is there an app or something easy to make these lexicons?

https://docs.aws.amazon.com/polly/latest/dg/gs-put-lexicon.html

I don't know of any apps for this, but this article has some lexicons used in its examples that might explain the concept.

I am not sure if this should be a separate issue or not, but I tried using the FFXIVCharacters&LocationsEN.zip lexicon (both through Amazon Polly and directly uploaded to the plugin) and it said "Maximum lexicons size has been exceeded".

I've never heard of this happening, but I assume that means Amazon Polly has some sort of size limit on lexicons. You can try splitting the lexicon in half, maybe? Pulling out half of the lexemes and putting them into a new lexicon file and uploading the resulting two smaller ones.

@dedren
Copy link

dedren commented Sep 13, 2021

I've never heard of this happening, but I assume that means Amazon Polly has some sort of size limit on lexicons. You can try splitting the lexicon in half, maybe? Pulling out half of the lexemes and putting them into a new lexicon file and uploading the resulting two smaller ones.

Looks like you nailed it, as per Amazon Polly’s site, “Each lexicon can be up to 4,000 characters in size. ”
I’ll have to wait until I have time to figure out how to separate them on an actual computer.

@johnysandels
Copy link
Collaborator

Looks like you nailed it, as per Amazon Polly’s site, “Each lexicon can be up to 4,000 characters in size. ”
I’ll have to wait until I have time to figure out how to separate them on an actual computer.

Wow 4000 characters is quite a small limit, in the future I'll have to make split files for Amazon Polly. For now though, I removed a lexeme that can't be used right now with the way the plugin currently works, and the character count is now 3999! If there is any issue let me know!
FFXIVLexiconPollyEN.zip

@dedren
Copy link

dedren commented Sep 13, 2021

Oh damn, thank you so much! Hah, karashiiro saw the future and knew it needed to get under 4000 characters LOL. But seriously, thank you so much, I'll try it in a few, hopefully, after a few morning jobs. Also, do you recommend uploading it to Amazon directly or just uploading it to the addon?

@johnysandels
Copy link
Collaborator

johnysandels commented Sep 13, 2021

You'll need to upload it through the TextToTalk plugin.

@johnysandels
Copy link
Collaborator

johnysandels commented Sep 25, 2021

FFXIVCharacters&LocationsEN.zip

  • Fixes Cid's name not being pronounced properly in some regions. Still within the 4000 character limit for AmazonPolly.

@johnysandels
Copy link
Collaborator

FFXIVCharacters&Locations.zip

  • Fixes pronunciation for Urianger's name for Microsoft David. Hopefully works with other voices (confirmed working for Zira atleast). I haven't been able to test others since I reinstalled windows. Under 4000 characters btw

@karashiiro
Copy link
Owner Author

FFXIVCharacters&Locations.zip

* Fixes pronunciation for Urianger's name for Microsoft David. Hopefully works with other voices (confirmed working for Zira atleast). I haven't been able to test others since I reinstalled windows. Under 4000 characters btw

That zip looks empty 👀

@johnysandels
Copy link
Collaborator

FFXIVCharacters&Locations.zip
lets pretend you didn't see that 👀

@johnysandels
Copy link
Collaborator

johnysandels commented Dec 5, 2021

--------Update: Added plurals to the new additions--------

Fixed pronunciation for the word Aetheryte and one of the expansions location names.
There will be more updates coming as I play through the expansion! Glad to have TTT on the first day I've been able to play the story!

Also I've split the polly zip version into two lexicon files to respect the 4000 character limit.

FFXIVCharacters.Locations.zip

FFXIVCharacters.Locations.Polly.zip

@dedren

This comment has been minimized.

@karashiiro

This comment has been minimized.

@karashiiro

This comment has been minimized.

@dedren

This comment has been minimized.

@karashiiro

This comment has been minimized.

@dedren

This comment has been minimized.

@johnysandels
Copy link
Collaborator

Capitalized Hydaelyn so it will actually work now 😅

FFXIVCharacters.Locations.zip
FFXIVCharacters.Locations.Polly.zip

@ryankhart
Copy link
Collaborator

@johnysandels What language are you making the Lexicon for?

If it's English, the official pronunciation of Yugiri, according to the voice acting, is You-gear-ee, not You-gid-ee.

But maybe you experience the game in Japanese and they pronounce it differently?

Despite this, I greatly appreciate the work that you've put into this Lexicon. It has made my overall experience that much better than before.

@johnysandels
Copy link
Collaborator

johnysandels commented Dec 17, 2021

If it's English, the official pronunciation of Yugiri, according to the voice acting, is You-gear-ee, not You-gid-ee.

But maybe you experience the game in Japanese and they pronounce it differently?

Oh I play in english!
I based the pronunciation on how people native to Doma says her name, since it seems like the more genuine way to pronounce it. Seems like the people who aren't from Doma say it in a more western way. I specifically tried matching Hien's and Gotsetsu's pronunciation.

Possible spoilers for stormblood in videos.
Hein:
https://youtu.be/KbEsvkbeuo4?t=67

first example I found from Gotsestu
https://youtu.be/DGE_X8GfvHI?t=267

@ryankhart
Copy link
Collaborator

I based the pronunciation on how people native to Doma says her name

Oh, that must be why. I haven't yet reached Doma yet in Stormblood.

@Trixemyar
Copy link

Hello there, let me start out by saying a big thank you for all the work you put into making lexicons, I myself have no knowledge about how this stuff are done but really like how people like you help bring the game alive for people like me.
Now to the issue I face, so i'm using polly to help bring TTT to life, but some of the beast tribe names are being read all wrong, i'm pretty early into the game still at ARR however the 2 i noted are "Ixal" and "amalj'aa" will this be possible to fix?
Thanks again for all the hard work you put into this 👍

@ryankhart
Copy link
Collaborator

ryankhart commented Dec 18, 2021

@Trixemyar If the IPA notation Johnysandels uses intimidates you like it does for me, try using aliases instead and just use trial and error to trick it into the correct pronunciation. The easiest way that I've found to test this through trial and error is to use Amazon's page for it here: https://us-west-2.console.aws.amazon.com/polly/home/SynthesizeSpeech

<lexeme>
   <grapheme>Ixals</grapheme>
   <alias>Icksals</alias>
</lexeme>
<lexeme>
   <grapheme>Amalj'aa</grapheme>
   <alias>Amaldja</alias>
</lexeme>

I'll add this to my personal lexicon that I'm compiling, and post it here in a bit. It's far from comprehensive. I just add things as I play and hear odd mispronunciations.

@ryankhart
Copy link
Collaborator

ryanslexicon.zip

@ryankhart
Copy link
Collaborator

I'd be willing to make pull requests to this repo if community lexicons were put under source control. I know basic git. And the scope of this community lexicon project is small enough for me to wrap my head around it. I might even be willing to help merge future community contributions posted here by people who don't want to bother with git. I can specify the github username of contributors in the commit notes as well as in XML comments.

@johnysandels
Copy link
Collaborator

johnysandels commented Dec 18, 2021

I'll have those added later tonight! Just as soon as I get off work 😎

@Trixemyar If the IPA notation Johnysandels uses intimidates you like it does for me, try using aliases instead

And true about aliases! I started off using alaises because they were easier to understand but started using phonemes because I noticed that different regions voices pronounce aliases differently. Pronemes are much more consistent through different regions, so I learned how they work to make a lexicon that is more universally applicable!

Also pronemes are much more simple than you would think. I use this to get the phonetics of words that are similar to the word that I'm trying to make, then this to test out the pronoucation. I tend to Frankenstein other words to piece together the word!

@karashiiro
Copy link
Owner Author

I'd be willing to make pull requests to this repo if community lexicons were put under source control. I know basic git. And the scope of this community lexicon project is small enough for me to wrap my head around it. I might even be willing to help merge future community contributions posted here by people who don't want to bother with git. I can specify the github username of contributors in the commit notes as well as in XML comments.

I think this is a good idea, I can set this up if you’re willing to help maintain it 👀 Having at least some lexicons be maintained in the repo itself should ensure that there are always updated lexicons to use, even if their original authors are unavailable.

@ryankhart
Copy link
Collaborator

if you’re willing to help maintain it 👀

👀 is right! But I'm always wishing I had something I could contribute to that I have the skills for that nobody else is already doing. And this seems like a good fit for me. I'll let you decide where you want to store the files initially, and I can update it from there over time.

@johnysandels
Copy link
Collaborator

johnysandels commented Dec 18, 2021

I'll still keep updating here then since I'm not familiar with Git! I also personally think it would be really cool to either have a lexicon either included with TTT that users can choose to enable or the ability to download the lexicons within the plugin, since it would be easier to access for the average user! or maybe just a link to the place they could download and why they might want to use one!

@ryankhart
Copy link
Collaborator

ryankhart commented Dec 18, 2021

Oh, I found the place where the default folder location can be set to a folder that contains various included Lexicon XML files.

InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments),

And this is the place where the UI appears to be set.

Now, I'm just wondering if there's a risk, when updating the plugin, if updates will overwrite user-placed lexicons there or not. I'm not sure how that would work out of the box. I know user preferences are preserved after updates, but it would be more ideal for updates to replace the provided lexicon XML files and not touch new files added by users.

That's just me thinking aloud, brainstorming. I don't expect a response. I may be able to mess with that code locally to see how that works if I can actually manage to figure out how to compile a relatively large code project (compared to what I'm used to for university assignments).

@johnysandels
Copy link
Collaborator

johnysandels commented Dec 18, 2021

I have an update ready for Ixal and Amalj'aa and the other beast tribes but some plugin bugs need to be ironed out before they will work. It'll need a fix for issue #58 and #48 before they can work!

@Trixemyar
Copy link

@ryankhart @johnysandels Thank you ones again for all the work you put in. it really does mean a lot, personally my experience with this game will suck without this addon because i'm dyslexic i take for ever to read all the text in this game so one of the 1st things i did was look up a work around. I'll keep my ears open for any errors i can pick up and report. Behalf of all who have issues reading i thank you all :D

@karashiiro
Copy link
Owner Author

Oh, I found the place where the default folder location can be set to a folder that contains various included Lexicon XML files.

(snip)

Now, I'm just wondering if there's a risk, when updating the plugin, if updates will overwrite user-placed lexicons there or not. I'm not sure how that would work out of the box. I know user preferences are preserved after updates, but it would be more ideal for updates to replace the provided lexicon XML files and not touch new files added by users.

That's just me thinking aloud, brainstorming. I don't expect a response. I may be able to mess with that code locally to see how that works if I can actually manage to figure out how to compile a relatively large code project (compared to what I'm used to for university assignments).

This is not the case. What you're looking at in OpenFile is the initial directory that the file dialog is viewing. Without setting this, the file dialog would start you out at the filesystem root, which would be inconvenient for most people, since they'd then need to navigate all the way down to wherever they've saved the lexicon file. Lexicon files aren't touched on plugin updates.

As for compiling the project, make sure you either clone Dalamud and build that as well, or update TextToTalk.csproj with paths to your installed Dalamud libraries. The size of the project shouldn't change anything compared to what you're used to (at least afaik, we didn't do anything special in 142/143), but the external dependency is something to look out for.

The expected directory structure is something like:

/whatever
|-Dalamud
| |-bin
|   |-Debug
|-TextToTalk

@karashiiro
Copy link
Owner Author

@karashiiro
Copy link
Owner Author

Dalamud might be a bit complicated to compile, actually 🤔 if you're not familiar with submodules and don't have the C++ build tools in Visual Studio installed. It's best to figure all that out for future reference, but if you can't you might just want to update the TextToTalk project file (remember not to commit it if you PR anything).

@karashiiro
Copy link
Owner Author

I'll still keep updating here then since I'm not familiar with Git! I also personally think it would be really cool to either have a lexicon either included with TTT that users can choose to enable or the ability to download the lexicons within the plugin, since it would be easier to access for the average user! or maybe just a link to the place they could download and why they might want to use one!

I can add you as a collaborator and you'll be able to edit the wiki, if you aren't familiar with Git 👀

@johnysandels
Copy link
Collaborator

I can add you as a collaborator and you'll be able to edit the wiki, if you aren't familiar with Git 👀

Ouh yes pls!

@karashiiro
Copy link
Owner Author

Sent requests @ryankhart @johnysandels

@karashiiro
Copy link
Owner Author

karashiiro commented Dec 18, 2021

I have a draft format specified in https://github.com/karashiiro/TextToTalk/tree/main/lexicons. I'm not super sure about it, but I think it'll work?

@karashiiro
Copy link
Owner Author

karashiiro commented Dec 18, 2021

I'd like to migrate this to Discussions, unless anyone's opposed 👀 we can go off-topic more easily that way 🙂

@karashiiro
Copy link
Owner Author

Oh, actually, I can convert this, I think.

Repository owner locked and limited conversation to collaborators Dec 18, 2021
@karashiiro karashiiro converted this issue into discussion #62 Dec 18, 2021
@karashiiro karashiiro unpinned this issue Dec 18, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants