-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Armenian letters should be lowercased #328
base: main
Are you sure you want to change the base?
Conversation
hello @NarHakobyan, your PR fits the need, however the PR is not passing the CI, I know it's not completely related to your work but could you fix the errors? clippy:
Rust FMT: Diff in /home/runner/work/charabia/charabia/charabia/src/normalizer/lowercase.rs:27:
fn should_normalize(&self, token: &Token) -> bool {
// https://en.wikipedia.org/wiki/Letter_case#Capitalisation
- matches!(token.script, Script::Latin | Script::Cyrillic | Script::Greek | Script::Georgian | Script::Armenian)
- && token.lemma.chars().any(char::is_uppercase)
+ matches!(
+ token.script,
+ Script::Latin | Script::Cyrillic | Script::Greek | Script::Georgian | Script::Armenian
+ ) && token.lemma.chars().any(char::is_uppercase)
}
} thank you! |
Hi @ManyTheFish, Done! I Do not know why but RustRover didn't show any error on these lines. |
Hey @NarHakobyan, charabia/charabia/src/normalizer/lowercase.rs Lines 45 to 98 in d929c01
You just have to add a source token in the tokens() list, then fill the normalizer_result() and the normalized_tokens() with the expected output. |
@ManyTheFish to be honest, I don't know how to do that :D here is an example text to which can be used: |
Add a token containing Armenian capital letters in the fn tokens() -> Vec<Token<'static>> {
vec![Token {
lemma: Owned("PascalCase".to_string()),
char_end: 10,
byte_end: 10,
script: Script::Latin,
..Default::default()
- }]
+ },
+ Token {
+ lemma: Owned("ֆիզիկոսը".to_string()),
+ char_end: 8,
+ byte_end: 16,
+ script: Script::Armenian,
+ ..Default::default()
+ }]
} Then run the tests: And fix the outputs in the |
@ManyTheFish could you please run a tests? |
Fixes #325