-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand components/ucd/tests/category_tests.rs #43
Comments
Hi, I would like to work on this. I'm new to unic but saw this in the |
Hi again, I've been looking into this issue for a couple of hours (after reading up on unicode). I'm not sure what needs to be done here. Do you want a second test which will check that all the General_Category values are correct Or did you just want to replace the Thanks for any help, this project is really interesting to me so I would like to keep contributing in the future. use unic_ucd::category::GeneralCategory as GC;
...
#[test]
fn test_bidi_nsm_against_gen_cat() {
// Every NSM must be a GC=Mark
for cp in iter_all_chars() {
if BidiClass::of(cp) == BidiClass::NonspacingMark {
assert!(GC::is_mark(&GC::of(cp))); // changed from is_combining_mark(cp);
}
}
// Every GC!=Mark must not be an NSM
for cp in iter_all_chars() {
if !GC::is_mark(&GC::of(cp)) { // changed from !is_combining_mark(cp);
assert_ne!(BidiClass::of(cp), BidiClass::NonspacingMark);
}
}
}
// This is just to help me learn what's happening
// but I could change this to test all the GC values are correct
#[test]
fn test_gc_val() {
let mut list: Vec<GC> = Vec::new();
for cp in iter_all_chars() {
let category = GC::of(cp);
if !list.contains(&category) {
list.push(category);
}
}
// prints all GC values except from 'Unassigned'
// which is obvious since we loop over all chars
println!("{:?}", list);
} |
@calum I believe the idea here is that we want a test to make sure that So, leaving the test which exists alone, something along the lines of:
But that actually brings up a point about allowing |
Thanks, @calum, for working on this, and welcome! :) First, Now, about non-mark GC values: if you look at the Bidirectional Character Types table, you can see that some of the table rows contain GC sets listed in their General Scope column. For example, Basically, it's up to you to find all possible such tests from the table, and if some of these tests fail, we can see if it's a bug, or just kind of over-generalization in the language. We can expand this more when we have the Does this help? |
Thank you both for the in depth help! I think I have everything I need to finish this off now. I'll aim to put in a pull request by Saturday (hopefully earlier). I really appreciate the help, and the opportunity to contribute. |
I deleted my last question. I think I'm working it out now. So an example of one of my tests looks like this: /// `Bidi_Class=B := General_Category in { Cc (Control), Zp (ParagraphSeparator) }`
///
/// <http://www.unicode.org/reports/tr9/#NSM>
#[test]
fn test_bidi_b_against_gen_cat() {
// Every B must be a GC::Control or GC::ParagraphSeparator
for cp in iter_all_chars() {
if BidiClass::of(cp) == BidiClass::ParagraphSeparator {
assert!(
GC::of(cp) == GC::Control ||
GC::of(cp) == GC::ParagraphSeparator
);
}
}
} |
#61 was merged. Should this be closed? |
Yep, done here. Thanks, @calum! |
We have a cross-component test in
components/ucd/tests/category_tests.rs
that checks values of the Bidi_Class property against General_Category property, based on UAX#9's Table 4. Bidirectional Character Types.Now that we have
component/ucd/category
, we can expand the test to also cover General_Category values.The text was updated successfully, but these errors were encountered: