-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement split
and join
for String
#2752
Comments
Good ideas!
|
Nice, thank you for sharing. You mentioned on the I'll give it a shot. |
Regarding which normalized mode to use, I think any of the two canonical equivalence modes - NFC and NFD could be used. I don't think we should use the two Compatibility equivalence modes since I assume we would like to differentiate between say a superscript and subscript letter. |
Yeah, for |
@turbolent @SupunS You can try out the following on https://dartpad.dev/. print('Caf\u00E9'.replaceAll(RegExp(r'\u00E9'), 'X')); // CafX
print('Caf\u0065\u0301'.replaceAll(RegExp(r'\u0065\u0301'), 'X')); // CafX
print('Caf\u0065\u0301'.replaceAll(RegExp(r'\u00E9'), 'X')); // Café The same with Kotlin: https://pl.kotl.in/_F5jCJfXN fun main() {
val s = "Caf\u00E9"
println(s) // Café
println(s.replace("\u0065\u0301", "X")) // Café
println(s.replace("\u00E9", "X")) // CafX
} Seems like only Swift handles this. This makes me wonder why we want to handle equivalence in Cadence. If backwards compatibility is an issue, could we leverage stable cadence as a way to make a breaking change? |
The same in Rust actually. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4f656607f30f56e8bd4340f022f36fa4 fn main() {
let s = String::from("Caf\u{00E9}");
println!("{}", s); // Café
println!("{}", s.replace("\u{0065}\u{0301}", "X")); // Café
println!("{}", s.replace("\u{00E9}", "X")); // CafX
} let s = String::from("Caf\u{00E9}");
let s2 = String::from("Caf\u{0065}\u{0301}");
println!("{}", s == s2); // false So I am wondering why we want to treat equivalent characters as the same (Swift) instead of treating them at the byte level like (Go, Rust, Kotlin, Dart). Is there any reason for it? |
The reason would be because Cadence |
Reposting from the Discord discussion: Cadence aims to make programs safer, by preventing bugs. One area of bugs is string handling. For example, in the extreme case, e.g in C, there are no strings at all, and the developer must handle all nuances of string operations (encodings). Some languages have some sort of "string" type, but again, keep the type simple for performance reasons. Swift, and also Cadence, take it a step further, and provide safe string types that operate on the level where operations on strings reflect expectations for humans. I know that means the burden is then on the Cadence implementation ("us"), but that's the price we have to pay. |
As long as both input and replacement are normalized, it should be possible to use Go's |
I am noob in this subject, but why don't we normalize at String level? Can't we just say Cadence normalizes string. (when you create the string) Doesn't this solve all remaining problems? |
@bluesign For the implementation of The Swift implement is not that easy to read - https://github.com/apple/swift-corelibs-foundation/blob/9f53cc551e065d73b327a80147895822bc8f89e0/CoreFoundation/String.subproj/CFString.c#L3084 |
I'm a noob here, too, few can say they're experts in this area. Yes, we already do normalize, that's what I tried to point out above. However, we do not (yet) normalize at creation, but only lazily when needed (e.g. to check equality). We should keep the lazy initialization of a normalized version, so we do not have to pay for it unless necessary, but also store the result, by replacing the original (currently the normalization is recomputed each time and thrown away after use). I'll open a PR for it. |
Opened #2777. Note that this against the Stable Cadence feature branch, as it requires a storage migration. |
Both split and join are implemented. |
Issue to be solved
A few more functions can be implemented for Strings.
Suggested Solution
This issue proposes adding two functions:
split(_ s: String, _ delimiter: String): [String]
: Returns a string array after splitting the provided string based on the provided delimiter. Delimiter can be made optional and,
can be used as default.join(_ strs: [String], _ separator: String): String
: Returns a string value after concatenating elements of the provided array of string with the given separator. Separator can be made optional and,
can be used as default.Taken from https://github.com/green-goo-dao/flow-utils/blob/main/cadence/contracts/StringUtils.cdc.
Will file issues for more functions in the future.
Epic: #1972
The text was updated successfully, but these errors were encountered: