Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TransformEvaluator.ts to include decomposeDiacriticalMarks #89

Conversation

Semperverus
Copy link
Contributor

@Semperverus Semperverus commented Sep 18, 2024

Adds a very rudimentary decomposeDiacriticalMarks function to the Transform Evaluators.

The way this functions is by using the built-in javascript normalize() method in 'NFKD' mode to first break a character with diacritical marks up into its constituent unicode components in compatibility mode, and then it performs a regex .replace() to strip out the diacritical marks in the string. These diacritical marks are located in unicode range \u0300 through \u036F as referenced in the official unicode documenation (or alternatively the Wikipedia page which is much easier to read)

This code requires testing within the plugin itself, but the function to perform the decomposition has been tested independently.

Official SailPoint Developer documenation page for decomposeDiacriticalMarks

Adds a very rudimentary decomposeDiacriticalMarks function to the Transform Evaluators.

This code requires testing within the plugin itself, but the function to perform the decomposition itself has been tested independently.
@Semperverus
Copy link
Contributor Author

Semperverus commented Sep 18, 2024

Note: I am not great with JavaScript, and I essentially stole existing functions' design and hopefully put together something acceptable. I mostly needed to wrap the input.normalize('NFKD').replace(/[\u0300-\u036f]/g, "") code into a decomposeDiacriticalMarks method to make it work, as not having this is problematic for work I am doing.

I tested the actual core of the method on Mozilla's documentation page here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize and replaced the existing normalize lines (lines 11 and 12) with the above. This validated and works correctly.

If you see any changes that need to be made, please feel free to do so, especially from lines 913 to lines 925 regarding the following code:

            if (attributes.input !== undefined) {
                input = attributes.input;

                if (typeof input === 'object') {
                    input = await this.evaluateChildTransform(input);
                }

                if (input === undefined) {
                    return;
                }
            }

I have a general understanding of what this does, but do not yet fully understand what all might need to go in here. Assuming that evaluating if it's of type object is a safe bet.

I hope this helps!

According to the IdentityNow developer documentation, they are using 'NFKD' mode, which is the "Compatibility Decomposition" mode. NFD will work, but may have edge cases.
@Semperverus
Copy link
Contributor Author

Semperverus commented Sep 19, 2024

Update: I went ahead and tested the official documentation's usage of the input.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "") regex pattern. This seems to be flavor-specific to Java and is unavailable in Javascript, therefore we will need to use the literal regex match of [\u0300-\u036f] instead.

A more complete list might be [\u0300–\u036F\u1AB0–\u1AFF\u1DC0–\u1DFF\u20D0–\u20FF\uFE20–\uFE2F\u0483-\u0486\u05C7\u0610-\u061A\u0656-\u065F\u0670\u06D6-\u06ED\u0711\u0730-\u073F\u0743-\u074A\u0F18-\u0F19\u0F35\u0F37\u0F72-\u0F73\u0F7A-\u0F81\u0F84\u0e00-\u0eff\uFC5E-\uFC62] (Source), but this also may be overzealous and I think removes some characters that IdentityNow's transform does not.

@yannick-beot-sp
Copy link
Owner

yannick-beot-sp commented Sep 19, 2024

Thank you for your contribution.
Can you update the README and CHANGELOG files to reflect this update? Don't forget to mention yourself as a contributor (there are example of Pull Requests and contributions in those files).

Also, what would be great is to:

Added decomposeDiacriticalMarks() method to stringUtils.ts per Yannick's request.

This could be renamed to "removeDiacriticalMarks()" in the future as that is technically its function, but SailPoint's official naming convention dictates "decomposeDiacriticalMarks" so I provided this name for the sake of consistency between systems.
Created the test suite for decomposeDiacriticalMarks() per Yannick's request.

This test includes strings provided by SailPoint's documentation, Mozilla's example, and a long string of diacritical characters.
Updated the Release Notes section of the readme to include mention of the decomposeDiacriticalMarks function that was added by Semperverus
Added notes about changes regarding decomposeDiacriticalMarks
Added self as contributor
Included self as contributor
@Semperverus
Copy link
Contributor Author

Semperverus commented Sep 19, 2024

Okay, these are all uploaded. I am not sure on how I would test locally. Is this something you would be able to run @yannick-beot-sp ?

@yannick-beot-sp
Copy link
Owner

I'm using the task "Unit Tests" or "Run current mocha test" to execute the test.
So, the tests shall succeed

@yannick-beot-sp yannick-beot-sp merged commit a99ada4 into yannick-beot-sp:main Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants