--update-user-info and special characters #167

Gor3t3x · 2017-04-21T12:30:52Z

Hi,

Last bug i found is using --update-user-info with specials characters like "ç" "é" "è" has a strange behavior.

those characters are pretty common in countries where we speak french...

2017-04-21 14:19:54 2712 INFO processor - Updating info for user key: federatedID,[email protected], changes: {'firstname': 'Andr\xc3\xa9'}
2017-04-21 14:19:54 2712 INFO processor - Updating info for user key: federatedID,[email protected], changes: {'firstname': 'Aur\xc3\xa9lie'}
2017-04-21 14:19:54 2712 INFO processor - Updating info for user key: federatedID,[email protected], changes: {'firstname': 'Andr\xc3\xa9'}
2017-04-21 14:19:54 2712 INFO processor - Updating info for user key: federatedID,[email protected], changes: {'firstname': 'Andr\xc3\xa9'}

Any idea how to resolve this?

Is this related => http://stackoverflow.com/questions/6956799/working-with-unicode-encoded-strings-from-active-directory-via-python-ldap)

EDIT: The problem seems to be present only in console, in adobe dashboard, the special characters are well encoded but still getting this error on each sync with --update-user-info with sames users to modfy

2017-04-21 16:00:08 3368 INFO processor - ---------- Start Sync Umapi --------------------------------
user_sync-2.0-py2-none-any.whl.58f0d6835e0dec629e1283c1839d8f5ad6f21614\user_sync-2.0-py2-none-any.whl\user_sync\rules.py:844:
UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if (value != umapi_value):

ianmak · 2017-04-22T01:04:08Z

This appears to be a limitation of python printing/logging... when you log a dictionary value, sub values containing symbols or non-english characters don't seem to show up properly. I haven't found a quick solution to this, I'll try digging into this a bit more next week.

The Unicode warning you're seeing is sort of a separate problem... I think it's because the directory loads string values, whereas umapi works with unicode strings, and it gets confused when it compares strings with symbols to unicode strings. I'll look into that next week as well...

adobeDan · 2017-05-04T07:45:41Z

The problem here is that we are not explicitly converting the strings fetched from the directory to unicode via utf-8 encoding, so Python is using the default (ascii) encoding assumed for 2.7 strings. The strings that come back from the umapi-client (and the UMAPI server) are always unicode strings, which is why Python is attempting to convert the directory strings to unicode when doing the comparison.

This should be fairly easy to fix. I will take a look.

adobeDan · 2017-05-04T23:23:19Z

Not such a simple fix after all: it revealed that the UMAPI client couldn't handle unicode strings: adobe-apiplatform/umapi-client.py#41. So first I had to go fix that and release it; now I can use the new version to fix this bug.

adobeDan · 2017-05-05T05:22:28Z

Wow, did this ever turn out to be a rabbit hole! Having allowed non-ascii strings everywhere, I got to find out where they are and are not allowed:

Yes, allowed:

in people's first and last names
in adobe group names (both PCs and user groups)

No, not allowed:

in email addresses
in federated usernames (since they are the local part of email addresses)

Since non-ascii chars are allowed in adobe group names, that means they can show up in config files, in the directory group mapping! So in addition to allowing non-ascii input from ldap and csv, I had to allow for non-ascii config files as well! So there is a new, optional command-line parameter --config-file-encoding whose first arg specifies the encoding of the config files (default ascii).

Fix #167: allow non-ascii unicode chars in user and group names. Also fix #159 and fix #173, both for the second time :(.

adobeDan · 2017-06-02T05:10:46Z

So this turns out not to be completely fixed, for two reasons:

csv input should be done in binary mode to handle all encodings properly
LDAP format strings are unicode, so you can't do string formatting unless you decode what goes into them.

* Modularize the CSV handling into an object that's unicode-aware. This not only fixes a file mode bug, and does catching of unicode issues, but it also makes us ready for py3 where the CSV module actually handles unicode strings. * NOTE: because emails cannot contain non-ascii chars, the stray files don't need encoding on input or output. * Make the LDAP attribute formatters fully unicode aware. Before they didn't realize that the format strings were themselves unicode, so they were re-encoding the results of formatting.

adobeDan mentioned this issue May 3, 2017

Fix #159 - more secure credential handling #176

Merged

adobeDan self-assigned this May 4, 2017

adobeDan added the bug label May 4, 2017

adobeDan added this to the v2.1 milestone May 4, 2017

adobeDan mentioned this issue May 4, 2017

excluded user counts are wrong if adobe-only-user-action is exclude #173

Closed

adobeDan mentioned this issue May 4, 2017

Allow secure credentials to be retrieved from environment, not disk #159

Closed

adobeDan closed this as completed in d635fde May 5, 2017

adobeDan added a commit that referenced this issue May 5, 2017

Merge pull request #178 from adobe-apiplatform/issue-167

2211f2f

Fix #167: allow non-ascii unicode chars in user and group names. Also fix #159 and fix #173, both for the second time :(.

adobeDan reopened this Jun 2, 2017

adobeDan modified the milestones: v2.1.1, v2.1 Jun 2, 2017

adobeDan closed this as completed in 6d0b988 Jun 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--update-user-info and special characters #167

--update-user-info and special characters #167

Gor3t3x commented Apr 21, 2017 •

edited

Loading

ianmak commented Apr 22, 2017

adobeDan commented May 4, 2017

adobeDan commented May 4, 2017 •

edited

Loading

adobeDan commented May 5, 2017

adobeDan commented Jun 2, 2017

--update-user-info and special characters #167

--update-user-info and special characters #167

Comments

Gor3t3x commented Apr 21, 2017 • edited Loading

ianmak commented Apr 22, 2017

adobeDan commented May 4, 2017

adobeDan commented May 4, 2017 • edited Loading

adobeDan commented May 5, 2017

adobeDan commented Jun 2, 2017

Gor3t3x commented Apr 21, 2017 •

edited

Loading

adobeDan commented May 4, 2017 •

edited

Loading