Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Wrong emoji deserialization #6371

Open
Guzhuu opened this issue Dec 4, 2024 · 2 comments
Open

[BUG] Wrong emoji deserialization #6371

Guzhuu opened this issue Dec 4, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Guzhuu
Copy link

Guzhuu commented Dec 4, 2024

Describe the bug
Emojis in an HL7 message are not deserialized correctly when using non-strict parser, which causes an error and prevents normal message processing.

To Reproduce
Steps to reproduce the behavior:

  1. Create channel, Source outbound message HL7 non-strict parser
  2. Insert the emoji somewhere in the Source outbound message (msg['MSH']['MSH.4']['MSH.4.1'] = "😀")
  3. Create 2 destinations, the first one with HL7 inbound, strict parser; the second one, HL7 inbound, non-strict parser. Create a transformer in each destination otherwise the message will not be parsed and no error will occur.
  4. Process a message

Expected behavior
Correct emoji deserialization, for example:

  • Original -> XML escaped (Source Transformed) -> Original (Source Encoded & Destination Raw) -> XML escaped (Destination Transformed)
  • 😀 -> 😀 -> 😀 -> 😀
  • 💄 -> 💄 -> 💄 -> 💄

Actual behavior
Incorrect emoji deserialization, for example:

  • Original -> XML escaped (Source Transformed) -> Original (Source Encoded & Destination Raw) -> XML escaped (Destination Transformed)
  • 😀 -> 😀 -> 😀 -> ��
  • 💄 -> 💄 -> 💄 -> ��

Screenshots
Source transformer
image
Source Transformed
image
Source encoded
image
Destination strict parser Raw
image
Destination strict parser Transformed
image
Destination non-strict parser Raw
image
Destination non-strict parser Transformed
image
Destination non-strict parser Error
image

Environment (please complete the following information):

  • OS: [Linux (RHEL 8.9), Windows 10]
  • Java Distribution/Version [OpenJDK 21, Java 8 (201)]
  • Connect Version [at least 3.12.0, 4.4.1 and 4.5.2]

Workaround(s)
Using strict parser, and changing the channel accordingly; which is not always an option.

@Guzhuu Guzhuu added the bug Something isn't working label Dec 4, 2024
@pacmano1
Copy link
Collaborator

pacmano1 commented Dec 4, 2024

Nice write up!

Maybe post a sample message as a file to make this a bit easier to reproduce. And you left out what encoding you are using, e.g. UTF-8?

@Guzhuu
Copy link
Author

Guzhuu commented Dec 12, 2024

Hello,

The emoji was being inserted in the HL7 in the source transformer, so any message (e.g. MSH|^~&|MSH3|😀|B|C|D|E|MDM^T02^MDM_T02|||2.5||||||UNICODE UTF-8)

Encoding UTF-8 is being used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants