Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sip,uri,test: Escape SIP URIs #740

Merged
merged 4 commits into from
Mar 28, 2023
Merged

Conversation

maximilianfridrich
Copy link
Contributor

According to RFC 3261, SIP URIs need to escape UTF-8 characters.

@maximilianfridrich maximilianfridrich marked this pull request as ready for review March 21, 2023 14:20
@maximilianfridrich maximilianfridrich force-pushed the uri_encoding branch 5 times, most recently from d729b33 to ffa8a9c Compare March 23, 2023 08:32
src/uri/uri.c Outdated Show resolved Hide resolved
@maximilianfridrich
Copy link
Contributor Author

Unfortunately, the MSVC compiled retest seems to hang when it runs the UTF-8 related test test_uri_user on Windows. I've been trying to figure out what's going on, but I can't even get BareSIP to compile on my Windows VM. Any help is greatly appreciated.

test/uri.c Outdated
const struct {
const char *enc;
const char *dec;
} uriv[] = {
{"alice%40atlanta.com", "[email protected]"},
{"project%20x", "project x"},
{"*21%23", "*21#"}
{"*21%23", "*21#"},
{"*21%23%C3%A4%E2%82%AC%40%20", "*21#ä€@ "}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to avoid Unicode characters in the source code.

you can use a binary string instead, i.e. "\x01\x02\x03\x04" etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! So simple, and I was convinced it would solve the Windows problem. I am not sure what the MSVC compiler is doing here...

Do you have any more insight into what could be going wrong on the Windows machines? Or could you provide the script to setup Windows the way it is setup in the GitHub Actions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps some C-string is not null terminated ?

add some debugging, see if the program is stuck in a while loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes looks like a out of bounds, this is the error window message, why windows test hangs:

Debug Assertion Failed!

Program: retest.exe File: minkernel\crts\ucrt\src\appcrt\convert\isctype.cpp Line: 36

Expression: c >= -1 && c <= 255

https://stackoverflow.com/questions/39538846/program-crashes

This is the Call stack:

retest.exe!is_unreserved(char c) Line 43	C
 	retest.exe!is_user(char c) Line 85	C
 	retest.exe!comp_escape(re_printf * pf, const pl * pl, bool(*)(char) eh) Line 173	C
 	retest.exe!uri_user_escape(re_printf * pf, const pl * pl) Line 235	C
 	retest.exe!re_vhprintf(const char * fmt, char * ap, int(*)(const char *, unsigned __int64, void *) vph, void * arg) Line 311	
 	retest.exe!mbuf_printf(mbuf * mb, const char * fmt, ...) Line 540	C
 	retest.exe!test_uri_user() Line 213	C
 	retest.exe!test_unit(const char * name, bool verbose) Line 506	C
 	retest.exe!test_reg(const char * name, bool verbose) Line 740	C
 	retest.exe!main(int argc, char * * argv) Line 235	C
``

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

char c is -61

Copy link
Member

@sreimers sreimers Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think isalnum() is the problem. It behaves differently on windows.

Maybe you can add a range check c >= -1 && c <= 255 before calling isalnum().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to [EOF]

https://en.cppreference.com/w/c/string/byte/isalnum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to comment the same thing, yes, isalnum seems to be the problem. Apparently, Unix-like systems accept negative values without complaining, while Windows doesn't. The man page of isalnum states:

These functions check whether c, which must have the value of an unsigned char or EOF, falls into a certain character class according to the specified locale.

So this should certainly be an unsigned char.

Copy link
Contributor Author

@maximilianfridrich maximilianfridrich Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I casted it because "ä" is represented by 0xC3 0xA4 in UTF-8 which as signed char gives us -61 and -92. So they are valid values, but need to be casted to be correctly interpreted by isalnum.

Edit: Actually, the Notes section of the man page states that precisely this should be done.

Copy link
Contributor Author

@maximilianfridrich maximilianfridrich Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the help, the Windows tests are finally passing!

@sreimers sreimers changed the title Escape SIP URIs sip,uri,test: Escape SIP URIs Mar 28, 2023
@sreimers sreimers merged commit e03e282 into baresip:main Mar 28, 2023
@maximilianfridrich maximilianfridrich deleted the uri_encoding branch May 15, 2023 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants