Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISO charset make crash with python #1711

Open
BigBadWouf opened this issue Nov 6, 2024 · 4 comments · May be fixed by #1747
Open

ISO charset make crash with python #1711

BigBadWouf opened this issue Nov 6, 2024 · 4 comments · May be fixed by #1747

Comments

@BigBadWouf
Copy link

Hello

On eggdrop 1.10 and python 3.12.6 (running on manjaro linux 6.1)

When bind on event like pubm, part, quit with messages contains ISO char like éèçà, eggdrop crash.

Simple to reproduce :
Compile an eggdrop with default parameters.
Load python.
Add script with bind like bind("pubm", "*", "*", onMessage)) and proc like def onMessage(nick, mask, handle, channel, msg=''): pass
Go to channel with bot and send message with charset like iso-8859-15 and special char.
Crash.

Cant produce gdb for now, i add one to night

@BigBadWouf
Copy link
Author

Here a gdb of SEGMENT VIOLATION with python and simple bind

gdb.txt

@michaelortmann
Copy link
Member

Can repeat.

scripts/1711.py:

from eggdrop import bind
def onMessage(nick, mask, handle, channel, msg=''): pass
bind("pubm", "*", "*", onMessage)

Bot:

./eggdrop -t BotA.conf 
.tcl pysource scripts/1711.py

Sending 0xc3 0xa9 0x0a (é) works fine in the first moment:
[05:28:06] [@] :testuser!~michael@localhost PRIVMSG #foo :é

But sending 0xc3 0x0a crashes the bot immediately

[05:54:01] [@] :testuser!~michael@localhost PRIVMSG #foo :
                                                          [05:54:01] * Please report problem to https://github.com/eggheads/eggdrop/issues
[05:54:01] * Check doc/BUG-REPORT on how to do so.
[05:54:01] * Last bind (may not be related): evnt:init_server
[05:54:01] * Wrote DEBUG
[05:54:01] * SEGMENT VIOLATION -- CRASHING!
Segmentation fault (core dumped)

gdb:
https://pastebin.com/YtHMFivx

So, in this example, the last 2 bytes eggdrop read from PRIVMSG from server are 0xc3 0xa0. eggdrop reads them in mainloop() via sockgets(). it will replace the linefeed 0xa0 with null terminator 0x00. then mainloop() calls dcc[idx].type->activity(idx, buf, i); with buf now ending with 0xc3 0x00.

0xc3 is a valid char, that is not filtered out or modified by eggdrop in any way. but it is no valid utf8. 0xc3 is 0b11000011, the 2 high bits are set, so, if some code would treat it as if it were valid utf8 it would expect one or more chars to follow, but there is only 0x00 following. such unsafe code could crash.

I dont think eggdrop cares about utf8 (do we parse uf8 anywhere???). that means, tcl and/or python crashes on invalid utf8 and eggdrop must make sure, it doesnt feed invalid utf8 into tcl/python functions that would stumble upon it.

@michaelortmann
Copy link
Member

btw: gdb is utf8 aware, wil try to decode it, and in this example it will print <incomplete sequence \303> when it sees the 0xc3 0x00 sequence in a char * buf.

@michaelortmann
Copy link
Member

in check_tcl_bind(), where we call trigger_bind() we can check if proc is *python, and then do a check for valid utf8 on the string in match. but do we always need to check? are there cases where non-utf8 would be fine? and how can we do such utf8 stuff without pulling in too much utf8 fiddling code? and what then? "fixing" the string? erasing chars from the match buffer? not calling the proc at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants