Speed up identifier detection #89

jserv · 2024-11-16T07:25:10Z

Both 'is_ident1' and 'is_ident2' are now macros instead of function calls, and they are tweaked for ASCII detection in advance with the fallback to table lookup for non-ASCII characters.

fuhsnn · 2024-11-16T15:25:13Z

Thank you for the patches. Honestly performance hasn't been a priority, I consider the lexer less significant in the total pipeline compared to cache-misses from leaking and the overhead of external assembler. I do have some ideas as well, so a branch for performance experiments should make sense.

In terms of performance I want to ensure a common basis to work on: which projects to benchmark and what kind of environment, is it self-hosted, another simple C compiler or full-on gcc/clang optimizations with custom allocator etc. Do you have a preference?

fuhsnn · 2024-11-16T15:33:33Z

On the pull request: I think the short-circuiting of ASCII characters can be done earlier, skipping decode_utf8() altogether.

jserv · 2024-11-16T15:47:06Z

In terms of performance I want to ensure a common basis to work on: which projects to benchmark and what kind of environment, is it self-hosted, another simple C compiler or full-on gcc/clang optimizations with custom allocator etc.

Using uftrace, I identified the performance bottlenecks in the compilation process.

Here's what I found:
First, apply the following changes:

--- a/GNUmakefile
+++ b/GNUmakefile
@@ -1,4 +1,4 @@
-CFLAGS=-std=c99 -g -fno-common -Wall -pedantic -Wno-switch
+CFLAGS=-std=c99 -g -pg -fno-common -Wall -pedantic -Wno-switch
 
 SRCS=$(wildcard *.c)
 OBJS=$(SRCS:.c=.o)

After running uftrace ./slimcc -c -o stage2/parse.o parse.c, the trace showed the following call stack:

Copy[6] is_ident2
[5] read_ident
[4] tokenize
[3] tokenize_file
[2] must_tokenize_file
[1] cc1
[0] main

Based on this analysis, I tweaked the is_ident1 and is_ident2 functions.

jserv · 2024-11-16T16:09:58Z

By the way, I am working with my students to enhance the shecc project, a self-compiling C compiler with some optimizations. Since shecc only implements a subset of C, we are planning to modify slimcc to output compatible C code that shecc can process, rather than having slimcc generate x86-64 assembly directly. This way, slimcc would serve as a full C language frontend for shecc, while allowing shecc to focus on optimization and backend support. To achieve this, we plan to contribute improvements to slimcc's parsing, tokenization, and preprocessing components.

fuhsnn · 2024-11-16T16:59:46Z

I'm thrilled to know that the project has some real world value and am happy to target different backends as long as the IR is stable, but I imagine memory usage will be a major pain point on embedded targets that shecc currently self-hosts on. I thought about re-architecting, but then the code wouldn't be as accessible and one might as well work on other more optimized C frontend like tcc, cproc, tyfkda/xcc, libfirm/cparser etc.

If C++ is allowed in your classroom, I do have an unfinished C++ port sitting around that use RAII for nearly identical code with much better memory consumption, a minimal C++ front-end bootstrappable with C99 is also something I've always wanted to work on.

fuhsnn · 2024-11-16T22:14:26Z

The C++ version is uploaded in partial_c++_port branch, it's less C++ than I remembered and still able to build as C with just a little macro and typedef's.

If that particular style is okay to you, I'd prefer basing performance optimizations on that version, since it feels a bit bike shedding to optimize parsing while compiling larger projects hit OOM on an embedded device.

jserv · 2024-11-22T08:11:02Z

I defer this pull request to @ChAoSUnItY, who is assigned to proceed with the above research prototype under my supervision.

fuhsnn · 2024-11-23T01:52:51Z

I believe identifier detection is of decent speed after efcbb85, feel free to open new issues regarding the C backend.

Btw thanks for mentioning uftrace, logging exact call counts is incredible. Can't believe it's not more well known.

ChAoSUnItY · 2024-11-26T17:42:34Z

I've adopted identifier reading logic from efcbb85 and further refine it with the macro functions introduced previously in this PR, and also applied fool-proofing logic to the function. This way, it improves readability.

jserv · 2024-11-28T00:49:14Z

Let's rebase the latest main branch and squash the commits.

Both 'is_ident1' and 'is_ident2' are now macros instead of function calls, and they are tweaked for ASCII detection in advance with the fallback to table lookup for non-ASCII characters. Also discard unnecessary variable and add safe guard: - Variable is_first can be replaced with boolean expression "p == start" - Add safe guard for ascii character checking to ensure starting identifier character must not be numberic Additionally, replace first ascii character check with macro is_ident2_ascii to keep readability. The later is_ident2 function call is replaced with is_ident2_non_ascii because the expanded macro function will result in multiple decode_utf8 function call, also it's redundant to check if it's an ascii character or not. Co-authored-by: Jim Huang <[email protected]>

fuhsnn · 2024-12-29T19:36:12Z

is_first is indeed redundant, I'll change that.

The rest is overwhelmingly similar to enh/chibicc@9473229

ChAoSUnItY mentioned this pull request Nov 23, 2024

Support of C Backend #102

Open

fuhsnn force-pushed the main branch from a105cc3 to ed5f470 Compare November 24, 2024 11:47

ChAoSUnItY force-pushed the speedup-is-ident branch from 61a6198 to 56bcd32 Compare November 26, 2024 06:29

fuhsnn force-pushed the main branch 5 times, most recently from 7c97de8 to ed6cfec Compare November 27, 2024 09:35

ChAoSUnItY force-pushed the speedup-is-ident branch 2 times, most recently from 644be33 to 4b34032 Compare November 28, 2024 06:00

fuhsnn force-pushed the main branch 10 times, most recently from f752182 to 6dae514 Compare November 29, 2024 04:33

fuhsnn force-pushed the main branch 3 times, most recently from d4283bc to c10050f Compare November 29, 2024 08:02

ChAoSUnItY force-pushed the speedup-is-ident branch 2 times, most recently from 97719ae to 8e22d62 Compare November 30, 2024 06:23

ChAoSUnItY force-pushed the speedup-is-ident branch from 8e22d62 to 0be5505 Compare December 1, 2024 13:24

fuhsnn force-pushed the main branch 7 times, most recently from f093a17 to bfe2374 Compare December 6, 2024 00:44

fuhsnn closed this Dec 29, 2024

fuhsnn mentioned this pull request Dec 29, 2024

Specialize (type) keywords as independent token kinds #106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up identifier detection #89

Speed up identifier detection #89

jserv commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

jserv commented Nov 16, 2024

jserv commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

fuhsnn commented Nov 16, 2024 •

edited

Loading

jserv commented Nov 22, 2024

fuhsnn commented Nov 23, 2024

ChAoSUnItY commented Nov 26, 2024

jserv commented Nov 28, 2024

fuhsnn commented Dec 29, 2024

Speed up identifier detection #89

Speed up identifier detection #89

Conversation

jserv commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

jserv commented Nov 16, 2024

jserv commented Nov 16, 2024

fuhsnn commented Nov 16, 2024

fuhsnn commented Nov 16, 2024 • edited Loading

jserv commented Nov 22, 2024

fuhsnn commented Nov 23, 2024

ChAoSUnItY commented Nov 26, 2024

jserv commented Nov 28, 2024

fuhsnn commented Dec 29, 2024

fuhsnn commented Nov 16, 2024 •

edited

Loading