Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router & Route Performance #56

Closed
gerardtoconnor opened this issue Jun 7, 2017 · 15 comments
Closed

Router & Route Performance #56

gerardtoconnor opened this issue Jun 7, 2017 · 15 comments
Labels
feature request Request to add new functionality
Milestone

Comments

@gerardtoconnor
Copy link
Member

gerardtoconnor commented Jun 7, 2017

@TheAngryByrd @dustinmoris @imetallica
This is a move of the topic over from Task CE issue. Just to summarise

  1. Current Giraffe router uses a process of elimination on a list of path substrings/patterns, iteratively testing everyone till match met which is elegant, simple but not performant.
  2. For performance, routes can be compiled into a tree structure that minimises the number of failed tested paths.
  3. The best tree structures are Tries (character/edge nodes) or Binary/AVL Tree (chunk/token tested higher/lower)
  4. I have initially tried trie as it only checks path char once while Bin/AVL checks same start path, multiple chars, multiple times, so if the trie node traversal can be sped up close to string checking, trie will be faster.
  5. I have built first draft of Trie Router and it reads path, char by char checking for node path and firing off Http Handler functions at appropriate endpoints.
  6. The initial draft (with exception of one route bug) has same perf for short routes, 20% faster on long, 50% on long parse routes, leading to blended 9% improvement on total.
  7. Custom made crawl/stream parsers are an important part of performance boost over standard regex.
  8. The initial draft uses ref obj nodes with dictionary edges, a crude and bulky way to initially test function, now that concept works, I will migrate nodes to structs with compressed sparse array edges in an effort to improve performance as the stack is faster than heap etc.
  9. You can note from perf table below, there is only a slight degradation of performance on Trie over longer routes while old router understandably is noticeably slower.
  10. I think the ultimate goal is that Giraffe router should be a lot faster than MVC such that Giraffe is the most compelling web platform when performance & composability are taken into account.

With all this said I will just re-iterate how Trie router Handler works vs standard route HttpHandler:

// (Node->Node) list -> HttpHandler
let routeTrie (routeFns:(Node->Node) list) = ...
// string HttpHandler -> (Node -> Node)
let routeT (path:string) (fn:HttpHandler) = fun (Node) -> Node
let (==>) (a: HttpHandler->Node->Node) (b:Httphandler) = a b 

let trieApi : HttpHandler =
    routeTrie [
        routeT "/" ==> text "Hello world, from Giraffe!"
        routeT "/test" ==> text "Giraffe test working"
        routeT "/about" ==> text "Giraffe about page!"
        routeTf "/data/%s/weather" (fun v -> sprintf "json (weatherForecasts (%s))" v |> text)
        routeTf "/value/%s" text 
        subRouteT "/auth" ==> AuthHandler >=>  routeTrie [
            routeT "/inbox" ==> text "Auth Inbox"
            routeTf "/parse%slong%istrings%sand%sIntegers" (fun (a,b,c,d) -> sprintf "%s | %i | %s | %s" a b c d |> text)
            routeTf "token/%s" (fun v -> text "following token recieved:" + v)                                    
            subRouteT "/manager" ==> routeTrie [
                routeT "/payroll" ==> text "Manager Payroll"
                routeTf "/team%ssales%f" (fun (t,s) -> sprintf "team %s had sales of %f" t s |> text)
                routeTf "/accesscode/%i" (fun i -> sprintf "manager access close is %i" i |> text)
                subRouteT "/executive" ==> routeTrie [
                    routeT "/finance" ==> text "executive finance"
                    routeTf "/area/%s" (sprintf "executive area %s" >> text)
                    routeTf "/area/%s/district/%s/costcode%i" (fun (a,d,c) -> sprintf "executive area %s district %s costcode %s"  a d c |> text)
                 ]
            ]
        ]
    ]

routeTrie compiles the list of route mapping functions into its root node that then represents the pre-complied route Trie for requests/paths to traverse. The important part is to do as much work as possible in the compile stage such that compute time at runtime matching is minimised.

URLS GiraffeTask GiraffeTaskTrie Task/Trie
/ 2370840 2349487 -1%
/about 2391520 2324699 -3%
/auth/dashboard 2325389 2316402 0%
/auth/helpdesk 2322608 2303590 -1%
/auth/inbox 2317181 2320119 0%
/auth/manager/accesscode/57646878 1792291 2271399 27%
/auth/manager/executive/area/globaloperationcentral 1617520 2301895 42%
/auth/manager/executive/finance 1930551 2311994 20%
/auth/manager/executive/mis 1963573 2320318 18%
/auth/manager/executive/operations 1954403 2337297 20%
/auth/manager/payroll 2211415 2306816 4%
/auth/manager/teambravosales5345545.34544 1365431 2224991 63%
/auth/manager/teamview 2188016 2342409 7%
/auth/manager/timesheets 2199281 2316064 5%
/auth/parsefirstlong987654321stringssecondandthirdIntegers 1503244 2264492 51%
/authtoken/as8d7f098adsf897asdf7a09dfasd7f9as7df987 2088203 2287279 10%
/data/sunnydrycold/weather 2304337 2287102 -1%
/delivery 2322192 2343339 1%
/ourstory 2342938 2370148 1%
/products 2339945 2353792 1%
/test 2377866 2374676 0%
/value/thisisavaluethatisbeingpassed 2295704 2306295 0%
/wheretofindus 2332918 2368049 2%
Grand Total 48857366 53302652 9%

Perf numbers represent total req per second over seperate 3 runs ( /3 for expected perf)

@dustinmoris
Copy link
Member

Awesome work and I like those suggestions. Let me know how I can help you with progressing on this issue !

@dustinmoris dustinmoris added the feature request Request to add new functionality label Jun 7, 2017
@TheAngryByrd
Copy link
Member

I think routing can be optimize without involving IO, we should probably look into http://benchmarkdotnet.org/ or expecto's wrapper for it use https://github.com/haf/expecto#benchmarkdotnet-usage

@gerardtoconnor
Copy link
Member Author

@TheAngryByrd I was only realising the same thing about an hour ago, you're absolutely right, a pattern matching algo can be run & tested outside the server with a dummy HttpHandler class, does not need to be tested inside server every time. I have 2/3 variants of algo/tree I want to test so makes way more sense to do it this way if I am micro-tweaking the performance. Thanks as always !! I'll try use the expecto wrapper (but I've never used before).

@dustinmoris Thanks, will do, it's just a case of testing a few variants of my algo/tree structure now to see which performs best. The problem with using structs for the nodes is that you usually have to recursively construct entire nodes on initialisation which calls for more complex construction algo ... or duplicating work by creating in Obj/Dictionary form first and then mapping. Will keep you posted.

If anyone has any ideas on most performant structures/algo please do share and we can implement & test.

@imetallica
Copy link

@gerardtoconnor great news!

@gerardtoconnor
Copy link
Member Author

gerardtoconnor commented Jun 19, 2017

@TheAngryByrd Sorry meant to say, your last most recent run of trie router on Debian was actually significantly faster than osx, really pronounced on longer routes and ones with parsing (one route is +200% faster then vanilla Giraffe!?!)

This output nicely shows that we can get +26% performance boost on TaskCont ( 17% Task CE, 9% Continuation), and another +39% with basic trie router (+76% from vanilla giraffe). I have built another radix/token tree implementation that should be little faster again (will test when all 3 done) and am now developing a cpu-caching focused one that lays entire map onto a 1-dimensional struct array, minimizing hops and utilises prebuilt index key maps to eliminate lookups.

Giraffe TaskCont TaskContTrie Gir/Cont Gir/Trie Cont/Trie Urls
88102 107562 110822 +22% +26% +3% /
88501 107966 105799 +22% +20% -2% /about
63268 81826 106559 +29% +68% +30% /auth/dashboard
64892 73544 103903 +13% +60% +41% /auth/helpdesk
63812 80834 108368 +27% +70% +34% /auth/inbox
34609 46247 96658 +34% +179% +109% /auth/manager/accesscode/57646878
32088 40954 98299 +28% +206% +140% /auth/manager/executive/area/globaloperationcentral
35423 51014 105696 +44% +198% +107% /auth/manager/executive/finance
37659 50092 103655 +33% +175% +107% /auth/manager/executive/mis
36439 48145 102607 +32% +182% +113% /auth/manager/executive/operations
48084 64472 104357 +34% +117% +62% /auth/manager/payroll
28899 33543 62715 +16% +117% +87% /auth/manager/teambravosales5345545.34544
46385 61851 101565 +33% +119% +64% /auth/manager/teamview
46846 62551 104378 +34% +123% +67% /auth/manager/timesheets
31418 37319 66035 +19% +110% +77% /auth/parsefirstlong987654321stringssecondandthirdIntegers
42309 54735 101076 +29% +139% +85% /authtoken/as8d7f098adsf897asdf7a09dfasd7f9as7df987
60940 75203 98367 +23% +61% +31% /data/sunnydrycold/weather
80472 107941 109810 +34% +36% +2% /delivery
84192 102317 111883 +22% +33% +9% /ourstory
84951 100414 103441 +18% +22% +3% /products
87425 109061 115006 +25% +32% +5% /test
56922 72761 101775 +28% +79% +40% /value/thisisavaluethatisbeingpassed
86496 107979 112136 +25% +30% +4% /wheretofindus
1330132 1678331 2334910 +26% +76% +39% Grand Total

@dustinmoris dustinmoris added this to the Beta milestone Jun 22, 2017
@dustinmoris
Copy link
Member

I am going to close this issue as this one has been superseded by #53 and #69

@gerardtoconnor
Copy link
Member Author

gerardtoconnor commented Jul 22, 2017

I think this is a separate issue/feature request but fine closing as I can integrate into continuation/task PR.

I finished my attempt at faster router but despite being 10x more complex was only 3% faster ... which is disappointing and pushes me to go for radix tree hybrid version that is almost as fast but far simplier to maintain.

@gerardtoconnor
Copy link
Member Author

Can we re-open this until the new router PR is submitted? This is separate to both continuations & task CE as it implements a (radix-style) tree router with hand rolled parsers for max performance.

I would like to reopen as there are a few design questions I would like to ask & performance metrics to share etc?

@dustinmoris dustinmoris reopened this Aug 5, 2017
@dustinmoris dustinmoris modified the milestones: RC1, Beta Aug 12, 2017
@gerardtoconnor
Copy link
Member Author

I have been bit delayed deploying this given I have been trying to re-design the parameter application structure to avoid boxing.

Given the high-frequency of parsing of URLs in web apps with a large slowdown in req/sec proportional to the amount of parsing, there is unneeded overhead in boxing parse values onto the heap, grouped together by a ref Tuple, more GC pressure.

By using partial application instead, (fun a b c ->), we should be able to push parsed value types right onto the stack without boxing, so the calling parse handler can just pop off values, not even touching the heap (string obj still on heap but without boxing & additional ref lookup).

So far method overloading having an issue matching initial curried function signature taken from (fmt:StringFormat<'T,HttpHandler> , fn: 'T ) so trying to figure out how to hack around this.

for routef "foo/%i/bar/%s" (fun i s -> ...), type 'T is int -> string -> HttpHandler matching our parse handler funtion fun i s next ctx -> ...

if I cannot figure a way around boxing after this weekend, will probably just release now with boxing, and put boxing issue on todo list.

@nojaf
Copy link
Contributor

nojaf commented Oct 5, 2017

@gerardtoconnor did you ever find any time to work on this? Is there a way we can help you out?

@gerardtoconnor
Copy link
Member Author

gerardtoconnor commented Oct 5, 2017

I have built 3 "working" routers

  1. Trie (char tree nodes)
  2. Token/Radix (substring token tree nodes)
  3. Flat rolled tree (cpu caching optimized)

The Trie one is slow relative to other two, unfortunately the flat one was only 5% faster then token one to my surprise. I can tweak it to squeeze a little more but it's a lot less then I had hoped given the exponentially more complicated logic & execution.

I have spent/wasted last few weeks trying to fix boxing problem such that SRTP would be able to build partial application parser that pushed parsed values into function stack, avoiding boxing but it's just a bridge too far for now (I will come back to this).

For maintainability & simplicity I figured I'd release the token router first so I have created a branch token-router in my fork and all original routing now in BasicRouter module with token one in TokenRouter.
I then went and copied all HttpHandlerTests into a TokenRouterTests such that HttpHandlerTests uses BasicRouter namespace, and new tests test token router on same tests.

7 tests have failed to my surprise and despite my basic tests all working... so that's where I am now, just need to figure out bug in 7 tests and good to merge. most appear to relate to nested subroutes (how dev lays/dupes routes). I have labelled failing tests with comment //<<FAIL in repo.

I may need to change how trees mapped as with token implementation there is a tree built for each router(choose) whereas with the flat router I managed to suck entire app into one optimised tree.

Either way should be released this week.

NB: I am not including any case-insensitive matching in performance router as its a perf big slowdown and rarely causes a routing issue, users should turn caps lock off. BasicRouter will always be there for those who need it and performance isn't an issue.

@gerardtoconnor
Copy link
Member Author

Actually, the token router is 580 lines vs Array routers 700 lines so not sure either fit into the "maintainability & simplicity" bucket. 😳

Feel free start reviewing process now as not too much will change with bug fixing... or just take my word for it and avoid burning your eyes on my spaghetti code 😏 .

In case anyone is lost as to what the code is doing, its building & then navigating a hybrid radix tree where nodes can have functions attached that dictate how to process/continue the match/parse process.
image
Some weird optimisations are included which may look bizarre but help JIT.

@gerardtoconnor
Copy link
Member Author

There is one test I cannot get to pass

Failed Giraffe.TokenRouterTests.GET "/foo/blah blah/bar" returns "blah blah"
Error Message:
Expected: blah%20blah
Actual: blah blah

I will submit PR and maybe @dustinmoris can perhaps throw a fresh set of eyes on it?

Basically, it expects the written response of parsed path to be url encoded, but given we are using NSubstitute and not the server, not seeing where the encoding is done in the prior version!?

@gerardtoconnor
Copy link
Member Author

New token router live and merged #114 on develop branch so now closing issue

@dustinmoris
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request to add new functionality
Projects
None yet
Development

No branches or pull requests

5 participants