-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Considering alternative coroutine sslocal impls #2452
Comments
For point 2, Go's RE engine is pretty robust. For point 3, Go's GC is really good now and it's relatively easy to write memory-efficient code. But compared to languages without GC and runtime (Rust/C/C++) there's still a memory penalty. However recent discussion on HN makes me more confident in Go compared to Rust. @Mygod I like the go2shadowsocks name! 😂 |
I don't know that article but it seems ancient (it even predates re2). We are talking about 100KB of regex pattern matching domain names so I am not sure if it is as performant as you claimed. Some benchmarks would be nice... |
The point of Cox's article is that Go uses finite automata-based regex engine so the performance should be good enough. Just checked re2 is similar so I guess you should expect similar performance. |
As mentioned in golang/go#11646, Go uses NFA so its matching time would be at least linear in pattern length for our use case. We want sublinear matching time which could be achieved via DFA, but DFA is harder to implement. |
Ah good to know there's still room for improvement! 😄 |
Out of curiosity, what do you do with ACL that makes it so large? |
GFW lists and China lists. While in theory DFA can take exponential size, almost all (except <5 maybe) of the rules are simply matching domain suffixes so a good DFA implementation should take almost linear size and only exponential in the number of exceptions. Of course, if no such good implementation exists, maybe another way is to handle those domain suffix matching differently using a DFA-esque algorithm (say, using Aho-Corasick with an alphabet over all the possible parts of the domain) and handle the rest using a regexp. |
If it’s almost suffix marching, wouldn’t trie-based data structure more efficient (both space and time)? |
Yeah. Trie is another DFA-esque algorithm that I was suggesting previously. :) |
Ah sorry I missed that part. Anyway, in this case I don’t think regex is the best solution. |
I think we all agree that shadowsocks-rust is the best choice here. it looks not much work to implement ACL in shadowsocks-rust, at least for ss-local only. @zonyitoo What do you think? |
Actually I think after discussion with @riobard I think the only drawback of using golang is GC. I am really unfamiliar with rust so I really don't know which one to use now. |
Last time we faced similar issues in overture due to the memory leak in golang's regex routine: #1639 Not sure if this is still a problem now. |
@riobard What about implementing the ACL in ss-go first, and let's compare the performance between implementations. |
Sure. But I haven't looked deep into the ACL implementation in libev yet. |
@zonyitoo The ACL function in ss-libev is very naive. I think you can build new ACL rules to support more features, like forwarding different domains to different upstreams. Some example ACL for shadowsocks-libev can be found here: https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/ |
Here're the missing features in ss-go and ss-rust for android integration:
BTW, the above logic can also be shared with an iOS implementation. |
I have no idea how the current ACL works and what features it supports, but judging from https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/ there are two major categories: 1) CIDR-matching, which can be efficiently calculated using a radix tree, and 2) domain-matching, as discussed before, is also mostly suffix matching. Why did we end up with regex in the first place? 😂 |
There is also an ACL implementation in this repo written in Kotlin and C++ (re2), which might be more readable... I would love to write them myself (and learn golang/rust) but unfortunately time is a luxury I don't have, so let me post a write-up explaining things sometime soon. |
ACLLet's first start with ACL. ACL is used to decide whether a given IPv4/IPv6/hostname should be bypassed. Basically there are two parts of ACL, one is hostname matching and the other is IP matching. The matching behavior (for a standard socks5 request) is described by the following algorithm:
For hostname matching, we can concatenate all regexps using Why ACLHere is my assumed reasoning for using ACL. With Chrome, you can configure proxy with extensive rulesets using extensions. Android does not have a good interface for socks5 proxy except for a global
Android will set up iptables routes so that all traffic will go to the tun device by default. To bind shadowsocks traffic to the underlying network (there is a Complications of doing ACL with socksified VPN
More rigorously, we want to have a local DNS relay doing the following: (currently implemented by
With the DNS server set up, we can do the ACL matching under VPN/transproxy mode:
OutlookI think with this change we will be able to do a lot more in the future, e.g. policy routing (mentioned above by @madeye), #2087, #2301, etc. |
@Mygod That's… quite some work 😂 |
Yeah the complicated part is local DNS resolving. If we are only doing ACL in the new impl then we have not ventured far from keep using libev impl. :) |
This comment has been minimized.
This comment has been minimized.
BTW, supporting ACL like libev is quite a simple job. I will try to implement that in the following weekends. |
Ah .. I see. https://github.com/shadowsocks/shadowsocks-libev/blob/master/acl/gfwlist.acl They are in the |
Privoxy uses very similar rules: https://www.privoxy.org/user-manual/actions-file.html#HOST-PATTERN . I am curious about what they did for improving performance. |
@zonyitoo A best practice is concatenating rules together like Here are some sample codes in shadowsocks-android:
Some search shows the regex engine in rust should be even faster than RE2: https://github.com/rust-lang/regex/blob/master/bench/log/05/re2-vs-rust |
Does Android have a C API for that? I can only find a Java API.
Hmm? How to do that? |
Yes, only Java API. It means we need a callback through JNI. An example can be found here: https://github.com/madeye/BaoLianDeng/blob/master/app/src/main/java/io/github/baoliandeng/core/LocalVpnService.java#L215 In golang, we have a callback in Dial interface: https://golang.org/src/net/dial.go#L97 |
ref: shadowsocks/shadowsocks-android#2452 Reformatted logs
shadowsocks-rust have just finished supporting ACL in TCP relays.
|
Thanks guys! Re shadowsocks-android/core/src/main/java/com/github/shadowsocks/bg/VpnService.kt Lines 65 to 92 in 9f15ab5
Re 2-DNS issue: I think all the DNS probes are necessary? (unless I am missing something) In addition to my explanations above, let me recall the part that involves DNS in VPN mode:
Re-IP merging: I think it is safe to assume the input does not have duplicated blocks unless you are using a binary search, since there the algorithm will not work correctly if there are duplicates. |
Also none of these are set in stone so feel free to propose changes when appropriate. |
I just got shadowsocks-rust successfully integrated into shadowsocks-android. For the next step, @zonyitoo can help implement those two RPCs in shadowsocks-rust.
It's a long term project, so take your time, no rush. |
Ok. I can only work on it in weekends. |
I prefer separate filtering modules. An in-memory index is worth considering, basically a balanced tree. |
Yeah we can definitely put ACL part directly in this repo if it is doable. |
https://github.com/shadowsocks/shadowsocks-libev/tree/master/acl |
@madeye #2452 (comment) is what I was talking about in PR. I think this implementation is much easier and does not involve RPC for reverse lookup. Let me know if you want to change it. |
An interesting write-up at Fuchsia: https://fuchsia.googlesource.com/fuchsia/+/refs/heads/master/docs/project/policy/programming_languages.md |
@Mygod After several weeks of developing with Rust, I'm quite sure that it's the best language for system programming. We're also discussing the adoption of Rust internally. However, one concern is that there's no ISO 26262 standard compiler for Rust so far. It means no safe certificate for the production software written by Rust. |
I see. I guess that's why you are making time to contribute to the rust repo. 🤣 |
Consider my proposal, advanced routing options can be placed in the "Advanced" menu, or even make the ACL editable. Most SS users only need an easy-to-use configuration: should I proxy this IP / URL? |
Moving discussion from shadowsocks/shadowsocks-org#154.
We are trying to make ACL handling of shadowsocks-android less hacky (for example shadowsocks/shadowsocks-libev#2627), however, things seem challenging.
get_sockaddr
), which could block the entire thread if network conditions are bad.Thoughts are appreciated.
CC @madeye @riobard @zonyitoo
The text was updated successfully, but these errors were encountered: