-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some privacy related questions #23
Comments
I cannot replicate the packets after Dooble exits. Dooble has its process and QtWebEngineProcess processes. After all these processes terminate, whatever packets are being sent with respect to fsf.org are not sent by Dooble. Dooble does not create secret processes. Why do packets continue to travel after a page is loaded? This is a question on the behavior of the Web engine. The Wikipedia article is totally incomplete. There was a Dooble 1 and now there's a Dooble 2. The article is not current. The Accepted / Blocked Domains panel allows you to specify which domains you'd like to block. There is a file in the Data directory which contains a long list of domains. There is also documentation. qrc://documentation/Dooble.html |
Dooble doesn't use any 3rd party extensions. Just Qt. Nothing else. Nothing. Really. The article is also nonsense. Dooble doesn't have a specific purpose. The entire project is a hobby. Numerous people have made and propagated myths and a result Dooble has many purposes. |
OK, OK, some real purposes. It shouldn't depend on anything beyond Qt. It should have documentation. It should be small. I think it does those things. |
I understand but that still that does not invalidate the concern even if it is related to the underlying web engine. One would expect a privacy respecting application to send packets only when needed. For example
Thanks for the info. I have read the section in the docs about Accept/Block. It seems there isn't really an option for fine grained control of 3rd party resources. It would be good to have that. |
You're going to be in a pickle now. You have WebKit and WebEngine. Gecko... and? Lynx. Qt envelops Chrome's engine with some additional methods. Again, Dooble cannot send packets after it has terminated. Explain. Have you inspected netstat -anpt | grep Dooble? What other resources are you interested in? |
It's also very nice that you're showing an interest. |
I am aware that most browsers use a limited set of engines. Unfortunately that results only in inheritance of the issues those very engines bring.
I am interested to find a truly privacy respecting browser which does not chatter in the background like Firefox and Chromium forks. In the last few months I have tested almost every browser, reported quite a few bugs using similar tcpdump tests and wrote in quite a few forums about it. Yet most of the time the result was superficial answers, meaningless discussions and/or bug reports set to low priority. When I saw doodle I decided it may be a chance for something better. Here is a test using netstat: Test 1:https://www.fsf.org/robots.txt While loading netstat shows 5 connections with state ESTABLISHED About 10 seconds after that tcpdump shows more packets:
Waiting a little more... Now only 4 ESTABLISHED connections. Then 3 of them turn to CLOSE_WAIT (while tcpdump shows some more packets). Waiting... only 1 ESTABLISHED left. Waiting... about 30 seconds. 1 more packet appears:
Then after about 15 seconds another one:
Exiting Doodle. 2 more packets sent. The signle ESTABLISHED turns to TIME_WAIT and "Program name" in netstat turns to Test 29 ESTABLISHED
The console in which Doodle was started shows:
tcpdump continues:
Only 2 ESTABLISHED left. About 10 seconds after they turn into CLOSE_WAIT:
Waiting...
No more connections in netstat and in tcpdump. I don't know why this is happening but it practically results in continuous communication with the remote host after the content has been downloaded and there is no need for connection whatsoever. Repeating the same tests with lynx:Right after loading 2 active connections quickly turn to TIME_WAIT and "Program name" becomes |
Lovely discussion. Alright, about the engines. Are you expecting that I magically turn into an engine gnome and adopt WebEngine? Because if you are, I'm not becoming a gnome. Qt has provided the engine on FreeBSD, Mac, Windows, and Linux. Probably others too. Without owning the source and making my own specific changes to it, I can only offer as much as I am offered. Reality. Sockets are interesting and quite useful if you can own them. Dooble is like a parent and some parents have children and some of those children hide crayons. If Dooble cannot access those crayons, well, that's a problem. And Dooble cannot access those crayons. Doesn't know their colors, their brands, doesn't even know they exist. If you can find a relationship between a page and the underlying sockets, Dooble can certainly provide you more control. If not, the crayons will continue doing what they do. |
Not at all. I was just answering your question :) I understand it is impossible for a single person to audit millions of lines of code written by others. Surely for the moment Doodle does a much better job than FF-family which chatters like crazy with Mozilla, Amazon, Akamai, including telemetry, OCSP etc. I am unaware of which engine would be able to work as lynx works. IIRC Midori used to behave like lynx when I tested it some time ago. I don't know what it uses but it seems abandoned. The overall point is: the web is becoming more and more dangerous. All kinds of entities are trying to track you, fingerprint you, exploit you in various ways. So having a decent browser in which in-depth thought has been given is quite important - starting from the lower protocol up to the frontend control of 3rd party resources etc. |
The engines have been made such that the details are hidden from the user. This is a preferred concept of not just Web engines but other technologies. Details frighten people. Options introduce speculation. Red warning labels cause panic. (Sorry.) There is a method for retrieving the file descriptors of a process in Linux. Once that's achieved, one could discover the sockets if the method provides type details. From there, one could then close the sockets. What happens to the owning objects is unknown. |
I wasn't implying that red labels should be displayed. Yet putting false green ones (like Mozilla and others do) is far worse.
Is that what lynx does? Or is that something Dooble can do? Also what do you mean by "owning objects"? |
Lynx has access to the network objects. In Dooble, that access is restricted, hidden, non-existent, unknown. Then there are the QtWebEngineProcess processes. Those are child processes of the Dooble process and some of them are children of other QtWebEngineProcess parents. Dooble cannot access the sockets of those processes unless it uses /proc or some system function. So what happens to the owning objects? Well suppose you have a Chrome object of some Chrome class which has a socket. Now you close that socket without using the object. The object may have a state associated with that socket. It believes that the socket is open. The unknowns. |
The correct solution is to liberate the access or provide a means of setting timeouts or preventing heartbeats. This other solution is ugly, not portable, damaging, and what for? |
You'd get a little more security because your application is crashing most of the time. |
Are you saying that lynx is crashing all the time? Or that it is not ugly to send TCP packets for several minutes after downloading a file for less than a second? Is that really the best possible? |
Neither. I am saying that the ugly solution of intercepting objects from the void in order to satisfy closed sockets is not a solution that is pleasant. If Dooble does this, it satisfies one single curiosity. What does it gain? It certainly will not add stability. Dooble ----- Qt ----- Chrome ----- ... ----- Sockets to Dooble ----- Sockets |
Chrome is highly controversial in the FOSS community, especially among people with privacy concerns. Perhaps you know that. Other than that: should we probably report the socket thing somewhere upstream? |
You don't need to ask me to report things. :P |
It may be controversial, but the alternatives are scarce. And there are things you can do. Disable JS, block domains, destroy cookies, report tickets, change your user agent, use Gopher, create a new Web. |
I am not a programmer and without your explanation I wouldn't know how the socket allocation works etc. All I know for the moment is what you explained. So surely I am not the right person to report such intricacies. But if you are interested in bringing Dooble to a level which other browsers are not, I suppose it would be good if you ask for a fix in Qt (or where appropriate) which would allow better control of sockets. It is not that I am lazy to do it, I am simply not an expert able to discuss all the programming aspects of it with the right people. |
Yes, I know. That's how I found Dooble :)
I have done all these except the last 2. Yet those are patch works, not security by design. :) |
That's fine. Such requests may be impossible to have ears. It has been typical to hear from other developers, why do you need that? I would prefer, we don't know why you need it but we'll try to offer it. |
And we have the answer: Because we want the user to control the program, not the program to "spy" on the user and report his activity to the remote host "I am still online, let me send you some more packets". This is not only a potential privacy problem but also technically inefficient and unnecessary. (uses more battery too :P) |
Expose the software, make it transparent, work to be open, be portable. These are very nice but when the details cascade from many sources, you become a realist. You form teams, committees, requirements, funding. Funding. |
If I know how and where to report this, I would do it without funding. But I don't. Even if I did I would be unable to do anything with the result of a potential "fix" done upstream. So I am reporting the privacy issue here, hoping to see a browser which may go further than where most browsers end up. Then it will be worth all the possible funding and community support. |
I think I nailed it for Firefox. You should probably provide a similar setting in Dooble: |
I would check https://peter.sh/experiments/chromium-command-line-switches/ and determine which options apply to sockets if any. |
Checked and nothing provides that. |
That's the problem, see? This sort of option should exist through a command-line or in Qt. Circumventing Chrome is not the solution. |
I don't know about that. And I don't know why it should exist as a CLI option. In Firefox it is not a CLI option. |
https://doc.qt.io/qt-5.10/qtwebengine-overview.html "The Qt WebEngine Process is a separate executable that is used to render web pages and execute JavaScript. This mitigates security issues and isolates crashes caused by specific content." |
I'm not particular to the solution. An interface into the engine is the preferred method for some reasons. It doesn't harm the engine, it remains future compatible (as long as the options remain supported), it's portable, it delegates function where function belongs. |
From the way you reply it sounds like the browser is essentially the engine and browser developers who don't work on the engine shouldn't really do much :) |
A Web engine is a living interface to living and static documents. An interface into the socket layer is an unknown. If it exists, I'll work with it. If it doesn't exist, what is your proposal? Modify Qt? Modify Chrome? Ask Chrome developers? Ask Google? Another example. https://browserleaks.com/webrtc https://bugreports.qt.io/browse/QTBUG-57505 So you see, one of the suggestions is to build Qt with this feature disabled. I'm not going to do this because I have to live. You can also see that the approved action was to introduce a mechanism to disable this at runtime. https://codereview.qt-project.org/#/c/204221/17/src/webenginewidgets/api/qwebenginesettings.cpp |
Portability and ease of releasing Dooble are very important to me. If I cannot release Dooble because I have to build Qt or maintain special exceptions on the various Linux distributions, I'll be very unhappy. With Dooble 2.0, I maintain it and it alone. That's liberating. |
I don't have to build Qt, I don't have to release libgcrypt, I don't keep sqlite current, etc. |
New ticket. |
Page code has urls to static.fsf.org that contain images and iframes: A bad thing on fsf site is piwik/matomo, a web analytics application with tracking API: |
Yes, the home page contains various resources. I know that. That's why my primary test is with robots.txt.
I know and I am not sure in what sense you consider it bad. Personally I maintain a few websites which unfortunately still use Google Analytics. I am in the process of looking for a replacement (in order to prevent leaking of info to the big G) and will probably also use matomo. Generally if tracking is bad or not depends on the purpose of it. You cannot possibly optimize a website without feedback about actual usage and optimizing UI/UX is good for the visitors. Otherwise it can be argued that even webserver logs should not exist as they can also be used for tracking (webalyzer and similar). If you have any better ideas I would be interested to discuss them with you but probably elsewhere as it is quite off-topic and we should not cloak this repo. BTW this Sucuri SiteCheck seems quite a superficial test (they also use Google Analytics). |
You can block google-analytics.com in Dooble. The site is included in the Data file. |
Only if the site uses fronted based GA. But GA also has an HTTP API which can be utilized through server side and you won't even know. Yes, it is more limited because it can't access all the info which JS can but it is still GA and you cannot stop it. |
Does that mean that after I visit pbs.org, pbs.org may share the information it's gathered with Google? |
Yes. https://developers.google.com/analytics/devguides/collection/protocol/v1/devguide |
Remember that it is also possible for a website to host locally the JS code of GA and send the POST requests to its own proxy host which in its turn can communicate with Google. So a filter based on domain name won't be effective. |
Annnnnnnddddddddddddddddddddddddddd... This is not a problem with a browser or Google or a site. Preventing a server from sharing the data that it extracts from you with other entities should be made available as information. Perhaps even optional. Restricting it? Eh. In this example, it's perhaps unethical. Do you have counters on your sites? |
I didn't say it was. I was just answering your question. Whether it is ethical or not depends on how the info is used (and now with the GDPR coming that must be transparent).
No need to. Webalizer and similar tools can give the same info. |
Socket states TO-DO item. |
Disable JavaScript and cookies.
Run
rcnetwork restart;tcpdump -i eth1 ip src host pc and dst host not myrouter and dst host not pc -tq
Start dooble and visit https://fsf.org/robots.txt
tcpdump output:
Page loaded.
Wait, do nothing and just watch tcpdump:
Exit dooble (after waiting for a minute or so).
No additional packets.
Questions:
Why is it that packets keep being sent after dooble has terminated? Does that not result in informing the remote host "The user has closed their browser"?
Why after the page has loaded additional packets keep traveling?
Wikipedia says that Dooble provides a means to block external content. I suppose (please correct me if I am wrong) this is similar to what browser extensions like uMatrix do (blocking of 3rd party resources). However I can't find this anywhere in settings. Is the Wikipedia article wrong or is that setting hidden somewhere where it is not so obvious?
The text was updated successfully, but these errors were encountered: