Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need of Internet / dubious connections / spyware #101

Open
IVIaV opened this issue Sep 22, 2024 · 3 comments
Open

Need of Internet / dubious connections / spyware #101

IVIaV opened this issue Sep 22, 2024 · 3 comments

Comments

@IVIaV
Copy link

IVIaV commented Sep 22, 2024

I find it a bit strange that the internet is used all the time. Especially with so many IPs... Can someone explain why?

In the data I could find out that e.g. Google is used (I assume for their language model?)

Anyway, these are all the connections that are there: I find the ones with an “!” strange:

  • US:
  • 9 x huggingface.co ---> for what???
  • 2 x httbin.org. ! ---> what is that?
  • 4 x api.gradio.app ---> Lib?
  • 1 x cdn-lfs-us-1.huggingface.co ---> for what?
  • CN:
  • 2 x ip.taobao.com ! ---> Very suspicious ... Spyware? This is a online shop!
  • IE:
  • 2 x checkip.amazonaws.com ! ---> Why the HECK Amazon?
  • CA:
  • 7 x geolocation.onetrust.com ! ---> Something to detect which country?
  • DE:
  • 2 x github.com ---> for updating?

In addition, the following pages are also communicated when the browser is opened:

  • Cloudflare.com
  • fonts.googleapis.com
  • gstatic.com ---> Why?

Don't take offense, but I'm being very careful here. In this project, various things are still installed after scripting... all in all, it seems too confusing for me to deal with. Perhaps the creator can shed some light on this?

grafik

@AnonimusJack
Copy link

First of all let me applaud you for your research here.
Next, I'll put some of your concerns at ease and raise others.

HuggingFace with LFS is probably used for the models and related things.
The GradioAPI is for Gradio telemetry data, collected by Gradio (for what purpose? most likely bug tracking of something, but who knows~).
httbin.org is generally used for testing, though a thorough dive into the code can uncover other uses 👀
checkip.amazonaws.com used by the Gradio telemetry api to get your IP.
geolocation.onetrust.com probably more telemetry, by whom will require some deep dive.
ip.taobao.com the Chinese AWS, what is hosted there will be interesting, also requires a deep dive into the code.

I'll check these out and comeback with insights.
It seem the OP is MIA for an entire quarter so probably we won't receive input from him.
But I'm curious, and before I sacrifice time for my own tool instead of this one I'll make sure it's safe.

Thanks again for bringing this to my attention.
Cheers~

@IVIaV
Copy link
Author

IVIaV commented Sep 23, 2024

Hey thx for the informations! :)
I found out some more info using Wireshark... But first a few comments and questions on what I know so far:

HuggingFace: But why do I need all the time a connection to huggingface, if I installed all modells 🤔

GradioAPI: Yeah, gradioapi ist strange... but no proof of misuse...

httbin.org: No answer but more information

httpbin is a popular online service that provides a simple HTTP request & response service for testing and debugging HTTP libraries and clients. Some key things you can do with httpbin
from proxiesapi.com

checkip.amazonaws.com: For what does it need my IP?

ip.taobao.com: I tracked the IP behind it. Seems to be a Service from Alibaba.com ... but no clue for what :/


New Information:
I used Wireshark to look at all http connections (really only http, as there is too much traffic over my machine). These companies are always retrieved at the start of finetune or xtts webui (I'm in Germany, therefore Akamai in DE)

Akamai Technologies Inc.	DE --> "OCSP 544 Request"
EU Metro Frontend 		IR --> "OCSP 540 Request"  
Edgecast Inc			US (California) 3 times reported, but no evidence of malware... just a OCSP 544 Request
Amazon.com Inc.		US (Virginia) --> Probably IP as mentioned by AnonimusJack "OCSP 533 Request"
Google LLC			US (Missouri) --> "403 GET /success.txt?ipv4 HTTP/1.1" and "423 GET /success.txt?ipv6 HTTP/1.1"🤷 

@AnonimusJack
Copy link

AnonimusJack commented Sep 23, 2024

I found something more.
It seems most of the "problematic" API calls come from this library: https://github.com/uliontse/translators

class Region(Tse):
    def __init__(self):
        super().__init__()
        self.get_addr_url = 'https://geolocation.onetrust.com/cookieconsentpub/v1/geo/location'
        self.get_ip_url = 'https://httpbin.org/ip'
        self.ip_api_addr_url = 'http://ip-api.com/json'  # must http.
        self.ip_tb_add_url = 'https://ip.taobao.com/outGetIpInfo'
        self.default_region = os.environ.get('translators_default_region', None)

From translators/server.py line 322.
Seems kinda safe.
I should check the project, maybe it can be turned off if there's no need to detect a language or do some translation for some peace of mind 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants