-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for the HTTP tracker #162
Add tests for the HTTP tracker #162
Conversation
… running Add a communication channel to wait until the new job is running. This is specially useful for testing, becuase tests need the HTTP server up and running before making requests.
06f5472
to
1ba7f9f
Compare
1ba7f9f
to
1a558d2
Compare
If you try to convert a peer Id into a String it returns an empty String. ``` let id = peer::Id(*b"-qB00000000000000000"); assert_eq!(id.to_string(), ""); ```
hi @WarmBeer, It seems the HTTP tracker is returning an empty Peer Id in the announce response I've added a test in a commit: #[test]
fn should_be_converted_into_string() {
let id = peer::Id(*b"-qB00000000000000000");
assert_eq!(id.to_string(), "");
} If you try to convert the peer Id into a {
"info_hash": "6d8f96b4e4761a04a3193ed729ebb8993ad17f0e",
"seeders": 1,
"completed": 0,
"leechers": 0,
"peers": [
{
"peer_id": {
"id": "2d7142343431302d78514c65535f7828622e6c53",
"client": "qBittorrent"
},
"peer_addr": "2.137.87.41:17548",
"updated": 1674482474001,
"updated_milliseconds_ago": 1674482474001,
"uploaded": 0,
"downloaded": 0,
"left": 0,
"event": "Started"
}
]
} but I do not know if that is the correct value for the HTTP tracker specification. I have not found the original specification for the HTTP tracker. Reading these two pages:
It's unclear which format the peer ID should have in the I think in the The Do you think we should use the same format: That's the hex representation of the byte array:
By the way, I think the problem is we are using two different functions to convert the ID into an impl std::fmt::Display for Id {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let mut buffer = [0u8; 20];
let bytes_out = binascii::bin2hex(&self.0, &mut buffer).ok();
match bytes_out {
Some(bytes) => write!(f, "{}", std::str::from_utf8(bytes).unwrap()),
None => write!(f, ""),
}
}
} and: pub fn get_id(&self) -> Option<String> {
let buff_size = self.0.len() * 2;
let mut tmp: Vec<u8> = vec![0; buff_size];
binascii::bin2hex(&self.0, &mut tmp).unwrap();
std::str::from_utf8(&tmp).ok().map(std::string::ToString::to_string)
} The first one is used in the HTTP tracker ( I just want to confirm that UPDATE: It's not working on the |
hey, @WarmBeer, it seems the only mandatory params for the GET Announce request are:
See this commit. The official specification BEP 3 says only I tried with another tracker: and you get the following error if you do not provide, for example, the
I'm writing tests for the current behaviour; once we migrate to Axum, we can change the behaviour if we want. In this case, I would respond with a concrete error like the "tracker.gbitt.info" tracker response. In fact, it's what the specification says you have to do: Tracker responses are bencoded dictionaries. If a tracker response has a key failure reason, then that maps to a human readable string which explains why the query failed, and no other keys are required. Otherwise, it must have two keys: interval, which maps to the number of seconds the downloader should wait between regular rerequests, and peers. peers maps to a list of dictionaries corresponding to peers, each of which contains the keys peer id, ip, and port, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively. Note that downloaders may rerequest on nonscheduled times if an event happens or they need more peers. |
Hi @josecelano ,
That is not expected behaviour for the non-compact response (for the compact response it is). It is not a breaking bug however, since the Peer Id seems to be completely useless for clients.
The Vuze Wiki mentions this: From this I would think that the Peer Id needs to be returned as a non-hexed string. So in the "-qB0000" format I guess. |
OK, thank you @WarmBeer! The BitTorrent client is using an Id like this The peer is a string, not a byte array. But in the BEP 23 they say: On the other hand, I have installed this tracker. I've run it locally, and they are using the UFT8 representation: Request:
Response:
I'm going to fix it to also use that format. |
It will be used to deserialize bytes from HTTP tracker announce compact responses. For exmaple: ``` pub peers: Vec<u8>, ```
30ecd47
to
080f3c4
Compare
We need it to get the address of the HTTP client we use in tests.
hi @WarmBeer , I've just added new tests for how the tracker assign IP address to peers. When the tracker is running behind a reverse proxy the tracker assigns the the IP on the The /// Get `PeerAddress` from `RemoteAddress` or Forwarded
fn peer_addr((on_reverse_proxy, remote_addr, x_forwarded_for): (bool, Option<SocketAddr>, Option<String>)) -> WebResult<IpAddr> {
if !on_reverse_proxy && remote_addr.is_none() {
return Err(reject::custom(Error::AddressNotFound));
}
if on_reverse_proxy && x_forwarded_for.is_none() {
return Err(reject::custom(Error::AddressNotFound));
}
if on_reverse_proxy {
let mut x_forwarded_for_raw = x_forwarded_for.unwrap();
// remove whitespace chars
x_forwarded_for_raw.retain(|c| !c.is_whitespace());
// get all forwarded ip's in a vec
let x_forwarded_ips: Vec<&str> = x_forwarded_for_raw.split(',').collect();
// set client ip to last forwarded ip
let x_forwarded_ip = *x_forwarded_ips.last().unwrap();
IpAddr::from_str(x_forwarded_ip).map_err(|_| reject::custom(Error::AddressNotFound))
} else {
Ok(remote_addr.unwrap().ip())
}
} Here you can see the test I've added for that beviour. Shouldn't be the the leftmost IP address @WarmBeer? I think the leftmost IP address is the original client address and the rightmost IP is the IP of the last proxy in the chain. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For Maybe It should not even be the first one: |
hi @WarmBeer, this is an interesting article shared by @da2ce7 : https://adam-p.ca/blog/2022/03/x-forwarded-for/ I think the reason why we are getting the last IP in the header is that we do not trust any of the IPs in the header. We only trust our own proxy, so we use the IP received by our own proxy. Is that the reasoning behind that decision? If that were the case, it means we would never get the real client IP when there are some other proxies in the middle, even it we potentially trust them, right? So basically, the config option If that is OK, I will document it on the test. I thought the intended behaviour was just getting the header's originator client IP. On the other hand, what would be the problem if the client IP is spoofed in the header? In that case, we would simply add a peer with an invalid IP. If we use the last IP, we are going to add a peer that we are sure exists but it could not be the real peer. I suppose adding a random peer is worse that adding a peer that is not a peer but at least has passed a request to the tracker. |
ebfb1e3
to
8cad64d
Compare
There are new Clippy errors not shown in my local version: https://github.com/torrust/torrust-tracker/actions/runs/4017899158/jobs/6902857248 I'm using: By the way, I've just realized |
This was indeed my reasoning. We can not trust the entire X-Forwarded-For chain, only the last added. This does indeed have the downside that you can not have multiple proxies in between.
So we can either choose to be sure that we receive the actual IP of the peer by selecting the last IP in the chain of the X-Forwarded-For header, with the downside that we can only support one proxy. Or we choose to support multiple proxies by selecting the first IP in the X-Forwarded-For header chain, but with the downside that the IP can be literally anything supplied by the client. We could of course check that the first string set in the X-Forwarded-For chain is an actual IP, but we won't know if this IP address is the actual IP of the client. I would prefer to know the actual IP of a client, so that a client can't just spam announce requests for the same info-hash using a different spoofed IP every time. What do you think? Maybe we should think of a totally different solution. |
8cad64d
to
badb791
Compare
``` peer::Id(*b"-qB00000000000000000").to_string() ``` was always returning an empty string. It has been changed to return the Hex representations of the byte array.
hi @WarmBeer, regarding our previous discussion about the Peer Id format in responses: I've fixed the Now:
it returns the hex string for the byte array. We also use that format when we serialize the peer id into json. With that change, the HTTP tracker response will contain that hex string instead of empty string. I have not implemented what we agreed here yet. We agreed on using the string representation, just mapping each byte to the ASCCII char The reason is that I've changed my mind. The HTTP tracker returns bencoded responses. And bencoded responses allow any random byte array (not only well-formed UTF8 string). I think we should avoid converting the peer into a string and then bencode the string. We should just send the encoded 20-byte array. What do you think? On the other hand, maybe the Display trait should print the byte array without making any conversion. But I'm not sure. Maybe it could try to convert it to a string, and if it is not possible, it could use the hex string. I suppose all clients use peer ids that can be shown as strings even if the protocol accepts any 20-byte array for some reason. Conclusions
I would return the bencoded byte array in responses (since we do not know if some client can be using peer ids with non-valid string chars), and I would If you agree, I can make the changes in this PR. If you think we need to discuss it, I will create a new issue. I would prefer to create a new issue to find out what other trackers and clients are doing. cc @da2ce7 |
@WarmBeer I made a test with https://github.com/webtorrent/bittorrent-tracker Peer Id with ASCII values (byte<128)Announce URL: The peer id: Response (bytes with the bencoded response):
Peer Id with non ASCII values (byte>128)Announce URL: The peer id: Response (bytes with the bencoded response):
It is strange that it returns 21 bytes: Bytes (hex): It's adding an extra byte, |
We will use one mod per type of request and response.
eba03e0
to
dc304e7
Compare
…follow new scrape conventions Deserialization from bencoded bytes for announce and scrape request is done in two phases. First using `serde_bencode::from_bytes` and later with a custom parser. The reason is the `serde_bencode` crate does not allow direct deserialization for the strcuts we need. The strcut resulting from the first deserialization done by `serde_bencode` is the `DeserializedCompact` and the second one just `Compact`. So the prefix `Deserialized` is used when the bytes in the reponse body are converted into a struct.
This is the final list of tests:
|
ACK c46e417 |
Add tests for the HTTP tracker.
announce
request.scrape
request.These subtasks require discussion and a formal definition of the solution we will implement. I will create new issues for them:
-qB000000000000000004
.X-Forwarded-For
header when the tracker is behind a proxy. More info here.UPDATE: 01-02-2023