Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[packetbeat] Expire source port mappings. (#41581)
port->pid mappings were only overwritten, never expired, the overwriting mechanism has a bunch of issues: - It only overwrites if it manages to find the new pid, so it misses short lived processes. - It only refreshes the mapping of said port, if a packet arriving on _another_ port misses the lookup (otherwise the original port is found and returned). Meaning, once all ports are used at least once, the cache is filled and never mutated again. The observable effect is that the user will see wrong process correlations _to_ older/long lived processes, imagine the follwing: - Long lived process makes _short_ lived TCP connection from src_port S. - Years later, a _short_ lived process makes a TCP connection to somewhere else, but from the same src_port S. It hits the cache, since it had a mapping for S, so packetbeat incorrectly correlates the new short-lived process connection, with the old long lived process. Related to a very long SDH, where a more in depth explanation of the bug can be found here, with a program to reproduce it. - elastic/sdh-beats#4604 (comment) - elastic/sdh-beats#4604 (comment) The solution is to discard mappings that are "old enough", with a hardcoded window of 10 seconds, so as long as the port is not re-used in this window, we are fine. This also makes sure the cache never becomes "immutable", since mappings will invariably get old, forcing a refresh. It's a very conservative approach as I don't want to introduce other bugs by redesigning it, work is on the way to change how the cache works in linux anyway. While here, I've noticed the locking was also wrong, we were doing the lookup unlocked, and also having to relock in case we have to update the mapping, so change this to grab the lock once and only once, interleaving is baad.
- Loading branch information