Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many accesses to pci-ids.ucw.cz #28

Closed
gollux opened this issue Mar 3, 2022 · 2 comments
Closed

Too many accesses to pci-ids.ucw.cz #28

gollux opened this issue Mar 3, 2022 · 2 comments
Assignees

Comments

@gollux
Copy link

gollux commented Mar 3, 2022

Hello!

This is the maintainer of the PCI ID database (pci-ids.ucw.cz) speaking.

I am currently getting more than 200000 requests per day from your module, but only from about 300 unique IP addresses per day. That translates to 60 GB of data daily! This looks like you have seriously broken (or non-existent) caching.

Could you please fix it quickly and notify your users? Otherwise I will have to implement bandwidth limits.

Even better, please consider sending individual queries to the PCI ID database using DNS. The protocol has no good documentation, but you can copy the implementations from pciutils/lib/names-net.c.

@jaypipes
Copy link
Owner

jaypipes commented Mar 3, 2022

Hi @gollux! So sorry that my library has caused issues for you!

The way that the library works is actually to first look for the PCI database files in a set of known locations and only if no PCI database file can be found on the local filesystem and the network fetch feature has not been disabled, attempt to fetch the latest PCI database from pci-ids.ucw.cz.

The only time that the network fetch occurs is when the platform is Windows (which AFAIK does not have the PCI database files cached locally like Linux does...) or if the library is loaded from a filesystem (perhaps in a container?) that does not have any of the PCI IDS database files mounted into it from the local Linux host.

If you would please email me jaypipes at gmail dot com any informaiton you have about those 300 unique IP addresses I can do some investigation and see what might be causing those calls to your pci-ids.ucw.cz site from the library. I suspect that may be some testing or CI platform (Kubernetes or one of the Kubernetes networking drivers) that includes this library in the container image used for testing and that container image's filesystem does not have a bind mount to pass the Linux host's pci-ids files into the container.

@jaypipes jaypipes self-assigned this Mar 3, 2022
jaypipes added a commit that referenced this issue Mar 24, 2022
Unfortunately, pcidb was abusing the hosting of the PCIIDS database (see
Issue #28). Looks like CI/CD jobs were executing in containers and the
container filesystem did not contain the linux host PCIIDS database,
causing pcidb to fetch the latest from the pci-ids.ucw.cz website.

This PR disables the network fetch behaviour of pcidb by default, which
should hopefully address the abusive calls to the pci-ids.ucw.cz hosting
service.

I'm working separately with @fromani from Red Hat to identify any CI
jobs that might have triggered the fetch storm...

Signed-off-by: Jay Pipes <[email protected]>
jaypipes added a commit to jaypipes/ghw that referenced this issue Mar 24, 2022
I cut a new release of the pcidb library (v1.0.0) to address a
particular issue (jaypipes/pcidb#28) that was causing pain for the
maintainers of the PCI-IDS database. This PR simply brings in that
latest patched version of pcidb, which includes disabling by default the
network fetch of the PCI-IDS database when no local database file is
found.

Signed-off-by: Jay Pipes <[email protected]>
jaypipes added a commit to jaypipes/ghw that referenced this issue Mar 24, 2022
I cut a new release of the pcidb library (v1.0.0) to address a
particular issue (jaypipes/pcidb#28) that was causing pain for the
maintainers of the PCI-IDS database. This PR simply brings in that
latest patched version of pcidb, which includes disabling by default the
network fetch of the PCI-IDS database when no local database file is
found.

Signed-off-by: Jay Pipes <[email protected]>
@taigrr
Copy link

taigrr commented Mar 26, 2022

I've created PRs on the following repos to help satisfy @gollux 's request that we notify users. I'm sure the majority of the traffic is from the kubernetes repos, but I don't think it will hurt to proliferate this change more widely.

yunionio/cloudpods#13788
livepeer/go-livepeer#2340
jm33-m0/emp3r0r#103
portainer/agent#280
ericmaustin/unixtools#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants