Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPKI not *all* IPs are checked for ROA? #1592

Closed
bwbroersma opened this issue Dec 11, 2024 · 4 comments · Fixed by #1596
Closed

RPKI not *all* IPs are checked for ROA? #1592

bwbroersma opened this issue Dec 11, 2024 · 4 comments · Fixed by #1596
Assignees
Labels
bug Unexpected or unwanted behaviour of current implementations
Milestone

Comments

@bwbroersma
Copy link
Collaborator

See example https://internet.nl/site/amazon.nl/3077099/:

Web server IPv6 address IPv4 address
amazon.nl None 52.95.120.64
... - 52.95.116.117
... - 54.239.33.93

For all IP addresses of your web server an RPKI Route Origin Authorisation (ROA) is published.

Webserver IP address RPKI Route Origin Authorization
amazon.nl 54.239.33.93 yes

All IP addresses of your web server have a Valid validation state. The route announcement of these IP addresses is matched by the published RPKI Route Origin Authorisation (ROA).

Web server BGP Route Prefix BGP Route Origin ASN RPKI Origin Validation state
amazon.nl 54.239.33.0/24 AS16509 valid
... 54.239.32.0/21 AS16509 valid
@bwbroersma bwbroersma added this to the backlog milestone Dec 11, 2024
@bwbroersma
Copy link
Collaborator Author

The RPKI code has the correct loop:

for host in hostset:
for ip in host.routing:

However at least here:

web_registered = check_registry("web_rpki", web_callback, shared.resolve_a_aaaa)
batch_web_registered = check_registry("batch_web_rpki", batch_web_callback, shared.batch_resolve_a_aaaa)

Both via this code:

@shared_task(
bind=True,
soft_time_limit=settings.SHARED_TASK_SOFT_TIME_LIMIT_HIGH,
time_limit=settings.SHARED_TASK_TIME_LIMIT_HIGH,
base=SetupUnboundContext,
)
def resolve_a_aaaa(self, qname, *args, **kwargs):
return do_resolve_a_aaaa(self, qname, *args, **kwargs)
@batch_shared_task(
bind=True,
soft_time_limit=settings.BATCH_SHARED_TASK_SOFT_TIME_LIMIT_HIGH,
time_limit=settings.BATCH_SHARED_TASK_TIME_LIMIT_HIGH,
base=SetupUnboundContext,
)
def batch_resolve_a_aaaa(self, qname, *args, **kwargs):
return do_resolve_a_aaaa(self, qname, *args, **kwargs)

Call this function:

def do_resolve_a_aaaa(self, qname, *args, **kwargs):
"""Resolve A and AAAA records and return a single result for each type."""

So maybe replacing this with code that returns the complete set.

Another problem seems to be the use of do_resolve_a_aaaa in do_resolve_ns_ips for NS that have multiple A or AAAA records:

yield (rr, do_resolve_a_aaaa(self, rr))

@bwbroersma bwbroersma added the bug Unexpected or unwanted behaviour of current implementations label Dec 11, 2024
@bwbroersma bwbroersma self-assigned this Dec 11, 2024
bwbroersma added a commit to bwbroersma/Internet.nl that referenced this issue Dec 12, 2024
@bwbroersma bwbroersma mentioned this issue Dec 12, 2024
6 tasks
@baknu
Copy link
Contributor

baknu commented Dec 12, 2024

This seems to be a bug. The intent of the RPKI subtest was to check all IP's also of the servers:

Odd to discover this only now. Do we want to have this fixed in 1.9 (because from that release on RPKI will have score impact)?

@bwbroersma
Copy link
Collaborator Author

Some useful scripting to use the https://tranco-list.eu/ to find stuff:

curl -sSfLA '' "https://tranco-list.eu/download_daily/$(curl 'https://tranco-list.eu/latest_list' -sSfA '' -o /dev/null -w "%{redirect_url}\n" | cut -d/ -f5)" | bsdtar -Oxf- | cut -d, -f2 | tr -d '\r' > top-1m.list
head -n1000 top-1m.list | sed 's/$/ NS/g' | xargs dig +noall +answer > top1000ns
awk '{print $5}' top1000ns | sed 's/\.$/ A/g' | sort -u | xargs dig +noall +answer | awk '{l[$1]=l[$1]$5" ";n[$1]++}END{for(d in l){if(n[d]>1)print n[d], d, l[d]}}' | sort -nr > top1000ns-multi-ipv4

E.g. cloudflare.com both has 2 IPv4 and IPv6 addresses for the website and all nameservers, while the multiple IP addresses are displayed in the IPv6 nameserver section it's missing in the ROA existence.

mxsasha pushed a commit to bwbroersma/Internet.nl that referenced this issue Jan 7, 2025
@bwbroersma
Copy link
Collaborator Author

bwbroersma commented Jan 8, 2025

head -n1000 top-1m.list | sed 's/$/ MX/g' | xargs dig +noall +answer > top1000mx
awk '{print $6}' top1000mx | sed 's/\.$/ A/g' | sort -u | xargs dig +noall +answer | awk '{l[$1]=l[$1]$5" ";n[$1]++}END{for(d in l){if(n[d]>1)print n[d], d, l[d]}}' | sort -nr > top1000mx-multi-ipv4

Regarding multiple IPv4's per MX, this is happening a lot, e.g.:

  • Some hosted Cisco Secure Email (formerly IronPort) has high IPv4 counts, e.g. intuit.com (or mailchimp.com) has two MX records with both 12 IPv4's
  • yahoo.com has multiple MX records with 8 IPv4 each
  • fastmail.com has two MX records, one with 8 IPv4's and one with 2 IPv4's
  • icloud.com (or me.com) has two MX records with both 6 IPv4's
  • mimecast customers (e.g. forbes.com, hp.com, zendesk.com, redhat.com or cambridge.org.) has two MX records with both 6 IPv4's
  • google.com (or other Google users) has one MX records with 5 IPv4's
  • aws.com (or other AWS users) has one MX records with 5 IPv4's
  • microsoft.com (or any other *.mail.protection.outlook.com user) has one MX records with 4 IPv4's
  • linkedin.com 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected or unwanted behaviour of current implementations
Development

Successfully merging a pull request may close this issue.

2 participants