Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRR: upgrade to upstream stable/8.1 branch #182

Merged
merged 4 commits into from
Nov 27, 2021
Merged

Conversation

c-po
Copy link
Member

@c-po c-po commented Aug 13, 2021

Change Summary

Upgrade from FRRouting 7.5.1 to 8.0

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe): Component Upgrade

Related Task(s)

https://phabricator.vyos.net/T3753

Component(s) name

FRR

Proposed changes

  • Upgrade Package
  • Introduce new libyang2 dependency on the package mirror

How to test

  • make test
  • make testc

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

@c-po c-po force-pushed the current branch 6 times, most recently from 9022ede to 1941149 Compare August 26, 2021 08:02
@c-po c-po force-pushed the frr-8-upgrade branch 2 times, most recently from 4281a66 to 5c85e83 Compare August 27, 2021 18:54
@c-po c-po marked this pull request as ready for review August 27, 2021 18:54
@c-po c-po force-pushed the frr-8-upgrade branch 2 times, most recently from 3eee15e to d67abe2 Compare October 17, 2021 07:51
@sempervictus
Copy link

Does 8.0 resolve the kernel crash introduced in 7.5? If not, we might actually want to go back to 7.2 - that thing is massively annoying and precludes production use when running routed VPN links having to rebuild routing tables on re-init.

@sworleys
Copy link

Does 8.0 resolve the kernel crash introduced in 7.5? If not, we might actually want to go back to 7.2 - that thing is massively annoying and precludes production use when running routed VPN links having to rebuild routing tables on re-init.

That crash is avoidable by disabling nexthop groups as stated in the issue. There is not much we can do FRR side, it's a kernel bug. I still haven't been able to hit it myself so if you have a easy reliable reproducer, please post it in the ticket.

@sempervictus
Copy link

@sworleys: Are you referring to the change of the zebra config file?
Any chance you might be able to point me to the relevant kernel bug? Do we know that its a bug in Linux itself and not a violation of the call semantics by userspace?
For a reproducer, create multiple routed IPSEC links, with several routes per VTI (static routes are fine), then start disrupting links at either side and watching them rebuild. Run in a loop, wait a day, should get a similar stack trace.

@sempervictus
Copy link

As a security person, will also note that libyang adds attack surface - they have a pretty gross amount (and quality) of CVEs against a relatively young and inconsequential lib in the grand scheme of systems-things. If this library is being used to process untrusted data (routing advertisements from peers), it opens up an avenue to compromise of the router infrastructure by peers on the net.

@sworleys
Copy link

sworleys commented Oct 22, 2021

@sworleys: Are you referring to the change of the zebra config file? Any chance you might be able to point me to the relevant kernel bug? Do we know that its a bug in Linux itself and not a violation of the call semantics by userspace? For a reproducer, create multiple routed IPSEC links, with several routes per VTI (static routes are fine), then start disrupting links at either side and watching them rebuild. Run in a loop, wait a day, should get a similar stack trace.

Yes that config line will disable FRR's use of kernel nexthop groups and therefore the bug will not be hit.

It is definitely a kernel bug, I am pretty intimately familiar with the uapi used by kernel nexthop groups and much of the code in the kernel itself.

Thanks for the reproducer, can you post it in the FRR issue itself with clear steps (iproute2 commands, configs, etc.) I don't want to annoy the people following this PR since it's not directly related.

@sworleys
Copy link

As a security person, will also note that libyang adds attack surface - they have a pretty gross amount (and quality) of CVEs against a relatively young and inconsequential lib in the grand scheme of systems-things. If this library is being used to process untrusted data (routing advertisements from peers), it opens up an avenue to compromise of the router infrastructure by peers on the net.

Libyang is only used in FRR to process configs/display operational data. It doesn't communicate over the wire AFAIK, not sure where you are getting that from or why someone would do that.

@sempervictus
Copy link

@sworleys: I am "getting that" from the state of industry: we live in an age of unrestricted informational warfare. Every system online is being attacked, and must be built to withstand attack. I specifically stated "If this library is being used to process untrusted data" ... so if it is not being used for such function, then the vectors of access to it are limited to only trusted inputs which provides for standoff in the security posture relative to any potential (0day) or existing vulnerability.
Every component added to a system adds some (possibly negative value) level of potential attack surface for compromise or otherwise alters the quotient of security posture. If things are changed without review and annotation of the security implications of said change, then the current security posture of the system is unknown relative to the last full posture assessment.
This can be automated to a large degree by having build-products booted, scanned by Nessus/Nexpose/OpenVAS (GSA now i think), and have package sources passed through a SonarQube instance for additional awareness past the published CVEs - the "bad guys" are doing this, and if the VyOS team is not, then they are at an intelligence disadvantage on their home turf relative to the adversary.

@sworleys
Copy link

@sworleys: I am "getting that" from the state of industry: we live in an age of unrestricted informational warfare. Every system online is being attacked, and must be built to withstand attack. I specifically stated "If this library is being used to process untrusted data" ... so if it is not being used for such function, then the vectors of access to it are limited to only trusted inputs which provides for standoff in the security posture relative to any potential (0day) or existing vulnerability. Every component added to a system adds some (possibly negative value) level of potential attack surface for compromise or otherwise alters the quotient of security posture. If things are changed without review and annotation of the security implications of said change, then the current security posture of the system is unknown relative to the last full posture assessment. This can be automated to a large degree by having build-products booted, scanned by Nessus/Nexpose/OpenVAS (GSA now i think), and have package sources passed through a SonarQube instance for additional awareness past the published CVEs - the "bad guys" are doing this, and if the VyOS team is not, then they are at an intelligence disadvantage on their home turf relative to the adversary.

Hopefully you aren't insinuating this, but if you are, I think it's pretty shortsighted of you to think both the FRR team and vyos team don't take security seriously. We have 100s of years between us of writing enterprise grade software and are well aware of how to manage potentially unsafe libraries.

FRR is in a few of 3rd party security scans using fuzzers and Static Analyzers that often find and report bugs before ever being released.

I don't think anyone would complain if you were to setup more automated scans and helped us track down/fix security bugs before they are in the wild. It would be awesome actually. Patches are always welcome.

@sempervictus
Copy link

No insult intended (apologies if any was internalized), this what i do for a living/life - offensive security architect (person who plans the red team effort and executes as part of/leading it) and since sane humans dont tend to think security first, i often find that nobody else will call these things out as they should be. Just because projects have been using unsafe practices, does not mean that its OK. FOSS is endemic in this pattern - not blaming people, trying to fix process.

I would be glad to set up a CI/CD pipeline, but VyOS is a commercial effort which generates revenue from selling support; and i run a commercial consultancy (a couple). Happy to work out a proper scope of work, cost analysis, and even host the entire thing at some small flat fee in our private clouds (we dont charge the egress/ingress/access fees, we do meter it, but consider those practices predatory) in a B2B context. I'm sure that paying customers would appreciate having vulnerability and code quality reports about their infrastructure.
Going back to the hardening PR - i've done a fair deal of work to try and help the project already in this department, and the lack of reciprocal effort does not put me in a mindset to drop another large stack of hours down an otherwise seemingly empty well when there's money to be made from enterprise customers who are often mandated to hold stronger posture (kicking and screaming, but they do commission the work). If the team takes up the hardening PR and gets that into mainline (with the commensurate testing, qualification, and internalization of the functions involved), i'll do it for free and even host it at-cost (pretty much free when you run your own clouds) - more than willing to trade efforts if it makes security a first-order concern in future development for VyOS.

@c-po c-po changed the title FRR: upgrade to upstream stable/8.0 branch FRR: upgrade to upstream stable/8.1 branch Nov 27, 2021
@c-po c-po merged commit e11dbc7 into vyos:current Nov 27, 2021
@c-po c-po deleted the frr-8-upgrade branch November 28, 2021 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants