Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first shot for pipelining support #503

Merged
merged 69 commits into from
Jan 5, 2024
Merged

first shot for pipelining support #503

merged 69 commits into from
Jan 5, 2024

Conversation

dmachard
Copy link
Owner

@dmachard dmachard commented Dec 10, 2023

@johnhtodd

This PR is a draft to validate the concept of pipelines in configuration.

  • introduce pippeline mode
  • new dnsmessage collector
  • add regex support with flexible matching mode
  • new atags transform to validate the concept
  • support list in matching mode
  • support regex from external source (local file, remote one)
  • modify routes definition to be more generic
  • support routing on loggers and collector in a generic way
  • check pipeline config
  • The ability to entirely delete client IP addresses
  • The ability to limit infinite memory consumption or prometheus logger

This codebase will be the minimum to implement future versions.

New model of config:

pipelines:
  - name: main-input
    dnstap:
      listen-ip: 0.0.0.0
      listen-port: 6000
    routing-policy:
      default: [ filter ]

  - name: filter
    dnsmessage:
      matching:
        include:
          dns.qname: "^.*\\.google\\.com$"
    transforms:
      atags:
        tags: [ "google"]
    routing-policy:
      dropped: [ outputfile ]
      default: [ console ]

  - name: console
    stdout:
      mode: text

  - name: outputfile
    logfile:
      file-path:  "/tmp/dnstap.log"
      max-size: 1000
      max-files: 10
      mode: flat-json

@dmachard dmachard marked this pull request as draft December 10, 2023 17:05
@johnhtodd
Copy link

This seems to be working with no problems - I've put a few tens of millions of queries through it so far. Testing has not been exhaustive, but other than the Prometheus bug (unrelated to this pipeline support) I'd say it's going well.

@johnhtodd
Copy link

johnhtodd commented Dec 15, 2023

Well, perhaps I was a bit premature in my comment. I tried adding a dnstap output, and that doesn't seem to quite work:

  - name: godnsconnector-next
    dnstap:
      remote-address: 127.0.0.1
      remote-port: 59311

  - name: apple-txt
    dnsmessage:
      matching:
        include:
          dns.qtype: "TXT"
          dns.qname: "^*.apple.com$"
      policy: "drop-unmatched"
    transforms:
      atags:
        tags: [ "TXT:apple" ]
    routes: [ outputfile,godnsconnector-next ]

Adding those lines causes this error:

. . . 
INFO: 2023/12/15 22:26:27.808298 [tag-queries] collector=dnsmessage - enabled
panic: main - routing error: stage godnscollector-next doest not exist

goroutine 1 [running]:
main.InitPipelines(0xc000436840?, 0x13ba7b6?, 0xc000415300, 0x0?)
	/home/jtodd/go-dnscollector/go-dnscollector-pipeline-branch/go-dnscollector/pipeline.go:156 +0x1474
main.main()
	/home/jtodd/go-dnscollector/go-dnscollector-pipeline-branch/go-dnscollector/dnscollector.go:114 +0x508
root@ub20template:/home/jtodd/go-dnscollector/go-dnscollector-pipeline-branch/go-dnscollector#

But, if I simply delete the "godnscollector-next" item from the routes: line, it seems to not blow up but of course I can't route any packets to that pipeline.

. . . 
ERROR: 2023/12/15 22:49:30.187704 [godnsconnector-next] logger=dnstap - dial tcp 127.0.0.1:59311: connect: connection refused
INFO: 2023/12/15 22:49:30.187761 [godnsconnector-next] logger=dnstap - retry to connect in 10 seconds
INFO: 2023/12/15 22:49:32.392661 [dnsdist-from-outside] collector=dnstap#1 - new connection from 41.63.13.231:50174
INFO: 2023/12/15 22:49:32.392719 [dnsdist-from-outside] processor=dnstap#1 - initialization...
INFO: 2023/12/15 22:49:32.393867 [dnsdist-from-outside] processor=dnstap#1 - waiting dns message to process...
INFO: 2023/12/15 22:49:32.395809 [dnsdist-from-outside] collector=dnstap#1 - receiver framestream initialized
INFO: 2023/12/15 22:49:40.188249 [godnsconnector-next] logger=dnstap - connecting to tcp://127.0.0.1:59311
ERROR: 2023/12/15 22:49:40.192153 [godnsconnector-next] logger=dnstap - dial tcp 127.0.0.1:59311: connect: connection refused
INFO: 2023/12/15 22:49:40.192305 [godnsconnector-next] logger=dnstap - retry to connect in 10 seconds

@dmachard
Copy link
Owner Author

Well, perhaps I was a bit premature in my comment. I tried adding a dnstap output, and that doesn't seem to quite work:

The panic error should only occur during routing initialization. If a route is missing in the configuration, the application will be stopped.

After copying and pasting your configuration on my end, it's working fine without any panics. Can you confirm if you have this log on startup?

INFO: 2023/12/17 10:48:47.415533 [godnsconnector-next] logger=dnstap - running in background...

I will add more log messages to show the routing initialization process.

@johnhtodd
Copy link

Thanks - this was a typo on my part during testing. Works as expected!

@dmachard
Copy link
Owner Author

dmachard commented Dec 17, 2023

YAML model improved with latest commit with include/exclude section, external file support in a generic way (working on all keys)

  - name: apple-txt
    dnsmessage:
      matching:
        include:
          dns.opcode: 0
          dns.length:
            greater-than: 50
          dns.qname:
            file-list: "./testsdata/filtering_keep_domains_regex.txt"
            file-kind: "domain_list"
        exclude:
          dns.qtype: [ "TXT", "MX" ]
          dns.qname:
            - ".*\\.github\\.com$"
            - "^www\\.google\\.com$"
      policy: "drop-unmatched"
    transforms:
      atags:
        tags: [ "TXT:apple", "TXT:google" ]
    routes: [ outputfile, console ]

For me the concept of pipelines is validated with a very flexible and powerful matching (testing load will be needed to check cpu and memory usage)

todo list to finalize the PoC:

  • when the value is an array, the matching does not yet work

@dmachard
Copy link
Owner Author

dmachard commented Dec 18, 2023

The new syntax to use:

  - name: apple-txt
    dnsmessage:
      matching:
        include:
          #dns.flags.qr: false
          dns.opcode: 0
          dns.length:
            greater-than: 50
          dns.qname:
            match-source: "file://./testsdata/filtering_keep_domains_regex.txt"
            source-kind: "regexp_list"
          dnstap.operation:
            match-source: "http://127.0.0.1/operation.txt"
            source-kind: "string_list"
        exclude:
          dns.qtype: [ "TXT", "MX" ]
          dns.qname:
            - ".*\\.github\\.com$"
            - "^www\\.google\\.com$"
      drop-policy: "unmatched"
    transforms:
      atags:
        tags: [ "TXT:apple", "TXT:google" ]
    routes: [ outputfile, console ]

@dmachard
Copy link
Owner Author

dmachard commented Jan 3, 2024

New syntax for better routing #521 definition between stanza /!\

The default key is used to configure where to send all matched/accepted dns messages.
And the dropped key can be used to send unmached packets to another stanza in the config.

    routing-policy:
      dropped: [ outputfile ]
      default: [ console ]

@dmachard dmachard marked this pull request as ready for review January 5, 2024 19:57
@dmachard dmachard changed the title MVP - first shot for pipelling support first shot for pipelling support Jan 5, 2024
@dmachard dmachard merged commit b2086dd into main Jan 5, 2024
68 checks passed
@dmachard dmachard deleted the pipeline_mode branch January 6, 2024 07:51
@dmachard dmachard changed the title first shot for pipelling support first shot for pipelining support Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants