Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inputs.tacacs): Add tacacs plugin for simple tacacs auth response time monitoring #12747

Merged
merged 20 commits into from
Aug 7, 2023

Conversation

Hr0bar
Copy link
Contributor

@Hr0bar Hr0bar commented Feb 27, 2023

Opening PR after discussion in this Issue: #12573 and separating the tacacs plugin into a separate PR as discussed int #12736

@telegraf-tiger telegraf-tiger bot added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins labels Feb 27, 2023
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this plugin @Hr0bar! I have some comments, mostly the same as for the radius one. Can you please also add a unit-test?!

plugins/inputs/tacacs/sample.conf Outdated Show resolved Hide resolved
plugins/inputs/tacacs/sample.conf Show resolved Hide resolved
plugins/inputs/tacacs/sample.conf Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
@srebhan srebhan self-assigned this Mar 7, 2023
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Mar 13, 2023

Ill try to check on this likely over the coming weeks, likely most stuff that was done for the radius plugin will be applicable here as well. Good thing the tacacs GO package seem to have some basic testing server implementation for tests and https://hub.docker.com/r/dchidell/docker-tacacs seem to be hopefully usable for integration tests.

@powersj
Copy link
Contributor

powersj commented Apr 4, 2023

Hi @Hr0bar,

Any further progress on this PR recently?

Thanks!

@powersj powersj added the waiting for response waiting for response from contributor label Apr 4, 2023
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Apr 4, 2023 via email

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Apr 4, 2023
@powersj
Copy link
Contributor

powersj commented Apr 4, 2023

Not yet, I shall have more time after April 22

Thanks

getting married, so not much free time because of preparations

Congrats! I know how busy that time can be :)

edit: I am going to mark this as a draft until you do another push and ping us to review again.

@powersj powersj marked this pull request as draft April 4, 2023 17:01
plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Show resolved Hide resolved
@Hr0bar
Copy link
Contributor Author

Hr0bar commented May 2, 2023

Applied most of the comments. However:

Tried using the status code returned, but it was very useless as the status depends on the current tacacs auth sequence. On initial start packet, return code X may mean something completely different than return code X for send username packet, then X may mean something else on send password packet etc. So without the secondary info at what stage the status code was returned, its useless.
In the radius plugin, the status code is always from the same place - its just one request to do the auth, not multiple, so its nicely descriptive with exact string representation of the status. Here its not the case, you get an integer code that may mean something else for every auth stage.

Anyway, its meant to monitor SUCCESSFUL authentications response time, so decided to simply return an error in all other cases. Let me know if that makes sense ?

Deleted responsetime, kept only responsetime_ms, same as in the radius plugin.

"Maybe also check for other empty required fields?" - mandatory fields are uncommented in the sample conf. If not set, I think these should catch it:

	username, err := t.Username.Get()
	if err != nil {
		return fmt.Errorf("getting username failed: %v", err)
	}

The other non mandatory are now defaults:
return &Tacacs{RemAddr: "127.0.0.1", ResponseTimeout: config.Duration(time.Second * 5)}

	if len(t.Servers) == 0 {
		t.Servers = []string{"127.0.0.1:49"}
	}

I could not move the client initialization to Init(), as the client itself contains the secret, it would be unnecessary stored in memory without defer releasesecret, or it gets cleaned from memory and secret is then broken for connections. Maybe there is a solution, but seems like unnecessary struggle (tried once, definitely can be avoided, just did not spent time on it, let me know if its a must to have it in Init).

@Hr0bar
Copy link
Contributor Author

Hr0bar commented May 2, 2023

BTW the "Git repo dirty. Please run "make docs" and push the updated README. Failing CI." in "ci/circleci: test-go-linux" is likely caused by me changing README.MD -> README.md filename in GIT some time ago in this PR. Ran make docs in the past after that, but still pops up

@Hr0bar Hr0bar marked this pull request as ready for review May 2, 2023 20:31
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Show resolved Hide resolved
@srebhan
Copy link
Member

srebhan commented May 9, 2023

Hey @Hr0bar,

regarding

BTW the "Git repo dirty. Please run "make docs" and push the updated README. Failing CI." in "ci/circleci: test-go-linux" is likely caused by me changing README.MD -> README.md filename in GIT some time ago in this PR. Ran make docs in the past after that, but still pops up

running make docs should solve the issue. It's likely due to the sample.conf content doesn't match the README.md Configuration section content...

plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Show resolved Hide resolved
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hr0bar thanks for the update. Just a few minor comments...

plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Show resolved Hide resolved
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Jun 1, 2023 via email

@srebhan
Copy link
Member

srebhan commented Jun 2, 2023

Ok. It's a pitty as we will likely miss v1.27.0 which is planned for June 12th... :-(

@srebhan
Copy link
Member

srebhan commented Jul 4, 2023

@Hr0bar any chance you work on this PR?

@Hipska Hipska added the waiting for response waiting for response from contributor label Jul 4, 2023
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Jul 10, 2023

I added all the changes, BUT the one regarding returning the return status in case of failures likely needs more thought. I added the response_code tag, and returning timeout value for response time, same as in the other plugin. That is all OK, but it works for wrong password, wrong username etc, but it doesnt cover for example wrong tacacs secret. In that case client.SendAuthenStart returns hard error, and no response code is returned. I think it should be consistent for the user, so as I commented on that suggestion already, what about creating the response codes completely custom and human readable for telegraf purposes? Instead of the hard errors there were previously.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 10, 2023
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Jul 10, 2023

So either:

  1. restore parts of code 4387d9f from before today for cases where wrong password etc is used, that return hard error on everything that is not success
  2. keep the current weird mix where we return the response code in tag, but it doesnt apply to wrong secret and possibly other cases, because in that case tacacs servers dont return any response and there is hard error on connection instead.
  3. Implement custom response codes, that provide all info to the user (including real response codes if provided by tacacs server), that cover also wrong tacacs secret and other cases. Use timeout time for response time in case of those failures (same as in the other plugin)

@srebhan srebhan closed this Jul 12, 2023
@srebhan srebhan reopened this Jul 12, 2023
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Hr0bar for the update! The code looks good to me and to me the current response-code strategy also seems good. IMO we error out if the sending reports an error. If the sending succeeds but the server reports some unexpected state we pass that information on to the user...

The only comment I have is that currently we do not report the (context/connection) timeout to the user. I think you should make an exception there when checking the error codes and also generate a metric for this case. This way an operator can see that Telegraf cannot reach the server and check what is going on...

plugins/inputs/tacacs/sample.conf Outdated Show resolved Hide resolved
@Hipska Hipska added the waiting for response waiting for response from contributor label Jul 12, 2023
@Hipska Hipska removed the waiting for response waiting for response from contributor label Jul 13, 2023
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
plugins/inputs/tacacs/README.md Outdated Show resolved Hide resolved
@Hr0bar
Copy link
Contributor Author

Hr0bar commented Jul 25, 2023

Let me know, if the tests are readable/by telegraf standards enough, or if additional refactoring is needed

Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, looks good. Just a few minor things..

plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs.go Outdated Show resolved Hide resolved
plugins/inputs/tacacs/tacacs_test.go Outdated Show resolved Hide resolved
@Hipska Hipska added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Aug 1, 2023
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for contributing this plugin @Hr0bar!

@srebhan srebhan assigned powersj and unassigned srebhan Aug 1, 2023
Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question inline to help me understand the reported value for response time.

if !errors.Is(err, context.DeadlineExceeded) {
return fmt.Errorf("error on tacacs authentication continue username request to %s : %w", client.Addr, err)
}
fields["responsetime_ms"] = time.Duration(t.ResponseTimeout).Milliseconds()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this code path means there was a timeout, but why is the response time set to the timeout value and not the real time?

Same question below, where it would be even more important to see how long the cumulative request time took.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is a valid point, especially for users who would set the timeout to zero. In other cases, it will be close to the configured timeout value anyway. With zero timeout configured it can theoretically also timeout from other sources than the configured context deadline (from some network functions called down the line). Maybe in addition to context.DealineExceeded error we should also allow errors.Is(err, os.ErrDeadlineExceeded) for these cases.

Copy link
Contributor Author

@Hr0bar Hr0bar Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the used tacacs package seems to be correctly setting the timeout to the deeper used network calls, I could not deterministically say that it will always return only context.DeadlineExceeded error on timeout.

It seemd os.IsTimeout() would be better here, as it returns True for both context.DeadlineExceeded and os.ErrDeadlineExceeded and possibly other timeout time errors, but after testing it it was too lenient/broad and it was returning True even for immediate dial tcp or dns errors etc so will keep
if !errors.Is(err, context.DeadlineExceeded) && !errors.Is(err, os.ErrDeadlineExceeded)

The timeout time returned is now real value.

@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Aug 7, 2023

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the effort and work on this!

@powersj powersj merged commit dae1158 into influxdata:master Aug 7, 2023
@github-actions github-actions bot added this to the v1.28.0 milestone Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants