Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass the logger channel through as a prefix #20109

Closed
wants to merge 1 commit into from

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented Apr 21, 2021

Overview

This builds on the recently added PR #20079
and changes the default behaviour such that if a channel is used, without any other intervention,
it will result in a change to the file name to reflect the channel

Channels are a new feature so this won't affect existing sites but
I think adding the prefix is a thing that people using different channels might
reasonably expect as a helpful default. I kept it such that we ignore anything
beyond a narrow range of characters - we could throw an exception
rather than silently ignore if we prefer

Before

Instantiating a logger with a channel logs to the same file as everything else, if a service is not defined

After

Instantiating a logger with a channel logs to a file prefixed by the channel name , if a service is not defined.

If the service uses letters that are no alphanumeric or '-' or '_' it will be silently ignored for security

Technical Details

@totten I think this adds some incentive to start updating existing calls to have a channel as it will separate them out - there are some existing payment processors that already use a prefix (bypassing the Civi::log() )

Comments

This builds on the recently added PR civicrm#20079
and changes the default behaviour such that if a channel is used, without any other intervention,
it will result in a change to the file name to reflect the channel

Channels are a new feature so this won't affect existing sites but
I think adding the prefix is a thing that people using different channels might
reasonably expect as a helpful default. I kept it such that we ignore anything
beyond a narrow range of characters - we could throw an exception
rather than silently ignore if we prefer
@civibot
Copy link

civibot bot commented Apr 21, 2021

(Standard links)

@civibot civibot bot added the master label Apr 21, 2021
@eileenmcnaughton eileenmcnaughton changed the title Err log Pass the logger channel through as a prefix Apr 21, 2021
@eileenmcnaughton
Copy link
Contributor Author

@totten what are your thoughts on this? If we do merge it I think it would be good to merge to the same release that introduced channels

@@ -39,6 +39,9 @@ public function getLog($channel = 'default') {
$c = \Civi::container();
$svc = "log." . $channel;
$this->channels[$channel] = $c->has($svc) ? $c->get($svc) : $c->get(self::DEFAULT_LOGGER);
if ($channel !== 'default' && method_exists($this->channels[$channel], 'setChannel')) {
$this->channels[$channel]->setChannel($channel);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eileenmcnaughton - OK, I'm personally happy with a policy of splitting channels into separate files by default.

From an r-tech POV, I suspect that some sysadmins/upgraders might prefer an option to keep the unified log (e.g. if they've already linked that into their log-aggregation system). Would be curious if we had some feedback from folks with sophisticated deployments (e.g. @MegaphoneJon @adixon @mlutfy).

Maybe a setting where you could opt-out/opt-in to split-files. Ex:

  // Default - Log every channel to separate file: "log_file_channels" == "/.*/"
  // Alternative - Don't log anything to a separate file: "log_file_channels" == "/^$/"
  // Alternative - Log two specific channels to separate files: "log_file_channels" == "/^(mail|ipn)$/

  if (preg_match(Civi::settings()->get('log_file_channels'), $channel)) {
    // Use separate file
  }
  else {
    // Use traditional file
  }

Implementation-wise, there should be a separate instance of the logger for each channel. To my eye, this looks like it'll re-use one instance, which could cause some crossed wires. Consider:

class Foo {
  private $log;
  public function __construct() { $this->log = Civi::log('foo'); }
  public function fooify() { $this->info("Hello from Foo"); }
}

class Bar {
  private $log;
  public function __construct() { $this->log = Civi::log('bar'); }
  public function barbecueTofu() { $this->info("Hello from Bar"); }
}

$foo = new Foo();
$bar = new Bar();
$foo->fooify();
$bar->barbecueTofu();
// If the two objects reference the same `$log`, then one or the other will route to the wrong place.

Assuming you're keen on the combination of DEFAULT_LOGGER and method_exists(...'setChannel') -- then one could get the desired result with clone, e.g.

if ($c->has($svc) {
  $log = $c->get($svc);
}
else {
  $log =  $c->get(self::DEFAULT_LOGGER);;
  if ($channel !== 'default' && method_exists($this->channels[$channel], 'setChannel') /* && settings encourage split files */) {
    $log = clone $log;
    $log->setChannel($channel);
  }
}
$this->channels[$channel] = $log;

(Also, small note - the old code there was using a default value of empty-string (''). The property protected $channel has a default of NULL. Dunno if that makes a difference here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that there are no existing configured channels so far - so default behaviour is no change - however there would be a change if we also defined channels

If this needs a setting to maintain then I don't think we should merge that change - it makes more sense to manage it all in an extension - which I'm making good progress on.

The main reason I thought it might be worth merging is the splitting out of log files give some incentive for people to think about what should get logged to which channels

Based on your feedback I'm inclined to close this but I'll leave it for a couple of days just to let people add any thoughts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed variant: #20163

totten added a commit to totten/civicrm-core that referenced this pull request Apr 27, 2021
Allow system administrators to define the file-naming convention for log files.

This complements on civicrm#20109, which is a step toward logging more information
to separate channels.  The primary risk of directing channels to separate
channels is that existing log-aggregators will not find them (e.g. they may
expect a consolidated log file).

This mitigate the risk by phasing-in the new filenames -- and allowing opt-ins/opt-outs.

* For new deployments, the default is `CiviCRM.{{CHANNEL}}.{{HASH}}.log`. It should be expected that
  logs will be split into multiple channels/files, and that channels/files may come and go over time.
* For sysadmins that want a single/predictable log, use `CiviCRM.{{HASH}}.log` or `CiviCRM.log`.
* For existing deployments, the default is `CiviCRM.{{LEGACY_CHANNEL}}{{LEGACY_HASH}}log`. This
  will only produce files that match the old naming convention. New channels are consolidated under
  the blank channel-name.

Before
------

Suppose you make these two log statements:

```
cv ev 'Civi::log()->info("default stuff"); Civi::log("ipn")->info("payment stuff");'
```

It will generate these two log files:

```
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.0123abcd0123abcd99999999.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.ipn.0123abcd0123abcd99999999log
```

After
-----

The behavior depends on on `CIVICRM_LOG_FILE`. Here are the log files produced by different configurations:

```
== Default on new deployments
== define('CIVICRM_LOG_FILE', 'CiviCRM.{{CHANNEL}}.{{HASH}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.ipn.0123abcd0123abcd99999999.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.default.0123abcd0123abcd99999999.log

== Default on old/unconfigured deployments
== define('CIVICRM_LOG_FILE', 'CiviCRM.{{LEGACY_CHANNEL}}{{LEGACY_HASH}}log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.0123abcd0123abcd99999999.log

== Use on combined log for everything
== define('CIVICRM_LOG_FILE', '/tmp/my-log/CiviCRM.log');
/tmp/my-log/CiviCRM.log

== Store logs in the normal `[civicrm.log]` folder, but with deep directories
== define('CIVICRM_LOG_FILE', '[civicrm.log]/{{HASH}}/{{CHANNEL}}/{{YYYY-MM-DD}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/0123abcd0123abcd99999999/ipn/2021-04-27.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/0123abcd0123abcd99999999/default/2021-04-27.log

== Store logs in an alternate part of the `[civicrm.private]` folder
== define('CIVICRM_LOG_FILE', '[civicrm.private]/logging/{{HASH}}/{{CHANNEL}}/{{YYYY-MM-DD}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/logging/0123abcd0123abcd99999999/ipn/2021-04-27.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/logging/0123abcd0123abcd99999999/default/2021-04-27.log

== Store logs in `/tmp` with long file names
== define('CIVICRM_LOG_FILE', '/tmp/my-log/civicrm-{{HASH}}-{{CHANNEL}}-{{YYYY-MM-DD}}.log');
/tmp/my-log/civicrm-0123abcd0123abcd99999999-default-2021-04-27.log
/tmp/my-log/civicrm-0123abcd0123abcd99999999-ipn-2021-04-27.log
```

You can verify this by running https://gist.github.com/totten/bd0df4478653330cf812bd481cb76ffc
totten added a commit to totten/civicrm-core that referenced this pull request Apr 27, 2021
Allow system administrators to define the file-naming convention for log files.

This complements on civicrm#20109, which is a step toward logging more information
to separate channels.  The primary risk of directing channels to separate
channels is that existing log-aggregators will not find them (e.g. they may
expect a consolidated log file).

This mitigate the risk by phasing-in the new filenames -- and allowing opt-ins/opt-outs.

* For new deployments, the default is `CiviCRM.{{CHANNEL}}.{{HASH}}.log`. It should be expected that
  logs will be split into multiple channels/files, and that channels/files may come and go over time.
* For sysadmins that want a single/predictable log, use `CiviCRM.{{HASH}}.log` or `CiviCRM.log`.
* For existing deployments, the default is `CiviCRM.{{LEGACY_CHANNEL}}{{LEGACY_HASH}}log`. This
  will only produce files that match the old naming convention. New channels are consolidated under
  the blank channel-name.

Before
------

Suppose you make these two log statements:

```
cv ev 'Civi::log()->info("default stuff"); Civi::log("ipn")->info("payment stuff");'
```

It will generate these two log files:

```
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.0123abcd0123abcd99999999.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.ipn.0123abcd0123abcd99999999log
```

After
-----

The behavior depends on on `CIVICRM_LOG_FILE`. Here are the log files produced by different configurations:

```
== Default on new deployments
== define('CIVICRM_LOG_FILE', 'CiviCRM.{{CHANNEL}}.{{HASH}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.ipn.0123abcd0123abcd99999999.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.default.0123abcd0123abcd99999999.log

== Default on old/unconfigured deployments
== define('CIVICRM_LOG_FILE', 'CiviCRM.{{LEGACY_CHANNEL}}{{LEGACY_HASH}}log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/CiviCRM.0123abcd0123abcd99999999.log

== Use on combined log for everything
== define('CIVICRM_LOG_FILE', '/tmp/my-log/CiviCRM.log');
/tmp/my-log/CiviCRM.log

== Store logs in the normal `[civicrm.log]` folder, but with deep directories
== define('CIVICRM_LOG_FILE', '[civicrm.log]/{{HASH}}/{{CHANNEL}}/{{YYYY-MM-DD}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/0123abcd0123abcd99999999/ipn/2021-04-27.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/ConfigAndLog/0123abcd0123abcd99999999/default/2021-04-27.log

== Store logs in an alternate part of the `[civicrm.private]` folder
== define('CIVICRM_LOG_FILE', '[civicrm.private]/logging/{{HASH}}/{{CHANNEL}}/{{YYYY-MM-DD}}.log');
~/bknix/build/dmaster/web/sites/default/files/civicrm/logging/0123abcd0123abcd99999999/ipn/2021-04-27.log
~/bknix/build/dmaster/web/sites/default/files/civicrm/logging/0123abcd0123abcd99999999/default/2021-04-27.log

== Store logs in `/tmp` with long file names
== define('CIVICRM_LOG_FILE', '/tmp/my-log/civicrm-{{HASH}}-{{CHANNEL}}-{{YYYY-MM-DD}}.log');
/tmp/my-log/civicrm-0123abcd0123abcd99999999-default-2021-04-27.log
/tmp/my-log/civicrm-0123abcd0123abcd99999999-ipn-2021-04-27.log
```

You can verify this by running https://gist.github.com/totten/bd0df4478653330cf812bd481cb76ffc
@mattwire
Copy link
Contributor

Is there a lab issue to discuss the preferred implementation of this. I'm very much in favour of the ability to split log files per channel etc. but I'm not sure about it being the default. Specifically my concern is:

  1. Split log files can be much more difficult to debug because you have to do work to put them in cronological order:
    eg. an IPN call comes in which triggers some "ipn" channel messages and then creates a new contribution via repeatransaction which triggers the "Contribution X updated successfully" log message through the contribution BAO. Then you might get another ipn log message confirming that the ipn was processed successfully. You can no longer be sure of the order of the messages, especially when they are recorded with the same (seconds) timestamp.
  2. You can miss log messages because you (or your client!) don't check the right "channel" log.

@eileenmcnaughton
Copy link
Contributor Author

There is a gitlab issue.

I think I've gone off the idea of doing this in core since it's now do-able in an extension - so I don't think anything needs doing here now- but I will post a blog once we put the extension into production

@eileenmcnaughton eileenmcnaughton deleted the err_log branch April 27, 2021 19:29
@mattwire
Copy link
Contributor

@eileenmcnaughton Which is the lab issue please?

@eileenmcnaughton
Copy link
Contributor Author

@mattwire not sure - but search logging or whatever & you'l find it - or maybe search 'monolog' since it was from that issue I found Dave's monolog extension

@mlutfy
Copy link
Member

mlutfy commented Apr 28, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants