Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM #614

kuerious · 2019-07-10T01:24:37Z

In raising this issue, I confirm the following (please check boxes, eg [X]) Failure to fill the template will close your issue:

I have read and understood the contributors guide.
The issue I am reporting can be replicated
The issue I am reporting isn't a duplicate

How familiar are you with the codebase?:

6....ish

[BUG | ISSUE] Expected Behaviour:
Running latest pull of DietPi, installing "Pi-hole", runs 24hrs/day, 7 days per week, without crashing.

[BUG | ISSUE] Actual Behaviour:
A minimum of 1x/day, sometimes up to 3x/day, the service/process "pihole-FTL" crashes without (apparent) log entries as to why.

[BUG | ISSUE] Steps to reproduce:
Installed latest version of

DietPi (as of writing, see this link); ran all system & security updates
Pi-hole (first ran via DietPi "Software Optimized" installer; 2nd try using "Curl" from website.)
For DoH via Cloudflare, used steps first at original link, then used steps at this link
No further software or optimizations done

Log file output [if available]

Token File here

Device specifics
1st Attempt:
Hardware Type: rPi 2 Model B+, 8GB microSD
OS: DietPi (Stretch + security updates)

2nd Attempt:
Hardware Type: VM w/ 4x core, 4GB RAM, 8GB space allocated
OS: DietPi (Stretch + security updates)

This template was created based on the work of udemy-dl.

The text was updated successfully, but these errors were encountered:

kuerious · 2019-07-10T01:43:56Z

Extra details

Cross-posted at DietPi GitHub as Issue# 2976. HW and VM details above/beyond the above are listed.

AzureMarker · 2019-07-10T03:01:08Z

Do you have a log of the crash, either from journalctl -u pihole-FTL or /var/log/pihole-FTL.log?

kuerious · 2019-07-10T05:06:44Z

And that's another thing ... it never seems to leave an error message in the logs. See here:
https://pastebin.com/Qg0cejwR

This is the closest I've ever come to a log file that at least shows the entries leading-up to the event:
https://i.imgur.com/UdubFGH.png

MichaIng · 2019-07-10T10:01:12Z

@kuerious @Mcat12
Lets keep discussion here, so no double posting required.

One idea is that it is due to DietPi-RAMlog, respectively DietPi-Logclear which removes non-today Pi-hole log entries when the files fill half of available tmpfs size.

After the logs are truncated, pihole-FTL is restarted in a special way that did work and was required some earlier times, but perhaps it does not anymore in some cases: https://github.com/MichaIng/DietPi/blob/master/dietpi/func/dietpi-logclear#L164-L177

The dnsmasq is relic from pre-FTLDNS times, which could be removed, but does not hurt.
I am not sure if this resolvconf restart is required? I remember it was.
Actually what would make sense IMO is to stop pihole-FTL before touching the log files and start if afterwards, instead of having this stop>stop>stop>start>start>start part. But you guys might know better how pihole-FTL handles/expects log files and how to have a safe restart procedure in combination with resolvconf.

I will also run some tests later and try to replicate the issue.

DL6ER · 2019-07-10T10:12:59Z

I am not sure if this resolvconf restart is required? I remember it was.

I don't think it is. I never did it myself.

But you guys might know better how pihole-FTL handles/expects log files and how to have a safe restart procedure in combination with resolvconf.

I don't think you need to restart pihole-FTL here.
We open and close the log file after each logging. I checked git blame and saw that the last changes to this function were made back in April 2017.
It should be (fairly) safe to leave FTL running while truncating the log file (depending on how you do it).
Furthermore, it might be worth injecting a line like

echo "DietPi: Truncated log file due to size." > /var/log/pihole-FTL.log

or something similar to avoid or at least reduce uncertainties if the log is really empty or was flushed at some point. Probably even with a time stamp.

it never seems to leave an error message in the logs

We have signal handlers that should catch and report every crash or termination request. The only thing we can not catch and log is an immediate termination via SIGKILL. This would have to be found using other means.

MichaIng · 2019-07-10T10:30:53Z

@DL6ER

We open and close the log file after each logging. I checked git blame and saw that the last changes to this function were made back in April 2017.

Okay I think stopping it before touching the files still makes sense to avoid the rare change that both touch the logs the same time? Although it was never an issue with other logs where it could basically be the same, if not concurrent accesses are handled kernel-wise. However I think there is a difference between "opening" and "closing" a file, so "occupying" it and doing a single write, AFAIK.

Furthermore, it might be worth injecting a line like

Very good idea, will add this.

@kuerious
There is a syntax in your journalctl check, please try: journalctl -u pihole-FTL

Ah now I see something:

We handle pihole.log the mentioned way but not pihole-FTL.log.
Makes sense to handle/truncate both in case, but I will check when running some tests later.

DL6ER · 2019-07-10T16:55:06Z

to avoid the rare change that both touch the logs the same time?

I do not yet see the real difference than just truncating the file. The worst thing that could happen would be a cropped log line. We do not write to the file when it cannot be opened and when we opened it, you should not be able to delete it: the file might disappear on the file system but our handle stays open and we can write to the file.
And if you just truncate the file by running an echo > command, then I don't see problems here.

In contrast, (re)starting pihole-FTL actually involves a lot of work. Assume a user has millions of domains in his blocking lists and possibly configured a number of regex filters as well, then a restart can need up to (several) minutes in the worst case to complete on really low-end devices (thinking of Pi 1B+ or similar).

We handle pihole.log the mentioned way but not pihole-FTL.log.

Yeah, it's different for pihole.log, however, ~~I think we even disable logging by default on new installs? This is somewhat guessed, though.~~ we really should disable logging by default on new installs!

MichaIng · 2019-07-10T19:30:24Z

pihole logging off if I see right.

Okay so pihole-FTL.log is for the service state and it makes sense then that we do no special handling for this in when RAMlog being enabled (thus hourly log clearing). pihole.log is the one that shows blocked requests and can be seen on web UI, right? Actually having this enabled by default makes quite sense to me to have an overview about how much and which ad requests are blocked. As well we (DietPi) receive reports about certain github.com subdomains being blocked (codeload.github.com and another one I currently don't remember), which then breaks any download from GitHub repos. I couldn't find out until now which blocklist is responsible, nor could I replicate the issue on a fresh Pi-hole install with all by default available blocklists enabled. So far I could not get the users to check the pihole.log which would contain the responsible block list? However adding the affected domain to the whitelist always solved the issue.

Anyway I have to spin up a test install to go through all of this, hopefully with a solution for clearing pihole.log without doing a pihole-FTL restart. One idea was to not have it in RAMlog, but instead in the userdata dir that most users move to external drive (hopefully), then leaving a symlink in /var/log/. Clearing logs can skip symlinks of course as long as their destination is outside the tmpfs.

And if you just truncate the file by running an echo > command, then I don't see problems here.

It's sed -i '/<todays_timestamp>/!d' /var/log/pihole.log so only logs from past days are removed. But that should not make a difference.

AzureMarker · 2019-07-11T02:09:35Z

You can find what is blocking a domain by using pihole -q or the associated function on the web interface (under "Tools -> Query Lists").

The only current usages of pihole.log are the "Tail pihole.log" feature on the web interface, pihole -t (which does the same thing but with added colors), and the debugger (which just saves the last few lines). None of these usages are critical or so often used that they can't be disabled by default.

MichaIng · 2019-07-11T16:45:20Z

Okay I just checked things now:

pihole-FTL.log is just the service/process log, which makes sense to have since it gives much more details for debugging than what journald receives.
- Clearing it on hourly basis as usual, if RAMlog enabled, is totally okay and does not break anything.
pihole.log contains detailed dnsmasq access log. Clearing it does not affect the query log from web UI and after some times testing does not break the service (nor any name resolving) as well. Perhaps this was an issue in the past but is not anymore?
The query logs from web UI are stored in database /etc/pihole/pihole-FTL.db (?)

So indeed pihole.log could/should be disabled without any disadvantage for regular users. I used:

pihole -l off
rm /var/log/pihole.log

Output is a bid confusing:

  [✓] Deleted  queries from database
  [i] Disabling logging...
  [✓] Restarting DNS service
  [✓] Logging has been disabled!

Logs from database (thus web UI) are removed as well, besides the most current four entries (in my case).
But help page states: off Disable and flush the Pi-hole log at /var/log/pihole.log and no word about query/database logs
Would be good to make things clear here and fully separate pihole.log file from database query logs.
Since it seems doubled to me, it could be instead made clear and grant some flexibility by allowing to choose between database logging (default) and file logging (or both).
Another bid confusing thing when having pihole.log disabled: On web UI settings page system tab, it allows to "Enable query logging", while when switching to privacy it shows full logging enabled, which is also proven by generating query log entries. So the switch on settings>system page toggles the pihole.log but not the actual query log from database that is shown on the web UI.

So basically a little cleanup and clear separation of these two logging systems would be helpful, and yeah, disabling pihole.log by default totally makes sense 👍.

For DietPi I will now implement the following after Pi-hole installer has finished:

pihole -l off
rm /var/log/pihole.log

@DL6ER
Could you run the above as well and check if your crashes are resolved then?

Additionally I will remove the special pihole.log handling from our logclear script. Even if enabled, I could not force any error by clearing the file. Furthermore the pihole-FTL restart, when file has grown over limit, seems to break in cases (in combination with resolvconf restart?) and as @Mcat12 mentioned is a large overhead as well.

+ DietPi-Software | Pi-hole: Disable pihole.log by default: pi-hole/FTL#614 (comment) + DietPi-Software | Pi-hole: Add Pi-hole specific Lighttpd include config based on: https://github.com/pi-hole/pi-hole/blob/master/advanced/lighttpd.conf.debian + DietPi-Software | Lighttpd: Do not create lighttpd.conf backup, since we only edit webroot or fix a Buster upgrade bug, which both is simply required and expected by user to have the just chosen software title + Lighttpd up and running. + DietPi-Logclear | Remove special pihole.log handling, which is not required since clearing it does not break the service/process anymore and query logs are holt in database. Also pihole.log is disabled now by default on our Pi-hole installs. + DietPi-Logclear | Remove log file exclusion code fragments, which was never really implemented + DietPi-Logclear | Further coding enhancements + DietPi-Cleaner | Never remove Git, since it is required often for internal updaters, e.g. Pi-hole + DietPi-Cleaner | Fix /var/tmp/dietpi/logs/dietpi-ramlog_store removal and use new DietPi-Set_software apt-cache function to clean APT + DietPi-Cleaner | Minor coding

kuerious · 2019-07-11T20:35:22Z

I, too, have done the above:

root@dns1:~# pihole -l off
[✓] Restarting DNS service
[✓] Flushed /var/log/pihole.log
[✓] Deleted 46466 queries from database
[i] Disabling logging...
[✓] Restarting DNS service
[✓] Logging has been disabled!
root@dns1:~# rm /var/log/pihole.log
root@dns1:~#

It seems to have cleared the queries logged up to this point (in the daily/24-hour period), which I find strange, see here:

I will keep an eye on things and report back anything of interest.

MichaIng · 2019-07-11T20:53:36Z

@kuerious

It seems to have cleared the queries logged up

Jep exactly as I mentioned above, a bid strange it is. However this is just a single time and the query logs (in database) continue as before, as wanted.

kuerious · 2019-07-12T00:07:08Z

Well, it's been a few hours, and the graphs all seem to be happy:

https://i.imgur.com/xEpTEdG.png

No need for a reboot in awhile. Now, mind you, this is the VM I used with the x86_64 DietPi image. I set the specs to be 4x cores, 4GB memory, and 8GB storage. This is geared more towards the new RPi 4, not the lower-end RPi devices.

Should I drop the memory to see if things happen? Or will it not matter?

AzureMarker · 2019-07-12T03:14:43Z

The pihole.log file itself should not be deleted, as it is still used for general dnsmasq logging (just not query logging).

Previously, before FTL v4, we read from the dnsmasq log in order to get the query information. Now that we run dnsmasq as part of FTL, we read the queries directly from dnsmasq and no longer need the log.

MichaIng · 2019-07-12T07:20:46Z

The service recreates the log file on start anyway, so no point in removing it.

Btw also the dir /var/log/pihole is created but not in use with default setup?

@kuerious
Memory size should not play a role. The size of the /var/log tmpfs (50 MiB by default) is relevant. If pihole.log reaches 1/3 of this size, logclear kicks in. But the log file can hardly reach this size in reasonable time with this query logs disabled.

AzureMarker · 2019-07-13T02:29:52Z

/var/log/pihole seems to have been planned as the location of the logs, but is not used by default. @DL6ER, can we remove this from the service script?

DL6ER · 2020-03-08T12:50:54Z

Solved by DietPi 6.26.0

kuerious mentioned this issue Jul 10, 2019

Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM MichaIng/DietPi#2976

Closed

DL6ER added the Potential bug label Jul 10, 2019

MichaIng mentioned this issue Jul 11, 2019

DietPi-Software | Pi-hole: Disable pihole.log by default and enhance Lighttpd config MichaIng/DietPi#2981

Merged

4 tasks

MichaIng mentioned this issue Sep 16, 2019

Beta v6.26.0 MichaIng/DietPi#3112

Merged

MichaIng mentioned this issue Oct 15, 2019

Release v6.26 MichaIng/DietPi#3162

Merged

DL6ER closed this as completed Mar 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM #614

Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM #614

kuerious commented Jul 10, 2019 •

edited

Loading

kuerious commented Jul 10, 2019 •

edited

Loading

AzureMarker commented Jul 10, 2019 •

edited

Loading

kuerious commented Jul 10, 2019 •

edited

Loading

MichaIng commented Jul 10, 2019

DL6ER commented Jul 10, 2019 •

edited

Loading

MichaIng commented Jul 10, 2019 •

edited

Loading

DL6ER commented Jul 10, 2019

MichaIng commented Jul 10, 2019 •

edited

Loading

AzureMarker commented Jul 11, 2019 •

edited

Loading

MichaIng commented Jul 11, 2019 •

edited

Loading

kuerious commented Jul 11, 2019

MichaIng commented Jul 11, 2019

kuerious commented Jul 12, 2019

AzureMarker commented Jul 12, 2019

MichaIng commented Jul 12, 2019 •

edited

Loading

AzureMarker commented Jul 13, 2019

DL6ER commented Mar 8, 2020

Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM #614

Service "pihole-FTL" keeps crashing/stopping; both as HW on RPi 2 Mod B+ & as VM #614

Comments

kuerious commented Jul 10, 2019 • edited Loading

kuerious commented Jul 10, 2019 • edited Loading

AzureMarker commented Jul 10, 2019 • edited Loading

kuerious commented Jul 10, 2019 • edited Loading

MichaIng commented Jul 10, 2019

DL6ER commented Jul 10, 2019 • edited Loading

MichaIng commented Jul 10, 2019 • edited Loading

DL6ER commented Jul 10, 2019

MichaIng commented Jul 10, 2019 • edited Loading

AzureMarker commented Jul 11, 2019 • edited Loading

MichaIng commented Jul 11, 2019 • edited Loading

kuerious commented Jul 11, 2019

MichaIng commented Jul 11, 2019

kuerious commented Jul 12, 2019

AzureMarker commented Jul 12, 2019

MichaIng commented Jul 12, 2019 • edited Loading

AzureMarker commented Jul 13, 2019

DL6ER commented Mar 8, 2020

kuerious commented Jul 10, 2019 •

edited

Loading

kuerious commented Jul 10, 2019 •

edited

Loading

AzureMarker commented Jul 10, 2019 •

edited

Loading

kuerious commented Jul 10, 2019 •

edited

Loading

DL6ER commented Jul 10, 2019 •

edited

Loading

MichaIng commented Jul 10, 2019 •

edited

Loading

MichaIng commented Jul 10, 2019 •

edited

Loading

AzureMarker commented Jul 11, 2019 •

edited

Loading

MichaIng commented Jul 11, 2019 •

edited

Loading

MichaIng commented Jul 12, 2019 •

edited

Loading