Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data upload delay due to hanging thread #18

Closed
ArneTR opened this issue Dec 3, 2023 · 7 comments
Closed

Data upload delay due to hanging thread #18

ArneTR opened this issue Dec 3, 2023 · 7 comments

Comments

@ArneTR
Copy link
Member

ArneTR commented Dec 3, 2023

The hog in it's current patch level constantly displays current data on local.

However I am experiencing missing data when using the Detailed Analytics.

My upload backlog is already at 2000+ and the daemon has been running uniterrupted since then.

Apparently the upload did not succeed and the thread crashed. The other threads where running fine.

The reason to that is currently unknown and I cannot spot this behaviour in my prior logs.

The resolve is to unload/load the daemon.

Screenshot 2023-12-03 at 5 11 05 PM
[DEBUG] 2023-12-03 11:47:46,921 - Program started.
[DEBUG] 2023-12-03 11:47:46,921 - Using db: /Library/Application Support/berlin.green-coding.hog/db.db
[DEBUG] 2023-12-03 11:47:46,922 - Using /etc/hog_settings.ini as settings file.
[DEBUG] 2023-12-03 11:47:46,969 - Setting: {'powermetrics': 5000, 'upload_delta': 300, 'api_url': 'https://api.green-coding.berlin/v1/hog/add', 'web_url': 'https://metrics.green-coding.berlin/hog-details.html?machine_uuid=', 'upload_data': True, 'resolve_coalitions': ['com.googlecode.iterm2', 'com.apple.terminal', 'com.vix.cron']}
[DEBUG] 2023-12-03 11:47:46,969 - Upload thread started
[DEBUG] 2023-12-03 11:47:46,969 - DB checker thread started
[DEBUG] 2023-12-03 11:47:46,969 - Ticker thread started
[INFO] 2023-12-03 11:47:46,970 - Starting powermetrics process: powermetrics --show-all -i 5000 -f plist
[INFO] 2023-12-03 11:47:47,050 - Uploading 10 rows to: https://api.green-coding.berlin/v1/hog/add
[DEBUG] 2023-12-03 11:47:51,938 - Uploaded.
[INFO] 2023-12-03 11:47:52,073 - Uploading 10 rows to: https://api.green-coding.berlin/v1/hog/add
[DEBUG] 2023-12-03 11:51:09,217 - Parsing new input
[DEBUG] 2023-12-03 11:51:09,346 - Data added to the DB
[INFO] 2023-12-03 11:51:09,346 - {'combined_energy': 3774, 'cpu_energy': 3726, 'gpu_energy': 48, 'ane_energy': 0, 'energy_impact': 2863}
[DEBUG] 2023-12-03 11:51:09,540 - DB Check
[DEBUG] 2023-12-03 11:51:09,632 - Power metrics running check
[DEBUG] 2023-12-03 11:51:14,375 - Parsing new input
[DEBUG] 2023-12-03 11:51:14,478 - Data added to the DB
[INFO] 2023-12-03 11:51:14,478 - {'combined_energy': 9657, 'cpu_energy': 9468, 'gpu_energy': 189, 'ane_energy': 0, 'energy_impact': 7556}
[DEBUG] 2023-12-03 11:51:19,481 - Parsing new input
[DEBUG] 2023-12-03 11:51:19,562 - Data added to the DB
[INFO] 2023-12-03 11:51:19,562 - {'combined_energy': 14164, 'cpu_energy': 13613, 'gpu_energy': 551, 'ane_energy': 0, 'energy_impact': 10617}
[DEBUG] 2023-12-03 11:51:24,625 - Parsing new input
[DEBUG] 2023-12-03 11:51:24,695 - Data added to the DB
[INFO] 2023-12-03 11:51:24,696 - {'combined_energy': 19772, 'cpu_energy': 18943, 'gpu_energy': 829, 'ane_energy': 0, 'energy_impact': 14064}
[DEBUG] 2023-12-03 11:51:29,692 - Parsing new input
[DEBUG] 2023-12-03 11:51:29,750 - Data added to the DB
[INFO] 2023-12-03 11:51:29,750 - {'combined_energy': 30182, 'cpu_energy': 28842, 'gpu_energy': 1340, 'ane_energy': 0, 'energy_impact': 21226}
[DEBUG] 2023-12-03 11:51:34,834 - Parsing new input
[DEBUG] 2023-12-03 11:51:34,890 - Data added to the DB
[INFO] 2023-12-03 11:51:34,890 - {'combined_energy': 38847, 'cpu_energy': 36999, 'gpu_energy': 1848, 'ane_energy': 0, 'energy_impact': 27390}

....

[DEBUG] 2023-12-03 12:31:26,931 - DB Check
[DEBUG] 2023-12-03 12:31:26,940 - Parsing new input
[DEBUG] 2023-12-03 12:31:27,015 - Power metrics running check
[DEBUG] 2023-12-03 12:31:27,054 - Data added to the DB
[INFO] 2023-12-03 12:31:27,055 - {'combined_energy': 1030542, 'cpu_energy': 961519, 'gpu_energy': 69023, 'ane_energy': 0, 'energy_impact': 660009}
[DEBUG] 2023-12-03 12:31:32,072 - Parsing new input
[DEBUG] 2023-12-03 12:31:32,124 - Data added to the DB
[INFO] 2023-12-03 12:31:32,125 - {'combined_energy': 1031471, 'cpu_energy': 962408, 'gpu_energy': 69063, 'ane_energy': 0, 'energy_impact': 660649}
[DEBUG] 2023-12-03 12:31:37,272 - Parsing new input
[DEBUG] 2023-12-03 12:31:37,319 - Data added to the DB
[INFO] 2023-12-03 12:31:37,319 - {'combined_energy': 1033139, 'cpu_energy': 964038, 'gpu_energy': 69101, 'ane_energy': 0, 'energy_impact': 661731}
[DEBUG] 2023-12-03 12:31:42,449 - Parsing new input
[DEBUG] 2023-12-03 12:31:42,500 - Data added to the DB
[INFO] 2023-12-03 12:31:42,501 - {'combined_energy': 1033865, 'cpu_energy': 964723, 'gpu_energy': 69142, 'ane_energy': 0, 'energy_impact': 662177}
[DEBUG] 2023-12-03 12:31:47,589 - Parsing new input
[DEBUG] 2023-12-03 12:31:47,647 - Data added to the DB
[INFO] 2023-12-03 12:31:47,647 - {'combined_energy': 1035421, 'cpu_energy': 966239, 'gpu_energy': 69182, 'ane_energy': 0, 'energy_impact': 663228}
[DEBUG] 2023-12-03 12:31:52,641 - Parsing new input
[DEBUG] 2023-12-03 12:31:52,685 - Data added to the DB
[INFO] 2023-12-03 12:31:52,685 - {'combined_energy': 1036295, 'cpu_energy': 967072, 'gpu_energy': 69223, 'ane_energy': 0, 'energy_impact': 663761}
[DEBUG] 2023-12-03 12:31:57,846 - Parsing new input
[DEBUG] 2023-12-03 12:31:57,900 - Data added to the DB
[INFO] 2023-12-03 12:31:57,901 - {'combined_energy': 1037642, 'cpu_energy': 968376, 'gpu_energy': 69266, 'ane_energy': 0, 'energy_impact': 664652}
[DEBUG] 2023-12-03 12:32:03,037 - Parsing new input
[DEBUG] 2023-12-03 12:32:03,082 - Data added to the DB
[INFO] 2023-12-03 12:32:03,082 - {'combined_energy': 1039071, 'cpu_energy': 969759, 'gpu_energy': 69312, 'ane_energy': 0, 'energy_impact': 665573}

....

[DEBUG] 2023-12-03 17:11:54,933 - Parsing new input
[DEBUG] 2023-12-03 17:11:54,992 - Data added to the DB
[INFO] 2023-12-03 17:11:54,992 - {'combined_energy': 4335394, 'cpu_energy': 4021331, 'gpu_energy': 314056, 'ane_energy': 7, 'energy_impact': 2893909}

@ArneTR
Copy link
Member Author

ArneTR commented Dec 3, 2023

I have reproduced the error again by closing the MacBook lid while an upload was running.

The uploader threads hangs in the select forever.

Screenshot 2023-12-03 at 5 44 52 PM

I will now wait if the OS is able to restart the thread at some point.

My immediate suggestion to the problem however would be to also have a "uploader-check" to see if the upload thread got stuck and then force-close all threads and let the OS restart the service.

@ribalba
Copy link
Member

ribalba commented Dec 3, 2023

I can verify this. Looking into it

ribalba added a commit that referenced this issue Dec 5, 2023
@ribalba
Copy link
Member

ribalba commented Dec 5, 2023

This should fix this:
fd7cdc8

@ArneTR please update your power_logger.py under /usr/local/bin/hog/power_logger.py

cd /tmp
wget https://raw.githubusercontent.com/green-coding-berlin/hog/main/power_logger.py 
sudo mv power_logger.py /usr/local/bin/hog/power_logger.py
chmod +x /usr/local/bin/hog/power_logger.py
sudo launchctl unload /Library/LaunchDaemons/berlin.green-coding.hog.plist
sudo launchctl load /Library/LaunchDaemons/berlin.green-coding.hog.plist

Will take some time to upload everything

@ribalba
Copy link
Member

ribalba commented Dec 5, 2023

Keeping open till @ArneTR confirms this fixes the issue

@ArneTR
Copy link
Member Author

ArneTR commented Dec 6, 2023

Implemented. waiting for data to confirm

@ArneTR
Copy link
Member Author

ArneTR commented Dec 6, 2023

Looks good:

INFO] 2023-12-06 07:04:33,267 - Stopping program due to an inconsistent state of the program. Probably the upload blocking
[DEBUG] 2023-12-06 07:04:38,519 - Program started.
[DEBUG] 2023-12-06 07:04:38,519 - Using db: /Library/Application Support/berlin.green-coding.hog/db.db
[DEBUG] 2023-12-06 07:04:38,519 - Using /etc/hog_settings.ini as settings file.
[DEBUG] 2023-12-06 07:04:38,556 - Setting: {'powermetrics': 5000, 'upload_delta': 300, 'api_url': 'https://api.green-coding.berlin/v1/hog/add', 'web_url': 'https://metrics.green-coding.berlin/hog-details.html?machine_uuid=', 'upload_data': True, 'resolve_coalitions': ['com.googlecode.iterm2', 'com.apple.terminal', 'com.vix.cron']}
[DEBUG] 2023-12-06 07:04:38,556 - Upload thread started
[DEBUG] 2023-12-06 07:04:38,557 - DB checker thread started
[DEBUG] 2023-12-06 07:04:38,557 - Ticker thread started
[INFO] 2023-12-06 07:04:38,557 - Starting powermetrics process: powermetrics --show-all -i 5000 -f plist
[INFO] 2023-12-06 07:04:38,653 - Uploading 10 rows to: https://api.green-coding.berlin/v1/hog/add

@ArneTR
Copy link
Member Author

ArneTR commented Dec 6, 2023

This one can be closed then. I will re-open if it happens again.

  1. What came to mind, since we now know it is a latency response / timeout issue with the API.
    Do we take care of duplicate checking?

Scenario: Data is sent by hog. API processes but answer not been sent. Hog closes. No 204 is ever sent back, but data actually processed on API side. Hog sends data again in next cycle.

Do we maybe have duplicates?

If you think this is possible. Can you check DB for duplicates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants