You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2023-08-28 20:49:47,132 INFO [manager] launching NextWorker(module='nowcast.workers.collect_weather', args=['06', '2.5km'], host='localhost')
2023-08-28 20:49:47,134 DEBUG [manager] cmd = ['/SalishSeaCast/nowcast-env/bin/python3', '-m', 'nowcast.workers.collect_weather', '/SalishSeaCast/SalishSeaNowcast/config/nowcast.yaml', '06', '2.5km']
2023-08-28 20:49:47,134 DEBUG [collect_weather] received message from manager: (ack) message acknowledged
2023-08-28 20:49:47,135 DEBUG [collect_weather] shutting down
2023-08-28 20:49:47,138 INFO [manager] launching NextWorker(module='nowcast.workers.crop_gribs', args=['06', '--fcst-date', '2023-08-29'], host='localhost')
2023-08-28 20:49:47,139 DEBUG [manager] cmd = ['/SalishSeaCast/nowcast-env/bin/python3', '-m', 'nowcast.workers.crop_gribs', '/SalishSeaCast/SalishSeaNowcast/config/nowcast.yaml', '06', '--fcst-date', '2023-08-29']
2023-08-28 20:49:47,144 DEBUG [manager] listening...
2023-08-28 20:49:51,174 DEBUG [crop_gribs] wrote GRIB file cropped to SalishSeaCast subdomain: /results/forcing/atmospheric/continental2.5/GRIB/20230829/00/048/20230829T00Z_MSC_HRDPS_APCP_Sfc_RLatLon0.0225_PT048H_SSC.grib2
2023-08-28 20:49:51,174 DEBUG [crop_gribs] observer thread files remaining to process: 5
2023-08-28 20:49:52,831 INFO [crop_gribs] running in process 25692
2023-08-28 20:49:52,832 INFO [collect_weather] running in process 25690
2023-08-28 20:49:52,833 INFO [crop_gribs] read config from /SalishSeaCast/SalishSeaNowcast/config/nowcast.yaml
2023-08-28 20:49:52,834 INFO [crop_gribs] publishing log messages to tcp://*:5562
2023-08-28 20:49:52,835 INFO [crop_gribs] connected to skookum.eos.ubc.ca port 5555
2023-08-28 20:49:52,836 INFO [crop_gribs] cropping 2023-08-29 ECCC HRDPS 2.5km continental 06Z GRIB files to SalishSeaCast subdomain
2023-08-28 20:49:52,838 DEBUG [crop_gribs] creating ECCC GRIB file paths list for 20230829 06Z forecast
2023-08-28 20:49:52,838 INFO [collect_weather] read config from /SalishSeaCast/SalishSeaNowcast/config/nowcast.yaml
2023-08-28 20:49:52,839 INFO [collect_weather] publishing log messages to tcp://*:5563
2023-08-28 20:49:52,840 INFO [collect_weather] connected to skookum.eos.ubc.ca port 5555
2023-08-28 20:49:52,850 DEBUG [collect_weather] calculated set of expected file paths for 2.5 km 20230829/06
2023-08-28 20:49:52,851 INFO [crop_gribs] starting to watch for ECCC grib files to crop in /results/forcing/atmospheric/continental2.5/GRIB/20230829/06/
2023-08-28 20:49:52,853 DEBUG [collect_weather] created /results/forcing/atmospheric/continental2.5/GRIB/20230829/
2023-08-28 20:49:52,853 DEBUG [collect_weather] created /results/forcing/atmospheric/continental2.5/GRIB/20230829/06/
2023-08-28 20:49:52,854 INFO [collect_weather] starting to watch for files in /SalishSeaCast/datamart/hrdps-continental/06/
2023-08-28 20:49:52,862 CRITICAL [crop_gribs] unhandled exception:
Note that crop_gribs starts to watch /results/forcing/atmospheric/continental2.5/GRIB/20230829/06/ before collect_weather creates it.
Traceback:
2023-08-28 20:49:52,862 CRITICAL [crop_gribs] unhandled exception:
Traceback (most recent call last):
File "/SalishSeaCast/NEMO_Nowcast/nemo_nowcast/worker.py", line 391, in _do_work
checklist =self.worker_func(
^^^^^^^^^^^^^^^^^
File "/SalishSeaCast/SalishSeaNowcast/nowcast/workers/crop_gribs.py", line 135, in crop_gribs
observer.start()
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/api.py", line 261, in start
emitter.start()
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/utils/__init__.py", line 92, in startself.on_thread_start()
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/inotify.py", line 119, in on_thread_startself._inotify = InotifyBuffer(path, self.watch.is_recursive)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/inotify_buffer.py", line 37, in __init__self._inotify = Inotify(path, recursive)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/inotify_c.py", line 181, in __init__self._add_watch(path, event_mask)
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/inotify_c.py", line 416, in _add_watch
Inotify._raise_error()
File "/SalishSeaCast/nowcast-env/lib/python3.11/site-packages/watchdog/observers/inotify_c.py", line 432, in _raise_errorraiseOSError(err, os.strerror(err))
FileNotFoundError: [Errno 2] No such file or directory
The text was updated successfully, but these errors were encountered:
Create the directory we're going to watch to handle 2 rare situations that can
cause a FileNotFoundError to be raised (issue #197):
1. collect_weather and crop_gribs are launched concurrently. Occasionally
crop_gribs tries to start watching the directory before collect_weather
creates it.
2. When collect_weather --backfill has to be run to recover from automation
problems we need to start crop_gribs first. So, it needs to create the
directory.
re: issue #197
* Update pkgs & versions used in recent dev env
* Enable Slack notifications for crop_gribs completion
* Change observer thread timeout to 0.5s
Tests in productions showed that timeout of 2s did not appear to reduce the
likelihood of the observer stalling with 1 or a few files uncropped
compared to the original 1s timeout. So, perhaps a shorter timeout will be
better.
* Add retry & timeout for crop_gribs stalled observer
If crop_gribs instance has been running for 8 hours the file system observer
thread may have stalled, or there may be a problem with the HRDPS data
dissemination.
If >10 files remain in the set of files to be cropped, it is probably due to a
data dissemination problem. So, log a critical message that includes
the set of uncropped files so that we can send it to MSC support.
If <=10 files remain, it is possible that the watchdog observer thread just
missed 1 or more files. So, log an info message and retry cropping the set of
uncropped files. If that fails with FileNotFoundError, it is probably due to a
data dissemination problem. So, log a critical message that includes the set of
uncropped files so that we can send it to MSC support.
* Bump cryptography from 41.0.3 to 41.0.4 in /envs (#202)
Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.3 to 41.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@41.0.3...41.0.4)
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Handle directory to watch no yet created
Create the directory we're going to watch to handle 2 rare situations that can
cause a FileNotFoundError to be raised (issue #197):
1. collect_weather and crop_gribs are launched concurrently. Occasionally
crop_gribs tries to start watching the directory before collect_weather
creates it.
2. When collect_weather --backfill has to be run to recover from automation
problems we need to start crop_gribs first. So, it needs to create the
directory.
re: issue #197
* Fix broken NOAA link in docs found by sphinx linkcheck
Log messages just before failure:
Note that
crop_gribs
starts to watch/results/forcing/atmospheric/continental2.5/GRIB/20230829/06/
beforecollect_weather
creates it.Traceback:
The text was updated successfully, but these errors were encountered: