-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unexpected large data gaps #16
Comments
This is all well known. The expectation that time series of waterlevels, or especially equidistant time series of locations in the tidal reach, are available without large gaps since 1900 is completely mistaken. Before 1971 the water level data were read off manually from the graphs, and as a rule only times and heights of high and low water (HW/LW-data) were processed. From the early 1930's for some tide gauges also water level data with a time step of 3 hours were processed. |
Thanks for your response. I understand there can be (large) gaps in timeseries and some further investigation on my side indeed shows that a contiguous timeseries is not to be expected. However, we have a reference dataset that was exported from DONAR a few years ago for the KenmerkendeWaarden project. The code to reproduce the figure below: import os
import pandas as pd
import hatyan
import matplotlib.pyplot as plt
plt.close("all")
dir_data = r"p:\archivedprojects\11208031-010-kenmerkende-waarden-k\work\data_vanRWS_20220805\wetransfer_waterstandsgegevens_2022-08-05_1306"
file_dia_cadz1 = os.path.join(dir_data, r"WATHTE_10min\CADZD_1.dia")
file_dia_cadz2 = os.path.join(dir_data, r"WATHTE_oud\CADZ_KW.dia")
file_dia_hoek1 = os.path.join(dir_data, r"WATHTE_10min\HOEKVHLD_1.dia")
file_dia_hoek2 = os.path.join(dir_data, r"WATHTE_oud\HOEK_KW.dia")
fig,ax = plt.subplots(figsize=(10,5))
ts_cadz = hatyan.read_dia([file_dia_cadz1,file_dia_cadz2], block_ids="allstation", station="CADZD", allow_duplicates=True)
ts_cadz["values"].plot(ax=ax, label="CADZD")
ts_hoek = hatyan.read_dia([file_dia_hoek1,file_dia_hoek2], block_ids="allstation", station="HOEKVHLD", allow_duplicates=True)
ts_hoek["values"].plot(ax=ax, label="HOEKVHLD")
ax.legend()
ax.grid()
ax.set_xlim(pd.Timestamp("1890-01-01"),pd.Timestamp("2024-01-01"))
fig.tight_layout() This DONAR export does have a larger data coverage than in the DDL-based figure from this issue description, for instance for the period between 1970 and 1990. This period is not present in many of the datasets available on the DDL, as is also visible in the last figure in #39. Therefore I expect that the DDL is not in sync with DONAR when it comes to this data. |
My expectation is that there are some stations for which there is data available from 1900 onwards, without large gaps. However, when looking at HOEKVHLD, we see large gaps in the dataset, sometimes 17/20 years without data.
This prints:
Also visible in timeseries (and only from 1986 for CADZD):
The text was updated successfully, but these errors were encountered: