Regarding the missing site info in the datasets #81

erensezener · 2016-07-13T11:02:21Z

I have been told by @ge00rg and @clauslang that the data is only from one site in the daily DB. I don't see that this is the case, see the below snippet. But why are there sites like 0.5, 1.5 etc.? Is this expected?

h5 = h5py.File('daily_database.hdf5', 'r')
data = h5['weather_data'][:]
np.unique(data[:,1]) #since column 1 is site
>>> array([ 0. ,  0.5,  1. ,  2.5,  3.5,  4. ])

ge00rg · 2016-07-13T11:39:37Z

I don't remember the exact query we made...nor do I know why these are
the indices, they should be contiguous itegers. Did syou test the other
db? Can you try using the nquery engine on it and see wether you get
plausible valuzes for get_data and/or get_val_range?

Am 2016-07-13 13:02, schrieb C. Eren Sezener:

I have been told by @ge00rg [1] and @clauslang [2] that the data is
only from one site in the daily DB. I don't see that this is the case,
see the below snippet. But why are there sites like 0.5, 1.5 etc.? Is
this expected?

h5 = h5py.File('daily_database.hdf5', 'r')
data = h5['weather_data'][:]
np.unique(data[:,1]) #since column 1 is site

array([ 0. , 0.5, 1. , 2.5, 3.5, 4. ])

You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub [3], or mute the
thread [4].

Links:

[1] https://github.com/ge00rg
[2] https://github.com/clauslang
[3] #81
[4]
https://github.com/notifications/unsubscribe/AP3d6OTn5BUWUrbtVjmLYDFY6geadbQ1ks5qVMW9gaJpZM4JLTY2

erensezener · 2016-07-13T11:51:33Z

The hourly DB sites are quite fucked up:

>>> h5 = h5py.File('hourly_database.hdf5', 'r'); data = h5['weather_data'][:]
>>> np.unique(data[:,2])
array([  0.00000000e+00,   1.00000000e+00,   4.00000000e+00,
         2.01606212e+11])

So we have sites 1 and 4 and a date (WTF?)

erensezener · 2016-07-13T11:53:20Z

Can you try using the nquery engine on it and see wether you get
plausible valuzes for get_data and/or get_val_range?

The DB should be essentially the same. You can check it yourself.

erensezener · 2016-07-13T17:03:11Z

I am running all the scrapers again such that the outputs will be written to different DBs. Then all scraper authors must review their data since the DB will consist of only their own data.

denisalevi · 2016-07-14T11:35:46Z

If you want scraper authors to review their DB, please provide a clear code snippet, explaining how to access the data or use the query engine and where which data should be stored.

erensezener assigned clauslang Jul 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the missing site info in the datasets #81

Regarding the missing site info in the datasets #81

erensezener commented Jul 13, 2016

ge00rg commented Jul 13, 2016

Links:

erensezener commented Jul 13, 2016

erensezener commented Jul 13, 2016

erensezener commented Jul 13, 2016

denisalevi commented Jul 14, 2016

Regarding the missing site info in the datasets #81

Regarding the missing site info in the datasets #81

Comments

erensezener commented Jul 13, 2016

ge00rg commented Jul 13, 2016

Links:

erensezener commented Jul 13, 2016

erensezener commented Jul 13, 2016

erensezener commented Jul 13, 2016

denisalevi commented Jul 14, 2016