Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.10.1] CQ missing data depending on time frame #6273

Closed
ChrisGute opened this issue Apr 8, 2016 · 4 comments
Closed

[0.10.1] CQ missing data depending on time frame #6273

ChrisGute opened this issue Apr 8, 2016 · 4 comments

Comments

@ChrisGute
Copy link

I am using CQ's to down sample by site to get location averages. All of this works correctly but I am seeing an odd issue. After some time of running I notice that data that is requested for long time frames (> 3h) don't have any current data. However, when I change the query to the last three hours the missing data shows up correctly. I checked using the following quires and the data is there for small times but not the long ones. I have tried restarting the influx service and it had no impact. Another thing to note is that this is also only happing on the CQ version of the data. The non downsampled data does not show this issue.

I was not able to find anything similar in the open bugs or on the google group page.

When running a Query for time > now() -4h the data is missing in alot of spots

SELECT mean("respTime") FROM "site_perf" WHERE time > now() - 4h GROUP BY time(5m)

name: site_perf

time mean
1459277400000000000 1.3061995469197798
1459277700000000000 1.2545889816546227
1459278000000000000 1.2958840825182398
1459278300000000000 1.3005296896113339
1459278600000000000 1.4994842241474975
1459278900000000000 1.3306135521088178
1459279200000000000 1.3286860573264112
1459279500000000000 1.4275837201763675
1459279800000000000 1.4117126874333947
1459280100000000000 1.3656119937434639
1459280400000000000 1.3755893642976111
1459280700000000000 1.399226772671526
1459281000000000000 1.3148504251594089
1459281300000000000 1.3014612960411196
1459281600000000000 1.311433302102438
1459281900000000000 1.347239695118827
1459282200000000000 1.3717685171844447
1459282500000000000 1.5015644301052113
1459282800000000000 1.5850378565473864
1459283100000000000 0
1459283400000000000 0.17374799393926477
1459283700000000000 0
1459284000000000000 0
1459284300000000000 0.096
1459284600000000000 0.014000000000000002
1459284900000000000 0
1459285200000000000 0
1459285500000000000 2.492857142857143
1459285800000000000 0.9516666666666667
1459286100000000000 0
1459286400000000000 2.0700000000000003
1459286700000000000 3.4266666666666667
1459287000000000000 3.5212499999999998
1459287300000000000 0.016
1459287600000000000 0
1459287900000000000 0
1459288200000000000 0
1459288500000000000 0
1459288800000000000 0.27010818592073305
1459289100000000000 0.184
1459289400000000000 0
1459289700000000000 0
1459290000000000000 0
1459290300000000000
1459290600000000000
1459290900000000000
1459291200000000000
1459291500000000000
1459291800000000000

When I run the same query with time > now() -3h the missing data from above is filled in.

SELECT mean("respTime") FROM "site_perf" WHERE time > now() - 3h GROUP BY time(5m)

name: site_perf

time mean
1459281000000000000 1.275318617604651
1459281300000000000 1.30146129604112
1459281600000000000 1.311433302102438
1459281900000000000 1.3472396951188272
1459282200000000000 1.3717685171844447
1459282500000000000 1.5015644301052105
1459282800000000000 1.5135420900790697
1459283100000000000 1.4957340015054985
1459283400000000000 1.43372543096075
1459283700000000000 1.4071240175352513
1459284000000000000 1.3783814203235474
1459284300000000000 1.4117470407603143
1459284600000000000 1.4458131225653117
1459284900000000000 1.3450328505375468
1459285200000000000 1.3572948187655949
1459285500000000000 1.3220819130086399
1459285800000000000 1.309561898727543
1459286100000000000 1.3039778806049933
1459286400000000000 1.4032298595633403
1459286700000000000 1.3490015485617857
1459287000000000000 1.3599768612453098
1459287300000000000 1.2967420419837596
1459287600000000000 1.3160391280229236
1459287900000000000 1.3094432629125927
1459288200000000000 1.3673983683312751
1459288500000000000 1.5923603896514558
1459288800000000000 1.6009743512228622
1459289100000000000 1.4809644378331817
1459289400000000000 1.2852435368653183
1459289700000000000 1.3194637956798185
1459290000000000000 1.2904080477000102
1459290300000000000 1.2854642804823777
1459290600000000000 1.283869079503355
1459290900000000000 1.375089631257322
1459291200000000000 1.4307575832844628
1459291500000000000 1.4902668390143015
1459291800000000000

When I look at the data that feeds the CQ it has the data. Below its working for time > now() -4h.

SELECT mean("respTime") FROM "serverPerf" WHERE time > now() - 4h GROUP BY time(5m)

name: respTime

time mean
1459277400000000000 2.81403007363919
1459277700000000000 2.769030150037585
1459278000000000000 2.875411656063908
1459278300000000000 2.938956235959644
1459278600000000000 3.1438613902584938
1459278900000000000 3.093386894610158
1459279200000000000 3.1460658696141133
1459279500000000000 3.142944734953221
1459279800000000000 3.3945110604472206
1459280100000000000 3.1054964714620636
1459280400000000000 3.080622421048196
1459280700000000000 3.2474383066596735
1459281000000000000 3.0608304033090725
1459281300000000000 2.984799693588042
1459281600000000000 3.0164897841606755
1459281900000000000 3.014098693520831
1459282200000000000 3.1474121588493484
1459282500000000000 3.355392116692471
1459282800000000000 3.373986779031815
1459283100000000000 3.580072823568203
1459283400000000000 3.3698958784260085
1459283700000000000 3.41120978200439
1459284000000000000 3.299251425757843
1459284300000000000 3.392203178113466
1459284600000000000 3.359279251866105
1459284900000000000 3.217150725169671
1459285200000000000 3.383973145858318
1459285500000000000 3.2520532857492794
1459285800000000000 3.262755255409653
1459286100000000000 3.1858383025648087
1459286400000000000 3.303823576169274
1459286700000000000 3.2602606388405437
1459287000000000000 3.3168824234054584
1459287300000000000 3.345516673528062
1459287600000000000 3.3700380197046393
1459287900000000000 3.334486536107518
1459288200000000000 3.4712444197574546
1459288500000000000 3.6871278549174007
1459288800000000000 3.7832157623633167
1459289100000000000 3.5756727853472796
1459289400000000000 3.30877900903283
1459289700000000000 3.4020033327159838
1459290000000000000 3.318592838302125
1459290300000000000 3.3159804856277812
1459290600000000000 3.3512863592321307
1459290900000000000 3.4235407971190663
1459291200000000000 3.4895092049862066
1459291500000000000 3.540926032463371
1459291800000000000 3.601479414551974

CQ:

siteRoleup CREATE CONTINUOUS QUERY siteRoleup ON serverPerf RESAMPLE EVERY 30s FOR 5m BEGIN SELECT mean(respTime) AS respTime INTO serverPerf.oneyear.site_perf FROM serverPerf.oneyear.respTime GROUP BY time(1m), serverGroup, site END

Influx Info:

name: build

Branch Build Time Commit Version
HEAD 2016-02-18T20:44:27.807242 df902a4 0.10.1

name: runtime

GOARCH GOMAXPROCS GOOS version
amd64 16 linux go1.4.3

Prebuild package

System Info:

Linux 3.2.0-30-generic #48-Ubuntu SMP Fri Aug 24 16:52:48 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Distributor ID: Ubuntu
Description: Ubuntu 12.04.2 LTS
Release: 12.04
Codename: precise

48Gb ram
2x Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
6 drive spinning disk software raid (Raid 10)

Influx conf mostly default.

[data]
enabled = true
dir = "/data/influxdb/data"
engine = "tsm1"
max-wal-size = 26214400
wal-flush-interval = "3m0s"
wal-partition-flush-delay = "2s"
wal-dir = "/data/influxdb/wal"
wal-logging-enabled = true
wal-ready-series-size = 30720
wal-compaction-threshold = 0.5
wal-max-series-size = 1048576
wal-flush-cold-interval = "5s"
wal-partition-size-threshold = 52428800
query-log-enabled = true
cache-max-memory-size = 524288000
cache-snapshot-memory-size = 26214400
cache-snapshot-write-cold-duration = "1h0m0s"
compact-full-write-cold-duration = "24h0m0s"
max-points-per-block = 0
data-logging-enabled = true

Any ideas on how to debug this or point out something dumb I am doing would be a huge help.

@KiNgMaR
Copy link

KiNgMaR commented Apr 14, 2016

#6096 #5951 #5415

@joanniclaborde
Copy link

I'm experiencing the same behaviour: I verified that my data exists in the original measurement and in the downsampled measurement (generated with a CQ and with some manual backfilling). When I query the downsampled measurement with a time range longer than about 3.25 days, big chunks of data is missing from the results.

I'm using 0.10.3.

@ChrisGute
Copy link
Author

Interesting thing while I was collecting data. As we are putting more data into the system that exact same behavior is happening to non CQ data.

Does anyone have any idea how I can trouble shoot this?

@jsternberg
Copy link
Contributor

I'm sorry this wasn't answered, but 0.10.3 is out of date and not maintained. Please try migrating your databases and upgrading to 1.0. If you still have issues with this after upgrading, comment and I'll reopen the issue. If you need help upgrading, you can ask questions on our mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants