prevent duplicate time, sequence_number on write #695

code0x9 · 2014-06-26T07:04:01Z

duplicate "time" or "sequence_number" column leads to panic. prevent this by filter out invalid request.

test request

/db/<database>/series?u=<user>&p=<pass>
[
  {
    "name" : "hd_used",
    "columns" : ["time", "value", "time", "host", "mount"],
    "points" : [
      [1403761091094, 23.2, 1403761091094, "serverA", "/mnt"]
    ]
  }
]

result

panic: runtime error: index out of range

goroutine 18 [running]:
runtime.panic(0x448dd60, 0x4a601f7)
    /Applications/go/src/pkg/runtime/panic.c:266 +0xb6
datastore.(*Shard).Write(0xc2100d3940, 0xc21031ae86, 0x4, 0xc2102eac48, 0x1, ...)
    /Users/niko/git/influxdb/src/datastore/shard.go:83 +0xc7f
datastore.(*ShardDatastore).Write(0xc21009baf0, 0xc21014b800, 0x0, 0x0)
    /Users/niko/git/influxdb/src/datastore/shard_datastore.go:231 +0xf2
cluster.(*WriteBuffer).write(0xc210106690, 0xc21014b800)
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:88 +0x86
cluster.(*WriteBuffer).handleWrites(0xc210106690)
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:79 +0xb7
created by cluster.NewWriteBuffer
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:43 +0x24f

otoolep · 2014-06-26T15:55:13Z

Seems like a good catch. Is this just 'time', or if any column is duplicated, will this happen?

I'm not sure, since I am new to this code base, but I think this kind of validation should happen higher up the API stack -- in src/api/http/api.go. Sanitize all incoming JSON there, removing all duplicates for example (without actually caring what column names are). Today it may be the 'time' column, tomorrow it will be something else.

What about adding a unit test?

code0x9 · 2014-06-26T18:30:04Z

I think sanitize all JSON leads to performance degradation, And the func ConvertToDataStoreSeries at src/common/serialize_series.go treats "time" and "sequence_number" exceptionally. So if the other column is duplicated, last column wins.

Main objective of pull request above is to prevent panic which leads to broken shard data. You can test by send write request with duplicate "time" field. After that, influxdb panics and you cannot repair old series data.

otoolep · 2014-06-26T19:01:31Z

What I mean is can you add a unit test to your change which sends duplicate 'time' columns, and confirms that the system doesn't blow up? And that the later time wins?

otoolep · 2014-06-26T23:32:10Z

Oh, by "sanitize", I just mean the column field, removing any dupes, since it seems to have a fairly constrained specification. I agree that sanitizing the entire JSON blob doesn't make much sense.

code0x9 · 2014-06-27T01:17:16Z

What about points? When you remove duplicate column, you have to remove obsolete point too because serialize series loops over Columns. (and that means "doesn't make sense" situation)

Think about this situation.

column: ["a", "b", "b", "c"]
points: [1, 2, 2, 3]

If you remove "b" only, column "c" will have value 2, like this.

column: ["a", "b", "c"]
points: [1, 2, 2, 3]

I don't think unit test helps. Also, tests on master is already failing on my macbook.

FAIL: wal_test.go:536: WalSuite.TestSimultaneousReplay

wal_test.go:563:
    c.Assert(len(requests), InRange, 100, 200)
... obtained int = 235
... expected greater than int = 100
... expected less than int = 200

OOPS: 24 passed, 1 FAILED, 1 PANICKED
--- FAIL: Test-8 (3.54 seconds)
FAIL
FAIL    wal 3.571s
make: *** [test] Error 1

otoolep · 2014-06-27T01:40:43Z

Good point about the corresponding data -- I hadn't thought of that. I was only thinking in terms of duplicate column names, no more.

I still don't follow why, with your fix in place, a unit test couldn't be added to prove this issue was no longer an issue. It would prevent a regression in the future. Isn't that useful?

code0x9 · 2014-06-27T05:35:45Z

Special case "time" and "sequence_number" is limited to func ConvertToDataStoreSeries. Anyone who will work on this source will notice that without reading unit test.

And. as I said, test on master is failing already. Not working test is meaningless. Don't you think?

I added tests anyway.

jvshahid · 2014-06-30T18:35:13Z

I merged in the pr but changed the implementation to check for duplicates in the Columns member of the series. This should avoid other confusing problems that might arise with duplicate field names other than time and sequence_number

otoolep · 2014-07-01T01:08:01Z

@jvshahid -- maybe I am missing something, but I don't see how the PR that was merged differs from what was first proposed. Where is the code that checks for dupes in Columns?

jvshahid · 2014-07-01T14:50:27Z

You can see it here https://github.com/influxdb/influxdb/blob/master/src/common/serialize_series.go#L54-L63

code0x9 · 2014-07-01T15:02:45Z

I was so curious about those needless conversations...
Thank you, @jvshahid. You teach me how to distinguish "Owner" and "NOBODY".
I won't make mistake next time.

P.S. Your patch is much better than mine.

jvshahid · 2014-07-01T18:11:25Z

Oh forgot to mention. Do you mind signing the CLA. You can find it on the website, http://influxdb.com/community/cla.html

code0x9 · 2014-07-01T18:16:00Z

I have signed CLA. thank you.

prevent duplicate time, sequence_number on write

8129f80

added unit test.

29cd751

jvshahid closed this in 9508c16 Jun 30, 2014

jvshahid added this to the 0.7.4 milestone Jul 7, 2014

jvshahid modified the milestones: 0.8.0, 0.7.4 Jul 24, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prevent duplicate time, sequence_number on write #695

prevent duplicate time, sequence_number on write #695

code0x9 commented Jun 26, 2014

otoolep commented Jun 26, 2014

code0x9 commented Jun 26, 2014

otoolep commented Jun 26, 2014

otoolep commented Jun 26, 2014

code0x9 commented Jun 27, 2014

otoolep commented Jun 27, 2014

code0x9 commented Jun 27, 2014

jvshahid commented Jun 30, 2014

otoolep commented Jul 1, 2014

jvshahid commented Jul 1, 2014

code0x9 commented Jul 1, 2014

jvshahid commented Jul 1, 2014

code0x9 commented Jul 1, 2014

prevent duplicate time, sequence_number on write #695

prevent duplicate time, sequence_number on write #695

Conversation

code0x9 commented Jun 26, 2014

test request

result

otoolep commented Jun 26, 2014

code0x9 commented Jun 26, 2014

otoolep commented Jun 26, 2014

otoolep commented Jun 26, 2014

code0x9 commented Jun 27, 2014

otoolep commented Jun 27, 2014

code0x9 commented Jun 27, 2014

jvshahid commented Jun 30, 2014

otoolep commented Jul 1, 2014

jvshahid commented Jul 1, 2014

code0x9 commented Jul 1, 2014

jvshahid commented Jul 1, 2014

code0x9 commented Jul 1, 2014