Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[influxdb] Exception when storing integer values #8798

Closed
wborn opened this issue Oct 18, 2020 · 28 comments · Fixed by #8831
Closed

[influxdb] Exception when storing integer values #8798

wborn opened this issue Oct 18, 2020 · 28 comments · Fixed by #8831
Assignees
Labels
bug An unexpected problem or unintended behavior of an add-on
Milestone

Comments

@wborn
Copy link
Member

wborn commented Oct 18, 2020

The following Exception is logged when integer values are stored by the OH3 influxdb persistence add-on:

[TRACE] [e.influxdb.InfluxDBPersistenceService] - Storing item Outside_Humidity (Type=NumberItem, State=91 %, Label=Humidity, Category=humidity, Groups=[Humidity]) in InfluxDB point org.openhab.persistence.influxdb.internal.InfluxPoint@2012713f
[ERROR] [org.influxdb.impl.BatchProcessor     ] - Batch could not be sent. Data will be lost
org.influxdb.InfluxDBException$FieldTypeConflictException: partial write: field type conflict: input field "value" on measurement "Outside_Humidity" is type integer, already exists as type float dropped=1
	at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:144) ~[bundleFile:?]
	at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[bundleFile:?]
	at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:827) ~[bundleFile:?]
	at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:460) ~[bundleFile:?]
	at org.influxdb.impl.OneShotBatchWriter.write(OneShotBatchWriter.java:22) ~[bundleFile:?]
	at org.influxdb.impl.BatchProcessor.write(BatchProcessor.java:340) [bundleFile:?]
	at org.influxdb.impl.BatchProcessor$1.run(BatchProcessor.java:287) [bundleFile:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]

This value is based on a humidity channel of the openweathermap binding (openweathermap:weather-and-forecast:account:local:current#humidity).

I queried the influxdb (1.8.3) and it looked like the Outside_Humidity series only had integer values. It does seem that the OH3 influxdb persistence now adds an item tag which previously wasn't stored. When I manually update the item to e.g. "91.0 %" it did store the value.

After I removed the series it was possible to store new integer values. However if I would then add a float it would error with:

[TRACE] [e.influxdb.InfluxDBPersistenceService] - Storing item Outside_Humidity (Type=NumberItem, State=92.2 %, Label=Humidity, Category=humidity, Groups=[Humidity]) in InfluxDB point org.openhab.persistence.influxdb.internal.InfluxPoint@730a50c2
[ERROR] [org.influxdb.impl.BatchProcessor     ] - Batch could not be sent. Data will be lost
org.influxdb.InfluxDBException$FieldTypeConflictException: partial write: field type conflict: input field "value" on measurement "Outside_Humidity" is type float, already exists as type integer dropped=1
	at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:144) ~[bundleFile:?]
	at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[bundleFile:?]
	at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:827) ~[bundleFile:?]
	at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:460) ~[bundleFile:?]
	at org.influxdb.impl.OneShotBatchWriter.write(OneShotBatchWriter.java:22) ~[bundleFile:?]
	at org.influxdb.impl.BatchProcessor.write(BatchProcessor.java:340) [bundleFile:?]
	at org.influxdb.impl.BatchProcessor$1.run(BatchProcessor.java:287) [bundleFile:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]

I also saw the add-on is now using a new client library which may be causing this new behavior.

@wborn wborn added the bug An unexpected problem or unintended behavior of an add-on label Oct 18, 2020
@wborn
Copy link
Member Author

wborn commented Oct 18, 2020

This also seems to be an issue for Items with OnOffType state which is stored as 0 and 1 values in the database.

@lujop lujop self-assigned this Oct 19, 2020
@lujop
Copy link
Contributor

lujop commented Oct 19, 2020

Thanks @wborn, one question, when you queried the InfluxDB do you remember if the integer values from the binding were stored as integers or floats in InfluxDB?

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

When the influxdb add-on was able to store values the number of decimals was as expected.

I'm just testing my migration path and now restored the original influxdb DB that I still use with my OH2 instance. It seems that now the OH3 influxdb add-on is able to store integer values in the existing series. But it could be that it is now working because in the meantime I had to reset the mapdb (#8794) which I have configured with a restoreOnStartup strategy. The mapdb add-on now uses Gson for (de)serialization and I have seen before that when (de)serializing integers with Gson the precision may get lost eclipse-archived/smarthome#3774

@mccaffm
Copy link

mccaffm commented Oct 19, 2020

Thanks @wborn, one question, when you queried the InfluxDB do you remember if the integer values from the binding were stored as integers or floats in InfluxDB?

For my setup its either...

org.influxdb.InfluxDBException$FieldTypeConflictException: partial write: field type conflict: input field "value" on measurement "GarageLUMI" is type float, already exists as type integer dropped=1

org.influxdb.InfluxDBException$FieldTypeConflictException: partial write: field type conflict: input field "value" on measurement "OPIOnOff15_ElectricMeterWatts" is type integer, already exists as type float dropped=1

im pretty sure you can see whats happeneing... if the first value is a whole number its written as an integer, otherwise its a float irrespective of what has been written previously.

@LukasA83
Copy link
Contributor

I also see this issue in OH3. Looking at both source codes, I believe in OH 2 influxdb provider always translated decimaltype into float, independent of the scale. With the rewritten persistence addon, it is considering the scale and either writing integer or float. That results in existing series with float will not accept integer, at least on my influxdb instance.
Currently, I've re-created the respective series in influxdb with integer values.

Maybe we are able to configure this behaviour in the rewritten influx addon to allow backwards compatibility?

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

I believe in OH 2 influxdb provider always translated decimaltype into float, independent of the scale

No it did not. The same optimization was already used in OH2.

Do you also use mapdb to restore values on startup? If mapdb restores values with a wrong scale it can break the series in influxdb.

@lujop
Copy link
Contributor

lujop commented Oct 19, 2020

@wborn When you say the number of decimals was as expected what do you mean? In your Influx database from OH 2.0 how are the integers stored inside Influx, as floats or integers?
I checked in mine that I also have humidity from weather binding and it's stored as float that makes sense with what @LukasA83 said.

About your comment @LukasA83 it was my initial explication, but looking at code, at first sight, it doesn't seem the intent.
The intention to try to use an Integer when is possible is inherited from old code: Look here at convertBigDecimalToNum method and stateToObject ones.
In fact, it has an explicit comment saying that using integer has better performance.

I really think that the problem was with the old driver that is 4 years old:
influxdata/influxdb-java#153 (comment) and that now the new drivers preserves types and its problematic

If I confirm that I think that the best decision is to use always floats. In fact for me is the best decision although we started from a fresh start because it's very dangerous to try to use integer while OH doesn't have an IntegerType. Because if an addon returns sometimes integers and sometimes floats it will break the persistence, or a case when an addon change the precision in an addon update

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

When you say the number of decimals was as expected what do you mean? In your Influx database from OH 2.0 how are the integers stored inside Influx, as floats or integers?

That the number of decimals of the state reported by openHAB exactly matches that of the number of decimals stored in the influxdb. So in my case with the openweathermap binding humidity channel, both the old and new (after removing the series) store the values as integers in the influxdb series.

I checked in mine that I also have humidity from weather binding and it's stored as float that makes sense with what @LukasA83 said.

If the binding provides floats on that particular humidity channel that's of course possible while another binding can provide integers on a humidity channel.

I'll have a look at the mapdb restore values on startup behavior as that persistence add-on was also rewritten.

@lujop
Copy link
Contributor

lujop commented Oct 19, 2020

Sure @wborn, but what I wanted to say is that now that we know that InfluxDB has problems using several types for the same field is dangerous to use integers.
Because as the type you receive for a single channel is DecimalType and nor IntegerType you can't be sure that if the first time you receive an integer from a channel you will receive always integers from the same channel, can be?

I will try to debug when I've time because it's not clear to me still how the old addon avoids the different types problem.

Because looking at drivers source code seems good:
https://github.com/influxdata/influxdb-java/blob/influxdb-java-2.2/src/main/java/org/influxdb/dto/Point.java (line 327)

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

Because as the type you receive for a single channel is DecimalType and nor IntegerType you can't be sure that if the first time you receive an integer from a channel you will receive always integers from the same channel, can be?

Yes it's an accident waiting to happen. I can also check what happens with the OH2 influxdb addon what happens if a float ends up in an integer series.

@LukasA83
Copy link
Contributor

@wborn you are right, I'm using MapDB to restore values on startup.

@lujop
Copy link
Contributor

lujop commented Oct 19, 2020

I've test now sending an integer value from OH 2.0 and it's stored as float in InfluxDB:

# From openhab console
smarthome:send TestNumberItem 5

#From InfluxDB
SELECT value AS "mean_value" FROM "openhab_db"."origin"."TestNumberItem" WHERE time > :dashboardTime: AND time < :upperDashboardTime:
5.0

Can someone confirm if he has some values stored as integers in a database only written by OH 2.0?
Please do a plain select to be sure, without any mean or group by aggregation / filling as the select in that comment, and use table visualization

@lujop lujop added this to the 3.0.0.M2 milestone Oct 19, 2020
@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

Can someone confirm if he has some values stored as integers in a database only written by OH 2.0?

Yes this is from the influxdb I am using currently using with OH 2.5.9:

> select * from Outside_Humidity where time > now()-1d
name: Outside_Humidity
time                value
----                -----
1603061530305000000 93
1603068730603000000 100
1603072330783000000 93
1603075930890000000 100
1603086731377000000 93
1603090331503000000 100
1603097531779000000 93
1603104732059000000 89
1603108332188000000 77
1603111932325000000 73
1603115532490000000 65
1603119132645000000 58
1603122732801000000 66
1603129933128000000 76
1603137133446000000 81

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

It looks like the mapdb restore seems to work OK after fixing (openhab/openhab-core#1737) and adding/enabling debug logging:

[apdb.internal.MapDbPersistenceService] - Deserialized 'Outside_Humidity' with state '98 %' from '{"name":"Outside_Humidity","state":"org.openhab.core.library.types.QuantityType@@@98 %","timestamp":"Oct 19, 2020, 11:37:56 PM"}'
[tence.internal.PersistenceManagerImpl] - Restored item state from '2020-10-19T23:37:56+02:00[Europe/Amsterdam]' for item 'Outside_Humidity' -> '98 %'
[smarthome.event.ItemStateChangedEvent] - Item 'Outside_Humidity' changed from NULL to 98 %

These floats causing these issues could also be restored by the default rrd4j persistence service, which is nowadays automatically installed. See #8806.

@wborn
Copy link
Member Author

wborn commented Oct 19, 2020

When I only use rrd4j to restore on startup the precision seems to get lost:

Before restart:

openhab> openhab:update Outside_Humidity '95 %'
Update has been sent successfully.
openhab> items list Outside_Humidity
Outside_Humidity (Type=NumberItem, State=95 %, Label=Humidity, Category=humidity)

After restart:

00:01:01.533 [DEBUG] [tence.internal.PersistenceManagerImpl] - Restored item state from '2020-10-20T00:00:00+02:00[Europe/Amsterdam]' for item 'Outside_Humidity' -> '95.0'
00:01:01.533 [INFO ] [smarthome.event.ItemStateChangedEvent] - Item 'Outside_Humidity' changed from NULL to 95.0 %

openhab> items list Outside_Humidity
Outside_Humidity (Type=NumberItem, State=95.0 %, Label=Humidity, Category=humidity)

@lujop
Copy link
Contributor

lujop commented Oct 19, 2020

Then it seems not to be a problem on the persistence add-on per see, isn't it?
But still, I see it too much fragile and it's not clear to me the best solution to implement.

One thing that I would like at least, is an option to use always floats when storing values, and I think that is the safest default for new users. But of course, not good for compatibility.

@wborn
Copy link
Member Author

wborn commented Oct 20, 2020

It could of course always be a setting but that introduces complexity. It would also be strange to start storing on/off values as 0.0 and 1.0. Switching to floats may also break graphs or anything else querying InfluxDB.

I will do some more testing to see if the issues I ran into really were due to the rrd4j restore on startup behavior. If that's the case it would be better to either fix rrd4j or not use it for restoring values on startup. Such issues may also impact the data stored by other persistence add-ons. We shouldn't be creating workarounds in every persistence add-on just because rrd4j is buggy.

@mccaffm
Copy link

mccaffm commented Oct 20, 2020 via email

@wborn
Copy link
Member Author

wborn commented Oct 20, 2020

Can you check if you have rrd4j installed @mccaffm? It is installed by default, restoring items on startup, see #8806.

@mccaffm
Copy link

mccaffm commented Oct 20, 2020 via email

@lujop
Copy link
Contributor

lujop commented Oct 20, 2020

It could of course always be a setting but that introduces complexity. It would also be strange to start storing on/off values as 0.0 and 1.0. Switching to floats may also break graphs or anything else querying InfluxDB.

I will do some more testing to see if the issues I ran into really were due to the rrd4j restore on startup behavior. If that's the case it would be better to either fix rrd4j or not use it for restoring values on startup. Such issues may also impact the data stored by other persistence add-ons. We shouldn't be creating workarounds in every persistence add-on just because rrd4j is buggy.

I agree that it has to be fixed in rrd4j, but for me, the inherent problem has a broader scope.
As we commented storing a DecimalType as an integer for me is not safe, because there isn't any guarantee that later values will be also integers.
I think that current behavior can be enough good to be the default behavior for compatibility reasons. But for new users who want to have it safer, I think that to have an option to store always the same OH Types with the same InfluxDB type is good to have.
It doesn't affect Booleans because they are another type, the problem is using different InfluxDB types for the same OH type that's the case for DecimalType (integer, float).
If someday is an IntegerType then it will be safe to use integers.

@wborn
Copy link
Member Author

wborn commented Oct 20, 2020

I did some more testing and also ran into the issue without having rrd4j installed. So I think it has to do with the new client library. It also seems that storing both integers and floats was never an issue with the library in OH 2.5.x. I was able to store both without any issue:

> select * from Outside_Humidity where time > now()-1d
name: Outside_Humidity
time                value
----                -----
1603198335974000000 77
1603212705936000000 93
1603212858295000000 91.5
1603212866313000000 91
1603212888119000000 99
1603212894625000000 99.2

I also tested with removing the tags but that also did not fix the issue. Also when I remove the optimization in convertBigDecimalToNum to always use the doubleValue the issue is resolved and I still only see integers being stored. So it might be that the library already does these optimizations itself now and we can safely remove this code.

@lujop
Copy link
Contributor

lujop commented Oct 20, 2020

That is good news, althought I can't understand completely yet:

https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-field-type-discrepancies-across-shards

The documentation says that it can be two different types in the same shard. It may be creates two shards when the types are different? Or maybe the problems are if different values are in the same batch¿?

I will try to isolate in a simple main and look at drivers code or ask a question on InfluxDB mailing list to be sure about the fix.
Is there any date for milestone 2?

@kaikreuzer
Copy link
Member

Is there any date for milestone 2?

Suggestion would be Nov 1.

@wborn
Copy link
Member Author

wborn commented Oct 20, 2020

The documentation says that it can be two different types in the same shard. It may be creates two shards when the types are different? Or maybe the problems are if different values are in the same batch¿?

Maybe it only applies when using the new client? I used the exact same data and InfluxDB 1.8.3 with OH 2.5 and OH 3. I've replicated all the Docker containers+data used by my production OH 2.5 instance on my desktop to test my migration to OH 3.

@lujop
Copy link
Contributor

lujop commented Oct 20, 2020

I will check tomorrow if I can to see the real cause, I only had time to do a little test but with new client it's easy to reproduce, writing only to points (a double and an integer) in a clean database and I caught the error.

@lujop
Copy link
Contributor

lujop commented Oct 21, 2020

Finally, I found the mystery, the old addon really stores all numbers as double. The difference is not in the driver's version, it's in the deprecated method it uses. Old addon uses Point.field() that is deprecated and stores always numeric values as double (in all versions, you can see here).
And the new addon versions use the non-deprecated method addField that preserves types.

    InfluxDB influxDB = InfluxDBFactory.connect("http://localhost:8086", "telegraf" , null);

    Point point1Old = Point.measurement("test0").field("value",50).build();
    Point point1New = Point.measurement("test0").addField("value",50).build();
    Point point2Old = Point.measurement("test0").field("value",2.5d).build();
    Point point2New = Point.measurement("test0").addField("value",2.5d).build();

    System.out.println(point1Old.lineProtocol());
    System.out.println(point1New.lineProtocol());
    System.out.println(point2Old.lineProtocol());
    System.out.println(point2New.lineProtocol());


    influxDB.write("test","autogen",point1Old);
    influxDB.write("test","autogen",point2Old);
    influxDB.write("test","autogen",point2New);
    try {
        influxDB.write("test", "autogen", point1New);
    } catch (Exception e) {
        System.out.println(e.getMessage());
    }

    QueryResult queryResult = influxDB.query(new Query("select value from test.autogen.test0 where time > now() -1d"));
    System.out.println(queryResult);

Outputs:

test0 value=50.0
test0 value=50i
test0 value=2.5
test0 value=2.5
partial write: field type conflict: input field "value" on measurement "test0" is type integer, already exists as type float dropped=1
QueryResult [results=[Result [series=[Series [name=test0, tags=null, columns=[time, value], values=[[2020-10-21T18:04:29.0007762Z, 50.0], [2020-10-21T18:04:29.0193208Z, 2.5], [2020-10-21T18:04:29.0341714Z, 2.5]]]], error=null]], error=null]

Select in terminal using influx:

SELECT value from test.autogen.test0 where time > now() -1d
time                value
----                -----
1603303469000776200 50
1603303469019320800 2.5
1603303469034171400 2.5

show field keys from test.autogen.test0
fieldKey fieldType
 -------- ---------
 value    float

But select on Chronograf
image

It seems that Chronograf is saying the truth here.

About the Openhab part, as when the value is read is parsed directly in DecimalType constructor passing a string it doesn't matter if it's integer or float.

In conclusion, I will remove the convertBigDecimalToNum that for me is the safest way because Influx not supporting different types in a shard, and it will be retro compatible with the old version because at the end it always used doubles.

@wborn
Copy link
Member Author

wborn commented Oct 21, 2020

Great! Thanks for having a look at it and it indeed explains the mystery. It's nice to know the root cause and that we can fix the issue without having to add any configuration options while staying compatible with existing databases. 👍

lujop added a commit to lujop/openhab-addons that referenced this issue Oct 21, 2020
Fixes openhab#8798
Improve documentation with more explicit information about Influx types used
Implement toSTring to InfluxPoint to allow some trace info to be useful in case it's needed

Signed-off-by: Joan Pujol <[email protected]>
wborn pushed a commit that referenced this issue Oct 22, 2020
…8831)

* Update documentation with changed Influx2 RC port
* Fix problem with non decimal numeric types

Improve documentation with more explicit information about Influx types used
Implement toString to InfluxPoint to allow some trace info to be useful in case it's needed

Fixes #8697
Fixes #8798

Signed-off-by: Joan Pujol <[email protected]>
boehan pushed a commit to boehan/openhab-addons that referenced this issue Apr 12, 2021
…penhab#8831)

* Update documentation with changed Influx2 RC port
* Fix problem with non decimal numeric types

Improve documentation with more explicit information about Influx types used
Implement toString to InfluxPoint to allow some trace info to be useful in case it's needed

Fixes openhab#8697
Fixes openhab#8798

Signed-off-by: Joan Pujol <[email protected]>
marcfischerboschio pushed a commit to bosch-io/openhab-addons that referenced this issue May 5, 2022
…penhab#8831)

* Update documentation with changed Influx2 RC port
* Fix problem with non decimal numeric types

Improve documentation with more explicit information about Influx types used
Implement toString to InfluxPoint to allow some trace info to be useful in case it's needed

Fixes openhab#8697
Fixes openhab#8798

Signed-off-by: Joan Pujol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of an add-on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants