Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSQL 5.4.0-beta1 - Decimal support is broken within Struct #3874

Open
robinroos opened this issue Nov 16, 2019 · 14 comments
Open

KSQL 5.4.0-beta1 - Decimal support is broken within Struct #3874

robinroos opened this issue Nov 16, 2019 · 14 comments

Comments

@robinroos
Copy link

I have cloned examples, and updated my local copy of docker-compose.yml from cp-all-in-one to use the 5.4.0-beta1 Confluent Platform images.

I am working with Avro IDL-generated schemas modelling FX Trades. My message Keys and also Values are AVRO "records".

I had issues with DECIMAL support, even outside KSQL, so I added Pi=3.141 to the top-level structure of the Trade message. Pi helped me identify that spring.kafka.properties.specific.avro.reader=true must be set when using SpecificAvro consumers.

KSQL seems to work well with the field Pi - I can see its value, I can SELECT it from the message, etc.

The Trade message also contains an embedded structure, called FXLeg. That structure contains quite a few DECIMALS. Those decimals, the ones inside STRUCT, are not supported by KSQL. Their values are incorrectly rendered (as if the String representation of the byte[] is being displayed, without conversion through the appropriate deserialiser), and I am unable to SELECT any column (regardless of type) from within the STRUCT.

As you will see below the Trade message contains two instances of the STRUCT, called "NearLeg" and "FarLeg", and the messages I am sending have FarLeg=null, just in case that is relevant. Everything below refers to NearLeg which is not null.

#1 Version Info from KSQL CLI - confirms I have 5.4.0-beta1 CLI and Server

CLI v5.4.0-beta1, Server v5.4.0-beta1 located at http://0.0.0.0:8088

#2 Creating the stream from the topic

ksql> create stream trades with (kafka_topic='trade', value_format = 'AVRO');

#3 Describe the stream

ksql> describe trades;
Name                 : TRADES
 Field         | Type                                                                                                                                                   
------------------------
 ROWTIME       | BIGINT           (system)                                                                                                                              
 ROWKEY        | VARCHAR(STRING)  (system)                                                                                                                              
 TRADEREF      | VARCHAR(STRING)                                                                                                                                        
 PI            | DECIMAL                                                                                                                                                
 TRADETYPE     | VARCHAR(STRING)                                                                                                                                        
 BUYSELL       | VARCHAR(STRING)                                                                                                                                        
 COUNTERPARTY  | VARCHAR(STRING)                                                                                                                                        
 PAIR          | VARCHAR(STRING)                                                                                                                                        
 TRADEDATETIME | BIGINT                                                                                                                                                 
 TRADEDATE     | INTEGER                                                                                                                                                
 NEARLEG       | STRUCT<SETTLEDATE INTEGER, BASEAMOUNT DECIMAL, QUOTEDAMOUNT DECIMAL, SPOTPRICE DECIMAL, TRADERPOINTS DECIMAL, SALESPOINTS DECIMAL, ALLINPRICE DECIMAL> 
 FARLEG        | STRUCT<SETTLEDATE INTEGER, BASEAMOUNT DECIMAL, QUOTEDAMOUNT DECIMAL, SPOTPRICE DECIMAL, TRADERPOINTS DECIMAL, SALESPOINTS DECIMAL, ALLINPRICE DECIMAL> 
-------------------------
For runtime statistics and query details run: DESCRIBE EXTENDED <Stream,Table>;

#4 Pi is working - it lives in the top-level of the message

ksql> SELECT PI FROM TRADES;
+---------------------------
|PI                                                                                                                                                                                                                                                        |
+----------------------------
|3.141                                                                                                                                                                                                                                                     |
^CQuery terminated

#5 Selecting NEARLEG shows corruption of DECIMAL values within that STRUCT

ksql> select NEARLEG from TRADES;
+----------------
|NEARLEG                                                                                                                                                                                                                                                   |
+-----------------
|{SETTLEDATE=18213, BASEAMOUNT=AlQL5AA=, QUOTEDAMOUNT=At/RwEA=, SPOTPRICE=SZT5oA==, TRADERPOINTS=AA==, SALESPOINTS=AA==, ALLINPRICE=SZT5oA==}                                                                                                              |
^CQuery terminated

#6 Unable to select a DECIMAL field from the STRUCT

ksql> SELECT NEARLEG -> ALLINPRICE FROM TRADES;
Field 'ALLINPRICE' cannot be resolved.

#7 Unable to select an INTEGER field from the STRUCT

ksql> SELECT NEARLEG->SETTLEDATE FROM TRADES;
Field 'SETTLEDATE' cannot be resolved.

#8 The AVRO message, being repeated back to me by a Consumer (SpringBoot). Just the TradeRef changes with each subsequent message. Scroll to the far right to see that the Decimals within NearLeg are all good.

2019-11-16 10:17:19.359  INFO 3600 --- [ntainer#1-0-C-1] u.c.s.aggregator.service.TradeConsumer   : ### -> Consumed message -> {"TradeRef": "10007", "pi": 3.141, "TradeType": "FXSPOT", "BuySell": "BUY", "Counterparty": "RBS", "pair": "EURUSD", "TradeDateTime": 2019-11-16T10:17:19.353Z, "TradeDate": 2019-11-11, "NearLeg": {"SettleDate": 2019-11-13, "BaseAmount": 1000000.0000, "QuotedAmount": 1234500.0000, "SpotPrice": 1.234500000, "TraderPoints": 0E-9, "SalesPoints": 0E-9, "AllInPrice": 1.234500000}, "FarLeg": null}
@robinroos robinroos added the bug label Nov 16, 2019
@robinroos
Copy link
Author

Attaching JSON Schema from Control Center, for Key and Value of topic Trade.

schema-trade-key-v1.txt
schema-trade-value-v1.txt

@robinroos
Copy link
Author

Attaching Avro IDL (fx.avdl). I trimmed this down to show just Trade (the Value record), CurrencyPairKey (the Key record), and necessary supporting types (BuySell, TradeType and FXLeg).

fx_avdl.txt

@robinroos
Copy link
Author

Docker-compose.yml, originally taken from examples/cp-all-in-one, and updated to use 5.4.0-beta1 Confluent Platform images.

docker-compose_yml.txt

@robinroos
Copy link
Author

I appreciate this is the KSQL repo, but I could not find an enterprise-control-center one. Can you tell me where to raise the following related issues?

#1 Control Center, when launched through the 5.4.0-beta1 image, still reports its version as 3.5.1.

#2 Control Center, which I believe to be 5.4.0-beta1, does not properly render DECIMAL types in the Messages pane for a Topic. This applies equally to decimals in the top-level of a message and decimals within structures.

image

image

image

@robinroos
Copy link
Author

The Docker Images I am using, 5.4.0-beta1 from Docker Hub, are 2 months old. If Confluent Inc could coordinate the publishing of -beta2 images to Docker Hub then I could easily re-test the above, both KSQL and Control Center.

@robinroos
Copy link
Author

I'd like to see the Control Center issue spun off to the appropriate forum in time for this to be addressed before final release of 5.4.0. Can we make that happen?

@apurvam
Copy link
Contributor

apurvam commented Dec 10, 2019

cc @agavra this seems to be a bug with decimals. Can you triage?

@apurvam apurvam added this to the 0.7.0 milestone Dec 10, 2019
@agavra
Copy link
Contributor

agavra commented Dec 11, 2019

Hello @robinroos - I tried to reproduce this locally and was only able to reproduce the problem partially (note the syntax that I'm using for struct creation is not yet committed):

ksql> CREATE STREAM decimals (col0 STRUCT<val DECIMAL(2,1)>) WITH (kafka_topic='decimals', partitions=1, value_format='AVRO');

 Message
----------------
 Stream created
----------------
ksql> INSERT INTO decimals (col0) VALUES ({VAL '2.1'});
ksql> SELECT COL0->val FROM DECIMALS EMIT CHANGES;
+----------------------------------------------------------------------------------------------+
|COL0__VAL                                                                                     |
+----------------------------------------------------------------------------------------------+
|2.1                                                                                           |
^CQuery terminated
ksql> SELECT COL0 FROM DECIMALS EMIT CHANGES;
+----------------------------------------------------------------------------------------------+
|COL0                                                                                          |
+----------------------------------------------------------------------------------------------+
|{VAL=FQ==}                                                                                    |

The first thing to note, is that there isn't corruption of the decimal data - that's just the way our CLI displays it (as byte strings) because that's the way they are stored in AVRO (a logical byte type). I agree that this isn't ideal (it's impossible to read!) but you can see in my example above that when it's a top-level field it displays fine, but when it's in a struct the CLI displays it as bytes.

Secondly, I'm not sure why you can't access the struct using ->. Since you can't select the int field in the NEARLEG struct either, I don't think this has anything to do with decimals. I'm still trying to figure out why this is coming up.

@agavra
Copy link
Contributor

agavra commented Dec 11, 2019

I've opened #4118 to track the display issue.

@robinroos
Copy link
Author

Thanks for #4118.

I suspect that your local changes (in support of DECIMAL within STRUCT) have inadvertently fixed the inability to access even non-decimal fields within a struct which does contain at least one decimal type.

If you wish I can specifically re-test (using -beta1) the selection of an int field from within a struct which DOES NOT contain any decimal types, if that might help. My preference, however, would be to rerun the full test on -beta2 images from Docker Hub as soon as your initial work is stable.

Any news regarding expected -beta2 availability? confluentinc/cp-docker-images#812

@agavra
Copy link
Contributor

agavra commented Dec 12, 2019

I think it's unlikely that my local changes affected struct access patterns (the code I wrote was to add syntax to create structs inline and is pretty isolated). Can you select from a struct that doesn't have a decimal in it?

w.r.t to -beta2 I don't have any timelines for the CP release, but we release ksqlDB docker images you can test against on a somewhat more regular basis. The most current image is ksqldb-server:0.6.0

@agavra agavra removed this from the 0.7.0 milestone Dec 17, 2019
@boxsterman
Copy link

I am facing a very similar - if not the same - issue. Running ksql 5.4.0 and using a toplevel record containing a DECIMAL value PI and a nested struct which also contains a DECIMAL value:

@namespace("ch.seibertec.example.decimaltypetest")
protocol DecimalTypeTest {
  record RecordWithOnlyDecimal {
    decimal(21,10) value;
  }
  record Toplevel {
    RecordWithOnlyDecimal value;
    decimal(21,10) pi;
  }
}

I can access and query both values, but only the top level one is displayed as decimal, the nested one is displayed as bytes. Here the nested decimal field has the value 1.1:

ksql> select * from TMP_test_dec_3 where VALUE->VALUE > 1 emit changes;
[ 1580106243484, "1", {
  "VALUE" : "Ao+mrgA="
}, 3.1400000000 ]

Is there an idea on when this issue could be addressed? Is there anything I could help with?

@agavra
Copy link
Contributor

agavra commented Jan 27, 2020

@boxsterman - the issue you are running into is fixed in master (see this issue: #4118)

@big-andy-coates
Copy link
Contributor

big-andy-coates commented Mar 5, 2020

@agavra / @robinroos is this issue now resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants