Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Added support for decimal256 read/write in parquet #1412

Merged
merged 9 commits into from
Mar 3, 2023

Conversation

TCeason
Copy link
Contributor

@TCeason TCeason commented Feb 21, 2023

Link to #1411

@codecov
Copy link

codecov bot commented Feb 21, 2023

Codecov Report

Patch coverage: 96.41% and project coverage change: +0.11 🎉

Comparison is base (d06323a) 83.63% compared to head (0ec7826) 83.75%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1412      +/-   ##
==========================================
+ Coverage   83.63%   83.75%   +0.11%     
==========================================
  Files         374      374              
  Lines       40603    40932     +329     
==========================================
+ Hits        33959    34283     +324     
- Misses       6644     6649       +5     
Impacted Files Coverage Δ
src/io/parquet/read/statistics/mod.rs 86.90% <70.00%> (-0.42%) ⬇️
src/io/parquet/read/indexes/mod.rs 80.93% <83.33%> (+0.21%) ⬆️
src/io/parquet/read/deserialize/simple.rs 84.50% <94.11%> (+1.47%) ⬆️
src/io/parquet/write/mod.rs 88.31% <97.22%> (+1.54%) ⬆️
src/io/parquet/read/indexes/fixed_len_binary.rs 100.00% <100.00%> (ø)
src/io/parquet/read/indexes/primitive.rs 37.76% <100.00%> (+4.24%) ⬆️
src/io/parquet/read/mod.rs 100.00% <100.00%> (ø)
src/io/parquet/read/statistics/fixlen.rs 100.00% <100.00%> (ø)
src/io/parquet/write/fixed_len_bytes.rs 100.00% <100.00%> (ø)
src/io/parquet/write/schema.rs 77.20% <100.00%> (+3.24%) ⬆️
... and 5 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sundy-li
Copy link
Collaborator

Need tests to covert the read/write of Decimal256.

@TCeason
Copy link
Contributor Author

TCeason commented Feb 28, 2023

I think need some help.

I have two questions:

  1. why are the row_group's column chunk different?

With number 9

in read, it's: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]

in write, it's: [9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

  1. How to deal with the neg number like -256 and -1?

  2. Wht the chunk is different but the statistics min/max is same all of them are [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]

---- io::parquet::read::v2_decimal_39_required stdout ----
the meta is 
ColumnChunkMetaData {
	column_chunk: ColumnChunk { 
		file_path: None, 
		file_offset: 2171, 
		meta_data: Some(
			ColumnMetaData { 
				type_: Type(7), 
				encodings: [Encoding(0), Encoding(3)], 
				path_in_schema: ["decimal_39"], 
				codec: CompressionCodec(0), 
				num_values: 10, 
				total_uncompressed_size: 274, 
				total_compressed_size: 274, 
				key_value_metadata: None, 
				data_page_offset: 1897, 
				index_page_offset: None, 
				dictionary_page_offset: None, 
				statistics: Some(Statistics { max: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]), null_count: Some(0), distinct_count: None, max_value: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min_value: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]) }), 
				encoding_stats: Some([PageEncodingStats { page_type: PageType(0), encoding: Encoding(0), count: 1 }]), 
				bloom_filter_offset: None }
			), 
		offset_index_offset: None, 
		offset_index_length: None, 
		column_index_offset: None, 
		column_index_length: None, 
		crypto_metadata: None, 
		encrypted_column_metadata: None 
	}, 
	column_descr: ColumnDescriptor { 
		descriptor: Descriptor {
			primitive_type: PrimitiveType { 
				field_info: FieldInfo { name: "decimal_39", repetition: Required, id: None }, logical_type: Some(Decimal(39, 0)), converted_type: Some(Decimal(39, 0)), physical_type: FixedLenByteArray(17) 
			}, 
			max_def_level: 0, 
			max_rep_level: 0 
		}, 
		path_in_schema: ["decimal_39"], 
		base_type: PrimitiveType(PrimitiveType { field_info: FieldInfo { name: "decimal_39", repetition: Required, id: None }, logical_type: Some(Decimal(39, 0)), converted_type: Some(Decimal(39, 0)), physical_type: FixedLenByteArray(17) }) 
	} 
},
the chunk is [21, 6, 21, 212, 2, 21, 212, 2, 92, 21, 20, 21, 0, 21, 20, 21, 0, 21, 0, 21, 0, 18, 28, 24, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 22, 0, 40, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]
thread 'io::parquet::read::v2_decimal_39_required' panicked at 'assertion failed: `(left == right)`
  left: `Decimal256(39, 0)[-256.0, -1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`,
 right: `Decimal256(39, 0)[255.0, -340282366920938463463374607431768211201.0, 680564733841876926926749214863536422912.0, 1020847100762815390390123822295304634368.0, 1361129467683753853853498429727072845824.0, 1701411834604692317316873037158841057280.0, 2041694201525630780780247644590609268736.0, 2381976568446569244243622252022377480192.0, 2722258935367507707706996859454145691648.0, 3062541302288446171170371466885913903104.0]`', tests/it/io/parquet/read.rs:54:5


---- io::parquet::write::decimal_39_required_v2 stdout ----
the meta is 
ColumnChunkMetaData {
	column_chunk: ColumnChunk { 
		file_path: None, 
		file_offset: 243, 
		meta_data: Some(
			ColumnMetaData { 
				type_: Type(7), 
				encodings: [Encoding(0), Encoding(3)], 
				path_in_schema: ["a1"], 
				codec: CompressionCodec(0), 
				num_values: 10, 
				total_uncompressed_size: 239, 
				total_compressed_size: 239, 
				key_value_metadata: None, 
				data_page_offset: 4, 
				index_page_offset: None, 
				dictionary_page_offset: None, 
				statistics: Some(Statistics { max: None, min: None, null_count: Some(0), distinct_count: None, max_value: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min_value: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]) }), encoding_stats: None, bloom_filter_offset: None 
			}
		), 
		offset_index_offset: Some(358), 
		offset_index_length: Some(11), 
		column_index_offset: Some(309), 
		column_index_length: Some(49), 
		crypto_metadata: None, 
		encrypted_column_metadata: None 
	}, 
	column_descr: ColumnDescriptor { 
		descriptor: Descriptor { primitive_type: PrimitiveType { field_info: FieldInfo { name: "a1", repetition: Optional, id: None }, logical_type: None, converted_type: None, physical_type: FixedLenByteArray(17) }, max_def_level: 1, max_rep_level: 0 }, 
		path_in_schema: ["a1"], 
		base_type: PrimitiveType(PrimitiveType { field_info: FieldInfo { name: "a1", repetition: Optional, id: None }, logical_type: None, converted_type: None, physical_type: FixedLenByteArray(17) }) 
	} 
},
the chunk is [21, 6, 21, 218, 2, 21, 218, 2, 92, 21, 20, 21, 0, 21, 20, 21, 0, 21, 6, 21, 0, 18, 28, 54, 0, 40, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 5, 255, 3, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
thread 'io::parquet::write::decimal_39_required_v2' panicked at 'assertion failed: `(left == right)`
  left: `Decimal256(39, 0)[-256.0, -1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`,
 right: `Decimal256(39, 0)[-340282366920938463463374607431768211456.0, -340282366920938463463374607431768211201.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`', tests/it/io/parquet/write.rs:74:5

parquet-tools show fixtures/pyarrow3/v2/basic_required_10.parquet
+---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------+
|   int64 |   float64 | string   | bool   | timestamp_ms               |   uint32 |   decimal_9 |   decimal_18 |   decimal_26 |   decimal_39 |
|---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------|
|    -256 |         0 | Hello    | True   | 1970-01-01 00:00:00        |        0 |        -256 |         -256 |         -256 |         -256 |
|      -1 |         1 | bbb      | True   | 1970-01-01 00:00:00.001000 |        1 |          -1 |           -1 |           -1 |           -1 |
|       2 |         2 | aa       | False  | 1970-01-01 00:00:00.002000 |        2 |           2 |            2 |            2 |            2 |
|       3 |         3 |          | False  | 1970-01-01 00:00:00.003000 |        3 |           3 |            3 |            3 |            3 |
|       4 |         4 | bbb      | False  | 1970-01-01 00:00:00.004000 |        4 |           4 |            4 |            4 |            4 |
|       5 |         5 | abc      | True   | 1970-01-01 00:00:00.005000 |        5 |           5 |            5 |            5 |            5 |
|       6 |         6 | bbb      | True   | 1970-01-01 00:00:00.006000 |        6 |           6 |            6 |            6 |            6 |
|       7 |         7 | bbb      | True   | 1970-01-01 00:00:00.007000 |        7 |           7 |            7 |            7 |            7 |
|       8 |         8 | def      | True   | 1970-01-01 00:00:00.008000 |        8 |           8 |            8 |            8 |            8 |
|       9 |         9 | aaa      | True   | 1970-01-01 00:00:00.009000 |        9 |           9 |            9 |            9 |            9 |
+---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------+


@TCeason
Copy link
Contributor Author

TCeason commented Mar 1, 2023

I think need some help.

I have two questions:

  1. why are the row_group's column chunk different?

With number 9

in read, it's: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]

in write, it's: [9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

  1. How to deal with the neg number like -256 and -1?
  2. Wht the chunk is different but the statistics min/max is same all of them are [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]
---- io::parquet::read::v2_decimal_39_required stdout ----
the meta is 
ColumnChunkMetaData {
	column_chunk: ColumnChunk { 
		file_path: None, 
		file_offset: 2171, 
		meta_data: Some(
			ColumnMetaData { 
				type_: Type(7), 
				encodings: [Encoding(0), Encoding(3)], 
				path_in_schema: ["decimal_39"], 
				codec: CompressionCodec(0), 
				num_values: 10, 
				total_uncompressed_size: 274, 
				total_compressed_size: 274, 
				key_value_metadata: None, 
				data_page_offset: 1897, 
				index_page_offset: None, 
				dictionary_page_offset: None, 
				statistics: Some(Statistics { max: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]), null_count: Some(0), distinct_count: None, max_value: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min_value: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]) }), 
				encoding_stats: Some([PageEncodingStats { page_type: PageType(0), encoding: Encoding(0), count: 1 }]), 
				bloom_filter_offset: None }
			), 
		offset_index_offset: None, 
		offset_index_length: None, 
		column_index_offset: None, 
		column_index_length: None, 
		crypto_metadata: None, 
		encrypted_column_metadata: None 
	}, 
	column_descr: ColumnDescriptor { 
		descriptor: Descriptor {
			primitive_type: PrimitiveType { 
				field_info: FieldInfo { name: "decimal_39", repetition: Required, id: None }, logical_type: Some(Decimal(39, 0)), converted_type: Some(Decimal(39, 0)), physical_type: FixedLenByteArray(17) 
			}, 
			max_def_level: 0, 
			max_rep_level: 0 
		}, 
		path_in_schema: ["decimal_39"], 
		base_type: PrimitiveType(PrimitiveType { field_info: FieldInfo { name: "decimal_39", repetition: Required, id: None }, logical_type: Some(Decimal(39, 0)), converted_type: Some(Decimal(39, 0)), physical_type: FixedLenByteArray(17) }) 
	} 
},
the chunk is [21, 6, 21, 212, 2, 21, 212, 2, 92, 21, 20, 21, 0, 21, 20, 21, 0, 21, 0, 21, 0, 18, 28, 24, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 22, 0, 40, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]
thread 'io::parquet::read::v2_decimal_39_required' panicked at 'assertion failed: `(left == right)`
  left: `Decimal256(39, 0)[-256.0, -1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`,
 right: `Decimal256(39, 0)[255.0, -340282366920938463463374607431768211201.0, 680564733841876926926749214863536422912.0, 1020847100762815390390123822295304634368.0, 1361129467683753853853498429727072845824.0, 1701411834604692317316873037158841057280.0, 2041694201525630780780247644590609268736.0, 2381976568446569244243622252022377480192.0, 2722258935367507707706996859454145691648.0, 3062541302288446171170371466885913903104.0]`', tests/it/io/parquet/read.rs:54:5


---- io::parquet::write::decimal_39_required_v2 stdout ----
the meta is 
ColumnChunkMetaData {
	column_chunk: ColumnChunk { 
		file_path: None, 
		file_offset: 243, 
		meta_data: Some(
			ColumnMetaData { 
				type_: Type(7), 
				encodings: [Encoding(0), Encoding(3)], 
				path_in_schema: ["a1"], 
				codec: CompressionCodec(0), 
				num_values: 10, 
				total_uncompressed_size: 239, 
				total_compressed_size: 239, 
				key_value_metadata: None, 
				data_page_offset: 4, 
				index_page_offset: None, 
				dictionary_page_offset: None, 
				statistics: Some(Statistics { max: None, min: None, null_count: Some(0), distinct_count: None, max_value: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9]), min_value: Some([255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0]) }), encoding_stats: None, bloom_filter_offset: None 
			}
		), 
		offset_index_offset: Some(358), 
		offset_index_length: Some(11), 
		column_index_offset: Some(309), 
		column_index_length: Some(49), 
		crypto_metadata: None, 
		encrypted_column_metadata: None 
	}, 
	column_descr: ColumnDescriptor { 
		descriptor: Descriptor { primitive_type: PrimitiveType { field_info: FieldInfo { name: "a1", repetition: Optional, id: None }, logical_type: None, converted_type: None, physical_type: FixedLenByteArray(17) }, max_def_level: 1, max_rep_level: 0 }, 
		path_in_schema: ["a1"], 
		base_type: PrimitiveType(PrimitiveType { field_info: FieldInfo { name: "a1", repetition: Optional, id: None }, logical_type: None, converted_type: None, physical_type: FixedLenByteArray(17) }) 
	} 
},
the chunk is [21, 6, 21, 218, 2, 21, 218, 2, 92, 21, 20, 21, 0, 21, 20, 21, 0, 21, 6, 21, 0, 18, 28, 54, 0, 40, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 24, 17, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 5, 255, 3, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
thread 'io::parquet::write::decimal_39_required_v2' panicked at 'assertion failed: `(left == right)`
  left: `Decimal256(39, 0)[-256.0, -1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`,
 right: `Decimal256(39, 0)[-340282366920938463463374607431768211456.0, -340282366920938463463374607431768211201.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]`', tests/it/io/parquet/write.rs:74:5
parquet-tools show fixtures/pyarrow3/v2/basic_required_10.parquet
+---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------+
|   int64 |   float64 | string   | bool   | timestamp_ms               |   uint32 |   decimal_9 |   decimal_18 |   decimal_26 |   decimal_39 |
|---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------|
|    -256 |         0 | Hello    | True   | 1970-01-01 00:00:00        |        0 |        -256 |         -256 |         -256 |         -256 |
|      -1 |         1 | bbb      | True   | 1970-01-01 00:00:00.001000 |        1 |          -1 |           -1 |           -1 |           -1 |
|       2 |         2 | aa       | False  | 1970-01-01 00:00:00.002000 |        2 |           2 |            2 |            2 |            2 |
|       3 |         3 |          | False  | 1970-01-01 00:00:00.003000 |        3 |           3 |            3 |            3 |            3 |
|       4 |         4 | bbb      | False  | 1970-01-01 00:00:00.004000 |        4 |           4 |            4 |            4 |            4 |
|       5 |         5 | abc      | True   | 1970-01-01 00:00:00.005000 |        5 |           5 |            5 |            5 |            5 |
|       6 |         6 | bbb      | True   | 1970-01-01 00:00:00.006000 |        6 |           6 |            6 |            6 |            6 |
|       7 |         7 | bbb      | True   | 1970-01-01 00:00:00.007000 |        7 |           7 |            7 |            7 |            7 |
|       8 |         8 | def      | True   | 1970-01-01 00:00:00.008000 |        8 |           8 |            8 |            8 |            8 |
|       9 |         9 | aaa      | True   | 1970-01-01 00:00:00.009000 |        9 |           9 |            9 |            9 |            9 |
+---------+-----------+----------+--------+----------------------------+----------+-------------+--------------+--------------+--------------+

This already fix. Because of this bug in i256::to_be_bytes . The high and low positions are reversed in this function . Fixed in this pr.

@TCeason TCeason force-pushed the ISSUE-1411 branch 2 times, most recently from 490cfce to b82d43a Compare March 1, 2023 13:15
@TCeason
Copy link
Contributor Author

TCeason commented Mar 2, 2023

@sundy-li @jorgecarleitao This pr is ready for review now. THX

src/io/parquet/read/mod.rs Outdated Show resolved Hide resolved
@TCeason
Copy link
Contributor Author

TCeason commented Mar 2, 2023

https://github.com/jorgecarleitao/arrow2/actions/runs/4310837007/jobs/7525280388

➤ YN0000: └ Completed in 12s 459ms
6257
➤ YN0000: ┌ Post-resolution validation
6258
➤ YN0028: │ The lockfile would have been modified by this install, which is explicitly forbidden.
6259
➤ YN0000: └ Completed
6260
➤ YN0000: Failed with errors in 12s 583ms
6261
1
6262
Error: docker-compose --file /home/runner/work/arrow2/arrow2/docker-compose.yml run --rm -e ARCHERY_INTEGRATION_WITH_RUST=1 conda-integration exited with a non-zero exit code 1, see the process log above.
6263

@TCeason
Copy link
Contributor Author

TCeason commented Mar 3, 2023

I think this failure has nothing to do with this pr. Is it possible to merge? @jorgecarleitao

https://github.com/jorgecarleitao/arrow2/actions/runs/4310837007/jobs/7525280388

➤ YN0000: └ Completed in 12s 459ms 6257 ➤ YN0000: ┌ Post-resolution validation 6258 ➤ YN0028: │ The lockfile would have been modified by this install, which is explicitly forbidden. 6259 ➤ YN0000: └ Completed 6260 ➤ YN0000: Failed with errors in 12s 583ms 6261 1 6262 Error: docker-compose --file /home/runner/work/arrow2/arrow2/docker-compose.yml run --rm -e ARCHERY_INTEGRATION_WITH_RUST=1 conda-integration exited with a non-zero exit code 1, see the process log above. 6263

@jorgecarleitao jorgecarleitao changed the title fead(parquet): add support decimal256 read/write in parquet Added support for decimal256 read/write in parquet Mar 3, 2023
@jorgecarleitao jorgecarleitao merged commit 9749aee into jorgecarleitao:main Mar 3, 2023
ritchie46 pushed a commit to ritchie46/arrow2 that referenced this pull request Mar 29, 2023
ritchie46 pushed a commit to ritchie46/arrow2 that referenced this pull request Apr 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants