🐛 Source Snowflake: In timestamp cursor fields, seconds in cursor values are rounded to ceil and causing to sync rows whose cursor column values are smaller than cursor. #9915
Labels
area/connectors
Connector related issues
community
connectors/source/snowflake
type/bug
Something isn't working
Environment
Current Behavior
Even though there is no change in cursor value, incremental sync pulls up rows whose cursor columns values are smaller than cursor value.
Cursor value(timestamp) is getting rounded to ceil when last (milliseconds) field is more than 500.
Ex: Snowflake timestamp value: 2022-01-28 10:30:33.614 which got round to 2022-01-28T10:30:34Z by airbyte.
Expected Behavior
It should not sync rows whose value is smaller than cursor value.
Logs
Please check them in sequence after reading steps to reproduce.
logs-58-0.txt
logs-59-0.txt
logs-60-0.txt
logs-61-0.txt
logs-62-0.txt
Steps to Reproduce
My snowflake table is
CREATE OR REPLACE TABLE test_timestamp ( id int, request_timestamp_formatted TIMESTAMP_NTZ );
Set request_timestamp_formatted as cursor field.
Add 132 rows to the table.
insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 132;
Run Sync. It should sync 132 records.
Add 100 records.
insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 100;
It should sync 232 records as per the greater than equal to cursor value logic.
Add 3 more rows.
insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 3;
Run Sync. It should sync 103 records: 100 repeated + 3 new.
On all subsequent syncs, It pulls up 103 records every-time.
Ideally it should sync only 3 records where cursor value is equal to newly added 3 records but it keeps on syncing old 100 records as well whose cursor column value is less than cursor value.
![Screenshot 2022-01-31 at 4 17 33 PM](https://user-images.githubusercontent.com/87300701/151780607-70859b1c-4234-49b7-b3db-2a4628689b17.png)
For reference:
My snowflake db structure after these steps.
select count(*), request_timestamp_formatted from test_timestamp group by request_timestamp_formatted;
Slack Thread : https://airbytehq.slack.com/archives/C01MFR03D5W/p1643368624680109
Are you willing to submit a PR?
I am not fully verse with code base. I feel we would need to fix at jdbc level on which this source is built on. I could try to fix given some pointers on code change locations.
The text was updated successfully, but these errors were encountered: