Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source Snowflake: In timestamp cursor fields, seconds in cursor values are rounded to ceil and causing to sync rows whose cursor column values are smaller than cursor. #9915

Closed
ameyabapat-bsft opened this issue Jan 31, 2022 · 4 comments · Fixed by #10242
Labels
area/connectors Connector related issues community connectors/source/snowflake type/bug Something isn't working

Comments

@ameyabapat-bsft
Copy link

ameyabapat-bsft commented Jan 31, 2022

Environment

  • Airbyte version: 0.35.5-alpha
  • OS Version / Instance: AWS EC2
  • Deployment: Docker
  • Source Connector and version: snowflake - 0.1.5
  • Destination Connector and version: S3 - 0.2.5
  • Severity: Medium
  • Step where error happened: Sync job

Current Behavior

Even though there is no change in cursor value, incremental sync pulls up rows whose cursor columns values are smaller than cursor value.
Cursor value(timestamp) is getting rounded to ceil when last (milliseconds) field is more than 500.
Ex: Snowflake timestamp value: 2022-01-28 10:30:33.614 which got round to 2022-01-28T10:30:34Z by airbyte.

Expected Behavior

It should not sync rows whose value is smaller than cursor value.

Logs

Please check them in sequence after reading steps to reproduce.
logs-58-0.txt
logs-59-0.txt
logs-60-0.txt
logs-61-0.txt
logs-62-0.txt

Steps to Reproduce

  1. Create Snowflake -> Aws S3 incremental connection and manual frequency.
    My snowflake table is
    CREATE OR REPLACE TABLE test_timestamp ( id int, request_timestamp_formatted TIMESTAMP_NTZ );
    Set request_timestamp_formatted as cursor field.

Screenshot 2022-01-28 at 4 50 44 PM

  1. Add 132 rows to the table.
    insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 132;

  2. Run Sync. It should sync 132 records.

  3. Add 100 records.
    insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 100;

  4. It should sync 232 records as per the greater than equal to cursor value logic.

  5. Add 3 more rows.
    insert into test_timestamp select * from (select seq2(0), current_timestamp() from table(generator(rowcount => 132))) order by seq2(0) limit 3;

  6. Run Sync. It should sync 103 records: 100 repeated + 3 new.

  7. On all subsequent syncs, It pulls up 103 records every-time.

Ideally it should sync only 3 records where cursor value is equal to newly added 3 records but it keeps on syncing old 100 records as well whose cursor column value is less than cursor value.
For reference:
My snowflake db structure after these steps.
select count(*), request_timestamp_formatted from test_timestamp group by request_timestamp_formatted;
Screenshot 2022-01-31 at 4 17 33 PM

Slack Thread : https://airbytehq.slack.com/archives/C01MFR03D5W/p1643368624680109

Are you willing to submit a PR?

I am not fully verse with code base. I feel we would need to fix at jdbc level on which this source is built on. I could try to fix given some pointers on code change locations.

@ameyabapat-bsft ameyabapat-bsft added needs-triage type/bug Something isn't working labels Jan 31, 2022
@alafanechere alafanechere changed the title In timestamp cursor fields, Seconds in cursor values are rounded to ceil and causing to sync rows whose cursor column values are smaller than cursor. 🐛 Source Snowflake: In timestamp cursor fields, seconds in cursor values are rounded to ceil and causing to sync rows whose cursor column values are smaller than cursor. Feb 1, 2022
@alafanechere alafanechere moved this to Backlog in GL Roadmap Feb 1, 2022
@alafanechere
Copy link
Contributor

@irynakruk could you please try to reproduce this issue?

@alafanechere
Copy link
Contributor

I do confirm it looks closely related to this epic: #8904

@alexandr-shegeda
Copy link
Contributor

will be fixed in the scope of this PR

@alexandr-shegeda alexandr-shegeda linked a pull request Feb 15, 2022 that will close this issue
40 tasks
@ameyabapat-bsft
Copy link
Author

will be fixed in the scope of this PR

Thanks @alexandr-shegeda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues community connectors/source/snowflake type/bug Something isn't working
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants