Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Fivetran UI data differs from Fivetran Log data #63

Closed
2 of 4 tasks
JadAbdallah opened this issue Oct 2, 2022 · 32 comments
Closed
2 of 4 tasks

[Bug] Fivetran UI data differs from Fivetran Log data #63

JadAbdallah opened this issue Oct 2, 2022 · 32 comments
Assignees
Labels
priority:p2 Affects most users; fix needed status:in_review Currently in review type:bug Something is broken or incorrect update_type:models Primary focus requires model updates

Comments

@JadAbdallah
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

I am seeing discrepancies between some of the usage metrics in the Fivetran UI versus the Logs from the dbt package. This includes dollars spent and MARs. The amount in Fivetran is always slightly higher. The plan rate ($/M) is slightly higher in the dbt logs as well. I'm assuming some of these fields are a result of a mathematical formula, so perhas one of the values is off.

I am querying the data from fivetran_log__usage_mar_destination_history and am on the latest version of the package (0.6.3).
Note however that MARs are also off when querying from fivetran_log__mar_table_history

Relevant error log or model output

No response

Expected behavior

Expect values to be identical for same period

dbt Project configurations

/

Package versions

0.6.3

What database are you using dbt with?

snowflake

dbt Version

1.2.1

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.
@JadAbdallah JadAbdallah added the bug label Oct 2, 2022
@fivetran-joemarkiewicz
Copy link
Contributor

Hi @JadAbdallah thanks so much for opening this issue and sorry to hear your totals are off between the package totals and the UI.

Are the totals consistently off, or is this only happening for the current month?

@JadAbdallah
Copy link
Author

@fivetran-joemarkiewicz It's consistent!

@fivetran-joemarkiewicz
Copy link
Contributor

Just curious, do these totals seem to be consistently higher or lower than what is listed on the Fivetran UI?

@JadAbdallah
Copy link
Author

Consistently lower, by around 100 units (for dollars spent). I looked at whether it was a a consistent extra X, due to perhaps an additional parameter in the calculation, but it fluctuated

@fivetran-joemarkiewicz
Copy link
Contributor

@JadAbdallah thanks for these additional details. I have actually been looking into a new table that was rolled out to our Fivetran Log package which may be the reason you are seeing some issue with the end models. A new table incremental_mar has been added to the connector and looks to be replacing the active_volume table (which our package currently uses in the downstream models that calculate your spend).

As such, I have opened Issue #65 to look into this further. I will look into this some more the rest of this week (and into next) and hope to have something for you to try and that could fix this issue.

Before I dig deeper into this new table, can you confirm if you have the incremental_mar table in your Fivetran Log source schema?

@JadAbdallah
Copy link
Author

I dont' see it actually, but do see the active_volume table.

@fivetran-joemarkiewicz
Copy link
Contributor

Hmmm 🤔

Do you know if you have any free MAR or credits? I know the incremental_mar table is intended to help display the paid vs. free MAR. This may be why the active_volume calculation is a bit off on your end. Once the incremenal_mar table is loaded we could then try and switch to using that source.

@JadAbdallah
Copy link
Author

Do you know why I don't have the incremental_mar table? I would assume that if anything, MAR would be higher on my end as I cannot differentiate free from paid. I would expect the Fivetran UI to display billed amount correctly. That amount is higher than what I am seeing from the package.

@fivetran-joemarkiewicz
Copy link
Contributor

I know this incremental_mar table was just recently rolled out. It may not have been rolled out to your account yet.

Would you be open to sharing a screenshot of the usage the Fivetran UI is showing for a period, and the relevant output of the dbt package? If you are uncomfortable sharing in an open GitHub Issue, you can start a private support ticket where we chan discuss off a public forum if you desired.

@JadAbdallah
Copy link
Author

Dbt package:
Screen Shot 2022-10-24 at 6 07 47 PM

Fivetran UI:
Screen Shot 2022-10-24 at 6 08 48 PM

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks for sharing this @JadAbdallah. I have tried digging into the logic on my end and am having some trouble identifying an issue.

I think this may be easier and more efficient to solve directly on a call. Would you be open to scheduling some time with my team and I using our Calendly link? This way we may debug the variance live on a call.

@JadAbdallah
Copy link
Author

Done!

@fivetran-sheringuyen fivetran-sheringuyen added type:bug Something is broken or incorrect and removed bug labels Dec 6, 2022
@pkanter
Copy link

pkanter commented Feb 6, 2023

Following up on this issue. I'm having a similar problem, but with a much larger variance. Our MAR for November was upwards of 12MM, yet the FIVETRAN_LOG__USAGE_MAR_DESTINATION_HISTORY table is showing approximately 600K mar. I can open a new issue, if necessary.

@fivetran-joemarkiewicz
Copy link
Contributor

Hi @pkanter happy to keep the conversation here.

Would you be able to share which version of the package you are using?

@pkanter
Copy link

pkanter commented Feb 6, 2023 via email

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks @pkanter. We released a few updates in 0.7.0 that added some more nuance to the MAR calculation (specifically free vs paid MAR).

What field in the package are you using to calculate your MAR? I wonder if you may only be using the paid MAR, but the sum of paid and free may equate to the 12MM you are seeing?

@pkanter
Copy link

pkanter commented Feb 8, 2023

@fivetran-joemarkiewicz , I'm taking the number directly from the Usage MAR Destnation History table in the fivetran_log package. The dollars are correct, but the rows are off considerably. The sum of free and paid don't even come to ten percent of the 12MM, at 94k and 606k respectively.

@fivetran-joemarkiewicz
Copy link
Contributor

The dollars are correct, but the rows are off considerably.

This statement jumps out the most to me 🤔. I would assume that if the MAR is off then the dollars spent would be as well. I think it may make the most sense if you schedule some time with my team to investigate this live on a call. You can book time here. In the meantime, I will continue to investigate.

@pkanter
Copy link

pkanter commented Feb 8, 2023 via email

@pkanter
Copy link

pkanter commented Feb 8, 2023

One thought...the INCREMENTAL_MAR table is only a staging table right now, no final production table is derived from it. The numbers are correct there.

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks for adding that note. The latest version of the package should be leveraging the incremental_mar table through to the final model. See here for how the staging model should flow through to the final usage_mar_destination_history model.

I know you mentioned you are using v0.7.0, but are there any other configs in your dbt_project.yml that may be causing it to not flow to the final downstream models?

@pkanter
Copy link

pkanter commented Feb 8, 2023

The only vars related to the fivetran_log package are "using_transformations" and "using_triggers" which are both false.

@fivetran-joemarkiewicz
Copy link
Contributor

Are there any model configs related to the fivetran_log package?

@pkanter
Copy link

pkanter commented Feb 8, 2023

No, forgot to mention that.

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks for the additional details. I will continue to look into this. I also appreciate you setting up some time with us for this week. We can go into more debugging then!

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks for meeting with our team @pkanter!

Recapping our meeting, we discovered that this line in the fivetran_log__mar_table_history model is erroneously filtering out records from the incremental_mar staging model.

This line was required with the active_volume staging model. However, since that field was deprecated and replaced with incremental_mar, we no longer need to account for the cumulative totals. Instead, we need to include the previous records to accurately total the mar per connector.

I will soon open a PR to address this issue.

@pkanter
Copy link

pkanter commented Feb 9, 2023 via email

@fivetran-joemarkiewicz fivetran-joemarkiewicz added priority:p2 Affects most users; fix needed status:in_review Currently in review update_type:models Primary focus requires model updates labels Feb 15, 2023
@fivetran-joemarkiewicz
Copy link
Contributor

Hi all,

I wanted to post back to let you know that the changes were merged and a new v0.7.1 release has been cut! You should see this live at the top of the hour.

As the issue has been addressed in the latest release, I will close this issue.

@pkanter
Copy link

pkanter commented Feb 15, 2023 via email

@pkanter
Copy link

pkanter commented Feb 16, 2023

@fivetran-joemarkiewicz , I've updated to the 0.7.1 version of the package. when I ran dbt build, a unique_combination_of_columns test fails. The full message is below...

14:36:47  Failure in test dbt_utils_unique_combination_of_columns_fivetran_log__mar_table_history_connector_id__destination_id__table_name__measured_month (models\fivetran_log.yml)
14:36:47    Got 1310 results, configured to fail if != 0
14:36:47
14:36:47    compiled Code at target\compiled\fivetran_log\models\fivetran_log.yml\dbt_utils_unique_combination_o_9247891a6fc9fc9b6aa8554805786898.sql
14:36:47

@fivetran-joemarkiewicz
Copy link
Contributor

Thanks so much for sharing @pkanter. I am unsure how this test failure did not arise in our development but it looks like another eagle eyed user caught this as well and raised it within Issue #73 and proposed a fix within PR #74.

We are working to address these in a timely manner. I hope to post a test branch in the issue as soon as possible. It would be great if you were open to following there and ensuring you see success before rolling out an 0.7.2 patch.

@fivetran-joemarkiewicz
Copy link
Contributor

@pkanter cross posting this comment from the other thread if you are available to test before we roll out the patch -
#73 (comment). If you have a moment, would you be able to test this branch and verify if the test failures are resolved, and your end totals are still accurate.

packages:
  - git: https://github.com/simon-stepper/dbt_fivetran_log.git
    revision: 73-duplicated-months
    warn-unpinned: false 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:p2 Affects most users; fix needed status:in_review Currently in review type:bug Something is broken or incorrect update_type:models Primary focus requires model updates
Projects
None yet
Development

No branches or pull requests

4 participants