Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable parquet to start processing from the specified starting version in the config #644

Merged
merged 1 commit into from
Dec 16, 2024

Conversation

yuunlimm
Copy link
Contributor

@yuunlimm yuunlimm commented Dec 12, 2024

Description

parquet processor was always starting from the min processed version or 0 if not all tables are processed. And we would like the consistency across all processors where it should start from the higher version.

Test Plan

Screenshot 2024-12-11 at 5 27 31 PM
Screenshot 2024-12-11 at 5 27 25 PM

Screenshot 2024-12-11 at 5 26 56 PM

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@yuunlimm yuunlimm marked this pull request as ready for review December 12, 2024 01:28

if let Some(min_processed_version) = min_processed_version {
Ok(std::cmp::max(
min_processed_version + 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if min_processed_version = config_starting_version = 0, then this will return 1 so version 0 will never be processed. Removing the + 1 will fix this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! will fix tjos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub async fn get_starting_version(
indexer_processor_config: &IndexerProcessorConfig,
conn_pool: ArcDbPool,
) -> Result<u64> {
// Check if there's a checkpoint in the appropriate processor status table.
let latest_processed_version =
get_starting_version_from_db(indexer_processor_config, conn_pool)
.await
.context("Failed to get latest processed version from DB")?;
// If nothing checkpointed, return the `starting_version` from the config, or 0 if not set.
Ok(latest_processed_version.unwrap_or(
indexer_processor_config
.transaction_stream_config
.starting_version
.unwrap_or(0),
))
}

hmm, in that case, would this also return 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  // return the higher of the checkpointed version + 1 and `starting_version`.
    Ok(status.map(|status| {
        std::cmp::max(
            status.last_success_version as u64 + 1,
            indexer_processor_config
                .transaction_stream_config
                .starting_version
                .unwrap_or(0),
        )
    }))
    ```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah lol do you mind changing that too please

@yuunlimm yuunlimm requested a review from dermanyang December 13, 2024 21:41
@yuunlimm yuunlimm force-pushed the 12-11-parquet_starting_version branch from 3dfd2f6 to 7ebce69 Compare December 16, 2024 22:38
Copy link

codecov bot commented Dec 16, 2024

Codecov Report

Attention: Patch coverage is 9.09091% with 10 lines in your changes missing coverage. Please review.

Project coverage is 45.2%. Comparing base (22f92fc) to head (7ebce69).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
rust/sdk-processor/src/utils/starting_version.rs 9.0% 10 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main    #644     +/-   ##
=======================================
- Coverage   47.7%   45.2%   -2.6%     
=======================================
  Files        228     229      +1     
  Lines      27205   27016    -189     
=======================================
- Hits       12983   12216    -767     
- Misses     14222   14800    +578     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yuunlimm yuunlimm merged commit c094090 into main Dec 16, 2024
11 checks passed
Copy link
Contributor Author

Merge activity

  • Dec 16, 5:52 PM EST: A user merged this pull request with Graphite.

@yuunlimm yuunlimm deleted the 12-11-parquet_starting_version branch December 16, 2024 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants