-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SDK-parquet] parquet default processor extractor step #601
[SDK-parquet] parquet default processor extractor step #601
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
b4f46db
to
9bff5f3
Compare
rust/sdk-processor/src/steps/common/parquet_extractor_helper.rs
Outdated
Show resolved
Hide resolved
rust/sdk-processor/src/steps/parquet_default_processor/parquet_default_extractor.rs
Outdated
Show resolved
Hide resolved
pub fn process_transactions( | ||
transactions: Vec<Transaction>, | ||
) -> ( | ||
Vec<MoveResource>, | ||
Vec<WriteSetChangeModel>, | ||
Vec<TransactionModel>, | ||
Vec<TableItem>, | ||
Vec<MoveModule>, | ||
) { | ||
// this will be removed in the future. | ||
let mut transaction_version_to_struct_count = AHashMap::new(); | ||
let (txns, _, write_set_changes, wsc_details) = TransactionModel::from_transactions( | ||
&transactions, | ||
&mut transaction_version_to_struct_count, | ||
); | ||
|
||
let mut move_modules = vec![]; | ||
let mut move_resources = vec![]; | ||
let mut table_items = vec![]; | ||
|
||
for detail in wsc_details { | ||
match detail { | ||
WriteSetChangeDetail::Module(module) => { | ||
move_modules.push(module); | ||
}, | ||
WriteSetChangeDetail::Resource(resource) => { | ||
move_resources.push(resource); | ||
}, | ||
WriteSetChangeDetail::Table(item, _, _) => { | ||
table_items.push(item); | ||
}, | ||
} | ||
} | ||
|
||
( | ||
move_resources, | ||
write_set_changes, | ||
txns, | ||
table_items, | ||
move_modules, | ||
) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to re-use the process_transactions
from the Parquet default processor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better yet, resuse this one from the Postgres default processor:
pub fn process_transactions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we can reuse the the process_transactions from the parquet default processor!
just sharing the context why initially I decided not to re-use the process_transactions
from the parquet processor because there is that we wouldn't need the struct count
map. having new code in this file allows us to simply remove the code later after migrated all. but it already re-uses some of the functions from the existing parquet processor lol.
Or better yet, resuse this one from the Postgres default processor:
I don't think we can reuse the current version of process_transactions
from the postgres default processor since they are kind of diverged due to Parquet migration. Keeping them as separate functions could simplify the code by avoiding conditionals for deprecated tables, making each version cleaner and more maintainable.
Alternatively, if there’s significant overlap, we could refactor shared logic into helper functions. What do you think—is it better to keep them separate or refactor shared portions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand the context: what part of the Postgres and Parquet diverged during the migration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unresolving, also curious about the above ^ @yuunlimm
7e0016b
to
58ac581
Compare
9bff5f3
to
9c6dc91
Compare
rust/sdk-processor/src/steps/common/parquet_extractor_helper.rs
Outdated
Show resolved
Hide resolved
472b366
to
daf3f6e
Compare
58ac581
to
067e1f2
Compare
eb396de
to
16f0a25
Compare
067e1f2
to
f828135
Compare
16f0a25
to
2f91ffa
Compare
1bc8672
to
ef311b0
Compare
…out duplicating the parsing code
ef311b0
to
8f43e54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving to unblock with some questions
_parquet_table_items, | ||
move_modules, | ||
), | ||
_transaction_version_to_struct_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what the convention is in general for var names with leading underscore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means the variables are unused, so the compiler doesn't mark this as a lint error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when would you do the leading underscore instead of just the underscore like in
let (txns, _, write_set_changes, wsc_details) = TransactionModel::from_transactions(
&transactions,
&mut transaction_version_to_struct_count,
);
pub fn process_transactions( | ||
transactions: Vec<Transaction>, | ||
) -> ( | ||
Vec<MoveResource>, | ||
Vec<WriteSetChangeModel>, | ||
Vec<TransactionModel>, | ||
Vec<TableItem>, | ||
Vec<MoveModule>, | ||
) { | ||
// this will be removed in the future. | ||
let mut transaction_version_to_struct_count = AHashMap::new(); | ||
let (txns, _, write_set_changes, wsc_details) = TransactionModel::from_transactions( | ||
&transactions, | ||
&mut transaction_version_to_struct_count, | ||
); | ||
|
||
let mut move_modules = vec![]; | ||
let mut move_resources = vec![]; | ||
let mut table_items = vec![]; | ||
|
||
for detail in wsc_details { | ||
match detail { | ||
WriteSetChangeDetail::Module(module) => { | ||
move_modules.push(module); | ||
}, | ||
WriteSetChangeDetail::Resource(resource) => { | ||
move_resources.push(resource); | ||
}, | ||
WriteSetChangeDetail::Table(item, _, _) => { | ||
table_items.push(item); | ||
}, | ||
} | ||
} | ||
|
||
( | ||
move_resources, | ||
write_set_changes, | ||
txns, | ||
table_items, | ||
move_modules, | ||
) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unresolving, also curious about the above ^ @yuunlimm
rust/processor/src/db/common/models/default_models/move_tables.rs
Outdated
Show resolved
Hide resolved
rust/processor/src/db/common/models/default_models/move_tables.rs
Outdated
Show resolved
Hide resolved
rust/processor/src/db/common/models/default_models/move_tables.rs
Outdated
Show resolved
Hide resolved
rust/processor/src/db/common/models/stake_models/delegator_balances.rs
Outdated
Show resolved
Hide resolved
rust/processor/src/db/common/models/default_models/move_tables.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. thanks for the detailed decription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with comments!
rust/processor/src/db/postgres/models/default_models/move_tables.rs
Outdated
Show resolved
Hide resolved
2198641
to
4455b67
Compare
4455b67
to
61ecc68
Compare
Description
This is to add an extractor step for parquet default processor.
Changes invovled:
TableItemConvertible
which is implemented for both Parquet and Postgres TableItem.TableItem::from_write_table_item
to returntable_items
onlyContext on why we added a new
RawTableItem
model:tl;dr
This is to reduce the duplicated code as much as possible and reuse the common parsing logic. RawTableItem represents parsed transaction data. This struct will act as the unified output from the parsing function. Parquet TableItem and Postgres TableItem will implement a TableItemConvertible trait to convert RawTableItem into the appropriate format.
With this approach, parsing logic is centralized, and each processor (Postgres or Parquet) can convert RawTableItem to its respective TableItem type, keeping code DRY and maintainable while supporting format-specific needs.