Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move file handling logic from RecordItemParser to RecordFileParser #585

Merged
merged 5 commits into from
Mar 7, 2020

Conversation

apeksharma
Copy link
Contributor

@apeksharma apeksharma commented Mar 5, 2020

Detailed description:

  • Move connection ownership to PostgresWritingRecordParsedItemHandler
    • RecordItemParser.start()+ RecordItemParser.initFile --> onStart()
    • RecordItemParser.finish + RecordItemParser.completeFile --> onEnd()
  • RecordItemParser is agnostic of 'file' and database now (except entity repo, to be fixed later)
  • RecordFileParser is agnostic of database now (except for application state - applicationStatusRepository)
  • Change RecordFileParserTest to use mocks rather than transactionRepository or recordFileRepository.
  • Adds two interfaces: StreamFileListener and RecordStreamFileListener
  • Dependencies:
    • RecordFileParser ---depends on --> RecordStreamFileListener and RecordItemListener
    • RecordItemParser --depends on--> RecordParsedItemHandler
  • Deprecated file id column in t_record_files. To be removed soon.
  • Remove checks for recordFileRepository from RecordItemParser tests

Signed-off-by: Apekshit Sharma [email protected]

Which issue(s) this PR fixes:
Partially addresses #566

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated

- Move connection ownership to RecordFileParser
  - initConnection: earlier RecordItemParser.start()
  - closeConnection: earlier RecordItemParser.finish()
- Move db file handling to RecordFileParser
  - initFile
  - closeFileAndCommit: eariler RecordItemParser.completeFile()
  - Move file testing from RecordItemParserTest to RecordFileParserTest
- RecordItemParser is completely agnostic of 'file' concept now.
- Remove checks for recordFileRepository from RecordItemParser tests

Followup: Remove fileId in t_transactions.

Signed-off-by: Apekshit Sharma <[email protected]>
@apeksharma apeksharma self-assigned this Mar 5, 2020
@apeksharma apeksharma added P2 parser Area: File parsing labels Mar 5, 2020
@apeksharma apeksharma added this to the Mirror 0.7.0 milestone Mar 5, 2020
@codecov
Copy link

codecov bot commented Mar 5, 2020

Codecov Report

Merging #585 into master will increase coverage by 0.14%.
The diff coverage is 76.47%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #585      +/-   ##
============================================
+ Coverage     64.31%   64.45%   +0.14%     
  Complexity      171      171              
============================================
  Files           101      101              
  Lines          3167     3157      -10     
  Branches        366      363       -3     
============================================
- Hits           2037     2035       -2     
+ Misses          976      971       -5     
+ Partials        154      151       -3
Impacted Files Coverage Δ Complexity Δ
...com/hedera/mirror/importer/domain/Transaction.java 82.6% <ø> (ø) 0 <0> (ø) ⬇️
...importer/parser/record/RecordParserProperties.java 92.85% <ø> (-0.48%) 0 <0> (ø)
.../com/hedera/mirror/importer/domain/RecordFile.java 66.66% <100%> (+9.52%) 0 <0> (ø) ⬇️
...porter/parser/record/PostgresWriterProperties.java 66.66% <100%> (+16.66%) 0 <0> (ø) ⬇️
...irror/importer/parser/record/RecordItemParser.java 83.89% <100%> (+1.3%) 0 <0> (ø) ⬇️
...irror/importer/parser/record/RecordFileParser.java 73.05% <72.5%> (+5.79%) 0 <0> (ø) ⬇️
...record/PostgresWritingRecordParsedItemHandler.java 82.56% <74%> (-2.87%) 0 <0> (ø)
...hedera/mirror/importer/util/DatabaseUtilities.java 35.71% <0%> (-21.43%) 0% <0%> (ø)
...r/grpc/retriever/PollingTopicMessageRetriever.java 96% <0%> (-2%) 12% <0%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac0f386...57da8b3. Read the comment docs.

Copy link
Member

@steven-sheehy steven-sheehy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the direction of this PR. The RecordFileParser should not have knowledge of database connections, rolling back or postgres specific parsed item handlers.

Recommend moving connection management to PostgresWritingRecordParsedItemHandler and be opened once per file. It can even be opened once per batch due to the wonders of connection pools. It's fine if RecordFileParser needs to open a separate connection for updating t_record_files. Recommend interface segregation principle be used to split file specific logic out from ParsedItemHandler and into new interface (exact name or parameters can be debated):

public interface StreamFileListener {
  void onStart(StreamFileData);
  void onEnd(StreamFileData);
}

PostgresWritingRecordParsedItemHandler should be adapted to implement StreamFileListener directly (not via ParsedItemHandler) only because he manually manages connections and needs to flush onEnd. Other implementations of ParsedItemHandler like message queues would not need to implement it as they don't batch.

RecordFileParser should only have knowledge of StreamFileListener and RecordItemParser should only have knowledge of ParsedItemHandler. This would provide a much cleaner separation of layers while still satisfying your concerns.

@apeksharma
Copy link
Contributor Author

I absolutely agree, don't like the postgres stuff in two places myself.
Will use suggestions above to clean it up.
Please ignore ugliness around fileId in current PR since it's going to be removed in followup.

@apeksharma
Copy link
Contributor Author

last commit was early preview of changes, please take a look.
Am working on fixing tests in meantime.

Signed-off-by: Apekshit Sharma <[email protected]>
Copy link
Contributor Author

@apeksharma apeksharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improvement suggestions would be great! There's still so much here (and in general, always), that can be made better.
But if may request, since this PR has grown big and iterations on this will be costly, let's do improvements in followup. There's lot of good stuff here already. :)

Signed-off-by: Apekshit Sharma <[email protected]>
Signed-off-by: Apekshit Sharma <[email protected]>
@apeksharma apeksharma merged commit c92c525 into master Mar 7, 2020
@apeksharma apeksharma deleted the parser_132 branch March 7, 2020 17:28
@steven-sheehy steven-sheehy added the enhancement Type: New feature label Mar 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Type: New feature P2 parser Area: File parsing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants