-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return scan results in dataframe fixes #95 #99
Return scan results in dataframe fixes #95 #99
Conversation
* Use version range for Soda spark dependency * Rewrite spark version range * Add port and host attributes
Add example test for converting test results to a data frame
I have redefined the tests and added below methods:
TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Great work, this is the approach I would expect. I have some remarks about the to_dict
and the `select
@JCZuurmond Please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nicely done! I have no major remarks, added a couple minor comments to keep the style of the code more consistent. Could mark the PR as non-draft after you finished resolving the comments?
I'll have one more look at it tomorrow when I have more time. And, I think, we should be able to merge it than.
src/sodaspark/scan.py
Outdated
return out | ||
|
||
|
||
def testresults_to_data_frame(testresults: list[TestResult]) -> DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make testresults
two words? test_results
, in function name and parameter
One more important thing! Could add your change to the change log, you deserve the credits for this! You can add your change to the top bullet list with a reference to this PR. And, you can add a |
…lkarni87/soda-spark into scan-results-in-dataframe
@JCZuurmond Updated and rebased the branch. Also had to update the schema for |
* Return scan resuts in dataframe * Update TestResults schema * Add test for converting test results to a data frame * Added Tests for Dataframes * Add optional flag as_frame to execute * Fix schemas for measurements and scan_error * Update schema for Test Co-authored-by: Cor <[email protected]>
- Provides the ability to get the scan results in Dataframes. ([#99](#99)) - measurements - test_results - scan_errors - Use version range for `soda-spark-sql` dependency - Add `host` and `port` attributes to `_SparkDialect` Contributors: - [Anil Kulkarni](https://github.com/anilkulkarni87) ([#99](#99))
- Provides the ability to get the scan results in Dataframes. ([#99](#99)) - measurements - test_results - scan_errors - Use version range for `soda-spark-sql` dependency - Add `host` and `port` attributes to `_SparkDialect` Contributors: - [Anil Kulkarni](https://github.com/anilkulkarni87) ([#99](#99))
- Provides the ability to get the scan results in Dataframes. ([#99](#99)) - measurements - test_results - scan_errors - Use version range for `soda-spark-sql` dependency - Add `host` and `port` attributes to `_SparkDialect` Contributors: - [Anil Kulkarni](https://github.com/anilkulkarni87) ([#99](#99))
This is still work in progress. The required functions have been added to the
scan.py
.Changes made:
The user has the option to use any of these methods depending on the usecase.
TODO: