Return scan results in dataframe fixes #95 #99

anilkulkarni87 · 2021-12-13T09:15:56Z

This is still work in progress. The required functions have been added to the scan.py.

Changes made:

measurements_to_data_frame
testresults_to_data_frame
scanerror_to_data_frame
convert_scan_result_to_spark_data_frames --> Leverages the above functions to return a tuple of Dataframes.

The user has the option to use any of these methods depending on the usecase.

TODO:

Tests for the new methods.
Add as_frame flag to scan function.
Tests for Schema check.

JCZuurmond · 2021-12-13T11:01:46Z

Looks good to me! If you add the todo's you mentioned, we have the functionality mentioned in the issue.

pro tip if you add "fixes #95" to the PR description, the issue gets closed automagically when this PR is merged, see here

src/sodaspark/scan.py

tests/test_scan.py

src/sodaspark/scan.py

* Use version range for Soda spark dependency * Rewrite spark version range * Add port and host attributes

Add example test for converting test results to a data frame

anilkulkarni87 · 2021-12-20T10:51:56Z

I have redefined the tests and added below methods:

test_measurements_to_data_frame
test_test_results_to_data_frame
test_scanerror_to_data_frame

TODO:
Add as_frame flag to execute function in scan.py
Tests for Schema check - This may not be required i think.

JCZuurmond

Awesome! Great work, this is the approach I would expect. I have some remarks about the to_dict and the `select

src/sodaspark/scan.py

tests/test_scan.py

anilkulkarni87 · 2021-12-20T22:20:06Z

@JCZuurmond
Add as_frame flag to execute function in scan.py
Added a test test_scan_execute_return_as_data_frame

Please review

JCZuurmond

Very nicely done! I have no major remarks, added a couple minor comments to keep the style of the code more consistent. Could mark the PR as non-draft after you finished resolving the comments?

I'll have one more look at it tomorrow when I have more time. And, I think, we should be able to merge it than.

tests/test_scan.py

src/sodaspark/scan.py

JCZuurmond · 2021-12-21T07:53:49Z

src/sodaspark/scan.py

+    return out
+
+
+def testresults_to_data_frame(testresults: list[TestResult]) -> DataFrame:


Could you make testresults two words? test_results, in function name and parameter

src/sodaspark/scan.py

tests/test_scan.py

JCZuurmond · 2021-12-21T08:03:11Z

One more important thing! Could add your change to the change log, you deserve the credits for this! You can add your change to the top bullet list with a reference to this PR. And, you can add a contributors section (above the latest release and below the top bullet list) with a reference to your Github profile. See here for an example.

…lkarni87/soda-spark into scan-results-in-dataframe

anilkulkarni87 · 2021-12-21T10:09:15Z

@JCZuurmond Updated and rebased the branch. Also had to update the schema for Test. All tests are now passing.
Please review and merge.

* Return scan resuts in dataframe * Update TestResults schema * Add test for converting test results to a data frame * Added Tests for Dataframes * Add optional flag as_frame to execute * Fix schemas for measurements and scan_error * Update schema for Test Co-authored-by: Cor <[email protected]>

- Provides the ability to get the scan results in Dataframes. ([#99](#99)) - measurements - test_results - scan_errors - Use version range for `soda-spark-sql` dependency - Add `host` and `port` attributes to `_SparkDialect` Contributors: - [Anil Kulkarni](https://github.com/anilkulkarni87) ([#99](#99))

Return scan resuts in dataframe

d1cabbd

anilkulkarni87 marked this pull request as draft December 13, 2021 09:18

anilkulkarni87 mentioned this pull request Dec 13, 2021

Have scan results as a dataframe #95

Closed

JCZuurmond reviewed Dec 13, 2021

View reviewed changes

src/sodaspark/scan.py Outdated Show resolved Hide resolved

Update TestResults schema

c7d3329

anilkulkarni87 changed the title ~~Return scan resuts in dataframe~~ Return scan resuts in dataframe fixed #95 Dec 14, 2021

anilkulkarni87 changed the title ~~Return scan resuts in dataframe fixed #95~~ Return scan results in dataframe fixes #95 Dec 14, 2021

JCZuurmond reviewed Dec 15, 2021

View reviewed changes

tests/test_scan.py Outdated Show resolved Hide resolved

JCZuurmond reviewed Dec 15, 2021

View reviewed changes

tests/test_scan.py Outdated Show resolved Hide resolved

JCZuurmond reviewed Dec 15, 2021

View reviewed changes

src/sodaspark/scan.py Outdated Show resolved Hide resolved

JCZuurmond and others added 4 commits December 16, 2021 09:02

Use version ranges (sodadata#100)

eeb595a

* Use version range for Soda spark dependency * Rewrite spark version range * Add port and host attributes

Add test for converting test results to a data frame

71e9f93

Merge pull request #1 from sodadata/scan-results-in-dataframe

129cb55

Add example test for converting test results to a data frame

Added Tests for Dataframes

1f24dc1

JCZuurmond suggested changes Dec 20, 2021

View reviewed changes

src/sodaspark/scan.py Outdated Show resolved Hide resolved

src/sodaspark/scan.py Outdated Show resolved Hide resolved

tests/test_scan.py Outdated Show resolved Hide resolved

anilkulkarni87 added 2 commits December 20, 2021 12:36

Fix schemas for measurements and scan_error

43772d0

Add optional flag as_frame to execute

acef270

anilkulkarni87 marked this pull request as ready for review December 20, 2021 22:20

Add optional flag as_frame to execute

ef8656a

JCZuurmond suggested changes Dec 21, 2021

View reviewed changes

anilkulkarni87 and others added 6 commits December 21, 2021 00:56

Updated changelog

2bde2c2

Return scan resuts in dataframe

4a4e649

Update TestResults schema

60ab7bf

Add test for converting test results to a data frame

7cecb5d

Added Tests for Dataframes

568fdcd

Fix schemas for measurements and scan_error

0d87d86

anilkulkarni87 added 6 commits December 21, 2021 01:35

Add optional flag as_frame to execute

a9291b2

Add optional flag as_frame to execute

fec6b06

After rebase

4794205

Merge branch 'scan-results-in-dataframe' of https://github.com/anilku…

e95c7c3

…lkarni87/soda-spark into scan-results-in-dataframe

Updated Changelog

7858022

Update schema for Test

b0a25ed

JCZuurmond merged commit a25986f into sodadata:main Dec 22, 2021

JCZuurmond mentioned this pull request Dec 22, 2021

Release 0.3.0 #102

Merged

anilkulkarni87 deleted the scan-results-in-dataframe branch April 1, 2022 07:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return scan results in dataframe fixes #95 #99

Return scan results in dataframe fixes #95 #99

anilkulkarni87 commented Dec 13, 2021

JCZuurmond commented Dec 13, 2021

anilkulkarni87 commented Dec 20, 2021

JCZuurmond left a comment

anilkulkarni87 commented Dec 20, 2021

JCZuurmond left a comment

JCZuurmond Dec 21, 2021

JCZuurmond commented Dec 21, 2021

anilkulkarni87 commented Dec 21, 2021

		return out


		def testresults_to_data_frame(testresults: list[TestResult]) -> DataFrame:

Return scan results in dataframe fixes #95 #99

Return scan results in dataframe fixes #95 #99

Conversation

anilkulkarni87 commented Dec 13, 2021

JCZuurmond commented Dec 13, 2021

anilkulkarni87 commented Dec 20, 2021

JCZuurmond left a comment

Choose a reason for hiding this comment

anilkulkarni87 commented Dec 20, 2021

JCZuurmond left a comment

Choose a reason for hiding this comment

JCZuurmond Dec 21, 2021

Choose a reason for hiding this comment

JCZuurmond commented Dec 21, 2021

anilkulkarni87 commented Dec 21, 2021