Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Spark Connect Dataframe #1634

Open
chenbojian opened this issue Aug 6, 2024 · 2 comments
Open

Supporting Spark Connect Dataframe #1634

chenbojian opened this issue Aug 6, 2024 · 2 comments

Comments

@chenbojian
Copy link

chenbojian commented Aug 6, 2024

Missing functionality

After databricks runtime 14, the dataframe type is changed in notebook. It was pyspark.sql.dataframe.DataFrame, but now it is pyspark.sql.connect.dataframe.DataFrame
it fails to work with ydata-profling because ydata-profiling expects either pandas.DataFrame or pyspark.sql.dataframe.DataFrame

Proposed feature

Support pyspark.sql.connect.dataframe.DataFrame for profiling

Alternatives considered

No response

Additional context

image
@charleslondon
Copy link

Bumping this as I also have this issue

@dan-eschman
Copy link

Bump - same issue here. It would be great if I didn't have to go to pandas first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants