Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Resolve the issue of conflicts between columns added during the analysis process and the original data columns in the Spark version. #1518

Merged
merged 3 commits into from
May 6, 2024

Conversation

frelion
Copy link

@frelion frelion commented Dec 9, 2023

#1476

In the Spark version, the program will add some auxiliary columns to the dataframe during runtime, such as Count, Std, etc.

If the original data to be analyzed already contains these columns, it may result in column name conflicts.

solution:
Before the program analysis, add the suffix "_customer" to all columns of the DataFrame.
Remove the suffix when displaying the results.

@frelion frelion changed the title Resolve the issue of conflicts between columns added during the analysis process and the original data columns in the Spark version. fix: Resolve the issue of conflicts between columns added during the analysis process and the original data columns in the Spark version. Dec 9, 2023
Copy link

@PeterlitsZo PeterlitsZo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me. It's OK. It can solve the problem.

frelion added 2 commits March 26, 2024 10:10
@fabclmnt fabclmnt force-pushed the fix/spark_column_conflict branch from 93555c1 to 925bcee Compare March 26, 2024 17:10
@fabclmnt fabclmnt merged commit ddcb388 into ydataai:develop May 6, 2024
4 of 7 checks passed
Copy link

codecov bot commented May 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.25%. Comparing base (2d9a24b) to head (925bcee).
Report is 25 commits behind head on develop.

❗ Current head 925bcee differs from pull request most recent head 2d7f8bb. Consider uploading reports for the commit 2d7f8bb to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1518      +/-   ##
===========================================
+ Coverage    90.08%   90.25%   +0.17%     
===========================================
  Files          195      195              
  Lines         6383     6383              
===========================================
+ Hits          5750     5761      +11     
+ Misses         633      622      -11     
Flag Coverage Δ
py3.8-ubuntu-22.04-pandas 90.25% <ø> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants