Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1889503: [Local Testing] Group by in empty dataframe adds "phantom" row with NULL group by key #2886

Open
miguelaraujo2 opened this issue Jan 20, 2025 · 0 comments
Labels
bug Something isn't working needs triage Initial RCA is required

Comments

@miguelaraujo2
Copy link

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

3.11.11

  1. What operating system and processor architecture are you using?

Linux-6.8.0-51-generic-x86_64-with-glibc2.39

  1. What are the component versions in the environment (pip freeze)?

N/A

  1. What did you do?

    Performing a group by on an empty dataframe produces a dataframe with a row where the key is NULL.

Simple reproducible example:

from snowflake.snowpark import Session
from snowflake.snowpark.functions import when_matched, when_not_matched, col, sum

session = Session.builder.config('local_testing', True).create()

df = session.create_dataframe([[1, 5]], schema=['id', 'count'])

df.filter(col("id") > 1).group_by(col("id")).agg(sum(col("count")).alias("count")).show()
  1. What did you expect to see?

    An empty DataFrame should be shown.
    However, we obtain the following output:

------------------
|"ID"  |"COUNT"  |
------------------
|NULL  |nan      |
------------------
  1. Can you set logging to DEBUG and collect the logs?
    N/A
@miguelaraujo2 miguelaraujo2 added bug Something isn't working needs triage Initial RCA is required labels Jan 20, 2025
@github-actions github-actions bot changed the title [Local Testing] Group by in empty dataframe adds "phantom" row with NULL group by key SNOW-1889503: [Local Testing] Group by in empty dataframe adds "phantom" row with NULL group by key Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Initial RCA is required
Projects
None yet
Development

No branches or pull requests

1 participant