-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should SparseArray.astype be dense or Sparse #23125
Comments
This also leads to confusing behavior like In [10]: a.astype(bool)
Out[10]:
[0, True, 0, True]
Fill: 0
IntIndex
Indices: array([1, 3], dtype=int32) since |
I think this is a regression compared to released version. Compare the last example:
When |
So we have a few choices
2 can be done and is backwards compatible. It means a special case in Block.astype, and means Series[sparse].astype isn't equivalent to SparseArray.astype. 1 is backwards incompatible. It'd be hard to do with a deprecation cycle I think. |
Isn't there also option 3: keep both SparsArray.astype and Series[sparse].astype always sparse? |
Yes, I was assuming we didn't want |
That's in any case what it is doing now (and in 0.23), so not doing that would also be a backwards incompatible change. On the one hand it is convenient to keep it sparse if you quickly want to change the sparse type (without needing to use |
But didn't we discuss that on the SparseArray PR? I seem to remember that you and @kernc argued for keeping it sparse? |
Hmm I don't recall right now. Not having |
The topic is touched here: #22325 (comment) |
How do you mean? Because that is what is already happening on master:
|
In the case of |
Do people (@kernc) have thoughts on whether Closing this in a few days if there aren't any further comments. |
I'd foremost prefer to avoid shaping the decision that may in future turn out to be a bad idea. 😄 As a user, I've never used I understand series sparsity to be orthogonal to its underlying dtype. Dtype is the type of values and sparsity is the way they are laid out in memory. The latter can be idempotently transformed from one into the other and back while the former not necessarily so. Admittedly, I'm no longer as interested in the topic and not regularly working with sparse data, but as a potential user, I'd be annoyed if But I agree with @jorisvandenbossche's comment that Perhaps get input from some other sparse users. |
A side question, certainly a little late at that, what blows up if |
Great, I think we're in agreement then.
I have a WIP branch that I hope to push tomorrow doing this.
You can't subclass |
Long term it might be nice to be able to make the distinction between |
Right now
SparseArray.astype(numpy_dtype)
is sparse:This is potentially confusing. I did it to match the behavior of SparseSeries, but we may not want that.
The text was updated successfully, but these errors were encountered: