-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore value sorting determinism (and possible changes) #175
Comments
Thinking on this more and exploring a bit, outlining thoughts and findings so far below. This appeared again in #181, so I'm focusing on figuring out more of the reasons why this might occur. PatternsWhen this happens, the following appear to be consistent patterns:
Possible explanationsAs a quick check I tried verifying that PyArrow sorting works the way it should. It seems that it does properly sort all values by all columns when implemented the way it is in CytoTable tests. See here for code demonstrating this. I feel there are several other possibilities for what's occurring which I'll work through in order to verify what's happening.
|
for further performance in cytomining#175
* customize sorting capabilities for further performance in #175 * simplify sql; exclude cytotable meta * exclude duplicate columns * updating tests * fixing tests * simulate csv source by removing meta * update preset sql to use refined syntax * address mixed type queries and tests * simplify and further clarity in test * correcting comment * make sorting optional * fix existing tests * further sorting options applied * add a test for unsorted output
From #174:
Because of the importance of this issue, adding that we need example cases where the fix has been validated with larger than testing datasets.
The text was updated successfully, but these errors were encountered: