Skip to content
This repository has been archived by the owner on Jan 14, 2025. It is now read-only.

Reduced shared parquet file contention in testing #440

Merged
merged 2 commits into from
Apr 29, 2024
Merged

Reduced shared parquet file contention in testing #440

merged 2 commits into from
Apr 29, 2024

Conversation

wilsonbb
Copy link
Collaborator

@wilsonbb wilsonbb commented Apr 29, 2024

Change Description

Since moving to dask-espr, we've seen increased test failures. In #434 there have been recent failures in test_batch_by_band where a flux column can sometimes be read in as NaNs during the from_parquet call used in that test invocation. Interestingly this only seems to occur when the tests are run as a suite and not when test_batch_by_band is run in isolation.

We can achieve the same effect by having the test build an ensemble using duplicate files of the source and object files. This appears to fix the test when run on github actions, but we should still investigate the underlying issue with from_parquet and then revert this change.

  • My PR includes a link to the issue that I am addressing

Code Quality

  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

Bug Fix Checklist

  • My fix includes a new test that breaks as a result of the bug (if possible)

Copy link

github-actions bot commented Apr 29, 2024

Before [8f33ee8] After [1c6dc5a] Ratio Benchmark (Parameter)
33.0±0.8ms 32.4±0.1ms 0.98 benchmarks.time_prune_sync_workflow
33.2±0.6ms 32.0±0.7ms 0.96 benchmarks.time_batch

Click here to view all benchmarks.

@wilsonbb wilsonbb changed the title Hotfix for from_parquet contention Reduced shared parquet file contention in testing Apr 29, 2024
Copy link

codecov bot commented Apr 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.82%. Comparing base (40236a3) to head (5eb4149).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #440   +/-   ##
=======================================
  Coverage   95.82%   95.82%           
=======================================
  Files          25       25           
  Lines        1772     1772           
=======================================
  Hits         1698     1698           
  Misses         74       74           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wilsonbb wilsonbb requested a review from dougbrn April 29, 2024 18:50
@wilsonbb wilsonbb marked this pull request as ready for review April 29, 2024 18:50
Copy link
Collaborator

@dougbrn dougbrn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this works, makes me really want to have TAPE generate it's own test data via generator functions rather than use these test datasets...

tests/tape_tests/test_ensemble.py Outdated Show resolved Hide resolved
@wilsonbb wilsonbb merged commit 8f33ee8 into main Apr 29, 2024
7 of 8 checks passed
@wilsonbb wilsonbb deleted the hot_fix branch April 29, 2024 19:01
@wilsonbb wilsonbb mentioned this pull request Apr 29, 2024
3 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants