Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Added try_new and new to all arrays #873

Merged
merged 4 commits into from
Mar 1, 2022
Merged

Added try_new and new to all arrays #873

merged 4 commits into from
Mar 1, 2022

Conversation

jorgecarleitao
Copy link
Owner

This PR introduces the functions try_new -> Result<Self> and new -> Self to every array.

Background

The function named from_data is currently used to initialize arrays. This name comes from the original arrow-rs implementation, where there is a concept of ArrowData. This concept is not used here, but the name "from_data" remained. Furthermore, this array currently panics, which means that it can only be used on trusted data.

This PR

  • adds two new functions, try_new and new that align with most of Rust's ecosystem.
  • adds try_new_unchecked and new_unchecked for arrays whose validation is O(N) (e.g. variable length, utf8).
  • adds better error messages to the validation

I have also (re-)audited all the invariants of every array to make sure they align with arrow's spec (they do).

Next steps

  • replace from_data throughout the code base by either try_new or new accordingly (depending on whether we expect the invariant to be held)
  • deprecate from_data?

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Mar 1, 2022
@codecov
Copy link

codecov bot commented Mar 1, 2022

Codecov Report

Merging #873 (bf4f4df) into main (fca3b8d) will increase coverage by 0.39%.
The diff coverage is 77.81%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #873      +/-   ##
==========================================
+ Coverage   71.52%   71.92%   +0.39%     
==========================================
  Files         335      338       +3     
  Lines       17989    18499     +510     
==========================================
+ Hits        12867    13305     +438     
- Misses       5122     5194      +72     
Impacted Files Coverage Δ
src/array/list/mod.rs 70.53% <57.50%> (-10.95%) ⬇️
src/array/specification.rs 84.44% <58.33%> (-7.45%) ⬇️
src/array/map/mod.rs 49.36% <73.33%> (+6.74%) ⬆️
src/array/null.rs 46.34% <77.77%> (+8.10%) ⬆️
src/array/fixed_size_list/mod.rs 72.50% <78.57%> (+0.83%) ⬆️
src/array/struct_/mod.rs 67.85% <78.94%> (+3.79%) ⬆️
src/array/primitive/mod.rs 80.85% <80.00%> (+0.61%) ⬆️
src/array/fixed_size_binary/mod.rs 70.32% <82.60%> (+1.90%) ⬆️
src/array/boolean/mod.rs 72.50% <85.71%> (+2.07%) ⬆️
src/array/utf8/mod.rs 66.66% <85.71%> (+1.17%) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fca3b8d...bf4f4df. Read the comment docs.

@jorgecarleitao jorgecarleitao merged commit eb4bc5d into main Mar 1, 2022
@jorgecarleitao jorgecarleitao deleted the validate branch March 1, 2022 19:19
sydduckworth pushed a commit to mindx/arrow2 that referenced this pull request Mar 2, 2022
sydduckworth pushed a commit to mindx/arrow2 that referenced this pull request Mar 2, 2022
@jorgecarleitao jorgecarleitao added feature A new feature and removed enhancement An improvement to an existing feature labels Mar 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant