Skip to content

Releases: elixir-explorer/explorer

v0.10.1

28 Nov 22:37
1ca086d
Compare
Choose a tag to compare

Fixed

  • Fix creation of series of {:list, {:decimal, ...}} containing empty lists.

  • Use i128 for :coef field in the Rust code.

    This field is a positive, arbitrary precision integer on the Elixir side.
    It's convenient to represent it as a signed i128 because that's what the Decimal dtype expects. While you could technically create an ExDecimal struct with a negative coef, it's not a practical concern.

  • Fix Explorer.DataFrame.print/1 for empty dataframes.

  • Fix datetime encoding overflow.

    Before we were always converting first to a microsecond-based representation then to the final representation. The intermediate conversion is unnecessary and risks overflows when trying to convert to a different time unit later.
    This approach converts directly to i64 from the Elixir struct and time unit.

  • Encode millisecond precision for time and datetime series.

  • Fix list struct print bug.

    Fixes an issue where we can't print columns with a dtype like {:list, {:struct, ...}} where the root of the tree isn't a :struct but it contains a :struct.

Deprecated

  • Remove documentation for deprecated functions to_date/1 and to_time/1.
    They are functions from the Explorer.Series that soon will be removed.

Pull requests

Full Changelog: v0.10.0...v0.10.1

SHA256 of the artifacts

d5b411a209e215a435557c17320084efd01f3d8d409c4c62f3cd8ecf24cea2ec  explorer-v0.10.1-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
afa33f6285fc1f2bec84ae4d40a405d6be2956643372650b0c80d5d22b09c5f9  explorer-v0.10.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
48efe024015f4210f80858d91a35f0b5018b55af1221b7ce8b3ec03762d9402c  explorer-v0.10.1-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
4c1cf15bda0d706535fe83c647f0f0870c9cfe4d912112aee529e49887c05c74  explorer-v0.10.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
4c7302413bad0625f99433712cb43105134f280526dd51a0a6967c0f1ed674c4  libexplorer-v0.10.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
90fbd41ff49ea5f002e9b7a5d81998a8a08999526526c2a48ba0d0c93931af75  libexplorer-v0.10.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
7530142224f105c8be1c406b32d735a7b45a9c9a68b70317c711f22d3098d2d0  libexplorer-v0.10.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
d4d38fbe70c40ed8d46111ce0b87a6ad3b8faad224e36b9f8c30396f5e4c4739  libexplorer-v0.10.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
c4e0256845332d111bdf0b4c02f0654f4a28f0dbfea8695cb6547b864391731b  libexplorer-v0.10.1-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
167de9dd4aa75864d453cf31daed8d28da3977ae9c5cc1eee445331cb0a45336  libexplorer-v0.10.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
aa6554366140259ef9dec5ef197d70a838f7ff7e26797da34fd0dd2ebe49d7ef  libexplorer-v0.10.1-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
ea3146648bec7e1ad30956e4609a5522a15db026c33e51a95c23d90064a1541c  libexplorer-v0.10.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
5c049bd578ef80814863a04fbfb56a593b273dae3aa5e7e8f4da62ffd6e84b9d  libexplorer-v0.10.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.10.0

24 Oct 00:00
5ee26d5
Compare
Choose a tag to compare

Added

  • Add support for the decimals data type.

    Decimals dtypes are represented by the {:decimal, precision, scale} tuple,
    where precision can be a positive integer from 0 to 38, and is the maximum number
    of digits that can be represented by the decimal. The scale is the number of
    digits after the decimal point.

    With this addition, we also added the :decimal package as a new dependency.
    The Explorer.Series.from_list/2 function accepts decimal numbers from that
    package as values - %Decimal{}.

    This version has a small number of operations, but is a good foundation.

  • Allow the usage of queries and lazy series outside callbacks and macros.
    This is an improvement to functions that were originally designed to accept callbacks.
    With this change you can now reuse lazy series across different "queries".
    See the Explorer.Query docs for details.

    The affected functions are:

    • Explorer.DataFrame.filter_with/2
    • Explorer.DataFrame.mutate_with/2
    • Explorer.DataFrame.sort_with/2
    • Explorer.DataFrame.summarise_with/2
  • Allow accessing the dataframe inside query.

  • Add "lazy read" support for Parquet and NDJSON from HTTP(s).

  • Expose more options for Explorer.Series.cut/3 and Explorer.Series.qcut/3.
    These options were available in Polars, but not in our APIs.

Fixed

  • Fix creation of series where a nil value inside a list - for a {:list, any()} dtype -
    could result in an incompatible dtype. This fix will prevent panics for list of lists with
    nil entries.

  • Fix Explorer.DataFrame.dump_ndjson/2 when date time is in use.

  • Fix Explorer.Series.product/1 for lazy series.

  • Accept %FSS.HTTP.Entry{} structs in functions like Explorer.DataFrame.from_parquet/2.

  • Fix encode of binaries to terms from series of the {:struct, any()} dtype.
    In case the inner fields of the struct had any binary (:binary dtype), it was
    causing a panic.

Changed

  • Change the defaults of the functions Explorer.Series.cut/3 and Explorer.Series.qcut/3
    to not have "break points" column in the resultant dataframe.
    So the :include_breaks is now false by default.

Pull requests

New Contributors

Full Changelog: v0.9.2...v0.10.0

SHA256 of the artifacts

28897fbb14f54a6a9d996cccf4d863dfdc27e589324bd22382c434402ddd258a  explorer-v0.10.0-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
f102a65a788cd6cd5d9cecea33ff3e10f924ff90bf9b43b65a5acc86b38f1d40  explorer-v0.10.0-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
d414eabfca363b1731f24917445b05a1b2c969c079e35d9cfb6aa59ef1d02af5  explorer-v0.10.0-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
aa71834d756c39b02dd45e9474e3782758cf3d4617b61bc891a820ec8fb35981  explorer-v0.10.0-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
ab0dc0df737611b73222072e865ba9a29fc4026ae5b33cd337c92c26137c835b  libexplorer-v0.10.0-nif-2.15-aarch64-apple-darwin.so.tar.gz
b837b3523b60e7804a0d38a8466ed8e0f3f68b2cea9ea10051811a0b20b08932  libexplorer-v0.10.0-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
c8d6c07af4a6b76b4769931132f045c1eb02a7916089f4db13481d2f5f1db363  libexplorer-v0.10.0-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
51fa272758b6cf331054ba3f27c13a64347c68f193d0c337240f1c79217678d2  libexplorer-v0.10.0-nif-2.15-x86_64-apple-darwin.so.tar.gz
804602cab628dae931c5ab6c78d0c62b878da764ca5312b61e03b32c0cbb420d  libexplorer-v0.10.0-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
ae8c48ff9b5d68a6587c423f71b5083a0e2e05dc79de7b136479527dd1508dec  libexplorer-v0.10.0-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
493a49c6ff72382c5ea5c79b01430230cc606174559ee596effe1022d8b665ef  libexplorer-v0.10.0-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
e7ead11dbca060e2f596e34507c304213bdda3ef0f26fc7c0f19db4edfd29b63  libexplorer-v0.10.0-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
62e480e536ef9a132bbc542df4a043f52fa6512a2ed8c2c5165ff4cff47b0e07  libexplorer-v0.10.0-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.9.2

27 Aug 13:53
1175ff9
Compare
Choose a tag to compare

Added

  • Add a new :keep option to the mutate_with/3 function and mutate/3 macro.
    This option allows users to control which columns are retained in the output
    dataframe after a mutation operation. You can use :all (the default) or :none.

Fixed

  • Fix handling of "LazySeries" with remote dataframes.
  • Fix typespecs of Explorer.Series.cast/2 by adding a dtype_alias() type.
  • Stop converting io_dtypes() to maps in order to preserve names ordering.

Pull requests

New Contributors

Full Changelog: v0.9.1...v0.9.2

SHA256 of artifacts

6717497ec99ba169d3224f63a59099650311e8e376480327a6251c5c8c9544f2  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
a1889f2558a125e4703894db04d1fab2aae2c07daf8ff2724922a73b67376368  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
17def23350d5e6367a88734b5b8c1d3d7d7369f61dd4514c22287b5ddb782f3b  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
5554f17bbb5823ada068ef7b03fbd7504213c93861395e90490313909c9e524c  explorer-v0.9.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
73c1fcc0db80c93b41bb74ee643de6ddc2e6c7053fe6eccd234edb007fa3a044  libexplorer-v0.9.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
e548d17dbf70de230f6a13f4576182f611065d0765c4370ace9e01ec6d1ebb77  libexplorer-v0.9.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
14a2f07fcdb815ecc483f4dcedbded982b59be40633528e71634e70e961d9f91  libexplorer-v0.9.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
1283a62cd2234d25b4b6d4d35a23a48e8fda2b915e068f91dcceb174c3a492aa  libexplorer-v0.9.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
599e73cc71dac39d4e0a8607a59176655591705a132f7f32b32b90045482e8eb  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
6ce4df2a9c1815be4f0d0d8fadd0f6cdc55172ebfb79bb77bbd1a008bebb6f09  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
936e4cd3b9db9039538893fc634b1c34c33e1c8636b00fc396822d10f0bab7c4  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
b07378f05f51c35f79b20e2fc78dfb804626e6583b19de8aa47be773bd2fe5c8  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
0afe0cc7410a2c09f30ae81ef57324e69b22f736705224e46b73b36837882250  libexplorer-v0.9.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Build with attestations: https://github.com/elixir-explorer/explorer/actions/runs/10579468339

v0.9.1

15 Aug 21:31
1f2ccbb
Compare
Choose a tag to compare

Added

  • Add support for saving to the cloud using streaming and the IPC format.
    This will enable saving a lazy frame to the cloud without loading it
    entirely in memory. It only supports saves to S3-compatible storage services.

Changed

  • Force garbage collection on remote gc.

Fixed

  • Re-enable support for saving to the cloud using streaming and the Parquet format.
    It's a fix from the release of v0.9.0 that disabled this feature.

  • Fix overwrite of dtypes for Explorer.DataFrame.load_csv/2.
    This was a regression introduced in v0.9.0.

Pull requests

New Contributors

Full Changelog: v0.9.0...v0.9.1

SHA256 of artifacts

13a1063430989ab65536e1195976028fa5a6274fcb71b04e9c77e77ffcc64f62  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
eec297e6d1a20c0fb4fcb83a9779dc0199e03252a1df1e985dcc95d44f3e533f  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
dedfe9f2e0b0a620038abeb40f8f6ae67decfd5eabb7313bd137168d94b25357  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
f2eb81ed0ed7eb5ed8d65a4fe6e6ae86beb77158da0ab8a514cfbd38e42805c9  explorer-v0.9.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
142ec5f7898cbea3213dc1f36db3082798b96ac75688d7b7d3cd4521b6c26183  libexplorer-v0.9.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
330cce54a8fc1a3f6ff79f340b3ad966a3705efbc67b13b16a03efca0e0567c4  libexplorer-v0.9.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
eba94a784c28729e142143107092cf8c5c7534d443e6825ed68c878d1a01fd40  libexplorer-v0.9.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
9ba64fe4ba60bf218049752761cabda7fd5a41401c5938d69aad72a0c74dbf9f  libexplorer-v0.9.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
aceea3bbc047feb7729110f3ec0bf61d087efb0efd76e39c93ccb92485a8595c  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
4edd090d6c200949d8cacdf0aee393cb0afbe188ec5035d06cb130cbb69c70ea  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
4abc7a67b27202b468eaa00c9b42afce896edfb0124204356e82128ef7a90a95  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
43dc79e1907c3230169136b002458b929fda3fc1fd2a6ed3287fa44cbf9db1f7  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
4f58b77bcbdd9c3c9545e26daf92eeb971a97bd0b382a5855b66b3823be51e7f  libexplorer-v0.9.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.9.0

26 Jul 21:08
2537c18
Compare
Choose a tag to compare

Added

  • Add initial support for SQL queries.

    The Explorer.DataFrame.sql/3 is a function that accepts a dataframe and a SQL query. The SQL is not validated by Explorer, so the queries will be backend dependent. Right now we have only Polars as the backend.

  • Add support for remote series and dataframes.

    Automatically transfer data between nodes for remote series and dataframes and perform distributed garbage collection.

    The functions in Explorer.DataFrame and Explorer.Series will automatically move operations on remote dataframes to the nodes they belong to.
    The Explorer.Remote module provides additional conveniences for manual placement.

  • Add FLAME integration, so we automatically track remote series and dataframes returned from FLAME calls when the :track_resources option is enabled.
    See FLAME for more.

  • Add Explorer.DataFrame.transform/3 that applies an Elixir function to each row. This function is similar to Explorer.Series.transform/2, and as such, it's considered an expensive operation. So it's recommended only if there is no similar dataframe or series operation available.

  • Improve performance of Explorer.Series.from_list/2 for most of the cases where the :dtype option is given. This is specially true for when the dtype is :binary.

Changed

  • Stop inference of dtypes if the :dtype option is given by the user.
    The main goal of this change is to improve performance. We are now delegating the job of decoding the terms as the given :dtype to the backend.

  • Explorer.Series.pow/2 no longer casts to float when the exponent is a signed integer. We are following the way Polars works now, which is to try to execute the operation or raise an exception in case the exponent is negative.

  • Explorer.Series.pivot_wider/4 no longer includes the names_from column name in the new columns when values_from is a list of columns. This is more consistent with its behaviour when values_from is a single column.

  • Explorer.Series.substring/3 no longer cycles to the end of the string if the negative offset surpasses the beginning of that string. In that case, an empty string is returned.

  • The Explorer.Series.ewm_* functions no longer replace nil values with the value at the previous index. They now propogate nil values through to the result series.

  • Saving a dataframe as a Parquet file to S3 services no longer works when streaming is enabled. This is temporary due to a bug in Polars. An exception should be raised instead.

Pull requests

New Contributors

Full Changelog: v0.8.3...v0.9.0

SHA256 of artifacts

aeed3719479b9bbe1e342af272927a75b5ad4f38bd89dd971e739561d1923172  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
720914c3e85a0869174cd43c26b65400e6ac0131993ea5796689b2d19cf364e2  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
ddd1f1f70d2791fc662fd477c34f9c4aa18a7d4d1a80bf953e39ac3d49924de7  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
52138d2657f8af5c85b75d22129b107e0547d0ef4b3abca966de48a1b18be6d6  explorer-v0.9.0-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
6210497158c3479bdf9f8ab8661fca6bc02addde2cce085d5b134b4cc43ad5d4  libexplorer-v0.9.0-nif-2.15-aarch64-apple-darwin.so.tar.gz
78d54509a7a37e8e174cff4cce06329e75740d2cab8936834f8df06ed6a4eaea  libexplorer-v0.9.0-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
3acd48fc82d89eeb74b54db70e3a4f44404f8764f5f63d767f0281a93f91ab35  libexplorer-v0.9.0-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
2c2390d37a0171c0e096e96620959d78b8d66c73907d219d9f205562df933983  libexplorer-v0.9.0-nif-2.15-x86_64-apple-darwin.so.tar.gz
0dca014ba38a6be5705607bdf6560bd9c485dcd1596e8f0c7dde4e69001a2c93  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
96ec9de1e472a504bd101cd05273873930337089286103911a2d43db72ae48a8  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
1baee4332cc0f6e5c2e1f505ce3ca98f6350cdcda73a3cde476d2f2690ab2094  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
320b2fc65700b09f20c58d0674631b67714dffe665df34dd39e9b9984051ee79  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
7a93e9a34e7dfac721d96dca49c57405b40da5330dfa9ac78add561a234c4040  libexplorer-v0.9.0-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.8.3

10 Jun 20:18
b3bee3b
Compare
Choose a tag to compare

Added

  • Add new data type for datetimes with timezones: {:datetime, precision, time_zone}
    The old dtype is now {:naive_datetime, precision}.

  • Add option to rechunk the dataframes when using Explorer.DataFrame.from_parquet/3

Changed

  • Change the {:datetime, precision} dtype to {:naive_datetime, precision}.
    The idea is to mirror Elixir's datetime, and introduce support for time zones.
    Please note: {:datetime, precision} will work as an alias for {:naive_datetime, precision} for now but will raise a warning.
    The alias will be removed in a future release.

  • Literal %NaiveDateTime{} structs used in expressions will now have :microsecond precision.
    Previously they defaulted to :nanosecond precision.
    This was incorrect because %NaiveDateTime{} structs only have :microsecond precision.

Fixed

  • Fix regression in Explorer.DataFrame.concat_rows/2.
    It's possible to concat dataframes that are not aligned again.

  • Fix "is_finite" and "is_infinite" from Series to work in the context of a Explorer.Query.

Pull requests

New Contributors

SHA256 of the artifacts

2caba60cb3132e6751bba2879366e5b95551158f344fcd86d3ad39d2ac87a255  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
839d89988421790dfc64894ebac830bbdb81b4ae0a9cfb8917935cf767c295cc  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
bde5f164e7b46cd30c371c959712507999644a046d41c658649bbeb86077ed3a  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
a0a091f6c2171c456f36dd516b03cf789ad028b51b8fb2fa0bdfeed73fce2b8f  explorer-v0.8.3-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
325bdf2b6d13a0aa3366bbf8a02b714610a4625e9b95d0306b66b2f3ac3fa9d6  libexplorer-v0.8.3-nif-2.15-aarch64-apple-darwin.so.tar.gz
0cfe0f315db83686fa1d7d1a276852f6964bda135279822ad946ee619c723ec2  libexplorer-v0.8.3-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
f221d655939a815156881c314d1c1794757dc23afd755eb6144f6b6fea5ee10f  libexplorer-v0.8.3-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
a69917a55aed8b0c0e40b7b3ba92cce5bec818bc1ca4a03b2277921e30f0c48e  libexplorer-v0.8.3-nif-2.15-x86_64-apple-darwin.so.tar.gz
f974fb1e4caa9ee07843e9b2691ade556560c8f1d35291dc73f37249bf6f3477  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
212422bdceeef98ca7f08648b0bf67520390fc35f924e96ba8f0e667715fd63b  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
cf3df4dbcc228d1801e5c6c2f258c721d375c22919bbd2690f05e9624d405b55  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
ff7a17d8a3e6d45f349ecfcb3246664d3aa34681b41e2fce32eb5076a38f0544  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
fbba507be1059dac16228eee2287906b0311041c7603afa8f7a0a6edb4382fe5  libexplorer-v0.8.3-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Full Changelog: v0.8.2...v0.8.3

v0.8.2

22 Apr 12:34
990d4e5
Compare
Choose a tag to compare

Added

  • Add functions to work with strings and regexes.

    Some of the functions have the prefix "re_", because they accept a string that represents a regular expression.

    There is an important detail: we do not accept Elixir regexes, because we cannot guarantee that the backend supports it. Instead we accept a plain string that is "escaped". This means that you can use the ~S sigil to build that string.
    Example: ~S/(a|b)/.

    The added functions are the following:

    • Explorer.Series.split_into/3 - split a string series into a struct of string fields. This function accepts a string as a separator.

    • Explorer.Series.re_contains/2 - check is the string series matches the regex pattern. Like the "non regex" counterpart, it returns a boolean series.

    • Explorer.Series.re_replace/3 - replaces all occurences of a pattern with replacement in string series. The replacement can refer to groups captures by using the ${x}, where x is the group index (starts with 1) or name.

    • Explorer.Series.count_matches/2 - count how many times a substring appears in a string series.

    • Explorer.Series.re_count_matches/2 - count how many times a pattern matches in a string series.

    • Explorer.Series.re_scan/2 - scan for all matches for the given regex pattern.
      This is going to result in a series of lists of strings - {:list, :string}.

    • Explorer.Series.re_named_captures/2 - extract all capture groups as a struct for the given regex pattern. In case the groups are not named, their positions are used as names.

  • Enable the usage of system certificates if OTP version 25 or above.

  • Add support for the :streaming option in Explorer.DataFrame.to_csv/3.

  • Support operations with groups in the Lazy Polars backend. This change makes the lazy frame implementation more useful, by supporting the usage of groups in following functions:

    • Explorer.DataFrame.slice/3

    • Explorer.DataFrame.head/2

    • Explorer.DataFrame.tail/2

    • Explorer.DataFrame.filter_with/2 and the macro version of it, filter/2.

    • Explorer.DataFrame.sort_with/3, although it ignores "maintain order" and "nulls last" options when used with groups.

    • Explorer.DataFrame.mutate_with/2 and its macro version, mutate/2.

Changed

  • We now avoid raising an exception if a non existent column is used in Explorer.DataFrame.discard/2.

  • Make the dependency of cacerts optional. This is because people using Erlang/OTP 25 or above can use the certificates provided by the system.
    So you may need to add the dependency of cacerts if your OTP version is older than that.

  • Some precision differences in float operations may appear. This is due to an update in the Polars version to "v0.38.1". Polars is our default backend.

Fixed

  • Fix Explorer.Series.split/2 inside the context of Explorer.Query.

  • Add optional X-Amz-Security-Token header to S3 request. This is needed in case the user is passing down a token for authentication.

  • Fix Explorer.DataFrame.sort_by/3 with groups to respect :nils option.
    This is considering only the eager implementation.

  • Fix inspection of lazy frames in remote nodes.

Pull requests

  • Bump Polars 0.37 by @lkarthee in #861
  • DataFrame.discard/2 - don't raise for non existent column by @lkarthee in #872
  • Add native expression for Series.split/2 by @H12 in #875
  • Bump mio from 0.8.10 to 0.8.11 in /native/explorer by @dependabot in #876
  • Implements Series.split_into/3 by @ryancurtin in #873
  • Update Polars to v0.38 by @philss in #879
  • Add optional x-amz-security-token header to S3 request by @jschniper in #881
  • Rewrite LazyFrame by @philss in #882
  • Update Rustler to v0.32.1 by @philss in #884
  • Fix DF.sort_by/3 with groups to respect :nils option by @philss in #886
  • Update Polars to v0.38.3 by @philss in #887
  • Implements :streaming option for DataFrame.to_csv/3 by @ryancurtin in #889
  • Support operations with groups in the Lazy Polars backend by @philss in #890
  • Bump h2 from 0.3.25 to 0.3.26 in /native/explorer by @dependabot in #891
  • Revert LazyFrame implementation with stack by @philss in #892
  • Refactor eager DF implementation to make use of lazy backend by @philss in #893
  • Add re_contains/2 and re_replace/3 to match with a regex by @philss in #894
  • Add count_matches/2, re_count_matches/2, re_scan/2 and re_named_captures/2 to Series by @philss in #895
  • Add changes to the change log for the upcoming version by @philss in #897
  • Update dependencies by @philss in #899
  • Pass down backend to lazy series and enable re_named_captures/2 usage by @philss in #896
  • Release v0.8.2 by @philss in #900

New Contributors

SHA 256 of Artifacts

fd4d7db73577544d1008827502461fbc82644b44879bf4d50b8c7c2f7a04ad1f  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
ba9f6afe86d37e52b7481a29e6011cdc834b2c0196ee6b4235497c4a405fe6e3  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
447e3150ebffa1712ed7b6d56e11dd2369126a92bf5466570e4f36ae46f200b9  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
2032955e04c6632fd4d6d1015f611b7b90a84a405710d49cce46b7b1e1f52b3d  explorer-v0.8.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
9f10c1b25846de37ca2caf271c7728716d5d6783c82e823783ced53cc6a0b4b0  libexplorer-v0.8.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
aceade08eab94230b8f9dc87a5850e5523a7cf7a4222495bf3fa012c4622cd54  libexplorer-v0.8.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
0f0341d8a0928554ea2c083a653e599fdd406a0675bfc8e615465e6461726508  libexplorer-v0.8.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
63f9ffda8f9dbcacb12a3a522adbde370b78579a23a75183321fb0fa81f0a596  libexplorer-v0.8.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
65793e232a26a91bcfb90867f6392e34228d8f3f23419f500bee47f08c3e8896  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
3b4d5b1d88cfe416a13e3f0f1fdc87dd54dbf78dff7ff40cb1f64aa7652d5b8a  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
3afb057bfecdf86199a9dc380be2f44c44f09ac428d806e7983232ba6f15601b  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
cfab4552f1f3791e38c6b6ce3d5099fbe88f686ec062ac181ef2b35f3432c1e3  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
7fa4961f08f9278f6b8585d2b4f89d5712ccf222de7a41edb054915d8ec7d50c  libexplorer-v0.8.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

Full Changelog: v0.8.1...v0.8.2

v0.8.1

24 Feb 21:29
bfdf07b
Compare
Choose a tag to compare

Added

  • Add Explorer.Series.field/2 to extract a field from a struct series.
    It returns a new series with the field's dtype.

  • Add Explorer.Series.json_decode/2 that can decode a string series containing valid JSON objects according to dtype.

  • Add eager count/1 and lazy size/1 to Explorer.Series.

  • Add support for maps as expressions inside Explorer.Query. They are "converted" to structs.

  • Add json_path_match/2 to extract a string series from a string containing valid JSON objects.
    See the article JSONPath - XPath for JSON for details about JSON paths.

  • Add Explorer.Series.row_index/1 to retrieve the index of rows starting from 0.

  • Add support for passing the :on column directly (instead of inside a list) in Explorer.DataFrame.join/3.

Changed

  • Remove some deprecated functions from documentation.

  • Change internal representation of the :struct dtype to use list of tuples instead of a map to represent the dtypes of each field. This shouldn't break because we normalise maps to lists when a struct dtype is passed in from_list/2 or cast/2.

  • Update Rustler minimum version to ~> 0.31. Since Rustler is optional, this shouldn't affect most of the users.

Fixed

  • Fix float overflow error to avoid crashing the VM, and instead it returns an argument error.

  • Fix Explorer.DataFrame.print/2 for when the DF contains structs.

Pull requests

New Contributors

Full Changelog: v0.8.0...v0.8.1
Official Changelog: https://hexdocs.pm/explorer/changelog.html

SHA256 of precompiled artifacts

ce4b06cf51f6213b4e1917e52f73c8a09ef57c5cf5e157409122cdd348d00ee3  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
b78fb84a8847b17dd857213c9aea69622dff0b6b00233f395e9aaf2e3ee9a923  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
481204194b180b5dd4207cc00909f192a3e8f094f08b8be58bbc5e9e058150cd  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
f1a77c0f378582e300f17a85fe391eea6bdd673839fdd52d9bc5988906ba6171  explorer-v0.8.1-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
86aec9dd29572a61cd064108b02360161bff3e93e109ec0eb6c3e516cd08a6b4  libexplorer-v0.8.1-nif-2.15-aarch64-apple-darwin.so.tar.gz
a10f4ea3c7c1135b15e4a15186f926eed9c18376d4168442156e7ab9d9678408  libexplorer-v0.8.1-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
ac338d49cc96bdd8646c2e98e4eba877a352d02f84b956055d814dc66b884e1f  libexplorer-v0.8.1-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
ffc4d30c9c6802e5be5429b687f599fa4ca7875178468e880a5a6b2bc7f83663  libexplorer-v0.8.1-nif-2.15-x86_64-apple-darwin.so.tar.gz
e7bd3c239fd11db43f5fa822a5d25ce3c1a4569e33b155cdf341dfc20e5488c1  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
dba1128914e97a0edca3d0618ef6617a3eed284fbe6ee9d052b95c674e6eac14  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
90b8960ce6d57b48002a1ddffaab950a746de24f7c45eab7c89d65a075edeb9d  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
9480ab502d28b7540cf598115c6aadc6a2d1c61eaaa82ffa60fc2e530b0f1e91  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
c5ffa7c27f6dc44ec31be9eed3d09ff1f3fa9ce4727342af8607160b48d6b686  libexplorer-v0.8.1-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz

v0.8.0

20 Jan 17:43
e0e242b
Compare
Choose a tag to compare

Added

  • Add explode/2 to Explorer.DataFrame. This function is useful to expand the contents of a {:list, inner_dtype} series into a "inner_dtype" series.

  • Add the new series functions all?/1 and any?/1, to work with boolean series.

  • Add support for the "struct" dtype. This new dtype represents the struct dtype from Polars/Arrow.

  • Add map/2 and map_with/2 to the Explorer.Series module.
    This change enables the usage of the Explore.Query features in a series.

  • Add sort_by/2 and sort_with/2 to the Explorer.Series module.
    This change enables the usage of the lazy computations and the Explorer.Query module.

  • Add unnest/2 to Explorer.DataFrame. It works by taking the fields of a "struct" - the new dtype - and transform them into columns.

  • Add pairwise correlation - Explorer.DataFrame.correlation/2 - to calculate the correlation between numeric columns inside a data frame.

  • Add pairwise covariance - Explorer.DataFrame.covariance/2 - to calculate the covariance between numeric columns inside a data frame.

  • Add support for more integer dtypes. This change introduces new signed and unsigned integer dtypes:

    • {:s, 8}, {:s, 16}, {:s, 32}
    • {:u, 8}, {:u, 16}, {:u, 32}, {:u, 64}.

    The existing :integer dtype is now represented as {:s, 64}, and it's still the default dtype for integers. But series and data frames can now work with the new dtypes. Short names for these new dtypes can be used in functions like Explorer.Series.from_list/2. For example, {:u, 32} can be represented with the atom :u32.

    This may bring more interoperability with Nx, and with Arrow related things, like ADBC and Parquet.

  • Add ewm_standard_deviation/2 and ewm_variance/2 to Explorer.Series.
    They calculate the "exponentially weighted moving" variance and standard deviation.

  • Add support for :skip_rows_after_header option for the CSV reader functions.

  • Support {:list, numeric_dtype} for Explorer.Series.frequencies/1.

  • Support pins in cond, inside the context of Explorer.Query.

  • Introduce the :null dtype. This is a special dtype from Polars and Apache Arrow to represent "all null" series.

  • Add Explorer.DataFrame.transpose/2 to transpose a data frame.

Changed

  • Rename the functions related to sorting/arranging of the Explorer.DataFrame.
    Now arrange_with is named sort_with, and arrange is sort_by.

    The sort_by/3 is a macro and it is going to work using the Explorer.Query module. On the other side, the sort_with/2 uses a callback function.

  • Remove unnecessary casts to {:s, 64} now that we support more integer dtypes.
    It affects some functions, like the following in the Explorer.Series module:

    • argsort
    • count
    • rank
    • day_of_week, day_of_year, week_of_year, month, year, hour, minute, second
    • abs
    • clip
    • lengths
    • slice
    • n_distinct
    • frequencies

    And also some functions from the Explorer.DataFrame module:

    • mutate - mostly because of series changes
    • summarise - mostly because of series changes
    • slice

Fixed

  • Fix inspection of series and data frames between nodes.

  • Fix cast of :string series to {:datetime, any()}

  • Fix mismatched types in Explorer.Series.pow/2, making it more consistent.

  • Normalize sorting options.

  • Fix functions with dtype mismatching the result from Polars.
    This fix is affecting the following functions:

    • quantile/2 in the context of a lazy series
    • mode/1 inside a summarisation
    • strftime/2 in the context of a lazy series
    • mutate_with/2 when creating a column from a NaiveDateTime or Explorer.Duration.

Pull requests

New Contributors

Read more

v0.7.2

30 Nov 20:21
e585012
Compare
Choose a tag to compare

Added

  • Add the functions day_of_year/1 and week_of_year/1 to Explorer.Series.

  • Add filter/2 - a macro -, and filter_with/2 to Explorer.Series.

    This change enables the usage of queries - using Explorer.Query - when
    filtering a series. The main difference is that series does not have a
    name when used outside a dataframe. So to refer to itself inside the
    query, we can use the special _ variable.

      iex> s = Explorer.Series.from_list([1, 2, 3])
      iex> Explorer.Series.filter(s, _ > 2)
      #Explorer.Series<
        Polars[1]
        integer [3]
      >
    
  • Add support for the {:list, any()} dtype, where any() can be any other
    valid dtype. This is a recursive dtype, that can represent nested lists.
    It's useful to group data together in the same series.

  • Add Explorer.Series.mode/2 to get the most common value(s) of the series.

  • Add split/2 and join/2 to the Explorer.Series module.
    These functions are useful to split string series into {:list, :string},
    or to join parts of a {:list, :string} and return a :string series.

  • Expose ddof option for variance, covariance and standard deviation.

  • Add a new {:f, 32} dtype to represent 32 bits float series.
    It's also possible to use the atom :f32 to create this type of series.
    The atom :f64 can be used as an alias for {:f, 64}, just like the
    :float atom.

  • Add lengths/1 and member?/2 to Explorer.Series.
    These functions work with {:list, any()}, where any() is any valid dtype.
    The idea is to count the members of a "list" series, and check if a given
    value is member of a list series, respectively.

  • Add support for streaming parquet files from a lazy dataframe to AWS S3
    compatible services.

Changed

  • Remove restriction on pivot_wider dtypes.
    In the early days, Polars only supported numeric dtypes for the "first"
    aggregation. This is not true anymore, and we can lift this restriction.

  • Change :float dtype to be represented as {:f, 64}. It's still possible
    to use the atom :float to create float series, but now Explorer.Series.dtype/1
    returns {:f, 64} for float 64 bits series.

Fixed

  • Add missing implementation of Explorer.Series.replace/3 for lazy series.

  • Fix inspection of DFs and series when limit: :infinity is used.

Removed

  • Drop support for the riscv64gc-unknown-linux-gnu target.

    We decided to stop precompiling to this target because it's been hard to maintain it.
    Ideally we should support it again in the future.

Pull requests

New Contributors

Full Changelog: v0.7.1...v0.7.2
Official Changelog: https://github.com/elixir-explorer/explorer/blob/main/CHANGELOG.md

Checksums

The list below if the SHA256 checksums of the precompiled artifacts.

363e9c8ecd92f2d7ff19cc977ab8fafbed8f5b5f4a9c483d98bb7441b469c5c2  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-gnu--legacy_cpu.dll.tar.gz
abb960b51f56e76d594554c1f7cd082de195a64a04f5795c75ced6126c4b66a5  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
d9c5e084f22dc2fc3a4ef808e840c192109e4cd919054f0111f7f1e4f52b97b3  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-msvc--legacy_cpu.dll.tar.gz
96a59887aff5e62b4838fb8d7189ac21b7a39bf6caae11985d8e2daea2d99f15  explorer-v0.7.2-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
d5e384c292fca48941cd5400cf900fa39117f9cc187bfbc1d252d3a72798cd07  libexplorer-v0.7.2-nif-2.15-aarch64-apple-darwin.so.tar.gz
52c4455faec0c12789ecf2fb287f89b4f8350728092fa9de39ca24d1203d3daa  libexplorer-v0.7.2-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
3524ebf3c73246eff8d3fb556786f30d063402cc212a83e3abaceae9f2ff86c5  libexplorer-v0.7.2-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
429ceebb5b7f465c66a6becc1e454ef7383fc58f6cfe081bf71fdd94eb55655b  libexplorer-v0.7.2-nif-2.15-x86_64-apple-darwin.so.tar.gz
029bb5fc6e102449260b706655428cd3ef02b36bd74246375e17897c1a26815d  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-freebsd--legacy_cpu.so.tar.gz
d2075a364a23fc7911b3099716505d8d0df69af1aab944e527c169a96fa2ef58  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
86b9e9b671c46cb90d0d4bf7b0f2998e59f8715d6f13fde29f8070ad756da648  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-gnu--legacy_cpu.so.tar.gz
fd4e095fafa0055619a49383bd8680abd6639e3d5fc114dd246e0192bcccb5e8  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
ad3779850c36bf0ff3ca568124da8966cca25ce84912642dc13462cb9ca5a9dd  libexplorer-v0.7.2-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz