Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L2 distance #439

Merged
merged 13 commits into from
Jan 18, 2023
Merged

L2 distance #439

merged 13 commits into from
Jan 18, 2023

Conversation

eddyxu
Copy link
Contributor

@eddyxu eddyxu commented Jan 18, 2023

Use std::arch provided by the stable Rust. We could use std::simd when it is ready for stable rust.

# On X86_64
RUSTFLAGS="-C target-feature=+avx2,+fma" cargo bench --bench distance
# On M1
cargo bench --bench distance
CPU Default Fallback Naive Arrow Impl (+AVX2)
AMD 5900X (X86_64) 204ms 1070ms 1457ms
Apple M1 1267ms (same as fallback) 1272 ms 1551ms

Note that, I did not hand write simd for M1 / Aarch64 yet, std::arch::aarch64 seems not stable at the moment. Also, no BLAS-based performance is tested so that there is no extra dependency for Lance.

@eddyxu eddyxu requested a review from changhiskhan January 18, 2023 09:07
@eddyxu eddyxu self-assigned this Jan 18, 2023
@eddyxu eddyxu added arrow Apache Arrow related issues benchmark rust Rust related tasks labels Jan 18, 2023

/// Euclidean Distance (L2) from a point to a list of points.
pub fn l2_distance(from: &Float32Array, to: &FixedSizeListArray) -> Result<Arc<Float32Array>> {
assert_eq!(from.len(), to.value_length() as usize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these stay in production? These panic right? Should these return Err in stead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not error, but bugs. Err is catchable & recoverable by the users, this is not the case?

pub fn l2_distance(from: &Float32Array, to: &FixedSizeListArray) -> Result<Arc<Float32Array>> {
assert_eq!(from.len(), to.value_length() as usize);
assert_eq!(to.value_type(), DataType::Float32);
assert_eq!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like an odd constraint no? shouldn't there be a non-optimized case that can deal with non-multiples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added a fallback to non SIMD route for odd length vectors.

@eddyxu eddyxu merged commit f7c3838 into main Jan 18, 2023
@eddyxu eddyxu deleted the lei/distances branch January 18, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Apache Arrow related issues benchmark rust Rust related tasks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants