Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[coresimd] extracts the no_std components into the coresimd crate #197

Merged
merged 7 commits into from
Nov 22, 2017

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Nov 20, 2017

  • Extracts the no_std components into the coresimd crate, which is unconditionally no_std (most of the library).
  • Re-exports the coresimd crate as stdsimd + std components (e.g. run-time feature detection for ARM)
  • Removes the dependency on the std::os::raw::c_void type by making those intrinsics take a *mut u8 instead.

The crates coresimd and stdsimd currently are not part of the same workspace because I got linker errors. We can fix this either in this PR or a subsequent one. Also, this PR broke Windows tests. PRs to this PR that fix this are welcome.

.gitignore Outdated
/coresimd/target
/stdsimd-test/target
/stdsimd-test/assert-instr-macro/target
/stdsimd-test/simd-test-macro/target
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a convention to ignore all paths containing target? */target/* ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you can just remove the leading / on the /target line near the top of the file

.travis.yml Outdated
@@ -25,14 +25,15 @@ matrix:
script: |
cargo install clippy
cargo clippy --all -- -D clippy-pedantic
cd coresimd
cargo clippy --all -- -D clippy-pedantic
Copy link
Contributor Author

@gnzlbg gnzlbg Nov 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo clippy --allonly works for the crates in a workspace, butcoresimdandstdsimd` are not in the same workspace yet: rust-lang/rust-clippy#2238


cargo fmt --all applies to all crates in the sub-directory, independently of whether they share the same workspace or not. I really don't know what the intended behavior is here, but I like cargo fmt's approach.

.travis.yml Outdated
allow_failures:
- env: RUSTFMT=On TARGET=x86_64-unknown-linux-gnu NO_ADD=1
- env: CLIPPY=On TARGET=x86_64-unknown-linux-gnu NO_ADD=1
install:
- if [ "$NO_ADD" == "" ]; then rustup target add $TARGET; fi

script:
- cargo generate-lockfile
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo generate-lockfile only really works if all the crates are in a workspace, so I have added write permissions to the docker runs.

documentation = "https://docs.rs/stdsimd"
homepage = "https://github.com/BurntSushi/stdsimd"
repository = "https://github.com/BurntSushi/stdsimd"
readme = "README.md"
keywords = ["std", "simd", "intrinsics"]
categories = ["hardware-support", "no-std"]
categories = ["hardware-support"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stdsimd always requires std now

@@ -18,6 +18,9 @@ is-it-maintained-issue-resolution = { repository = "BurntSushi/stdsimd" }
is-it-maintained-open-issues = { repository = "BurntSushi/stdsimd" }
maintenance = { status = "experimental" }

[dependencies]
coresimd = { version = "0.0.3", path = "coresimd/" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set the first version of coresimd to 0.0.3 to match that of stdsimd, we should release these together.

cargo clean
cargo build --target $target

rustdoc --target $target -o target/doc/$arch src/lib.rs --crate-name stdsimd --library-path target/$target/debug/deps
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now rustdoc needs a build of the coresimd to generate the docs for stdsimd. I haven't taken a detailed look at the docs, and haven't looked at the output of this at all. cargo doc seems to work just fine for stdsimd, wrapping the docs of both crates nicely, although some links on coresimd are still pointing to stdsimd.

ci/run-docker.sh Outdated
@@ -19,7 +19,7 @@ run() {
--env TARGET=$target \
--env FEATURES=$2 \
--env STDSIMD_TEST_EVERYTHING \
--volume `pwd`:/checkout:ro \
--volume `pwd`:/checkout \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not managed to get this to work any other way :(

The checkout directory is not readonly anymore because now building stdsimd requires building coresimd and the lock files are not generated a priori anymore.

ci/run.sh Outdated
cargo test --all --release --target $TARGET --features $FEATURES --verbose -- --nocapture
cd ..
cargo test --all --target $TARGET --features $FEATURES --verbose -- --nocapture
cargo test --all --release --target $TARGET --features $FEATURES --verbose -- --nocapture
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std crate feature is gone, and so are the two $FEATURES_STD runs.

However, cargo test --all only works for workspaces, so we need to run the tests in coresimd manually...

//! [simd_tracking_issue]: https://github.com/rust-lang/rust/issues/27731
//! [cfg_target_feature_issue]: https://github.com/rust-lang/rust/issues/29717
//! [simd_soundness_bug]: https://github.com/rust-lang/rust/issues/44367
//! [target_feature_impr]: https://github.com/rust-lang/rust/issues/44839
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs still need to be updated to reflect coresimd instead of stdsimd.

// x86/x86_64:
any(target_arch = "x86", target_arch = "x86_64")
)]
pub use runtime::{__unstable_detect_feature, __Feature};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In coresimd only x86's run-time is available, stdsimd just re-exports it.

/// Tests the `bit` of `x`.
pub const fn test(x: usize, bit: u32) -> bool {
x & (1 << bit) != 0
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is duplicated in coresimd and stdsimd...

CACHE.store(f(), Ordering::Relaxed);
}
bit::test(CACHE.load(Ordering::Relaxed), bit)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is duplicated in coresimd and stdsimd...

fn test_macros() {
assert!(cfg_feature_enabled!("sse"));
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is duplicated in coresimd and stdsimd...

@@ -424,7 +423,6 @@ pub fn detect_features() -> usize {

#[cfg(test)]
mod tests {
#[cfg(feature = "std")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std is always available when running coresimd's tests.

@@ -1257,7 +1257,7 @@ pub unsafe fn _mm256_shuffle_epi8(a: u8x32, b: u8x32) -> u8x32 {
/// # #![feature(cfg_target_feature)]
/// # #![feature(target_feature)]
/// #
/// # #[macro_use] extern crate stdsimd;
/// # #[macro_use] extern crate coresimd as stdsimd;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit lazy here because I don't know the best way to fix the examples so that they show coresimd in the docs of coresimd and stdsimd in the docs of stdsimd. So here I just changed as little as possible for now.

@@ -1572,7 +1571,7 @@ pub const _MM_HINT_NTA: i8 = 0;
#[cfg_attr(test, assert_instr(prefetcht1, strategy = _MM_HINT_T1))]
#[cfg_attr(test, assert_instr(prefetcht2, strategy = _MM_HINT_T2))]
#[cfg_attr(test, assert_instr(prefetchnta, strategy = _MM_HINT_NTA))]
pub unsafe fn _mm_prefetch(p: *const c_void, strategy: i8) {
pub unsafe fn _mm_prefetch(p: *const u8, strategy: i8) {
Copy link
Contributor Author

@gnzlbg gnzlbg Nov 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this intrinsic used c_void, but Intel specifies it as taking a const char*. Anyhow I did not change this to fix the API, but to remove the dependency on c_void...

The intrinsic is also untested, I've filled: #198

@@ -29,7 +28,7 @@ pub unsafe fn _mm_pause() {
#[inline(always)]
#[target_feature = "+sse2"]
#[cfg_attr(test, assert_instr(clflush))]
pub unsafe fn _mm_clflush(p: *mut c_void) {
pub unsafe fn _mm_clflush(p: *mut u8) {
Copy link
Contributor Author

@gnzlbg gnzlbg Nov 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel specifies that the _mm_clflush intrinsic takes a const void*. I don't know why it takes a *mut c_void, and am slightly uncomfortable with it taking a *mut u8 instead.

The tracking issue to fix this is: rust-lang/rust#36193

I've opened this issue #199 to track the intrinsics in which we break the API because of the dependency on c_void. This should block the stabilization of these intrinsics.

@alexcrichton
Copy link
Member

Looks like some appveyor tests may be failing? (I'd recommend removing --nocapture from CI to get some better output)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 20, 2017

@alexcrichton done (btw why does --nocapture produce worse output on windows?)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 20, 2017

So I've removed --nocapture but the output looks the same.

@alexcrichton
Copy link
Member

Oh --nocapture isn't windows specific, it, when passed, just means that the otuput of each test is garbled with all the other outputs, but now that it's not passed the output for each test is in its own delineated section.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 20, 2017 via email

@alexcrichton
Copy link
Member

So FWIW I just added a [workspace] directive to the root Cargo.toml and it worked? I wonder if the link error you got was maybe due to a stale target directory? Or maybe I'm running tests differently?

I'll test on Windows

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 20, 2017 via email

@alexcrichton
Copy link
Member

Aha, I've reproduced the 32-bit failure as well

@alexcrichton
Copy link
Member

Ok I think I've pushed a fix for workspaces, it was I believe just related to disabling tests for the procedural macros

@alexcrichton
Copy link
Member

That seems to have... magically fixed windows?

So I can't actually run the tests locally on Windows due to a faulting instruction. The test is failing on the xsaves test and instruction. The disassembly in visual studio when it faults leads with:

00007FF7EBEF990F  mov         edx,0FFFFFFFFh  
00007FF7EBEF9914  mov         eax,0FFFFFFFFh  
00007FF7EBEF9919  xsaves      [rsp+80h]  

where the faulting instruction is xsaves and RSP is 000000696F3FB580. So I think that's 64-byte aligned? The reported exception is:

Unhandled exception at 0x00007FF7EBEF9919 in coresimd-80b6104084dfdbe5.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF. occurred

I'm not really sure what's going on there... I wonder if maybe xsaves detection isn't working? I apparently have an i7-7700k, although I'm not sure if it actually supports xsaves or not. Do you know what's going on?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 21, 2017

So I think that's 64-byte aligned?

Yes.

I'm not really sure what's going on there... I wonder if maybe xsaves detection isn't working?

I don't think so, we currently only enable xsaves if the CPU supports AVX and XSAVES, and if the OS supports it (OSXSAVE). Also, Appveyor says that xsaves is not enabled, while on your machine it says that it is enabled, so I think run-time detection looks fine.

Do you know what's going on?

This probably means that the xsaves/xsaves64 tests were always broken.

Two things point in this direction:

  • The --nocapture results from a previous Appveyor build show that the test for xsaves returns false on Appveyor, and true on your machine, so these were not tested before on Windows.
  • I disabled xsaves tests on Intel SDE because... I am an idiot, so these are probably not tested on Linux either.

It is weird that you get an error saying that you are trying to read from 0xFFFF.... That looks surprisingly similar to the mask that XSAVES uses, but this mask is not a memory address; it is used to specify which registers should be saved, where 0xFFFF... just means "save all registers".

But IMO the simplest explanation is that they were broken already, and that the bug surfaced only on your machine, so can you comment them out and test if the xsavec tests work?

If so, I would move on with this and I'll try to fix the tests this week (I have some machines with xsaves enabled).

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 21, 2017

@alexcrichton can you post a cat /proc/cpuinfo of that cpu? (not sure how easy it is to do so from windows but maybe you happen to have dual boot?)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 21, 2017

rebased

@gnzlbg gnzlbg mentioned this pull request Nov 21, 2017
@alexcrichton
Copy link
Member

r=me, but looks like it needs a rebase (feel free to merge after rebasing)

@gnzlbg gnzlbg force-pushed the split branch 2 times, most recently from 965665c to c13fbc4 Compare November 21, 2017 23:03
* Enable a Cargo workspace for the repo
* Disable tests for proc-macro crates
* Move back to mounting source directory read-only
* Refactor test invocation to only test one crate with `--all`
@gnzlbg gnzlbg merged commit f3ee983 into rust-lang:master Nov 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants