-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R][C++] Homebrew nightly test failing with illegal opcode #36685
Comments
I think the ticket with the most thorough investigation of these issues is #31132 I will try and remember / summarize the discussion at the time. The crux of the matter is our parquet simd dispatch arguably violates the ODR. In general, we get away with this. Multiple copies of the symbol are linked in and the appropriate one is chosen via tag dispatch. In homebrew, the MinSizeRel build type is used. For whatever reason, we don't get away with our ODR violation in this case. The compiler / linker collapse all the copies with the same name into one implementation and that implementation has AVX2 instructions. At the time, we discussed several more thorough solutions but none of them were taken due to a fear of performance regression and, probably, a lack of energy at the time for solving this particular problem. We ended with #14342 which overrides MinSizeRel for the files that have multiple symbols. This seemed to work ok. Note that 474b1a5 seems to have modified the same area that #14342 did. It's possible we ended up reverting our change accidentally. |
Why not disable runtime SIMD dispatch by passing |
### Rationale for this change Summary of this problem: #31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in #36583. The solution we chose by #14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by #14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: #31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: #36685 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
The homebrew-r-brew job is failing with
address 0x1060b0219, cause 'illegal opcode'
. This seems to be a recurring problem with using Homebrew and has come up a number of times (#33807, #28343, #14826 ). I suspect - but don't know - that this could be related to an M1 Mac running an x86 binary.I don't believe this is an R-specific problem; rather, I think that the homebrew-r-brew nightly is one of the more comprehensive tests of that particular C++ package. For example, this was also exposed by GDAL's tests on Homebrew. If it is an R-specific problem then I would suggest that the solution is to remove the nightly test: linking to homebrew Arrow is difficult to achieve in practice and isn't something I feel compelled to support.
Link to failure: https://github.com/ursacomputing/crossbow/actions/runs/5540897821/jobs/10113586586#step:8:20367
Output:
Component(s)
C++, R
The text was updated successfully, but these errors were encountered: