Skip to content

Commit

Permalink
Fix segfault in release build with GCC 5. (#419)
Browse files Browse the repository at this point in the history
* Use GCC 6 and GCC 7 in GitHub actions to prevent segfault.

GCC 5 will cause segfault in Release build.

* Switch to GCC 5.

Disable inline optimization of `CudaStreamOverride`.

* Move Push and Pop of CudaStreamOverride to context.cu

It prevents the compiler from inlining them.
  • Loading branch information
csukuangfj authored Nov 27, 2020
1 parent 57a3bc6 commit d376902
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 29 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
matrix:
os: [ubuntu-16.04, ubuntu-18.04]
cuda: ["10.1", "10.2"]
gcc: ["5", "6"]
gcc: ["5"]
torch: ["1.6.0", "1.7.0"]
python-version: [3.6, 3.7, 3.8]

Expand Down
36 changes: 23 additions & 13 deletions .github/workflows/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,11 @@ jobs:
strategy:
matrix:
os: [ubuntu-18.04]
cuda: ["10.1", "10.2"]
cuda: ["10.1"]
gcc: ["5"]
torch: ["1.6.0", "1.7.0"]
torch: ["1.6.0"]
python-version: [3.6]
# build_type: ["Release", "Debug"]
#
# disable release build for github actions since it results
# in segfault which CANNOT be reproduced locally.
build_type: ["Debug"]
build_type: ["Release", "Debug"]

steps:
# refer to https://github.com/actions/checkout
Expand Down Expand Up @@ -56,6 +52,10 @@ jobs:
echo "CXX=/usr/bin/g++-${{ matrix.gcc }}" >> $GITHUB_ENV
echo "CUDAHOSTCXX=/usr/bin/g++-${{ matrix.gcc }}" >> $GITHUB_ENV
- name: Display GCC version
run: |
gcc --version
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
Expand All @@ -70,11 +70,10 @@ jobs:
torch: ${{ matrix.torch }}
shell: bash
run: |
python3 -m pip install --upgrade pip
python3 -m pip install wheel twine
python3 -m pip install bs4 requests tqdm
python3 -m pip install dataclasses graphviz
sudo apt-get install python-pydot python-pydot-ng graphviz
python3 -m pip install -qq --upgrade pip six
python3 -m pip install -qq bs4 requests tqdm
python3 -m pip install -qq dataclasses graphviz
sudo apt-get -qq install graphviz
./scripts/github_actions/install_torch.sh
python3 -c "import torch; print('torch version:', torch.__version__)"
Expand All @@ -94,12 +93,23 @@ jobs:
cmake -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} ..
cat k2/csrc/version.h
- name: Build and Run Tests
- name: ${{ matrix.build_type }} Build
shell: bash
run: |
echo "number of cores: $(nproc)"
cd build
# we cannot use -j here because of limited RAM
# of the VM provided by GitHub actions
make
- name: Display Build Information
shell: bash
run: |
export PYTHONPATH=$PWD/k2/python:$PWD/build/lib:$PYTHONPATH
python3 -m k2.version
- name: Run Tests
shell: bash
run: |
cd build
ctest --output-on-failure
4 changes: 2 additions & 2 deletions .github/workflows/style_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ on:

jobs:
build:
runs-on: ubuntu-latest
runs-on: ubuntu-18.04
strategy:
matrix:
python-version: [3.7, 3.8]
python-version: [3.7]

steps:
- uses: actions/checkout@v2
Expand Down
7 changes: 5 additions & 2 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ you still need to install PyTorch with CUDA support.

There are two ways to install k2 from pre-built wheel packages.

### (1) From PyPI using `pip install k2`
### (1) From PyPI using `pip install --pre k2`

The wheel packages on PyPI are built using torch==1.6.0+cu101 on Ubuntu 18.04.
If you are using other Linux systems, the pre-built wheel packages may NOT
Expand Down Expand Up @@ -150,10 +150,13 @@ make -j
ctest --parallel <JOBNUM>
```

If Valgrind is installed, you can check heap corruptions and memory leaks by
If `valgrind` is installed, you can check heap corruptions and memory leaks by

```bash
cd build
make -j
ctest -R <TESTNAME> -D ExperimentalMemCheck
```

**HINT**: You can install `valgrind` with `sudo apt-get install valgrind`
on Ubuntu.
11 changes: 11 additions & 0 deletions k2/csrc/context.cu
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,17 @@

namespace k2 {

void CudaStreamOverride::Push(cudaStream_t stream) {
stack_.push_back(stream);
stream_override_ = stream;
}

void CudaStreamOverride::Pop(cudaStream_t stream) {
K2_DCHECK(!stack_.empty());
K2_DCHECK_EQ(stack_.back(), stream);
stack_.pop_back();
}

RegionPtr NewRegion(ContextPtr context, std::size_t num_bytes) {
// .. fairly straightforward. Sets bytes_used to num_bytes, caller can
// overwrite if needed.
Expand Down
14 changes: 5 additions & 9 deletions k2/csrc/context.h
Original file line number Diff line number Diff line change
Expand Up @@ -374,18 +374,14 @@ class CudaStreamOverride {
else
return stream;
}
void Push(cudaStream_t stream) {
stack_.push_back(stream);
stream_override_ = stream;
}
void Pop(cudaStream_t stream) {
K2_DCHECK(!stack_.empty());
K2_DCHECK_EQ(stack_.back(), stream);
stack_.pop_back();
}

void Push(cudaStream_t stream);

void Pop(cudaStream_t stream);

CudaStreamOverride() : stream_override_(0x0) {}

private:
cudaStream_t stream_override_;
std::vector<cudaStream_t> stack_;
};
Expand Down
7 changes: 5 additions & 2 deletions scripts/github_actions/install_cuda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,11 @@ case "$cuda" in
filename=cuda_10.0.130_410.48_linux
;;
10.1)
url=https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.105_418.39_linux.run
filename=cuda_10.1.105_418.39_linux.run
# WARNING: there are bugs in
# https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.105_418.39_linux.run
# with GCC 7. Please use the following version
url=http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
filename=cuda_10.1.243_418.87.00_linux.run
;;
10.2)
url=http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
Expand Down

0 comments on commit d376902

Please sign in to comment.