Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ccache with sccache #10146

Merged
merged 15 commits into from
Feb 3, 2022

Conversation

ajschmidt8
Copy link
Member

This PR replaces ccache with sccache.

This PR replaces `ccache` with `sccache`.
@ajschmidt8 ajschmidt8 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 27, 2022
@AyodeAwe
Copy link
Contributor

rerun tests

1 similar comment
@ajschmidt8
Copy link
Member Author

rerun tests

@codecov
Copy link

codecov bot commented Jan 27, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.04@57ac8c4). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-22.04   #10146   +/-   ##
===============================================
  Coverage                ?   10.48%           
===============================================
  Files                   ?      122           
  Lines                   ?    20496           
  Branches                ?        0           
===============================================
  Hits                    ?     2148           
  Misses                  ?    18348           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57ac8c4...939f906. Read the comment docs.

@ajschmidt8
Copy link
Member Author

pre-cache C++ build time (link to log): 1:34:39.5

image

@ajschmidt8
Copy link
Member Author

rerun tests

@davidwendt
Copy link
Contributor

davidwendt commented Jan 28, 2022

Is there any way to get the ccache hit results for the build. We report that in the Build Metrics Report.
The report values are generated here:

cudf/build.sh

Lines 187 to 194 in b7aa47f

# get the current count before the compile starts
FILES_IN_CCACHE=""
if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v ccache)" ]]; then
FILES_IN_CCACHE=$(ccache -s | grep "files in cache")
echo "$FILES_IN_CCACHE"
# zero the ccache statistics
ccache -z
fi

These can be helpful in understanding the results.

@ajschmidt8
Copy link
Member Author

Is there any way to get the ccache hit results for the build. We report that in the Build Metrics Report. The report values are generated here:

These can be helpful in understanding the results.

@davidwendt, I had this on my list of things to investigate 🙂. You beat me to it. sccache does have a --show-stats (-s) flag as well. The output it a little bit different than ccache -s. I've posted the output for each below.

ccache -s
cache directory                     /root/.cache/ccache
primary config                      /root/.config/ccache/ccache.conf
secondary config (readonly)         /etc/opt/conda/ccache.conf
cache hit (direct)                     0
cache hit (preprocessed)               0
cache miss                             0
cache hit rate                      0.00 %
cleanups performed                     0
files in cache                         0
cache size                           0.0 kB
max cache size                       5.0 GB
sccache -s
Compile requests                      0
Compile requests executed             0
Cache hits                            0
Cache misses                          0
Cache timeouts                        0
Cache read errors                     0
Forced recaches                       0
Cache write errors                    0
Compilation failures                  0
Cache errors                          0
Non-cacheable compilations            0
Non-cacheable calls                   0
Non-compilation calls                 0
Unsupported compiler calls            0
Average cache write               0.000 s
Average cache read miss           0.000 s
Average cache read hit            0.000 s
Failed distributed compilations       0
Cache location                  S3, bucket: Bucket(name=rapids-sccache, base_url=http://rapids-sccache.s3-us-west-2.amazonaws.com/)

There is also a --zero-stats (-z) flag:

All sccache commands
# sccache
sccache: No command specified
sccache 0.2.15

USAGE:
    sccache [FLAGS] [OPTIONS] [cmd]...

FLAGS:
        --dist-auth       authenticate for distributed compilation
        --dist-status     show status of the distributed client
    -h, --help            Prints help information
    -s, --show-stats      show cache statistics
        --start-server    start background server
        --stop-server     stop background server
    -V, --version         Prints version information
    -z, --zero-stats      zero statistics counters

OPTIONS:
        --package-toolchain <executable> <out>    package toolchain for distributed compilation
        --stats-format <stats-format>
            set output format of statistics [default: text]  [possible values: text, json]


ARGS:
    <cmd>...    

Enabled features:
    S3:        true
    Redis:     true
    Memcached: true
    GCS:       true
    Azure:     true

So it seems that we should be able to get the same information from sccache by just updating the commands to use sccache instead of ccache and changing the grep arguments. Does that sound accurate?

@ajschmidt8
Copy link
Member Author

pre-cache C++ build time (link to log): 1:34:39.5

post-cache C++ build time (link to log): 0:05:36.4

image

@davidwendt
Copy link
Contributor

So it seems that we should be able to get the same information from sccache by just updating the commands to use sccache instead of ccache and changing the grep arguments. Does that sound accurate?

I'm not sure. Are the cache-hits from a local cache for the build? That is, are the sscache statistics from the server's cache or are the statistics local to the current build?

@ajschmidt8
Copy link
Member Author

So it seems that we should be able to get the same information from sccache by just updating the commands to use sccache instead of ccache and changing the grep arguments. Does that sound accurate?

I'm not sure. Are the cache-hits from a local cache for the build? That is, are the sscache statistics from the server's cache or are the statistics local to the current build?

The stats are local to the current build. sccache has the capability to set up a shared server, but in the absence of one, it spins up its own for each local build. The short article below talks about this for reference.

https://github.com/mozilla/sccache/blob/master/docs/Jenkins.md

@ajschmidt8
Copy link
Member Author

@davidwendt, I pushed commit ac47436 with the changes we discussed. Please review when you have a minute.

@davidwendt
Copy link
Contributor

@ajschmidt8
Copy link
Member Author

The cache hit rate looks a little strange in the build metrics report here https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/7537/Build_20Metrics_20Report/

yeah, interesting. I'll debug.

@ajschmidt8
Copy link
Member Author

The cache hit rate looks a little strange in the build metrics report here https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/7537/Build_20Metrics_20Report/

Just pushed an update. As it turns out, sccache -s outputs different fields based on whether or not there is any data. So my original grep pattern didn't work.

"sccache -s" with stats
Compile requests                     610
Compile requests executed            610
Cache hits                           609
Cache hits (C/C++)                   284
Cache hits (CUDA)                    325
Cache misses                           1
Cache misses (CUDA)                    1
Cache timeouts                         0
Cache read errors                      0
Forced recaches                        0
Cache write errors                     0
Compilation failures                   0
Cache errors                           0
Non-cacheable compilations             0
Non-cacheable calls                    0
Non-compilation calls                  0
Unsupported compiler calls             0
Average cache write                0.182 s
Average cache read miss           66.172 s
Average cache read hit             0.153 s
Failed distributed compilations        0
Cache location                  S3, bucket: Bucket(name=rapids-sccache, base_url=http://rapids-sccache.s3-us-west-2.amazonaws.com/)

"sccache -s" with no stats
Compile requests                      0
Compile requests executed             0
Cache hits                            0
Cache misses                          0
Cache timeouts                        0
Cache read errors                     0
Forced recaches                       0
Cache write errors                    0
Compilation failures                  0
Cache errors                          0
Non-cacheable compilations            0
Non-cacheable calls                   0
Non-compilation calls                 0
Unsupported compiler calls            0
Average cache write               0.000 s
Average cache read miss           0.000 s
Average cache read hit            0.000 s
Failed distributed compilations       0
Cache location                  S3, bucket: Bucket(name=rapids-sccache, base_url=http://rapids-sccache.s3-us-west-2.amazonaws.com/)

@davidwendt
Copy link
Contributor

Ok. This looks great: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/7548/Build_20Metrics_20Report/

As an experiment, you can temporarily remove the word static from this line

static constexpr uint32_t DEFAULT_HASH_SEED = 0;
and push a commit to this PR. This change causes a recompile of almost everything. We should see a cache-hit rate of less than 10%.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 1, 2022
@ajschmidt8
Copy link
Member Author

Ok. This looks great: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/7548/Build_20Metrics_20Report/

As an experiment, you can temporarily remove the word static from this line

and push a commit to this PR. This change causes a recompile of almost everything. We should see a cache-hit rate of less than 10%.

done in d3dac36

@davidwendt
Copy link
Contributor

This looks great: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-cpu-cuda-build/CUDA=11.5/7552/Build_20Metrics_20Report/
Don't forget to undo the temporary change to types.hpp.

@github-actions github-actions bot removed the libcudf Affects libcudf (C++/CUDA) code. label Feb 1, 2022
@ajschmidt8 ajschmidt8 marked this pull request as ready for review February 2, 2022 17:29
@ajschmidt8 ajschmidt8 requested a review from a team as a code owner February 2, 2022 17:29
@ajschmidt8 ajschmidt8 added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Feb 2, 2022
@ajschmidt8
Copy link
Member Author

PR is ready for review, but we'll wait to merge until all the other PRs are confirmed working as well.

@ajschmidt8 ajschmidt8 removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Feb 3, 2022
@ajschmidt8
Copy link
Member Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4e58850 into rapidsai:branch-22.04 Feb 3, 2022
@ajschmidt8 ajschmidt8 deleted the new-build-process branch February 3, 2022 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants