Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cython cache diff compare #13526

Closed
wants to merge 8 commits into from
Closed

Cython cache diff compare #13526

wants to merge 8 commits into from

Conversation

nparley
Copy link
Contributor

@nparley nparley commented Jun 28, 2016

As talked about in #13425 with @gfyoung and @jreback this PR does cython caching by comparing the pyx files and not relying on the git history.

@nparley
Copy link
Contributor Author

nparley commented Jun 28, 2016

At the moment this code does the follow:

  1. Caches all the pyx files at the end of a run under $HOME/.cache/pyxfiles keeping the directory structure incase two cython files have the same name but in different directories.
  2. Caches all the corresponding compiled files for those pyx files.

then at the start of a run:

  1. Finds all the current pyx files in the checkout git repo on travis
  2. For each file compares it to the file that is in the cache
  3. If the file does not exist in cache a new cython file has been added => Do not use cython cache and rebuild
  4. If the file is different from the cython file in cache => Do not use cython cache and rebuild
  5. If the number of pyx files are different then a file has been deleted (a rename will be picked up by 3) => Do not use cython cache and rebuild
  6. If all files are the same => Use cython cache, i.e. extract and touch the compiled pyx files.

The thing this does not do is the case where a file exists on the cache but has been deleted from the repo. Do you think this should be caught as well or should it not matter? I guess a compiled file would be copied over but then just not referenced or used in the build.

@codecov-io
Copy link

codecov-io commented Jun 28, 2016

Current coverage is 84.42%

Merging #13526 into master will increase coverage by 0.08%

@@             master     #13526   diff @@
==========================================
  Files           138        138          
  Lines         51111      51349   +238   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43107      43350   +243   
+ Misses         8004       7999     -5   
  Partials          0          0          

Powered by Codecov. Last updated by e8e0aae...8b4ad0b

@nparley
Copy link
Contributor Author

nparley commented Jun 29, 2016

Tests to perform:

  • If pyx files have not changed then cached cython cache is reused
  • If a pyx file is changed the cython cache is not used
  • If a pyx file is added the cython cache is not used
  • If a pyx file is deleted the cython cache is not used
  • Check a pyx file change when rewritting git history i.e. cache is not used

clear_cache=0

pyx_files=`echo "$pyx_file_list" | wc -l`
pyx_cache_files=`echo "$pyx_cache_file_list" | wc -l`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not work. you would need to compare .pyx file sizes I think to make this reliable. Just adding / removing a file is not good enough at all. Most often we have a change in actual contents.

Copy link
Contributor Author

@nparley nparley Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actually checking is done below. diff -u $i $PYX_CACHE_DIR${i} for each pyx file . This code is only here to check no pyx files have been deleted in full.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i c.

@jreback jreback added the Build Library building on various platforms label Jun 29, 2016
@jreback jreback added this to the 0.18.2 milestone Jun 29, 2016
git diff HEAD~2 --numstat | grep -E "pyx|pxd"
retval=$(git diff HEAD~2 --numstat | grep -E "pyx|pxd"| wc -l)
echo "Not a PR"
# Uncomment next 2 lines to turn off cython caching not in a PR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you meaning to leave in these comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it; as a quick way of turning off caching of cython files in the future if something goes wrong. But happy to take them out also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that's fine then.

@jreback
Copy link
Contributor

jreback commented Jun 29, 2016

@nparley ok let's try this. ping when green / ready.

@nparley
Copy link
Contributor Author

nparley commented Jun 29, 2016

There is an interesting gocha with the way travis checks out commits. On a branch travis actually checks out the hash. However, for a PR travis just checks out the latest merge. Therefore if you push lots of changes quickly before travis has started (like I just did early), then builds might not actually reflect the push. (They might all be testing the last push).

@nparley
Copy link
Contributor Author

nparley commented Jun 29, 2016

Travis is having problems with Mac OSx runs at the moment. I will not do any more until they fix it.

nparley added 4 commits July 1, 2016 13:59
Delete pyx file

Add pyx file

Change pyx file

Undo pyx change

Un test commit

Test commit after ci change

Change a pyx file

Bug fix
@nparley
Copy link
Contributor Author

nparley commented Jul 1, 2016

@jreback OK this is ready to go.

@jorisvandenbossche
Copy link
Member

Looks good to me!

@jreback jreback closed this in 449e824 Jul 3, 2016
@jreback
Copy link
Contributor

jreback commented Jul 3, 2016

thanks @nparley

let's give this a watch over next few days to make sure its behaving.

@nparley nparley deleted the pyx-diff branch August 15, 2016 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants