Add file caching to Repository.get_file and get_file_exists.
Review Request #4083 — Created April 28, 2013 and submitted — Latest diff uploaded
Add file caching to Repository.get_file and get_file_exists. We had file caching before in the diffutils code before fetching a file. This ensured that we would only have to perform this operation once per path + revision (so long as it stayed in the cache). That caching support has moved into Repository.get_file directly, meaning that any call sites will get the cached copy, and not just those going through diff code. Along with this, I've finally added caching for get_file_exists. Prior to this, if you uploaded a diff touching files more than once, we'd check (and possibly fetch) those files over and over, which could be expensive (and really hit the rate limits hard for GitHub, in some cases). The get_file_exists caching will first check to see if we've already determined if the file exists. If not, it will check to see if we have actually fetched the file (using get_file) before, and if so, we consider the file to exist. Otherwise, we get it from the repository. Either way, we store that the file was fetched, for future lookups.
Posted a change twice, locally. The first time, I saw the file accesses. The second time, none, and it was a faster post. I then cleared the cache, viewed the diff, and then posted again. Once again, it was fast, and it didn't perform any lookups, due to having had the fetched files in the cache. All unit tests (including the new ones to test for caches and signals) pass.