Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Provide mechanism to cache manifest file content #4518

Merged
merged 7 commits into from
Sep 26, 2022

Conversation

rizaon
Copy link
Contributor

@rizaon rizaon commented Apr 6, 2022

This is a draft PR for ManifestCache implementation.

The ManifestCache interface closely follow com.github.benmanes.caffeine.cache.Cache interface, with addition of specifying maxContentLength(). If stream length is longer than maxContentLength(), AvroIO will skip caching it to avoid memory pressure. An example of implementation, CaffeineManifestCache, can be seen in TestAvroFileSplit.java.

I will tidy this up and add documentations as we go. Looking forward for any feedback.

Closes #4508

@rizaon
Copy link
Contributor Author

rizaon commented Apr 7, 2022

485e2ab Switch ManifestCache class to use Guava Cache instead. This is easier to do in iceberg-api since we can simply include the Guava cache classes into the shadow jar of iceberg-bundled-guava.

@rizaon rizaon marked this pull request as ready for review April 8, 2022 01:02
@rdblue
Copy link
Contributor

rdblue commented Apr 10, 2022

Switching to use more Guava classes is probably not a good idea.

@rizaon, do you have use cases where this has helped? If so, what were they? This adds quite a bit of complexity and memory overhead that we purposely avoided up to now so I want to make sure it is worth the change.

Have you considered adding a caching FileIO instance? That could be used for more use cases than just manifest files and job planning. For example, a FileIO read-through cache could cache delete files that might be reused for tasks. This could be configured by file path, making it easy to determine what to cache. Plus, we could use more options than just in-memory, like a distributed cache or local disk.

@rizaon
Copy link
Contributor Author

rizaon commented Apr 11, 2022

Hi @rdblue, thank you for your feedback.

We found a slow query compilation issue against the Iceberg table in our recent Apache Impala build. Impala uses Iceberg's HiveCatalog and HadoopFileIO instance with an S3A input stream to access data from S3. We did a full 10 TB TPC-DS benchmark and found that query compilation can go for several seconds, while it used to be less than a second with native hive tables. This slowness in single query compilation is due to the requirement to call planFiles several times, even for scan nodes targetting the same table. We also see several socket read operations that spend hundreds of milliseconds during planFiles, presumably due to S3A HTTP HEAD request overhead and backward seek overhead (issue #4508). This is especially hurt fast-running queries.

We tried this caching solution and it help speed up Impala query compilation almost 5x faster compared to without on Iceberg tables. Our original solution, however, is to put a Caffeine cache as a singleton in AvroIO.java. I thought it is better to supply the cache from outside.

I have not considered the solution of adding a caching FileIO instance. I'm pretty new to the Iceberg codebase but interested to follow up on that if it can yield a better integration. Will it require a new class of Catalog/Table as well, or can we improve on the existing HiveCatalog & HadoopFileIO?

@rizaon
Copy link
Contributor Author

rizaon commented Apr 11, 2022

A relevant Impala's JIRA is here:
https://issues.apache.org/jira/browse/IMPALA-11171

@rdblue
Copy link
Contributor

rdblue commented Apr 11, 2022

@rizaon, caching in the FileIO layer would be much more general. You could do things like detect that the file size is less than some threshold and cache it in memory, or detect file names under metadata and cache those. I highly recommend looking into the FileIO API!

@rizaon
Copy link
Contributor Author

rizaon commented Apr 14, 2022

c032b7a implement caching as a new FileIO class, CachingHadoopFileIO. A new Tables class, CachingHadoopTables, is also added to assist with testing.

We tried to avoid lazyStats() call as much as possible by checking the cache first before checking for stream length. This comes with a risk of keeping stale data in memory when the actual file is already deleted, as shown in CachingHiveTableTest.testMissingMetadataWontCauseHang(). This is probably fine, given the immutable nature of iceberg metadata. Otherwise, application should invalidate cache entry first to force re-read / lazyStats() call.

@rizaon
Copy link
Contributor Author

rizaon commented Apr 18, 2022

Hello @rdblue. This PR is ready for review.

Please let me know if there is any new feedback or request. I'm happy to follow up.

*/
public class CachingFileIO implements FileIO, HadoopConfigurable {
private static final Logger LOG = LoggerFactory.getLogger(CachingFileIO.class);
private static ContentCache sharedCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to pass in the cache rather than making it static?

@jackye1995, do you have any ideas here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remain static in 47b8008. Please let me know if there is a better way to access the cache from Catalog object or outside.

@rdblue
Copy link
Contributor

rdblue commented Apr 24, 2022

@rizaon, there are a couple of PRs that should help you with this. #4608 adds IOUtil.readFully and IOUtil.readRemaining that you can use to fill buffers from an input stream. I've also opened #4623 to add the ByteBufferInputStream classes I was talking about. Those should allow you to easily split the data across several small allocations and wrap them in an InputStream implementation.

@rizaon
Copy link
Contributor Author

rizaon commented Apr 26, 2022

Got it, thank you. Will rebase and update this PR once they are merged.

CatalogProperties.IO_MANIFEST_CACHE_MAX_TOTAL_BYTES_DEFAULT);
}

public static long cacheMaxContentLength(FileIO io) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be package level protection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@danielcweeks
Copy link
Contributor

Hey @rizaon I think this is getting real close now (thanks for sticking with us on this). Now that we've gotten this far, I see there's a little more we can do to consolidate the logic in ContentCache. For example, if you pull the property configuration into ContentCache constructor, we can remove a lot from ManifestFiles. For example:

 public ContentCache(FileIO fileio) {
    this.fileio = fileio;

    long expireAfterAccessMs = PropertyUtil.propertyAsLong(
        fileio.properties(),
        CatalogProperties.IO_MANIFEST_CACHE_EXPIRATION_INTERVAL_MS,
        CatalogProperties.IO_MANIFEST_CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);

    long maxTotalBytes = PropertyUtil.propertyAsLong(
        fileio.properties(),
        CatalogProperties.IO_MANIFEST_CACHE_MAX_TOTAL_BYTES,
        CatalogProperties.IO_MANIFEST_CACHE_MAX_TOTAL_BYTES_DEFAULT);

    long maxContentLength = PropertyUtil.propertyAsLong(
        fileio.properties(),
        CatalogProperties.IO_MANIFEST_CACHE_MAX_CONTENT_LENGTH,
        CatalogProperties.IO_MANIFEST_CACHE_MAX_CONTENT_LENGTH_DEFAULT);

    ValidationException.check(expireAfterAccessMs >= 0, "expireAfterAccessMs is less than 0");
    ValidationException.check(maxTotalBytes > 0, "maxTotalBytes is equal or less than 0");
    ValidationException.check(maxContentLength > 0, "maxContentLength is equal or less than 0");
    this.expireAfterAccessMs = expireAfterAccessMs;
    this.maxTotalBytes = maxTotalBytes;
    this.maxContentLength = maxContentLength;

You can also pull CONTENT_CACHES, contentCache(FileIO io), and dropCache(FileIO fileIO) into ContentCache as well.

I played around with moving more of the logic into ContentCache and was able to move everything (minus the newInputFile) logic into ContentCache, which I feel is a cleaner separation of concerns.

@rizaon
Copy link
Contributor Author

rizaon commented Sep 23, 2022

Hi @danielcweeks that looks like a nice cleanup. But I have 2 concerns:

  1. By pushing the config verification down to ContentCache constructor, that will make ContentCache less generic and more specific towards manifest caching use. Is that OK?
  2. If we keep fileio as a class member of ContentCache, that becomes a strong reference to fileio. Thus, the weakKeys() in CONTENT_CACHES is not useful anymore because there is always a strong reference to that FileIO key living in ContentCache object (the value)?

@danielcweeks
Copy link
Contributor

Hi @danielcweeks that looks like a nice cleanup. But I have 2 concerns:

  1. By pushing the config verification down to ContentCache constructor, that will make ContentCache less generic and more specific towards manifest caching use. Is that OK?
  2. If we keep fileio as a class member of ContentCache, that becomes a strong reference to fileio. Thus, the weakKeys() in CONTENT_CACHES is not useful anymore because there is always a strong reference to that FileIO key living in ContentCache object (the value)?

Good points. Let me take one more pass (I was just hoping to consolidate more of the logic in ContentCache, but you're probably right that it makes more sense to keep the manifest settings with ManifestFiles).

@rizaon rizaon force-pushed the master-ManifestCache branch from 4175090 to ba7329d Compare September 24, 2022 00:29
@rizaon
Copy link
Contributor Author

rizaon commented Sep 24, 2022

In the last github check runs, there was a checkstye issue and exception handling when given FileIO does not implement properties() method.

I have rebased this PR and add ba7329d to fix this issue.

@danielcweeks danielcweeks merged commit 6c7e8d1 into apache:master Sep 26, 2022
@danielcweeks
Copy link
Contributor

@rizaon Thanks for your contribution here and keeping this updated! I'm excited to see how this works out.

@rizaon
Copy link
Contributor Author

rizaon commented Sep 26, 2022

@danielcweeks thank you for accepting this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide mechanism to cache manifest file content across several planFiles call
6 participants