Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to retrieve and cache vector files #62

Merged
merged 2 commits into from
Apr 12, 2021

Conversation

nickpeihl
Copy link
Member

@nickpeihl nickpeihl commented Mar 31, 2021

Fixes #60. The getVectorDataOfType function takes either a geojson or topojson argument and retrieves and caches the specified vector file from EMS.

This is similar to how the vector tile stylesheets are cached. For file layers we use the memoize function rather than the once function because the argument may differ between function calls.

Note: Reviewers may want to hide whitespace changes.

Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Just two comments for consideration.

constructor(config: FileLayerConfig, emsClient: EMSClient, proxyPath: string) {
super(config, emsClient, proxyPath);
this._config = config;
}

async getVectorDataOfType(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on changing this to FileLayer#getGeoJson and doing any format conversion automagically in here if necessary?

If the default format is topojson, ems-client still downloads topojson (because it is smaller over the wire), but performs the geojson conversion "automagically".

The reason for this is that if the conversion does not happen, the client (in this case Kibana), would still need to run the topojson->geojson conversion each time still. Conversion-cost is not that huge of a penalty, but nonetheless it is one. And if ems-client already introduces caching, it may as well cache the usable converted geojson result.

The reason I think we want to avoid making this cache format-dependent is how the suggestEMSTermJoin-function proposed here elastic/kibana#94969 might end up being used.

For example, imagine a client looping over all columns in an index-pattern and calling this function with a few sample-values each to "find" a joinable layer. Even though ems-client transparently caches the downloaded result (great!), the topojson->geojson conversion would still need to happen at each call, which in a tight loop will start to be noticeable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I like the idea of fully hiding the format logic behind the client, so it only outputs geojson features up. That way library users don't need to know or deal with TopoJSON and just consume a more familiar data representation.

@@ -37,11 +40,31 @@ type EMSFormats = EmsFileLayerFormatGeoJson | EmsFileLayerFormatTopoJson;
export class FileLayer extends AbstractEmsService {
protected readonly _config: FileLayerConfig;

private _getVectorDataOfType = _.memoize(
Copy link
Contributor

@thomasneirynck thomasneirynck Apr 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should push the caching into the instance of the EMSClient instead, rather than have a memoized result on each individual FileLayer object. The reason for this is that this cache, in theory, can grow with however many layers EMS-exposes. E.g. if a client just loops over all the FileLayer objects and gets the geojson, this will keep 60+ files in memory.

I'd consider using a single LRU-cache (https://www.npmjs.com/package/lru-cache), maybe with a size of 10, and expose it on the EMSClient.

e.g.

EMSClient#getGeoJsonFromCache(layerId)

and

EMSClient#cacheGeoJson(layerId)

FileLayer#getGeoJson can then just use these methods.

This cache is cleared with EMSClient#_invalidateSettings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I was wondering about that, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is over engineering, but what about using local storage for caching, so the state is saved between EMS client life cycles? A simple expiration logic would be enough since our assets don't change much.

I believe there are localstorage polyfills for node users and our own tests, but never used them though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is over engineering, but what about using local storage for caching, so the state is saved between EMS client life cycles? A simple expiration logic would be enough since our assets don't change much.

I believe there are localstorage polyfills for node users and our own tests, but never used them though.

I think it's better to retain a limited cache in memory rather than in the local storage. We may need to fix data in the service and I don't think we can invalidate user's locally stored data easily.

If we really wanted to have persistent caching, I think we would use Service Workers in the browser. There we can set TTLs and have native cross-browser support. But I don't think we use Service Workers anywhere in Kibana now and I don't want to add them without having a larger discussion with the team.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough 👍

@nickpeihl nickpeihl requested a review from thomasneirynck April 5, 2021 23:58
Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, this is great, and going to be a nice little win for any maps user.


describe('ems_client', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GeoJson retrieval to FileLayer
3 participants