Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Efficient client management to support multiple OpenSearch clusters #1499

Closed
4 of 6 tasks
zhongnansu opened this issue Apr 26, 2022 · 1 comment
Closed
4 of 6 tasks
Assignees
Labels
dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project

Comments

@zhongnansu
Copy link
Member

zhongnansu commented Apr 26, 2022

Problem Statement

As part of #1388, to support Dashboards connecting to multiple OpenSearch Cluster, the thread discussed adding a new saved object datasource, and referenced to the data model of index-pattern. There's also a proposed POC PR.

There's an issue with the POC code. When it establishes the connection with external datasource, it was using the @opensearch-js client in the search strategy.

const client = new Client({
node: url,
auth: {
username,
password,
}
});

This is not resource efficient since it creates a connection for each and every query. We need better approach to do client management.

Proposed Solution [WIP]

We can take a look at how clients are managed in Dashboards core. The lifecycle starts when Dashboards-core plugin is spinning up, it creates 2 client and registers to the core context to expose to other modules through http services. It can spawn child clients as needed(AsScopedClient) from auth in request, which returns a new client instance that shares the connection pool with the parent client. If you call close in any of the parent/child clients, every client will be closed. The connections close when Dashboards stops.

We may consider adding one more client only for external data source, to the ClusterClient class. And wire our logic to spawn child clients as needed, from datasource. Similar to asScoped function.

export class ClusterClient implements ICustomClusterClient {
public readonly asInternalUser: Client;
private readonly rootScopedClient: Client;

asScoped(request: ScopeableRequest) {
const scopedHeaders = this.getScopedHeaders(request);
const scopedClient = this.rootScopedClient.child({
headers: scopedHeaders,
});
return new ScopedClusterClient(this.asInternalUser, scopedClient);
}

POC is needed and I am currently working on it

Changes need to make

  • 1. Along with asCurrentUser and asInternalUser, define a new interface asDataSourceUser, for context.core.client to call
  • 2. Create the third client in cluster_client only for data Source
  • 3. Register above client to core context
  • 4. import save_object client to the core Context, in order to retrieve data source meta info, that will be used to establish connection
  • 5. adjust client config as needed(header, ssl). May need add new parsing logic specific for dataSourceClient
  • *code style adjustment. Expose clean interfaces
@zhongnansu zhongnansu self-assigned this Apr 26, 2022
@zhongnansu zhongnansu changed the title [POC Proposal] Efficient client management to support multiple OpenSearch clusters [POC] Efficient client management to support multiple OpenSearch clusters Apr 26, 2022
@zhongnansu zhongnansu added the enhancement New feature or request label Apr 26, 2022
@zhongnansu zhongnansu changed the title [POC] Efficient client management to support multiple OpenSearch clusters [PoC] Efficient client management to support multiple OpenSearch clusters Apr 26, 2022
@zhongnansu
Copy link
Member Author

zhongnansu commented May 9, 2022

Results

My poc code PR zengyan-amazon#2. it's based on the poc did by @zengyan-amazon (Issue #1388 / PR #1499), which have the data sources added and use plain Client for each request to data source.

However, step 5 is blocked.

  1. adjust client config as needed(header, ssl). May need add new parsing logic specific for dataSourceClient

Basically the connection failed connecting to the test endpoint I give, instead routing the request to localhost. It seems that certain ConnectionPool options is not allowed to configure while creating client.child(), for example Connection. It was documented in Elastic

You can pass to the child every client option you would pass to a normal client, but the connection pool specific options (ssl, agent, pingTimeout, Connection, and resurrectStrategy).

Next Step(propose new solutions)

  1. Implement a client pooling strategy that keeps a number of clients(e.g. 200), maintain LRU cache to manage. Within each client scope, we can still use .child() to handler different user on the same endpoint.
  2. Spend some time exploring opensearch-js client lib, to see if by making minor change we can enable node(endpoint) option for child() instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project
Projects
None yet
Development

No branches or pull requests

1 participant