Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

Closed
Tracked by #2003
seraphjiang opened this issue Jun 12, 2022 · 2 comments
Assignees
Labels
dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project v3.0.0

Comments

@seraphjiang
Copy link
Member

@zhongnan to fill in detail

@seraphjiang seraphjiang added dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project labels Jun 12, 2022
@seraphjiang seraphjiang changed the title Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource [MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource Jun 13, 2022
@zhongnansu
Copy link
Member

zhongnansu commented Jun 15, 2022

The solid truth is we need create multiple clients, which hold multiple connections in order to talk to multiple OpenSearch clusters. The problems are:

1. Where to initialize the client?

2. How to manage clients in an efficient way? (Out of discussion scope of this thread)

While my previous poc #1499 doesn't come up with a clear solution for question number 2, but it provides some insights on question number 1.

Let's first take a look at how default opensearch_strategy retrieve client to talk to OpenSearch. We can see that default search strategy will call the core api to create "child" clients to be used per request. But since .child() has connection pooling, it's still efficient.

context.core.opensearch.client.asCurrentUser.search(params),

export class ScopedClusterClient implements IScopedClusterClient {
constructor(
public readonly asInternalUser: OpenSearchClient,
public readonly asCurrentUser: OpenSearchClient
) {}
}

asScoped(request: ScopeableRequest) {
const scopedHeaders = this.getScopedHeaders(request);
const scopedClient = this.rootScopedClient.child({
headers: scopedHeaders,
});
return new ScopedClusterClient(this.asInternalUser, scopedClient);
}

To support multiple datasource, we also need a way to create and retrieve multiple clients. And there are 2 options

1. initialize client in data plugin - >search_strategy, same as poc code

const client = new Client({
node: url,
auth: {
username,
password,
}
});

2. initialize client in core, and expose core apis for modules to retrieve clients. Similar to what I did in the poc zengyan-amazon#2

The second approach is preferred for the following reasons.

  1. it follows the similar paradigm of default search strategy, it can be accessible from core by something similar to context.core.opensearch.dataSourceClient.asDataSourceUser.search(param)
  2. When other internal and external plugins wants to leverage multi datasource feature, it will have a much cleaner interface to just call cotext.core.dataSourceClient. Giving an example of how external OSD plugins is using default opensearch client
    https://github.com/opensearch-project/dashboards-reports/blob/73331021e8e496f82fa80c15d7258ae18fba319d/dashboards-reports/server/routes/reportDefinition.ts#L53
  3. Having it decoupled from data plugin enables us more flexibility when implementing features of clients, such as client pooling, or onboarding other types of datasources(that requires different types of clients). Because we decouple the search(data pugin) and client management(OSD core) logically.

Furthermore, a bit about implementation. We can

  1. Wire it in the existing core -> opensearch_service, similar as scoped client creation
  2. Or we can just create a new service called datasource_service in core.

As for which one is better, it's implementation level that we can decide later.

@seraphjiang
Copy link
Member Author

Thanks @zhongnansu for putting information together.

I like the preferred solution to initialize client in core api which could be used by other plugins without change too much code.

Here is my thoughts about below
wire in existing opensearch_service make sense to me if we only want to support opensearch for short term as well as long term. doens't seem too much work for plugin development to follow convention to make their plugin mutiple-source compatibility. the api signature is clear

context.core.opensearch.dataSourceClient.asDataSourceUser.search(param)

create a new datasource_service seems supporting datasource type other than opensearch in the future. However the api signature is not clear and straightforward to me

context.core.datasource_service.dataSourceClient.asDataSourceUser.search(param)
or
context.core.datasource_service.opensearch.dataSourceClient.asDataSourceUser.search(param)
or
context.core.datasource_service.mysql.dataSourceClient.asDataSourceUser.search(param)

@zengyan-amazon any comments?

Furthermore, a bit about implementation. We can

  1. Wire it in the existing core -> opensearch_service, similar as scoped client creation
  2. Or we can just create a new service called datasource_service in core.

@zhongnansu other than above open question, are we clear to close this research task with conclusion and move to design and implementation phase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project v3.0.0
Projects
None yet
Development

No branches or pull requests

2 participants