You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, since we also use/query ES directly, a proxy would work far better than changing Kibana.
In the discussion above, it was decided (by Shay) that the place to do it would be at the ES level.
The following is a proposal to achieve it at the ES level. This is some very early planning and I've decided to post it early to make sure I'm not duplicating work or am on a totally wrong path (which is totally possible). Any feedback/comments are appreciated.
Proposal
The general plan is to make a query only node (termed 'search load balancer' in elasticsearch.yml) and have a list of cluster names that we want to query.
During a search, the node will query from each of the shards in each of the clusters and aggregate the results.
Details
Make a query only node
In elasticsearch.yml
node.master: false
node.data: false
I'm hoping this means that we can isolate code changes to only the search portion.
Accommodate multiple clusters
Have ZenDiscover reach out to all the listed clusters to get their state.
ClusterService will contain a map of ClusterName -> ClusterState.
To maintain the interface, clusterService.state() will just return the first cluster.
We can have another interface that will allow us to get the the cluster map.
MulticastZenPing will have to listen for changes in the listed clusters and update the corresponding cluster state.
Searching across multiple clusters
I've only looked at TransportSearchTypeAction and TransportSearchQueryThenFetchAction so far.
In the BaseAsyncAction, we'll need to get all the relevant shards across each cluster and query them (sendExecuteFirstPhase).
We change expectedSuccessfulOps and expectedTotalOps to the multi-cluster counts so that in onFirstPhaseResult we know when to move on (innerMoveToSecondPhase).
In moveToSecondPhase, we again use the metadata from each cluster and do the actual fetch of the documents.
Finally in innerFinishHim, we merge all the results with the SearchPhaseController and return the response via the normally.
Thanks!
The text was updated successfully, but these errors were encountered:
Just noticed an interesting commit that addresses this issue: #4708
Definitely a more thought out approach since it merges the cluster states, but requires a new "tribe" node. I wonder how it will works for single cluster actions that do not require a merged cluster state.
ES with Multi-Cluster Search Support
It'd be nice to be able to query across multiple clusters and get their aggregated results.
Our initial motivation is to view Kibana results across multiple clusters. See: Enhancement: Allow conntections to multiple ES backends from a single Kibana instance.
However, since we also use/query ES directly, a proxy would work far better than changing Kibana.
In the discussion above, it was decided (by Shay) that the place to do it would be at the ES level.
The following is a proposal to achieve it at the ES level. This is some very early planning and I've decided to post it early to make sure I'm not duplicating work or am on a totally wrong path (which is totally possible). Any feedback/comments are appreciated.
Proposal
The general plan is to make a query only node (termed 'search load balancer' in
elasticsearch.yml
) and have a list of cluster names that we want to query.During a search, the node will query from each of the shards in each of the clusters and aggregate the results.
Details
elasticsearch.yml
node.master: false
node.data: false
ZenDiscover
reach out to all the listed clusters to get their state.ClusterService
will contain a map ofClusterName -> ClusterState
.clusterService.state()
will just return the first cluster.MulticastZenPing
will have to listen for changes in the listed clusters and update the corresponding cluster state.TransportSearchTypeAction
andTransportSearchQueryThenFetchAction
so far.BaseAsyncAction
, we'll need to get all the relevant shards across each cluster and query them (sendExecuteFirstPhase
).expectedSuccessfulOps
andexpectedTotalOps
to the multi-cluster counts so that inonFirstPhaseResult
we know when to move on (innerMoveToSecondPhase
).moveToSecondPhase
, we again use the metadata from each cluster and do the actual fetch of the documents.innerFinishHim
, we merge all the results with theSearchPhaseController
and return the response via the normally.Thanks!
The text was updated successfully, but these errors were encountered: