-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Opensearch SSL transport error, master not discovered or elected yet #54
Comments
Never seen this issue before from me, @DandyDeveloper @TheAlgo any idea on this issue from @alborotogarcia ? |
That specific bug can be ignored ( Shouldn't have any impact on the cluster working.
Is this actually causing problem? You mention the cluster being green. If you are just trying to run a single cluster; # # minimum_master_nodes need to be explicitly set when bound on a public IP
# # set to 1 to allow single node clusters
# discovery.zen.minimum_master_nodes: 1
# Setting network.host to a non-loopback address enables the annoying bootstrap checks. "Single-node" mode disables them again.
#discovery.type: single-node Uncomment these it'll work. If not, we need the full log. |
I meant green state as desired, not really reached unfortunately, as securityconfig doesn't get started Thanks for the help @peterzhuamazon @DandyDeveloper ! |
FWIW @DandyDeveloper @peterzhuamazon , I forgot to mention, internal users and other configurations added work if they're kept in their volumes and I redeploy the helm chart one more time with no securityConfig.config.data. Including ldap users. |
this seems to be the problem
|
@smlx I see, since kubernetes version 1.9.6 and forth, volumeMounts behavior on secret, configMap, downwardAPI and projected have changed to Read-Only by default as stated here kubernetes/kubernetes#62099 But I don't understand why just leaving as the default chart template it doesn't complain about RO filesystem.. is it another UID that initiates the process? the current fsGroup is set to user 1000 and so it is set on #9 |
I just deployed locally with your exact values and its working for me and able to write to that directory. [opensearch@opensearch-cluster-master-0 ~]$ cd data/
[opensearch@opensearch-cluster-master-0 data]$ ls -l
total 20
-rw-rw-r-- 1 opensearch opensearch 5 Sep 24 01:29 batch_metrics_enabled.conf
-rw-rw-r-- 1 opensearch opensearch 5 Sep 24 01:29 logging_enabled.conf
drwxrwxr-x 3 opensearch opensearch 4096 Sep 24 01:29 nodes
-rw-rw-r-- 1 opensearch opensearch 5 Sep 24 01:29 performance_analyzer_enabled.conf
-rw-rw-r-- 1 opensearch opensearch 5 Sep 24 01:29 rca_enabled.conf
[opensearch@opensearch-cluster-master-0 data]$ ls -l nodes/
total 4
drwxrwxr-x 3 opensearch opensearch 4096 Sep 24 01:39 0
[opensearch@opensearch-cluster-master-0 data]$ ls -l nodes/0
total 4
drwxrwxr-x 2 opensearch opensearch 4096 Sep 24 01:29 _state
-rw-rw-r-- 1 opensearch opensearch 0 Sep 24 01:29 node.lock Edit: I had a bunch of info here that was redundant and incorrect. I misread volumes :) What k8s version are you running? I'm running latest in my test cluster here. |
Sorry for the delay @DandyDeveloper, I had some issues with my IdP and had to spent time on it.. I am running a 3 node k3s cluster and yes I am aware that all config files are needed otherwise it will complain.. I run longhorn as storage class.. but IMHO I suspect that If I turn it to subpaths for each file mounts it may be less error prone.. as you said earlier it may affect to the folder that it gets mounted on.. will report back |
@DandyDeveloper We are running into what I perceive as the same or similar issue with the Would you mind reviewing the permissions in the -rw-r--r-- 1 opensearch opensearch 452868 Jul 8 22:32 saaj-impl-1.5.2.jar
drwxrwsrwt 3 root opensearch 260 Sep 26 12:24 securityconfig
-rw-r--r-- 1 opensearch opensearch 41203 Jul 8 22:32 slf4j-api-1.7.25.jar [opensearch@opensearch-cluster-master-0 securityconfig]$ ls -l
total 0
lrwxrwxrwx 1 root opensearch 24 Sep 26 12:24 action_groups.yml -> ..data/action_groups.yml
lrwxrwxrwx 1 root opensearch 16 Sep 26 12:24 audit.yml -> ..data/audit.yml
lrwxrwxrwx 1 root opensearch 17 Sep 26 12:24 config.yml -> ..data/config.yml
lrwxrwxrwx 1 root opensearch 25 Sep 26 12:24 internal_users.yml -> ..data/internal_users.yml
lrwxrwxrwx 1 root opensearch 19 Sep 26 12:24 nodes_dn.yml -> ..data/nodes_dn.yml
lrwxrwxrwx 1 root opensearch 16 Sep 26 12:24 roles.yml -> ..data/roles.yml
lrwxrwxrwx 1 root opensearch 24 Sep 26 12:24 roles_mapping.yml -> ..data/roles_mapping.yml
lrwxrwxrwx 1 root opensearch 18 Sep 26 12:24 tenants.yml -> ..data/tenants.yml
lrwxrwxrwx 1 root opensearch 20 Sep 26 12:24 whitelist.yml -> ..data/whitelist.yml Not sure if this is the issue but the content of each of the above files looks correct as per these examples. Of note, we have an older version of the OpenSearch charts that do work using the same values file but with the material difference being this block. |
It seems my previous assumption is incorrect. Applying an older version of the OpenSearch Helm chart with the same values file works even with the same folder and file permissions as above. -rw-r--r-- 1 opensearch opensearch 452868 Jul 8 22:32 saaj-impl-1.5.2.jar
drwxrwsrwt 3 root opensearch 260 Sep 26 12:44 securityconfig
-rw-r--r-- 1 opensearch opensearch 41203 Jul 8 22:32 slf4j-api-1.7.25.jar [opensearch@opensearch-cluster-master-0 securityconfig]$ ls -l
total 0
lrwxrwxrwx 1 root opensearch 24 Sep 26 12:44 action_groups.yml -> ..data/action_groups.yml
lrwxrwxrwx 1 root opensearch 16 Sep 26 12:44 audit.yml -> ..data/audit.yml
lrwxrwxrwx 1 root opensearch 17 Sep 26 12:44 config.yml -> ..data/config.yml
lrwxrwxrwx 1 root opensearch 25 Sep 26 12:44 internal_users.yml -> ..data/internal_users.yml
lrwxrwxrwx 1 root opensearch 19 Sep 26 12:44 nodes_dn.yml -> ..data/nodes_dn.yml
lrwxrwxrwx 1 root opensearch 16 Sep 26 12:44 roles.yml -> ..data/roles.yml
lrwxrwxrwx 1 root opensearch 24 Sep 26 12:44 roles_mapping.yml -> ..data/roles_mapping.yml
lrwxrwxrwx 1 root opensearch 18 Sep 26 12:44 tenants.yml -> ..data/tenants.yml
lrwxrwxrwx 1 root opensearch 20 Sep 26 12:44 whitelist.yml -> ..data/whitelist.yml
[opensearch@opensearch-cluster-master-0 securityconfig]$ I'll continue debugging. |
When using the latest version of the OpenSearch chart with the same values file as above, these are the exceptions we receive, which prevent Error
opensearch [2021-09-26T12:26:49,688[][DEBUG[][o.o.s.c.ConfigurationRepository[] [opensearch-cluster-master-0[] Try to load config ...
opensearch [2021-09-26T12:26:49,689[][DEBUG[][o.o.s.c.ConfigurationRepository[] [opensearch-cluster-master-0[] security index not exists (yet)
opensearch [2021-09-26T12:26:49,689[][ERROR[][o.o.s.c.ConfigurationLoaderSecurity7[] [opensearch-cluster-master-0[] Exception while retrieving configuration for [INTERNALUSERS, ACTIONGROUPS, CONFIG, ROLES, ROLESMAPPING, TENANTS, NODESDN, WHITELIST, AUDIT[] (index=.opendistro_security)
opensearch org.opensearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
opensearch at org.opensearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:203) ~[opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:189) ~[opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.action.get.TransportMultiGetAction.doExecute(TransportMultiGetAction.java:72) ~[opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.action.get.TransportMultiGetAction.doExecute(TransportMultiGetAction.java:53) ~[opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:192) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.indexmanagement.rollup.actionfilter.FieldCapsFilter.apply(FieldCapsFilter.kt:141) [opensearch-index-management-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:190) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.security.filter.SecurityFilter.apply0(SecurityFilter.java:234) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.filter.SecurityFilter.apply(SecurityFilter.java:154) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:190) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.performanceanalyzer.action.PerformanceAnalyzerActionFilter.apply(PerformanceAnalyzerActionFilter.java:99) [opensearch-performance-analyzer-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:190) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.action.support.TransportAction.execute(TransportAction.java:168) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.action.support.TransportAction.execute(TransportAction.java:96) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.client.node.NodeClient.executeLocally(NodeClient.java:99) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.client.node.NodeClient.doExecute(NodeClient.java:88) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.client.support.AbstractClient.execute(AbstractClient.java:428) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.client.support.AbstractClient.multiGet(AbstractClient.java:546) [opensearch-1.0.0.jar:1.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationLoaderSecurity7.loadAsync(ConfigurationLoaderSecurity7.java:211) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationLoaderSecurity7.load(ConfigurationLoaderSecurity7.java:102) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationRepository.getConfigurationsFromIndex(ConfigurationRepository.java:375) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationRepository.reloadConfiguration0(ConfigurationRepository.java:321) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationRepository.reloadConfiguration(ConfigurationRepository.java:306) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at org.opensearch.security.configuration.ConfigurationRepository$1.run(ConfigurationRepository.java:166) [opensearch-security-1.0.0.0.jar:1.0.0.0[]
opensearch at java.lang.Thread.run(Thread.java:832) [?:?] If this turns out to be a different issue than the issue that's the topic of this thread then I'll open a separate issue. |
I believe I found the issue or, at least, a workaround. It appears the behavior of the If the - name: discovery.zen.minimum_master_nodes
value: "1"
- name: discovery.zen.ping.unicast.hosts
value: "opensearch-cluster-master-headless" ...rather than: - name: cluster.initial_master_nodes
value: "opensearch-cluster-master-0,opensearch-cluster-master-1,opensearch-cluster-master-2,"
- name: discovery.seed_hosts
value: "opensearch-cluster-master-headless" When using "discovery", the failures as per above are present and the security indexes are never created thus resulting in a red cluster status. I am not very familiar with zen discovery but likely prefer it so new nodes can discovery the cluster state. However, it does not appear to work. All thoughts and support are welcome. UPDATE 1: It appears that we should be using UPDATE 2: I modified the StatefulSet to use |
@mprimeaux @DandyDeveloper I followed your suggestions, and here's what it worked for me
and let majorVersion: "" though I can't still login with my IdP |
@mprimeaux @alborotogarcia I did not try out the config and installation as I am away from work for some time. But I am thinking out loud. Can this be something related to the core engine and not the chart? Maybe we might need to look at the security repository to understand more because ideally |
@TheAlgo Here is logic. It appears to be, in part, an issue with the chart logic since we SHOULD be using different discovery However, I agree with you that something deeper might be going on and so I will also research the security repository. Related to the OpenDistro docs, it seems they are stale given the discovery attributes are discovery.zen.ping.unicast.hosts and discovery.seed_hosts as per this in the OpenSearch docs. |
@alborotogarcia Thanks, mate. I will try your suggestion in the above reply. |
@mprimeaux We need to change the Helm logic for sure. As part of #21 we changed it at 1 place and did not change the others which seemed to breaking. Coming to the |
It appears the setting See the cluster settings logic here. I believe this might be a point of focus for the chart logic. |
@DandyDeveloper Also an ingress api upgrade from networking.k8s.io/v1beta1 to networking.k8s.io/v1 on kubernetes 1.22+ |
@alborotogarcia Coincidentally, I noticed this also, and, in addition it seems {{- if and .Values.ingress.ingressClassName }}
ingressClassName: {{ .Values.ingress.ingressClassName | quote }}
{{- end }} I will create a new issue and related PR today for the |
Close this for now as it seems to be resolved by community. Thanks. |
Im seeing this error when trying to stand up this infrastructure with the latest images: https://github.com/opensearch-project/data-prepper/tree/main/examples/log-ingestion And using this fluentbit.conf:
It is not clear to me from this thread what modifications i can make to the docker-compose or other settings to resovle this. this is the error:
|
Describe the bug
Can't reproduce default demo setup on kubernetes.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Cluster gets GREEN state
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: