Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for template construction compatible with kubernetes hints #135624

Closed
ChrsMark opened this issue Jul 4, 2022 · 17 comments
Closed

Add support for template construction compatible with kubernetes hints #135624

ChrsMark opened this issue Jul 4, 2022 · 17 comments
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@ChrsMark
Copy link
Member

ChrsMark commented Jul 4, 2022

As discussed at elastic/elastic-agent#613 (comment), Fleet UI can be enhanced in order to provide specific input templates which will be capable to be enabled and populated by hint's based autodiscovery implemented in kubernetes provider.

Fleet UI should be capable to produce an inputs.d ConfigMap like the following:

apiVersion: v1
kind: ConfigMap
metadata:
  name: elastic-agent-standalone-inputs
data:
  redis.yml: |-
    inputs:
     - name: templates.d/redis/0.3.6
       type: redis/metrics
       data_stream.namespace: default
       use_output: default
       streams:
         - data_stream:
             dataset: redis.info
             type: metrics
           metricsets:
           - info
           hosts:
           - "${kubernetes.hints.redis.info.host|'127.0.0.1:6379'}"
           idle_timeout: 20s
           maxconn: 10
           network: tcp
           period: "${kubernetes.hints.redis.info.period|'10s'}"
           condition: ${kubernetes.hints.redis.info.enabled} == true
         - data_stream:
             dataset: redis.key
             type: metrics
           metricsets:
           - key
           hosts:
           - "${kubernetes.hints.redis.key.host|'127.0.0.1'}:${kubernetes.hints.redis.info.port|'6379'}"
           idle_timeout: 20s
           key.patterns:
             - limit: 20
               pattern: '*'
           maxconn: 10
           network: tcp
           period: "${kubernetes.hints.redis.key.period|'10s'}"
           condition: ${kubernetes.hints.redis.key.enabled} == true

So the flow for creating this new ConfigMap is like this:

  1. Fleet retrieves all the available packages/integrations from the Registry.
  2. One by one constructs the config blocks using the default values defined in the package spec.
    a . For every setting that is a known "hint", populates its value with the hint placeholder/variable like ${kubernetes.hints.redis.info.host}". The fallback of this should be the default value so the final value of the setting is like ${kubernetes.hints.redis.info.host|'127.0.0.1:6379'}".
    b. for every data_stream in the config block we add the proper condition so as this to be enabled only by the hint mechanism: condition: ${kubernetes.hints.redis.key.enabled} == true

The purpose of this ConfigMap will be to be mounted at elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml manually as well as to be included in the full manifest that Fleet UI constructs implemented by #114439.

This is related to elastic/elastic-agent#662.

@ChrsMark ChrsMark added enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team labels Jul 4, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@joshdover
Copy link
Contributor

  • Fleet retrieves all the available packages/integrations from the Registry.

I hope we don't mean every package? This is not going to scale well as we get to 1k+ packages. Fleet can't be downloading all of the packages to produce this, I think we either need:

  • The user to select which integrations they may want to use; and/or
  • A default, constrained set of popular packages that we want to support out the box with the ability to add additional packages

Another question is where in the UI should this be shown? Only in the standalone agent configuration UI?

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 5, 2022

  • Fleet retrieves all the available packages/integrations from the Registry.

I hope we don't mean every package? This is not going to scale well as we get to 1k+ packages. Fleet can't be downloading all of the packages to produce this, I think we either need:

How much this information would be in terms of bytes? Note that we only care about the package spec and not the assets at this point. Could a caching mechanism in Kibana help here?

  • The user to select which integrations they may want to use; and/or
  • A default, constrained set of popular packages that we want to support out the box with the ability to add additional packages

I think that would work but would not comply with the goal of the feature which is to provide minimal configuration steps. Imagine that these templates could be completely hidden from the user since those only act as low level implementation detail. However if including all these is proved to not be performant we need to revisit and re-consider this.

Another question is where in the UI should this be shown? Only in the standalone agent configuration UI?

Yes to my mind this should be shown only in the standalone agent configuration UI.

@mlunadia @gizas any thoughts on the above comments/concerns?

@kpollich
Copy link
Member

kpollich commented Jul 5, 2022

How much this information would be in terms of bytes? Note that we only care about the package spec and not the assets at this point. Could a caching mechanism in Kibana help here?

We have the package registry API that can provide some info, e.g. https://epr.elastic.co/search?experimental=true but this doesn't include detailed information about variables like default values, types, etc. In order to resolve that level of detail, we need to either query for that list linked above -> query the API for each individual package, e.g. https://epr.elastic.co/package/1password/1.4.0/ or we need to download every package.

Fleet does some in-memory caching of downloaded packages to save on repeated downloads of packages, but that doesn't help us in the "cold start" case where we need to download every single package to determine if and how Fleet should be generating Kubernetes templates for them. I agree with @joshdover's points above that we need some way to limit the list of packages we're querying here, either by user input or by a hardcoded allow list.

Packages can easily be several megabytes on average, and if a package ships with prebuilt assets like ML jobs it can be quite a bit larger.

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 6, 2022

Out of curiosity I drafted the following python script to measure what we are discussing:

get_packages.py
import requests
import wget

import ssl
ssl._create_default_https_context = ssl._create_unverified_context


res = requests.get('https://epr.elastic.co/search?experimental=true')

base_uri = 'https://epr.elastic.co'

packages = res.json()
for pkg in packages:
    print(pkg)
    print("\n")
    path = pkg.get('download')
    uri = base_uri + path
    print(uri)
    print("\n")
    wget.download(uri)

Running this script from my local machine I manage to download all the packages in less than 2 minutes and the total storage used seems to be 117M.

$ time ./packagesTest/get_packages.py 
....
3.10s user 1.89s system 4% cpu 1:43.66 total
$ ls -l packagesTest | wc -l
     154
$ du -sh packagesTest                                       
117M	packagesTest
Downloaded packages
$ ls -lh packagesTest

-rw-r--r--  1 chrismark  staff   855K Jul  6 11:40 1password-1.4.0.zip
-rw-r--r--  1 chrismark  staff   2.5M Jul  6 11:40 activemq-0.3.0.zip
-rw-r--r--  1 chrismark  staff    27K Jul  6 11:40 akamai-1.0.1.zip
-rw-r--r--  1 chrismark  staff   1.4M Jul  6 11:40 apache-1.3.5.zip
-rw-r--r--  1 chrismark  staff   1.0M Jul  6 11:41 apm-8.3.0.zip
-rw-r--r--  1 chrismark  staff    18K Jul  6 11:40 atlassian_bitbucket-1.2.1.zip
-rw-r--r--  1 chrismark  staff    23K Jul  6 11:40 atlassian_confluence-1.3.0.zip
-rw-r--r--  1 chrismark  staff    22K Jul  6 11:40 atlassian_jira-1.3.0.zip
-rw-r--r--  1 chrismark  staff   365K Jul  6 11:40 auditd-3.1.0.zip
-rw-r--r--  1 chrismark  staff   823K Jul  6 11:40 auditd_manager-1.0.0.zip
-rw-r--r--  1 chrismark  staff   2.1M Jul  6 11:40 auth0-1.0.0.zip
-rw-r--r--  1 chrismark  staff   4.9M Jul  6 11:40 aws-1.17.1.zip
-rw-r--r--  1 chrismark  staff    15K Jul  6 11:41 aws_logs-0.2.3.zip
-rw-r--r--  1 chrismark  staff   348K Jul  6 11:40 awsfargate-0.1.1.zip
-rw-r--r--  1 chrismark  staff   818K Jul  6 11:40 azure-1.1.10.zip
-rw-r--r--  1 chrismark  staff   488K Jul  6 11:40 azure_application_insights-1.0.1.zip
-rw-r--r--  1 chrismark  staff   196K Jul  6 11:40 azure_billing-1.0.1.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:40 azure_metrics-1.0.5.zip
-rw-r--r--  1 chrismark  staff   249K Jul  6 11:40 barracuda-0.9.0.zip
-rw-r--r--  1 chrismark  staff   124K Jul  6 11:40 bluecoat-0.8.0.zip
-rw-r--r--  1 chrismark  staff   241K Jul  6 11:42 carbon_black_cloud-1.0.3.zip
-rw-r--r--  1 chrismark  staff    28K Jul  6 11:42 carbonblack_edr-1.3.0.zip
-rw-r--r--  1 chrismark  staff   1.0M Jul  6 11:40 cassandra-1.1.0.zip
-rw-r--r--  1 chrismark  staff   123K Jul  6 11:40 cef-2.0.3.zip
-rw-r--r--  1 chrismark  staff    91K Jul  6 11:40 checkpoint-1.5.1.zip
-rw-r--r--  1 chrismark  staff   1.1M Jul  6 11:40 cisco-0.12.5.zip
-rw-r--r--  1 chrismark  staff   825K Jul  6 11:40 cisco_asa-2.4.2.zip
-rw-r--r--  1 chrismark  staff   474K Jul  6 11:40 cisco_duo-1.2.4.zip
-rw-r--r--  1 chrismark  staff    43K Jul  6 11:40 cisco_ftd-2.2.2.zip
-rw-r--r--  1 chrismark  staff    23K Jul  6 11:40 cisco_ios-1.6.0.zip
-rw-r--r--  1 chrismark  staff   159K Jul  6 11:40 cisco_ise-0.1.0.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:40 cisco_meraki-0.5.1.zip
-rw-r--r--  1 chrismark  staff   180K Jul  6 11:40 cisco_nexus-0.5.1.zip
-rw-r--r--  1 chrismark  staff   228K Jul  6 11:40 cisco_secure_email_gateway-0.1.0.zip
-rw-r--r--  1 chrismark  staff    27K Jul  6 11:40 cisco_secure_endpoint-2.4.1.zip
-rw-r--r--  1 chrismark  staff    27K Jul  6 11:40 cisco_umbrella-1.0.1.zip
-rw-r--r--  1 chrismark  staff   1.6M Jul  6 11:40 cloud_security_posture-0.0.16.zip
-rw-r--r--  1 chrismark  staff   947K Jul  6 11:41 cloudflare-2.0.1.zip
-rw-r--r--  1 chrismark  staff   285K Jul  6 11:41 cockroachdb-0.2.0.zip
-rw-r--r--  1 chrismark  staff   977K Jul  6 11:41 crowdstrike-1.3.4.zip
-rw-r--r--  1 chrismark  staff   156K Jul  6 11:41 cyberark-0.4.4.zip
-rw-r--r--  1 chrismark  staff   547K Jul  6 11:41 cyberarkpas-2.4.2.zip
-rw-r--r--  1 chrismark  staff   123K Jul  6 11:41 cylance-0.8.1.zip
-rw-r--r--  1 chrismark  staff    34M Jul  6 11:41 dga-0.0.2.zip
-rw-r--r--  1 chrismark  staff   713K Jul  6 11:41 docker-1.2.0.zip
-rw-r--r--  1 chrismark  staff   860K Jul  6 11:41 elastic_agent-1.3.3.zip
-rw-r--r--  1 chrismark  staff    97K Jul  6 11:41 elasticsearch-0.2.0.zip
-rw-r--r--  1 chrismark  staff   222K Jul  6 11:41 endpoint-8.3.0.zip
-rw-r--r--  1 chrismark  staff   233K Jul  6 11:41 f5-0.9.0.zip
-rw-r--r--  1 chrismark  staff    17K Jul  6 11:41 fim-1.0.0.zip
-rw-r--r--  1 chrismark  staff    24K Jul  6 11:41 fireeye-1.4.0.zip
-rw-r--r--  1 chrismark  staff   3.7K Jul  6 11:41 fleet_server-1.2.0.zip
-rw-r--r--  1 chrismark  staff   402K Jul  6 11:41 fortinet-1.6.2.zip
-rw-r--r--  1 chrismark  staff   602K Jul  6 11:41 gcp-1.9.2.zip
-rw-r--r--  1 chrismark  staff    11K Jul  6 11:41 gcp_pubsub-1.0.1.zip
-rw-r--r--  1 chrismark  staff   402B Jul  6 11:35 get_packages.py
-rw-r--r--  1 chrismark  staff   761K Jul  6 11:41 github-1.0.2.zip
-rw-r--r--  1 chrismark  staff    95K Jul  6 11:41 google_workspace-1.5.1.zip
-rw-r--r--  1 chrismark  staff   236K Jul  6 11:41 haproxy-0.7.0.zip
-rw-r--r--  1 chrismark  staff   1.0M Jul  6 11:41 hashicorp_vault-1.4.0.zip
-rw-r--r--  1 chrismark  staff   1.1M Jul  6 11:41 hid_bravura_monitor-1.0.3.zip
-rw-r--r--  1 chrismark  staff   8.2K Jul  6 11:41 http_endpoint-1.1.0.zip
-rw-r--r--  1 chrismark  staff   9.1K Jul  6 11:41 httpjson-1.2.4.zip
-rw-r--r--  1 chrismark  staff   1.6M Jul  6 11:41 iis-0.8.0.zip
-rw-r--r--  1 chrismark  staff   112K Jul  6 11:41 imperva-0.8.0.zip
-rw-r--r--  1 chrismark  staff   159K Jul  6 11:41 infoblox-0.8.0.zip
-rw-r--r--  1 chrismark  staff   151K Jul  6 11:41 infoblox_nios-0.1.0.zip
-rw-r--r--  1 chrismark  staff   1.4M Jul  6 11:41 iptables-0.10.1.zip
-rw-r--r--  1 chrismark  staff    11K Jul  6 11:41 journald-0.0.2.zip
-rw-r--r--  1 chrismark  staff   715K Jul  6 11:41 juniper-1.1.0.zip
-rw-r--r--  1 chrismark  staff   279K Jul  6 11:41 juniper_junos-0.2.1.zip
-rw-r--r--  1 chrismark  staff   381K Jul  6 11:41 juniper_netscreen-0.2.0.zip
-rw-r--r--  1 chrismark  staff    68K Jul  6 11:41 juniper_srx-1.3.1.zip
-rw-r--r--  1 chrismark  staff   298K Jul  6 11:41 kafka-1.2.2.zip
-rw-r--r--  1 chrismark  staff    21K Jul  6 11:41 keycloak-1.3.1.zip
-rw-r--r--  1 chrismark  staff    23K Jul  6 11:41 kibana-1.0.2.zip
-rw-r--r--  1 chrismark  staff   1.5M Jul  6 11:41 kubernetes-1.21.1.zip
-rw-r--r--  1 chrismark  staff   591K Jul  6 11:41 linux-0.6.7.zip
-rw-r--r--  1 chrismark  staff   5.6K Jul  6 11:41 log-1.0.0.zip
-rw-r--r--  1 chrismark  staff   534K Jul  6 11:41 logstash-1.1.0.zip
-rw-r--r--  1 chrismark  staff    22K Jul  6 11:41 m365_defender-1.0.4.zip
-rw-r--r--  1 chrismark  staff    16K Jul  6 11:41 mattermost-1.2.0.zip
-rw-r--r--  1 chrismark  staff   854K Jul  6 11:41 microsoft-1.1.0.zip
-rw-r--r--  1 chrismark  staff   741K Jul  6 11:41 microsoft_defender_endpoint-2.2.1.zip
-rw-r--r--  1 chrismark  staff    17K Jul  6 11:41 microsoft_dhcp-1.4.2.zip
-rw-r--r--  1 chrismark  staff   1.0M Jul  6 11:41 microsoft_sqlserver-1.1.1.zip
-rw-r--r--  1 chrismark  staff   152K Jul  6 11:41 mimecast-1.0.0.zip
-rw-r--r--  1 chrismark  staff    48K Jul  6 11:41 modsecurity-1.0.0.zip
-rw-r--r--  1 chrismark  staff   176K Jul  6 11:41 mongodb-1.3.1.zip
-rw-r--r--  1 chrismark  staff   729K Jul  6 11:41 mysql-1.2.1.zip
-rw-r--r--  1 chrismark  staff    21K Jul  6 11:41 mysql_enterprise-1.0.1.zip
-rw-r--r--  1 chrismark  staff   1.4M Jul  6 11:41 nats-1.2.0.zip
-rw-r--r--  1 chrismark  staff   134K Jul  6 11:41 netflow-2.0.1.zip
-rw-r--r--  1 chrismark  staff   122K Jul  6 11:40 netscout-0.8.0.zip
-rw-r--r--  1 chrismark  staff   385K Jul  6 11:41 netskope-1.0.1.zip
-rw-r--r--  1 chrismark  staff   354K Jul  6 11:41 network_traffic-1.3.1.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:41 nginx-1.3.1.zip
-rw-r--r--  1 chrismark  staff   1.6M Jul  6 11:41 nginx_ingress_controller-1.2.0.zip
-rw-r--r--  1 chrismark  staff   707K Jul  6 11:41 o365-1.6.0.zip
-rw-r--r--  1 chrismark  staff   464K Jul  6 11:41 okta-1.8.0.zip
-rw-r--r--  1 chrismark  staff    18K Jul  6 11:41 oracle-1.0.2.zip
-rw-r--r--  1 chrismark  staff   628K Jul  6 11:41 osquery-1.3.0.zip
-rw-r--r--  1 chrismark  staff   109K Jul  6 11:41 osquery_manager-1.3.1.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:41 panw-2.2.2.zip
-rw-r--r--  1 chrismark  staff    26K Jul  6 11:41 panw_cortex_xdr-1.2.1.zip
-rw-r--r--  1 chrismark  staff   873K Jul  6 11:42 pfsense-1.0.3.zip
-rw-r--r--  1 chrismark  staff   676K Jul  6 11:41 postgresql-1.2.0.zip
-rw-r--r--  1 chrismark  staff   2.4M Jul  6 11:41 problemchild-0.0.2.zip
-rw-r--r--  1 chrismark  staff   713K Jul  6 11:42 prometheus-0.7.0.zip
-rw-r--r--  1 chrismark  staff   147K Jul  6 11:42 proofpoint-0.7.0.zip
-rw-r--r--  1 chrismark  staff   241K Jul  6 11:42 proofpoint_tap-0.1.0.zip
-rw-r--r--  1 chrismark  staff    26K Jul  6 11:42 pulse_connect_secure-1.0.1.zip
-rw-r--r--  1 chrismark  staff    24K Jul  6 11:42 qnap_nas-1.2.1.zip
-rw-r--r--  1 chrismark  staff    46K Jul  6 11:42 rabbitmq-1.2.0.zip
-rw-r--r--  1 chrismark  staff   119K Jul  6 11:42 radware-0.7.0.zip
-rw-r--r--  1 chrismark  staff   319K Jul  6 11:42 redis-1.2.0.zip
-rw-r--r--  1 chrismark  staff   548K Jul  6 11:41 santa-3.1.0.zip
-rw-r--r--  1 chrismark  staff   985K Jul  6 11:42 security_detection_engine-8.1.1.zip
-rw-r--r--  1 chrismark  staff   283K Jul  6 11:42 sentinel_one-0.1.0.zip
-rw-r--r--  1 chrismark  staff    29K Jul  6 11:42 snort-0.3.1.zip
-rw-r--r--  1 chrismark  staff    31K Jul  6 11:42 snyk-1.2.1.zip
-rw-r--r--  1 chrismark  staff   192K Jul  6 11:42 sonicwall-0.8.1.zip
-rw-r--r--  1 chrismark  staff   658K Jul  6 11:42 sonicwall_firewall-0.1.1.zip
-rw-r--r--  1 chrismark  staff   198K Jul  6 11:42 sophos-2.2.2.zip
-rw-r--r--  1 chrismark  staff   111K Jul  6 11:42 squid-0.8.0.zip
-rw-r--r--  1 chrismark  staff   312K Jul  6 11:42 stan-1.2.0.zip
-rw-r--r--  1 chrismark  staff   602K Jul  6 11:42 suricata-2.1.0.zip
-rw-r--r--  1 chrismark  staff   280K Jul  6 11:42 symantec-0.1.3.zip
-rw-r--r--  1 chrismark  staff   300K Jul  6 11:42 symantec_endpoint-1.0.1.zip
-rw-r--r--  1 chrismark  staff   113K Jul  6 11:41 synthetics-0.9.4.zip
-rw-r--r--  1 chrismark  staff   1.0M Jul  6 11:42 system-1.16.2.zip
-rw-r--r--  1 chrismark  staff   6.6K Jul  6 11:41 tcp-1.1.0.zip
-rw-r--r--  1 chrismark  staff   140K Jul  6 11:42 tenable_sc-1.2.2.zip
-rw-r--r--  1 chrismark  staff    57K Jul  6 11:40 ti_abusech-1.3.2.zip
-rw-r--r--  1 chrismark  staff   320K Jul  6 11:40 ti_anomali-1.3.3.zip
-rw-r--r--  1 chrismark  staff    74K Jul  6 11:41 ti_cybersixgill-1.4.1.zip
-rw-r--r--  1 chrismark  staff    42K Jul  6 11:41 ti_misp-1.4.1.zip
-rw-r--r--  1 chrismark  staff    29K Jul  6 11:40 ti_otx-1.3.2.zip
-rw-r--r--  1 chrismark  staff    22K Jul  6 11:42 ti_recordedfuture-1.0.1.zip
-rw-r--r--  1 chrismark  staff    31K Jul  6 11:42 ti_threatq-1.3.2.zip
-rw-r--r--  1 chrismark  staff   116K Jul  6 11:40 tomcat-1.4.1.zip
-rw-r--r--  1 chrismark  staff   560K Jul  6 11:42 traefik-1.2.0.zip
-rw-r--r--  1 chrismark  staff   6.5K Jul  6 11:41 udp-1.1.1.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:42 vsphere-0.1.0.zip
-rw-r--r--  1 chrismark  staff   325K Jul  6 11:42 windows-1.12.4.zip
-rw-r--r--  1 chrismark  staff    16K Jul  6 11:41 winlog-1.5.2.zip
-rw-r--r--  1 chrismark  staff   1.8M Jul  6 11:42 zeek-2.1.0.zip
-rw-r--r--  1 chrismark  staff    14K Jul  6 11:42 zerofox-1.3.1.zip
-rw-r--r--  1 chrismark  staff   481K Jul  6 11:42 zookeeper-1.2.0.zip
-rw-r--r--  1 chrismark  staff    28K Jul  6 11:42 zoom-1.3.1.zip
-rw-r--r--  1 chrismark  staff   112K Jul  6 11:42 zscaler-0.5.1.zip
-rw-r--r--  1 chrismark  staff   320K Jul  6 11:42 zscaler_zia-2.1.0.zip
-rw-r--r--  1 chrismark  staff   370K Jul  6 11:42 zscaler_zpa-1.0.0.zip

Are those numbers expected @kpollich ? I wonder if those numbers are actually risky in terms of performance, since this action should take place only once on Kibana's "first" load time and then the constructed ConfigMap can be cached. Would a background job along with the caching help here?

If these indicators are concerning then I would do a step back and re-consider the approach/solution. To the specific comment

I agree with @joshdover's points above that we need some way to limit the list of packages we're querying here, either by user input or by a hardcoded allow list.

how do you think of this selection? Would that mean that by-default we only select some packages but we also provide the option to users to select and download all of them? Wouldn't that lead to the risk of downloading everything again if users are choosing "select all"?

One thing that I would like to make clear here is the purpose of this feature. Hints' based autodiscovery serves for the cases where users want full automation and no "restarts" with as minimal configuration as possible. So having the users to select/deselect the packages to be included makes us diverge from the purpose. In addition to this, if for any reason users want to add sth more they will need to go back to Fleet UI and regenerate the templates and finally restart the Agent.
This is more of a hybrid approach and not fully automated based on hints.

Having said this, I think that if Kibana and Fleet UI cannot solve this issue efficiently we need to reconsider.

Some quick alternatives here:

  1. Would that be possible instead of downloading all the artifacts for the packages to only download the packages' spec from https://github.com/elastic/package-storage/tree/production/packages directly?
  2. I would even consider forgetting about supporting this on Fleet UI and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap. With something like this we could even have a nightly job to upload this ConfigMap at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone if there is any diff. Then only thing that users need to do is to download https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone and deploy, which exactly what they are doing today. Keep in mind that even today the standalone policy we provide is quite static and not frequently updated at https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml#L27, but this is somehow expected when it comes to standalone experience.

cc: @gizas

@MichaelKatsoulis
Copy link
Contributor

I would even consider forgetting about supporting this on Fleet UI and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap.

This is a good idea. Each time there is a new update in one of the packages(in the vars of the data streams?), a new ConfigMap will be constructed and a PR can be opened to Kibana project as well to update the https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8 which is currently used for the standalone agent. This is not expected to happen very often.

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 6, 2022

To my mind updating https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8 is another story that is irrelevant to the templates construction and should be handled on top. For example even today if we change https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone then Kibana's part will be outdated. Based on this maybe updating Kibana's should happen in any case if changes are detected at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone.

@gizas
Copy link
Contributor

gizas commented Jul 6, 2022

Trying to follow all above and synced with Christos on the small details, and just some clarifications:

  • This story focuses on standalone solely (on Managed agent I want to hide some of above steps but another story)
  • Agree that updating elastic_agent_manifest.ts#L8 adds more complexity here, we can track it additionally
  • I would add the idea to create the k8stemplate on epr side. As long as epr downloads packages, why can we create the templates there and to be served with another api request?

@mtojek
Copy link
Contributor

mtojek commented Jul 6, 2022

Ok, I see that this thread expanded quickly, so let me clarify few things as Ecosystem owns elastic-package and package-registry.

Would that be possible instead of downloading all the artifacts for the packages to only download the packages' spec from https://github.com/elastic/package-storage/tree/production/packages directly?

package-storage as repository will be deprecated soon. By soon, I mean the end of July/August. We will switch to https://package-storage.elastic.co/ which is based on buckets. I strongly recommend not considering package-storage v1 (Git) as a component.

We have the package registry API that can provide some info, e.g. https://epr.elastic.co/search?experimental=true but this doesn't include detailed information about variables like default values, types, etc. In order to resolve that level of detail, we need to either query for that list linked above -> query the API for each individual package, e.g. https://epr.elastic.co/package/1password/1.4.0/ or we need to download every package.

We don't plan to extend EPR to perform any extra logic apart from serving package indices and redirecting to package-storage to download .zip or static artifacts.

I would add the idea to create the k8stemplate on epr side. As long as epr downloads packages, why can we create the templates there and to be served with another api request?

EPR is intended to be a static component with a simple search facility. We don't aim to put extra processing logic there.

@gizas
Copy link
Contributor

gizas commented Jul 6, 2022

So you dont let us many possibilities there :)

I guess the only 2 final candidates are:

  • To create templates on Kibana side if @kpollich agrees that no perfromance issue might apply
  • To create them on daily basis and keep them somewhere where can be picked from kibana on start

Also @mtojek how about the part:

2. and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap

Can we plan for it? Do you see any issues?

@mtojek
Copy link
Contributor

mtojek commented Jul 6, 2022

  1. and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap

It is something I'd like to understand better as elastic-package's actions refer to development lifecycle (build, lint, format, test, stack, etc.). I don't see how the createk8sTemplates action fits there, but maybe we can evaluate/rephrase it.

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 6, 2022

  1. and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap

It is something I'd like to understand better as elastic-package's actions refer to development lifecycle (build, lint, format, test, stack, etc.). I don't see how the createk8sTemplates action fits there, but maybe we can evaluate/rephrase it.

The goal here is simple. We want to produce a static kubernetes ConfigMap with the templates from the latest versions of packages (for now). In order to make it available to our users we can store it in the upstream repository at https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml. Our official docs at the moment redirect our users to download our proposed manifests from there (see docs).

So a developer from cloudnative team that maintains these manifests mainly would need a tool to automate the construction of this ConfigMap. This is where elastic-package comes into play.

In order to automate this process even more, after we have the tooling implemented we can add it in a nightly automation run which would re-run the elastic-package createk8sTemplates command and will check for diffs with the upstream at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone opening a PR if there is a diff.

In this way users following our docs will only have to curl -L -O https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml the same way they do today and the hints feature will be available for them in a transparent way.

To make the proposal more complete, Kibana's side should be synced according to https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml from time to time in order to have the "hardcoded" manifest (https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8) up to date. But this is a need that even exists today since the hardcoded manifest is not getting updated if something changes at https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml. Again the creation and existence of the new templates' ConfigMap is an implementation detail and hidden from the end user.

@kpollich
Copy link
Member

kpollich commented Jul 6, 2022

I wonder if those numbers are actually risky in terms of performance, since this action should take place only once on Kibana's "first" load time and then the constructed ConfigMap can be cached. Would a background job along with the caching help here?

Fleet's setup process on boot blocks Kibana's healthy status, so adding 2+ minutes of degraded time to Kibana on boot is a nonstarter here. A background job seems like a better fit if we place responsibility for the generation of these ConfigMap objects on Kibana.

To create them on daily basis and keep them somewhere where can be picked from kibana on start

This is a better solution in my mind. If the ConfigMap object is truly a static list of every single package that supports Kubernetes autodiscovery hints, it doesn't seem necessary for Kibana to generate that list "on-demand". I expect the rate of change for these hints to be fairly slow, so handling them through a CI job seems a lot better to me.

how do you think of this selection? Would that mean that by-default we only select some packages but we also provide the option to users to select and download all of them? Wouldn't that lead to the risk of downloading everything again if users are choosing "select all"?

I guess I just don't fully understand the use case here. To me, it seems like there'd be an overwhelming amount of config I might not need in this ConfigMap object. For example if we include k8's hints in 10-15 packages the ConfigMap is going to include definition blocks for each of them. To me, this seems like a lot of noise and area for confusion - but then again I am a total novice with Kubernetes, so my understanding of this use case is limited.

You are correct though. If we allow selection here we still run the risk of downloading all packages in order to resolve the default for each variable.

The solution of a static ConfigMap maintained by CI feels the safest to me.

@mtojek
Copy link
Contributor

mtojek commented Jul 6, 2022

Folks, I'm afraid that you're forgetting about the scaling factor. We need to think about the situation where we have 1kk packages. Do we want to keep updating config maps at that scale?

Also, how do you plan to support those config maps if the format depends on the Elastic stack version?

I suggest going back to square one and rethinking the procedure. Generating templates on a nightly basis and introducing coupling between packages and Fleet doesn't sound like a safe choice. What if we start accepting community packages? We won't be able to store information about community pkgs in Kibana.

@ChrsMark
Copy link
Member Author

ChrsMark commented Jul 6, 2022

@kpollich @gizas fyi, we had a chat with @mtojek to make things more clear. What we will be evaluating is implementing the template construction compatible with kubernetes hints in a CI component similarly to what we have buckets' indexing etc.

In that case the ConfigMap with the templates will be constructed asynchronously by a job and will be available to our users through https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone. With that we have the same UX that we have today as described at https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html and Kibana/Fleet are not somehow affected.

We will only support having the "templates" based on latest packages since the logic in standalone is decoupled from packages' updates/versions etc, and we just need input policies that work with the defined Agent. This would mean that for 8.3 we will ship a manifest available at https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml which will include templates that are compatible with 8.3 version. This is a convention that will handle in time of construction.

Regarding scaling, since we will only be including the "latest" compatible packages at the moment we talk about ~150 package and hence ~150 input templates. While we are scaling we can consider selection options but we don't foresee any crucial blocker here.

Having said this, since we agree on taking the safest approach we can consider this issue as "stalled" for now and close it soon if we have the CI's approach moving forward :).

@gizas
Copy link
Contributor

gizas commented Jul 6, 2022

Thank you! As all teams are unblocked and no issues with performance sure we can go with above: [For all to be synced] proposal is: CI construction of templates and place those under https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml

We will need this issue to track any work that might needed in Fleet UI to update manifests etc after templates are done

@ChrsMark ChrsMark assigned ChrsMark and unassigned ChrsMark Jul 11, 2022
@ChrsMark
Copy link
Member Author

@gizas elastic/elastic-agent#613 seems completed. Do we still need this one for any reason?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

7 participants