Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Repository host runs Ubuntu on Azure/RHEL cluster #1868

Closed
sk4zuzu opened this issue Nov 20, 2020 · 2 comments
Closed

[BUG] Repository host runs Ubuntu on Azure/RHEL cluster #1868

sk4zuzu opened this issue Nov 20, 2020 · 2 comments
Assignees
Labels
priority/critical Show-stopper! You better start it now type/bug
Milestone

Comments

@sk4zuzu
Copy link
Contributor

sk4zuzu commented Nov 20, 2020

Describe the bug
While looking at #1824 I used config snippets designed for Epiphany 0.5 with epicli built from develop branch. After this PR #1687 snippets in question are not compatible with current code, but we don't assert that in any way! 😱

In result epcli apply on Azure/RHEL without explicitly defined infrastructure/virtual-machine for repository VM fails because default config for the VM is to use UbuntuServer OS image.

To Reproduce
Steps to reproduce the behavior:

  1. use config snippet for older Epiphany (for example from 0.5)
  2. execute epicli apply ...
  3. observe ansible error:
{"asctime": "04:35:52", "levelname": "INFO", "name": "cli.engine.ansible.AnsibleCommand", "message": "TASK [repository : Create epirepo repository] *************************************************************************************************************************************************"}
{"asctime": "04:35:52", "levelname": "INFO", "name": "cli.engine.ansible.AnsibleCommand", "message": "task path: /shared/build/test/ansible/roles/repository/tasks/Debian/setup.yml:4"}
{"asctime": "04:35:54", "levelname": "ERROR", "name": "cli.engine.ansible.AnsibleCommand", "message": "fatal: [mop-test-repository-vm-0]: FAILED! => {\"changed\": true, \"cmd\": \"set -o pipefail && /tmp/epi-repository-setup-scripts/create-repository.sh /var/www/html/epirepo true |& tee /tmp/epi-repository-setup-scripts/create-repository.log\", \"delta\": \"0:00:00.064175\", \"end\": \"2020-11-20 04:35:54.116775\", \"msg\": \"non-zero return code\", \"rc\": 2, \"start\": \"2020-11-20 04:35:54.052600\", \"stderr\": \"\", \"stderr_lines\": [], \"stdout\": \"disabling default repositories...\\nlibdpkg-perl not found, installing...\\ndpkg: error: cannot access archive '/var/www/html/epirepo/packages/libdpkg-perl*.deb': No such file or directory\", \"stdout_lines\": [\"disabling default repositories...\", \"libdpkg-perl not found, installing...\", \"dpkg: error: cannot access archive '/var/www/html/epirepo/packages/libdpkg-perl*.deb': No such file or directory\"]}"}

Please notice the *.deb extenstion in the log. 😱

Expected behavior
Our epicli command should assert that cluster config is not compatible, inform the user in an error message and point to documentation describing solution to this problem.

Config files

---
kind: epiphany-cluster
name: default
provider: azure
specification:
  admin_user:
    key_path: /shared/ssh/testenvs/id_rsa
    name: operations
  cloud:
    region: North Europe
    subscription_name: XXX
    use_public_ips: true
    use_service_principal: true   
  components:
    kafka:
      count: 1
      machine: kafka-machine-rhel
    kubernetes_master:
      count: 1
      machine: kubernetes-master-machine-rhel
    kubernetes_node:
      count: 3
      machine: kubernetes-node-machine-rhel
    load_balancer:
      count: 1
      machine: lb-machine-rhel  
    logging:
      count: 2
      machine: logging-machine-rhel
    monitoring:
      count: 1
      machine: monitoring-machine-rhel
    postgresql:
      count: 2
      machine: postgresql-machine-rhel
    rabbitmq:
      count: 2
      machine: rabbitmq-machine-rhel
    ignite:
      count: 2
      machine: ignite-machine-rhel
    opendistro_for_elasticsearch:
      count: 2
      machine:  opendistro-machine-rhel
  name: test
  prefix: qa
title: Epiphany cluster Config
---
kind: infrastructure/virtual-machine
name: kafka-machine-rhel
provider: azure
based_on: kafka-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: kubernetes-master-machine-rhel
provider: azure
based_on: kubernetes-master-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: kubernetes-node-machine-rhel
provider: azure
based_on: kubernetes-node-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: logging-machine-rhel
provider: azure
based_on: logging-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: monitoring-machine-rhel
provider: azure
based_on: monitoring-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: postgresql-machine-rhel
provider: azure
based_on: postgresql-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: lb-machine-rhel
provider: azure
based_on: load-balancer-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418"
---
kind: infrastructure/virtual-machine
name: rabbitmq-machine-rhel
provider: azure
based_on: rabbitmq-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418" 
---
kind: infrastructure/virtual-machine
name: ignite-machine-rhel
provider: azure
based_on: ignite-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418" 
---
kind: infrastructure/virtual-machine
name: opendistro-machine-rhel
provider: azure
based_on: logging-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: 7-RAW
    version: "7.7.2019090418" 
---
kind: configuration/kubernetes-master
name: default
provider: azure
specification:
  advanced:
    networking:
      plugin: flannel
    certificates:
      location: /etc/kubernetes/pki
      expiration_days: 24855
      renew: true
---
kind: configuration/postgresql
name: default
provider: azure
specification:
  replication:
    enable: true
    user: postgresql-replication-user
    password: postgresql-replication-password
    max_wal_senders: 5
    wal_keep_segments: 32
  additional_components:
    pgbouncer:
      enabled: yes
  extensions:
    pgaudit:
      enabled: yes
title: Postgresql
---        
kind: configuration/rabbitmq
title: "RabbitMQ"
provider: azure
name: default
specification:
  version: 3.7.10
  rabbitmq_user: rabbitmq
  rabbitmq_group: rabbitmq
  logrotate_period: weekly
  logrotate_number: 10
  ulimit_open_files: 65535
  amqp_port: 5672
  rabbitmq_use_longname: true
  rabbitmq_policies: []
  rabbitmq_plugins:
    - rabbitmq_management_agent
    - rabbitmq_management
  custom_configurations: []
  cluster:
    is_clustered: true

OS (please complete the following information):

  • RHEL
  • most likely CentOS

Cloud Environment (please complete the following information):

  • Azure

Additional context
N/A

@sk4zuzu sk4zuzu added type/bug status/grooming-needed priority/critical Show-stopper! You better start it now labels Nov 20, 2020
@przemyslavic
Copy link
Collaborator

Note that the user may not explicitly specify all components and then default values are used, which are not always set to 0, so validating only the input yaml file may not be sufficient unless we force the user to explicitly specify all components or we validate the manifest file.

@przemyslavic przemyslavic added this to the S20201203 milestone Nov 20, 2020
@sk4zuzu sk4zuzu self-assigned this Nov 20, 2020
@przemyslavic przemyslavic self-assigned this Nov 25, 2020
@przemyslavic
Copy link
Collaborator

✅ An assertion has been added to check whether all components specified explicitly in the yaml configuration file and those enabled implicitly in the defaults, have the same operating system defined.

@plirglo plirglo closed this as completed Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical Show-stopper! You better start it now type/bug
Projects
None yet
Development

No branches or pull requests

3 participants