Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature: allow specifying of target host machine on every individual resource/data source #1071

Open
memetb opened this issue Feb 22, 2024 · 4 comments

Comments

@memetb
Copy link
Contributor

memetb commented Feb 22, 2024

System Information

Linux distribution

any

Terraform version

terraform -v

Terraform v1.5.7
on linux_amd64

Provider and libvirt versions

git describe --always --abbrev=40 --dirty

c8facd868b69360f4542f8907eb1cf4c7b2fffa8-dirty

Description of Issue/Question

The current recommendation for connecting to multiple hosts is to declare multiple providers and assign them aliases. Indeed, this is how hashicorp themselves recommend using multiple providers with AWS.

However, AWS is a service, where as libvirt is connecting directly to machines and therefore the use case is not exactly the same. Let's take for instance an enterprise which collocates multiple racks in multiple geographic locations (TOR,NYC,SFC). Each rack has an id (rackN), and in each rack there are 5 general purpose compute machines (generalN), 3 GPU enabled machines (gpuN), and 3 storage servers (storageN) with dedicated connectivity between. This gives each server a fqdn slug like tor-rack1-general2.evilcorp.com.

Now let's say an IaC engineer wants to create a module for an application that will allocate allocate a CPU frontend and a GPU backend, and allocate a storage device on the storage server.

The desired use-case for such a module would be something like so:

terraform {
  required_version = ">= 0.13"
  required_providers {
    libvirt = {
      source = "dmacvicar/libvirt"
      version = "1.0.0"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.0"
    }
    digitalocean = {
      source  = "digitalocean/digitalocean"
      version = "~> 2.0"
    }
  }
}

provider "digitalocean" {
  token = var.do_token
}

provider "cloudflare" {
  api_token = var.cf_token
}

provider "libvirt" {
  uri = "qemu+ssh://${var.target}/system?sshauth=privkey&no_verify=true"
}

module "dev" {
  source = "./foobar"

  region   = "TOR"
  hostname = "development"
  //...
}

module "uat" {
  source = "./foobar"

  region   = "TOR"
  hostname = "user-acceptance"
  //...
}

module "production" {
  source = "./foobar"

  region   = "TOR"
  hostname = "production"
  //...
}

Under the hood, the module foobar/main.tf would connect to an unknown number of host machines to accomplish this goal, something like:

resource "libvirt_volume" "local-data" {
  host = "${var.region}-rack1-storage.evilcorp.com"
  // ...
}

resource "libvirt_domain" "frontend" {
  host = "${var.region}-rack1-general2.evilcorp.com"
  // ...
}

resource "libvirt_domain" "compute" {
  host = "${var.region}-rack1-gpu2.evilcorp.com"
  // ...
}

Note how doing this with duplicated providers forces an abstraction leak: the IaC engineer working on the backend stuff now has to inform the application programmers that they need 3 providers to be able to call this module. Any changes to metal implementations cause application programmers to need to change their code, or IaC engineers to muck with code that isn't theirs.

I have implemented a full solution to this feature request here: https://github.com/memetb/terraform-provider-libvirt/compare/main...memetb:terraform-provider-libvirt:multi-connection?expand=1
Associated PR #1072

I have not submitted it as a PR because I used a currently outstanding PR as my branch base (I can rebase if there's interest for this to be pushed through).

Implementation Notes

A couple of improvements have been made which are of note:

  • there was an abstraction leak whereby definition modules would make use of locks and therefore need to call the configuration object. This abstraction has been corrected and a lock is obtained via an interface, this allows for...
  • per connection locking to allow for simultaneous deployment across any number of machines
  • for backwards compatibility, the host and uri arguments are both optional:
    • the provider uri argument, when specified, will act as a default URI to connect to
    • a host argument can be passed to any of the resources or data sources: if present, that host uri will be used, if absent the default URI will be used
    • if neither are present an error will be raised
@scabala
Copy link
Contributor

scabala commented Sep 2, 2024

Hi,
I strongly disagree with this provider handling multiple locations. Libvirt is just a service running on single machine and this provider is meant as a mean to connect to it. If there is need to communicate with multiple libvirt instances, there's a solution - multiple providers. If there's a need to orchestrate machines across different hosts, there are different solutions for that like oVirt, Proxmox, OpenStack, CloudStack or others.

@memetb
Copy link
Contributor Author

memetb commented Sep 3, 2024

I strongly disagree with this provider handling multiple locations. Libvirt is just a service running on single machine and this provider is meant as a mean to connect to it.

libvirt+ssh is a network service. Any machine which has a libvirt & ssh setup can be part of a cluster of servers which provide access to terraform this terraform provider.

terraform is an orchestration technology: the entire purpose of terraform is that it allows spinning up resources non-locally.

I'm not sure what your objection exactly is, given that the two conditions you cite are already in use in the wild.

If there's a need to orchestrate machines across different hosts, there are different solutions for that like oVirt, Proxmox, OpenStack, CloudStack or others.

I'm not sure if you are suggesting the solution to having a terraform based problem is to use a non-terraform solution?

@scabala
Copy link
Contributor

scabala commented Sep 3, 2024

libvirt by itself work only on single machine level - I see no point in trying to somehow emulate it working on multiple hosts on provider level when perfectly working mechanisms already exists - multiple providers.
Having it on provider directly complicates internal structure and introduces another abstraction layer that does not have counterpart in pure libvirt - honestly it is huge effort with limited usecase and - as mentioned before - with existing mechanism to address it.

I'm not sure if you are suggesting the solution to having a terraform based problem is to use a non-terraform solution?

You are trying to achieve multi-host orchestration, which is not what libvirt is. It is not a libvirt problem domain so I suggested tools that solve multi-host problem.

@memetb
Copy link
Contributor Author

memetb commented Sep 16, 2024

You are trying to achieve multi-host orchestration, which is not what libvirt is. It is not a libvirt problem domain so I suggested tools that solve multi-host problem.

I'm not sure if you're saying "libvirt through terraform" , or just libvirt. The latter is a non-issue to begin with: I'm not trying to achieve anything in the libvirt library itself.

For the former, I have explained exactly the use case where this is useful:

The current recommendation for connecting to multiple hosts is to declare multiple providers and assign them aliases. Indeed, this is how hashicorp themselves recommend using multiple providers with AWS.

However, AWS is a service, where as libvirt is connecting directly to machines and therefore the use case is not exactly the same. Let's take for instance an enterprise which collocates multiple racks in multiple geographic locations (TOR,NYC,SFC).

The only difference between AWS and libvirt is that AWS API acts as an interposing stateless service allowing the declaration of a single provider and use slug values to "route" the destination resource.

This is not about orchestrating the lifecycles of services on multiple hosts. It's about being able to declare (IaC) where your enterprise infrastructure lives (split across multiple baremetal host locations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants