Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for S3 as an AWS DMS Endpoint (target). #1685

Merged
merged 1 commit into from
May 7, 2018

Conversation

altay13
Copy link

@altay13 altay13 commented Sep 15, 2017

  • Added support for s3 as dms endpoint (with tests).

As AWS added the new feature to use a S3 DMS endpoint as a target endpoint we would be happy to have it also in terraform. We at Takeaway.com are using terraform for provisioning our AWS infrastructure. Currently we are implementing the DMS for replicating the data from production slave database to Redshift and S3 as a separate task, as we want to have a backup of the data.

The feature is already tested in our environment and working in forked project.
Hope everything done in a proper way including the description.
Thanks in advance.

@altay13 altay13 changed the title Add support for S3 as an AWS DMS target. Add support for S3 as an AWS DMS Endpoint (target). Sep 17, 2017
@altay13 altay13 closed this Sep 18, 2017
@altay13 altay13 reopened this Sep 18, 2017
@kurtmaile
Copy link

Hi, Awesome - when will this be released? Great stuff will be super useful

@altay13
Copy link
Author

altay13 commented Sep 25, 2017

Hi, @stack72
Maybe you can help with this pull request. We are waiting for the response for pushing it to production.
Thanks in advance

@radeksimko radeksimko added the enhancement Requests to existing resources that expand the functionality or scope. label Sep 27, 2017
@robbruce
Copy link
Contributor

robbruce commented Nov 5, 2017

Very useful, cheers for the work!

Not had the chance to test yet, resource creation was successful, however I have a couple of observations

  • If the kms_key_arn attribute is supplied, then plan/apply detects a change as a s3 endpoint does not have a kms key; the applied change would be a forces new resource
  • Resulting extra connection attributes are CsvDelimiter=,;CsvRowDelimiter=\n;CompressionType=GZIP;csvRowDelimiter=\n;compressionType=GZIP;csvDelimiter=, in the AWS console; I'm sure that the attributes should be compressionType not CompressionType for example.

It would be good to be able to control these or just allow the extra_connection_attributes attribute to pass these in because there are more than what you have hard coded.

@altay13
Copy link
Author

altay13 commented Nov 5, 2017

@robbruce
Thanks for the feedback.. I will check it in within 1-2 days. Most probably you are right.
Cheers

@radeksimko radeksimko added the size/L Managed by automation to categorize the size of a PR. label Nov 15, 2017
@altay13
Copy link
Author

altay13 commented Nov 21, 2017

Hey @robbruce

  1. Thanks for the feedback. I think I have fixed the issue related to extra connection attributes.
  2. About the kms key. It is set to force new in params because of other endpoint types, I can not change it. And as far as I know there is no interface for validation function in schema which can give you possibilities to validate the input based on the other parameter value. I think it should be in terraform, I will think about implementation, but it is another project.
    For now as a workaround I have added the conflict checks which will alert the kms conflict if it is set with bucket_name or bucket_folder parameter.

@robbruce
Copy link
Contributor

robbruce commented Dec 8, 2017

Thanks @altay13, I can confirm these are working.

@robbruce
Copy link
Contributor

robbruce commented Jan 3, 2018

@radeksimko - when can this be merged in?

@altay13 - any chance you can update your branch with master? Having to run from a compiled version of your code, but new features have been released

@altay13 altay13 force-pushed the master branch 2 times, most recently from 7e962a5 to 72efec2 Compare January 3, 2018 12:57
@altay13
Copy link
Author

altay13 commented Jan 3, 2018

@radeksimko @robbruce master branch is synced and commits are squashed.

@radeksimko radeksimko added service/s3 Issues and PRs that pertain to the s3 service. service/databasemigrationservice and removed service/s3 Issues and PRs that pertain to the s3 service. labels Feb 8, 2018
@robbruce
Copy link
Contributor

robbruce commented Mar 1, 2018

@altay13 looks like some changes were made and so now this conflicts, are you able to re-merge again?

@radeksimko its a bit unfair for a pull request to exist for so long and to expect the author to keep their code in line with upstream changes. Can this be escalated?

@ghost
Copy link

ghost commented Mar 1, 2018

Are we any closer to getting this?

@ghost ghost added the size/L Managed by automation to categorize the size of a PR. label Mar 2, 2018
@altay13
Copy link
Author

altay13 commented Mar 2, 2018

@robbruce @melvinio16 @radeksimko
Guys, I have rebased my master with upstream/master again :)

@ShahNewazKhan
Copy link

ShahNewazKhan commented Mar 2, 2018

Hi everyone, what are the options that need to be set to test this?

# Create a new endpoint
resource "aws_dms_endpoint" "test" {
  database_name               = "test"    # Should this be s3 bucket name?
  endpoint_id                 = "test-dms-endpoint-tf"
  endpoint_type               = "target"
  engine_name                 = "s3"
  extra_connection_attributes = "" # Do we need to populate anything here?
  kms_key_arn                 = "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
  ssl_mode                    = "none"

  tags {
    Name = "test"
  }

  username = "test"
}

Thanks in advance!

@altay13
Copy link
Author

altay13 commented Mar 3, 2018

Hi @ShahNewazKhan

resource "aws_s3_bucket" "dms_bucket" {
  bucket = "dms_s3_test_bucket"
  acl    = "public"

  tags {
    Name        = "Test bucket for dms s3 endpoint"
    Environment = "Test"
  }
}



resource "aws_dms_endpoint" "s3_endpoint" {
  endpoint_id                 = "testing-s3-target-endpoint"

  endpoint_type               = "target"
  engine_name                 = "s3"

  # kms_key_arn                 = "${var.kms_key_arn}"

  service_access_role          = "arn:aws:iam::blabla:role/dms-access-for-endpoint-foo"

  bucket_name                 = "${aws_s3_bucket.dms_bucket.id}"
  bucket_folder               = "test"

  extra_connection_attributes = "CompressionType=GZIP;CsvDelimiter=,;CsvRowDelimiter=\\n"

  tags {
    Name        = "Test bucket for dms s3 endpoint"
    Environment = "Test"
  }
}

In extra_connection_attributes as you can see, you can customize csv file parameters

@robbruce
Copy link
Contributor

robbruce commented Apr 6, 2018

@bflad can you help to get this in the v1.14.0 release?

@bflad
Copy link
Contributor

bflad commented Apr 6, 2018

Sorry I do not have time to look at this today before the release is cut. For those interested, we use 👍 votes on the original issue or PR comment to help with prioritization.

@robbruce
Copy link
Contributor

robbruce commented Apr 6, 2018

@bflad I suspect that this will not get voted for that often as DMS is not a commonly used service within AWS, however S3 DMS Endpoint has been supported by AWS since the 7th March 2017. Currently the workaround is to either use this code as a patched version or to use an embedded CloudFormation template; Cloudformation supported DMS in July 2017. Can it be in v1.15.0?

Copy link
Contributor

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @altay13 for this contribution! Can you please see my initial comments below and let me know if you have any questions or do not have time to implement the feedback?

@@ -117,6 +120,14 @@ func resourceAwsDmsEndpoint() *schema.Resource {
Type: schema.TypeString,
Optional: true,
},
"bucket_name": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To better match the AWS API and SDK, we should nest these attributes under a new configuration block:

"s3_settings": {
  Type:     schema.TypeList,
  Optional: true,
  MaxItems: 1,
  Elem: &schema.Resource{
    Schema: map[string]*schema.Schema{
      "bucket_name": {
        Type:     schema.TypeString,
        Required: true,
      },
      "bucket_folder": {
        Type:     schema.TypeString,
        Optional: true,
      },
    },
  },

This will allow us to also implement other attributes of the S3 configuration better such as compression_type, csv_delimiter, and csv_row_delimiter without polluting the top attribute namespace.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea
Will implement it also

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bflad
I have checked it. If we will implement it like this, then what should we do with extra_connection_attributes attribute? Because even if it is not like that in AWS API, but it is like this in AWS UI. Only bucket_name and folder is separated in UI.


hasChanges = true

goto DONE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please implement this without goto? Even if that means duplicated code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @bflad
Ok I will implement without goto.
Usually I also don't like goto approach, but usually as I thought it is normal in GO 👍
Will fix it and other feedback right away

d.Set("service_access_role", endpoint.S3Settings.ServiceAccessRoleArn)
d.Set("bucket_folder", endpoint.S3Settings.BucketFolder)
d.Set("bucket_name", endpoint.S3Settings.BucketName)
d.Set("extra_connection_attributes",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems these should available attributes nested under the new s3_settings configuration rather than creating our own attribute implementation.

`, randId, engineName, actionList)
}

func dmsEndpointDynamoDbConfig(randId string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we just statically assign all the dynamodb:* and s3:* permissions?

@bflad bflad added the waiting-response Maintainers are waiting on response from community or contributor. label Apr 13, 2018
@robbruce
Copy link
Contributor

@altay13 @bflad Really glad this is getting some traction. The S3 connection attributes differ depending if the endpoint is a source or a target:

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.S3.html
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html

Please make sure that extra connection attributes can still be passed through for anything which is not part of s3_settings

@microamp
Copy link
Contributor

microamp commented May 2, 2018

I wonder if we should only stick to the target endpoint for S3 in this PR to keep it simple. Is that an option? @bflad

I understand that both source and target endpoints are supported but awscli only mentions the target endpoint as shown below (if that matters).

$ aws dms create-endpoint help
...
       --s3-settings (structure)
          Settings  in JSON format for the target Amazon S3 endpoint. For more
          information about the available settings, see the  Extra  Connection
          Attributes  section  at Using Amazon S3 as a Target for AWS Database
          Migration Service .

       Shorthand Syntax:

          ServiceAccessRoleArn=string,ExternalTableDefinition=string,CsvRowDelimiter=string,CsvDelimiter=string,BucketFolder=string,BucketName=string,CompressionType=string

       JSON Syntax:

          {
            "ServiceAccessRoleArn": "string",
            "ExternalTableDefinition": "string",
            "CsvRowDelimiter": "string",
            "CsvDelimiter": "string",
            "BucketFolder": "string",
            "BucketName": "string",
            "CompressionType": "none"|"gzip"
          }
...
$ aws dms create-endpoint --generate-cli-skeleton
{
    ...
    "S3Settings": {
        "ServiceAccessRoleArn": "",
        "ExternalTableDefinition": "",
        "CsvRowDelimiter": "",
        "CsvDelimiter": "",
        "BucketFolder": "",
        "BucketName": "",
        "CompressionType": "gzip"
    },
    ...
}

@microamp
Copy link
Contributor

microamp commented May 4, 2018

I created a new pull request which has some additional work on top of @altay13's existing commit. Please have a look if you have been following this PR.

@altay13
Copy link
Author

altay13 commented May 4, 2018

@microamp
Hello mate
Thanks for your contribution. But I have applied the pull request since November, and it just got reviewed 3 weeks ago. And now I am working on it, and actually you are stepping on my toes. It is not so nice. Lets call it not very nice way of contribution.
And actually the last time I wrote a comment and waiting for the response. I had another plans regarding implementation which is almost equals with the proposition from @bflad
I have the implementations just got the last point regarding extra stuff.
Thanks in advance.

@microamp
Copy link
Contributor

microamp commented May 6, 2018

Sorry if I made you feel that way. It wasn't intentional at all.

In my defence, I could see no indication that it was still being actively worked on. You posted a comment

Will fix it and other feedback right away

23 days ago as of today. The PR was labelled as 'waiting-response' on the same day and it hasn't changed since. I also saw someone making a pull request on behalf of someone else in this project after 20 days of inactivity. Perhaps not directly comparable to this one, but I doubt he violated CoC then. I also did not steal your work. I clearly acknowledged that and kept your commit around in my pull request even though that meant additional work involved in fixing conflicts, etc.

I will close my pull request. I hope this gets merged soon.

@altay13
Copy link
Author

altay13 commented May 7, 2018

@microamp
Sorry if I sound rude and thanks for understanding. Actually I have made already the changes written in the review. But I have also asked a question about the external settings, as if because I have made already projects in my last company and terraform are with this external settings. So I asked actually contributor what about that, if totally change the external settings parameter to the only way as is written in review, it means after switching to normal terraform they will have to change the terraform files. If it is ok, then I will test my code and make a pull request in these days.

Again, thanks for understanding.

@bflad
Copy link
Contributor

bflad commented May 7, 2018

For what its worth, the implementation of #4447 is almost exactly what we would expect this PR to look like and had it not been closed, it was likely to be merged since it properly attributed credit to the original author and passed acceptance testing.

But I have also asked a question about the external settings, as if because I have made already projects in my last company and terraform are with this external settings. So I asked actually contributor what about that, if totally change the external settings parameter to the only way as is written in review, it means after switching to normal terraform they will have to change the terraform files.

As a project we cannot reasonably make concessions for any implementation prior to officially released resources of the provider when they introduce potential maintenance burdens or unexpected behavior with the API. In almost all cases we prefer to follow the API implementation. Using the API implementation offers a simpler translation from other AWS SDK/CloudFormation/CLI implementations, lowering the barrier of entry for migrating between these projects or cross-referencing documentation of those projects.

For reference, other AWS projects utilize S3 settings as such to follow the API:

Extra connection attributes can still be used to control configuration like cannedAclForObjects and likely the attributes you mention previously. Although, if something like the bucketName extra connection attribute is provided only to extra_connection_attributes but not s3_settings.0.bucket_name, then Terraform will likely (correctly) report the missing attribute in s3_settings.0.bucket_name to match the API returning that value for that attribute (if it does). This is not something we should try to workaround nor should we provide the inverse to inject these attributes back into extra_connection_attributes. The previous implementations will need to adjust their implementation to match the behavior introduced by the official release.

@altay13
Copy link
Author

altay13 commented May 7, 2018

To be honest, @bflad and @microamp, now it is already late to close the pull request and to redo the stuff what was done. If you @bflad say it is normal way and it would be merged, then I am ok with that to reopen the #4447 pull request and merge it and as if you say it is done on top of this branch.
I am just saying it would be nicer to just ask before doing it, because obviously after your review and feedback frankly saying it is not a difficult part to implement it.
It is not worth it arguing in this case. I am ok with re-opening and merging the pull request.
I am just now merged and solved the conflicts in my branch, but it is useless to re-write the same code which @microamp wrote already.
Lets do the way which is recommended by you @bflad
Thanks in advance

@bflad bflad merged commit 3380f00 into hashicorp:master May 7, 2018
@bflad bflad added this to the v1.18.0 milestone May 7, 2018
bflad added a commit that referenced this pull request May 7, 2018
@microamp
Copy link
Contributor

Many thanks to both of you for handling this professionally. I regret not asking first, and I will be more careful next time. Thanks guys.

@bflad
Copy link
Contributor

bflad commented May 10, 2018

This has been released in version 1.18.0 of the AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

@ghost
Copy link

ghost commented Apr 6, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 6, 2020
@breathingdust breathingdust removed the waiting-response Maintainers are waiting on response from community or contributor. label Sep 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Requests to existing resources that expand the functionality or scope. size/L Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants