Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECR] [request]: Cross Region Replication for Repositories #140

Closed
RyPeck opened this issue Jan 30, 2019 · 54 comments
Closed

[ECR] [request]: Cross Region Replication for Repositories #140

RyPeck opened this issue Jan 30, 2019 · 54 comments
Labels
ECR Amazon Elastic Container Registry Proposed Community submitted issue

Comments

@RyPeck
Copy link

RyPeck commented Jan 30, 2019

Tell us about your request
Cross region replication for images and tags in ECR Repositories

Which service(s) is this request for?
ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
We have containers deployed in multiple regions. We would like to rely on AWS to replicate the containers across regions, just like we do for S3 Objects in an S3 Bucket.

With this feature - we will easily be able to use AWS PrivateLink for ECR in each region we run containers in VPCs without internet access.

In the absence of this feature, we can build our own solution to copy images to multiple regions with each build or pay the costs (per GB and latency) involved in pulling images from a single region to another. This will also remove a single point of failure, ECR in a single region, from our current setup.

Are you currently working around this issue?
Currently pulling from a single region.

@RyPeck RyPeck added the Proposed Community submitted issue label Jan 30, 2019
@deleugpn
Copy link

This would be an amazing feature. I have been trying to build a Pipeline with cross-region deploy and ECR has been a stopper for me. It's a bit annoying to get ECR on all desired region and have just one (the pipeline region) pushing to all regions before running a cross region deploy.

@ghost
Copy link

ghost commented Jan 30, 2019

@RyPeck is part of a team in my organization that we provide infrastructure/container services for. This would be a very useful feature.

@tabern tabern added the ECR Amazon Elastic Container Registry label Jan 31, 2019
@max-rocket-internet
Copy link

Cross region replication for images and tags in ECR Repositories

Yes or what would be even better is just a global ECR option. i.e. ecr.aws/my-org/my-image. And this would be transparently replicated in the same way a CDN is. This is how Google's GCR does it and it's much simpler.

@ghost
Copy link

ghost commented Feb 18, 2019

Defiantly want to see thing getting worked on soon.

ECR is a single point of failure in the ECS design. ideally a global endpoint where the images are pulled and pushed from but are cached at region level ECS / EKS / Fargate that are the closed to you with the ability to pull from different regions if there are any timeout issues ect.

we are using azure container registry just now as it has the feature.

@jtoberon
Copy link

jtoberon commented Feb 21, 2019

Thanks for your input, everyone. Before I joined AWS, I worked on a team that had to build cross region replication on top of ECR, too!

I'm going to leave this as Proposed only because it's not on the top of our priority list yet. In the meantime, it would be useful to hear more about what folks want:

  1. When do you want to replicate an image? Is it sufficient to replicate all images, or do you want some control over which images are replicated?
  2. Do you want to replicate across accounts?
  3. What are the primary problems you're trying to solve with cross region replication? For example, you might be trying to speed up image pulls, reduce your data transfer bill, build another region for disaster recovery, etc.
  4. How much control do you really want to have over referring to an image? Today, ECR includes a "region" part in the registry URL: https://aws_account_id.dkr.ecr.region.amazonaws.com. Let's say we remove the region. Do you want DNS control over where this URL points to, or do you want ECR to figure out the best way to serve the request (knowing that this can have cost and performance implications)?

@kadrach
Copy link
Member

kadrach commented Feb 21, 2019

Primarily speeding up image pulls of sometimes multi-gb sized images in ECS (and AWS Batch), half-way around the world.

(ECR as a pull-through cache would be nice!)

@deleugpn
Copy link

deleugpn commented Feb 21, 2019

You people just made me realise I don't need this to achieve what I want. I have been using a constant to set the image of the Task Definition for so long I didn't notice I could use only one ECR across all regions. Yeah it will make Fargate download slower but for my use case that's not a big deal.

With that in mind, I think the best option for me would be if I could choose to not have region on the ECR address and Amazon would load the nearest available from a static link. Like putting Cloudfront in front of ECR. I wonder if that's possible without making my images public.

@pmontanari
Copy link

Hi @jtoberon,

My need for multi Region Replication is mainly for Disaster Recovery. Ideally we would remove the region from the registry URL and let ECR figure out the best way to serve the request.

@jlambert121
Copy link

@jtoberon

  • Our ideal situation would be to have replication happen at layer creation (and synchronization might be a better word since we'd like removals to happen with a single delete as well).
  • Cross account replication would be something we would likely use if available, but isn't a requirement for us today.
  • The two problems we'd be looking to solve are DR and locality (cost and speed) for multi-region deployments.
  • A single global name to represent a repository would be ideal and let ECR figure out the best location to serve the image up from. Most of our traffic is from ECR to ECS, but this would simplify configuration but (hopefully) still add a bit of resiliencey in the event of a regional ECR outage.

@rdpa
Copy link

rdpa commented Mar 6, 2019

@jtoberon

  1. For my project's use case it would be sufficient to replicate all images.
  2. My project's use case does not require replicating across accounts.
  3. Many of the above reasons are applicable - speed up image pulls, reduce your data transfer bill, etc. My project uses active-active datacenters in multiple regions, but currently only pulls from and pushes to one "main" ecr region.
  4. Having ECR figure out the best way to serve the request would be ideal in active-active architectures, where if one ECR region goes down, ECR can simply figure out where to pull from. DNS control would also be nice in this scenario, but I'm not sure both are possible at the same time.

@mbelang
Copy link

mbelang commented Mar 8, 2019

@jtoberon

  1. Yes cross region would be a must
  2. Cross account is also a problem that we are facing

@RyPeck
Copy link
Author

RyPeck commented Mar 28, 2019

  1. When do you want to replicate an image? Is it sufficient to replicate all images, or do you want some control over which images are replicated?

When I push. I want to replicate all images.

  1. Do you want to replicate across accounts?

Possibly. Having cross account permissions work for replicated images would be a requirement.

  1. What are the primary problems you're trying to solve with cross region replication? For example, you might be trying to speed up image pulls, reduce your data transfer bill, build another region for disaster recovery, etc.

Cross region image pulls is the primary motivator which will reduce the data transfer bill and speed up image pulls.

  1. How much control do you really want to have over referring to an image? Today, ECR includes a "region" part in the registry URL: https://aws_account_id.dkr.ecr.region.amazonaws.com. Let's say we remove the region. Do you want DNS control over where this URL points to, or do you want ECR to figure out the best way to serve the request (knowing that this can have cost and performance implications)?

Including a "region" part of the registry URL seems acceptable. This feel similar to spinning up S3 Buckets in a different region and setting up replication.

@ajohnstone
Copy link

ajohnstone commented Mar 28, 2019

  • When do you want to replicate an image? Is it sufficient to replicate all images, or do you want some control over which images are replicated?

Replication at the repository level would be more than sufficient.

  • Do you want to replicate across accounts?

Yes, unfortunately ECR is currently too limited due to IAM permissions not being granular enough to cover image/tags. As such, we cannot prevent images from being pulled from ECR if an individual image had a vulnerability or had not been marked as scanned. See use cases in #230

  • What are the primary problems you're trying to solve with cross region replication? For example, you might be trying to speed up image pulls, reduce your data transfer bill, build another region for disaster recovery, etc.

DR and isolation between regions. Vulnerability scanning and pulling images.

  • How much control do you really want to have over referring to an image? Today, ECR includes a "region" part in the registry URL: https://aws_account_id.dkr.ecr.region.amazonaws.com. Let's say we remove the region. Do you want DNS control over where this URL points to, or do you want ECR to figure out the best way to serve the request (knowing that this can have cost and performance implications)?
  1. A generic endpoint that is globally distributed and points to nearest AWS owned POP/region.
  2. VPC endpoints to ECR with the same generic endpoint. DNS ideally the same except points to Interface. The VPC endpoint to support a policy.

@israelp
Copy link

israelp commented Apr 2, 2019

@jtoberon
For my project:

  1. Replicate on push, I prefer to mark a repository for replication (starting with all images would be fine)
  2. no need to replicate across accounts, I m using one account for images, and other multiple accounts pull the images, I need that the repository permissions will also be replicated (repository metadata)
  3. I m using vpc-endpoints, and I don't want my Fargate cluster to go public.
  4. Yes, I prefer, like other vpc-endpoints interfaces, that you will do private-dns for it.
    Thank you.
    IP

@raghukumarc
Copy link

raghukumarc commented Apr 11, 2019

We are looking at ECR cross-account cross-region replication for DR. I am sure most of the ECS users are building it in house for redundancy in case of Region failures or for DR.

@Globegitter
Copy link

For our use-case we only need cross-region replication for the same account only to regions we can control. The main use-cases would be to speed up image pulls, have further redundancy and especially to reduce our data transfer bill.

@barooi
Copy link

barooi commented Jul 4, 2019

@jtoberon

We're also very much interested in a Replication solution.

Our current design, we are implementing now, will be as described below.

We have an "CICD" AWS Account, where we build and push our dev builds to ECR repositories. A release consist of promoting (copying) the relevant images to repositories in a separate "Release" AWS Account.
We define these repositories as "master" repos (for releases and dev builds).

Multiple AWS Accounts (DTAP) exist where we run our clusters, they define the same repositories ("slave" repos in this case) for each region we are active in. For instance; Production runs in 7 regions.
We will implement a pull system that will check for the existence of the requested image version in the region slave ECR. If it is not found we pull the image from the relevant master repo to the slave repo.

Production, acceptance and test will only pull images from the Releases master, while our Development clusters will pull from both master sources.

Which brings us to your questions.

When do you want to replicate an image? Is it sufficient to replicate all images, or do you want some control over which images are replicated?

Ideally we want to define multiple master repositories for a slave repo. Replication only occurs when an image is not present and we do not need to control which images are replicated.
It would be nice to have an ordering in place which source repo to query first for missing versions (release master > development master).

As such, the slave repos work as a caching proxy to one or more master repos.

Do you want to replicate across accounts?
Yes, we have multiple accounts as described.

What are the primary problems you're trying to solve with cross region replication? For example, you might be trying to speed up image pulls, reduce your data transfer bill, build another region for disaster recovery, etc.
All of the above!

How much control do you really want to have over referring to an image? Today, ECR includes a "region" part in the registry URL: https://aws_account_id.dkr.ecr.region.amazonaws.com. Let's say we remove the region. Do you want DNS control over where this URL points to, or do you want ECR to figure out the best way to serve the request (knowing that this can have cost and performance implications)?
I think region can be dropped, but even with region still present I think we can manage.

@cloventt
Copy link

cloventt commented Aug 8, 2019

@jtoberon

This feature is highly desirable for my use-case. Pushing to multiple regions at once is our current workaround, but it can significantly increase the time taken for our deployment pipeline to run.

@mbelang
Copy link

mbelang commented Aug 9, 2019

@jtoberon

  1. Yes cross region would be a must
  2. Cross account is also a problem that we are facing

I will clarify my comment from above.

  1. Still valid
  2. We would like to have cross account replication because we are currently pushing our images to a single ECR in 1 of our account and we are pulling those images from other accounts/regions.
  3. Mostly image pull speed, data transfer bills
  4. Removing the region from the DNS is a must for us as we use a single ECR for all images. I do not know if that would be possible be removing the account from the DNS would be nice as user do not care about which account it sits in.

We decided to use that model to speed up our CI/CD pipeline. Pushing in a single ECR makes a log of sense IMHO. We are also using tags to promote images from dev to prod so no need to copy images to an other registry for every environment. The caveat of the is transfer cost and slow pull when say we push in ca-central-1 and pull from eu-central-1.

@max-rocket-internet
Copy link

I think we can summarise everything in this issue with one simple request: please just copy Google Cloud Container Registry 😃

@mbelang
Copy link

mbelang commented Aug 12, 2019

This is what I had in mind but was shy to say it 😂

@algestam
Copy link

Found this issue while overlooking our DR plan and ensuring that our ECR images will be available in case of a region failure.

@jtoberon

  1. Whenever an image is pushed would be enough for our needs.
  2. Cross account replication would not be needed for our needs but I can definitely see a need for it in other projects.
  3. Primarily Disaster recovery, secondarily image pull speed
  4. Dropping the region from the URL would be nice. It would still be ok with having it in place though.

Until this has been implemented we will run own solution to copy images to other regions.

@bminahan73
Copy link

We have our registries in a centralized account. For our use case images do not need to be replicated cross-account but would definitely still need to be pulled from multiple accounts.

Cross-region replication would assist us in disaster recovery and speed up our build/deploy process quite a bit. Since we need our images in two distinct locations for our DR plan, we currently do two docker pushes, one to each of two distinct regions in the same account. This unnecessarily adds time to the push part of our pipeline, potentially a lot of time if its a large image,

As for removing the region from the URL, our resources currently pull images based on the region they are deployed to. Example, AWS Batch definitions in us-east-2 pull images from us-east-2. There isn't really a reason for this other than potentially faster pull speeds (but not always turns out) when from the same region.

If AWS handled this decision for us and resolved url based on response time or something similar that would also be a huge improvement for DR. This would allow our services in us-east-2 to automatically pull images from us-west-2 if ECR in us-east-2 was experiencing issues, for example. Which today would be a manual change.

@bminahan73
Copy link

another headache is ensuring all image tags are in sync across regions, not just the images themselves. In our workflow many individuals can adjust tags on images for various business purposes. This is currently a process issue to enforce if you change a tag in one region, do it in the other region too. We could automate this away but would love for a managed solution

@ayush-sharma
Copy link

We've just started using ECR as container registry, and this is currently a blocker for us. Any existing solutions to backup ECR repos and images somewhere? My main concern is preventing accidental deletion (or deliberate in the event of a credentials breach) and reducing latency when cloning from a different region.

@anshul0915zinnia
Copy link

verify much needed our use case as well

@michielvermeir
Copy link

The single-account, single-region design of ECR is just a pain in the ass. I think most of us would really appreciate a singular registry endpoint, with some settings on which accounts/regions you would like replication for, and not have all this complexity unnecessarily exposed.

I thought org.ecr.amazonaws.com or ecr.amazonaws.com/org/ were nice suggestions. Coping with different registry endpoints involves retagging container images a lot, lots of shuffling bytes around.

@PatrickXYS
Copy link

Our use case:

We're running EKS cluster and deploying Kubeflow application. The point here is we need to create Kubeflow Notebook Server with provided AWS Kubeflow Image (hosted on ECR). In Kubeflow, there's no functionality that dynamically detects users' region and provided corresponding ECR Image.

We have to let users to use one single ECR image from restricted region, that's pain point from our side. Users may suffer from poor pulling performance from different regions.

What we expect:

A single ECR image without region specified and ECR team can take care of the traffic or Image Duplication in different regions.

@eist76
Copy link

eist76 commented Apr 28, 2020

this should be prioritized as it is a much needed feature. AWS customer need to be able to easily replicate ECR across different regions without any workarounds (codebuild, lambda, ...)

@virajpadte
Copy link

Went to a series of comments here and would like to add by saying I am facing a current scenario where I need image replication between EU-NORTH-1, US-EAST-1 and AP-SOUTH-EAST-1. The reason is simple we are trying to use ECR as a private repo solution across our organization. I am currently down the path following https://github.com/aws-samples/amazon-ecr-cross-region-replication but if there is a added feature from the ECR team that would be awesome!

@jdjaro
Copy link

jdjaro commented Jun 18, 2020

+1 for this. We need multi region deployments for resiliency in the event of a regional outage, and having the images stuck in a repo in a single region is a major point of failure. There's no point in having a stack deployed in a backup region if there are no images available to run in it.
Currently it appears to be a choice between this (as also mentioned by @virajpadte above), and using an ECR push event to trigger a Lambda that would copy the image to an ECR repo in another region. Both of these approaches seem like a lot of additional work for something that ECR should support by default.

@sun-mir
Copy link

sun-mir commented Jun 18, 2020

Has anybody used https://github.com/uber/kraken to augment this functionality?

@omieomye
Copy link

A quick update to the ECR community, thanks for continuing to comment and influence this ask. We're actively working on it. High level, we're aiming to tackle a push into a primary region ECR, replicating into N other region ECRs. We're looking to support both single-account across multi-region, and multi-account across multi-region scenarios. As soon as possible, we'll move it to the Coming Soon stage.

@Mehul1313
Copy link

Hello, what is the ETA of this feature? We are looking to copy the docker images to other region for DR.
We are taking approach of the using docker push command to push the image from us-east-1 to us-west-2. Has anyone tried the approach. My only concern is the latency.

@oslobodian
Copy link

Hello, extremely needed for our use case.

@TimoSchmechel
Copy link

Also would much love this feature

@sc-alscient
Copy link

This https://aws.amazon.com/blogs/containers/advice-for-customers-dealing-with-docker-hub-rate-limits-and-a-coming-soon-announcement/ talks about geo-replication for public containers in the new public registry. You would assume that it could be done with private ones as well now?

@AbdoNile
Copy link

AbdoNile commented Dec 7, 2020

Hi ,
Glad to see this with a "coming soon" label.
Will this include cross account replication as well ? my use cases requires this.

@mwarkentin
Copy link

It seems like this should be announcing soon:

@joshuastern
Copy link

https://aws.amazon.com/about-aws/whats-new/2020/12/amazon-ecr-announces-cross-region-replication-of-images/

@magJ
Copy link

magJ commented Dec 9, 2020

Support recently added, API reference documentation:
https://docs.aws.amazon.com/AmazonECR/latest/APIReference/API_PutReplicationConfiguration.html
https://docs.aws.amazon.com/AmazonECR/latest/APIReference/API_PutRegistryPolicy.html

User Guide documentation still seems to be unavailable.

@omieomye
Copy link

omieomye commented Dec 9, 2020

Shipped. https://aws.amazon.com/blogs/containers/cross-region-replication-in-amazon-ecr-has-landed/. The blog calls out some improvements we're already beginning to tackle. Thank you for being part of the Amazon ECR community!

@omieomye omieomye closed this as completed Dec 9, 2020
@odg0318
Copy link

odg0318 commented Jan 12, 2021

This seems to be available only per registry not repository, right?

@heidemn
Copy link

heidemn commented Jan 16, 2021

@odg0318 yes, but it seems there are plans for more fine-granular control:
https://aws.amazon.com/de/blogs/containers/cross-region-replication-in-amazon-ecr-has-landed/

What’s next? [...]

  • Replication status APIs to surface the progress of the replication process for an image.
  • The ability to add filters so that only a subset of repositories and images are replicated.
  • Notifications on replication events such as the completion of a copy.
  • Support for manifest lists.

@christopher-wong
Copy link

The docs here mention:

After this, every time you push an image to the private ECR repository (or call the replicate API explicitly) ECR automatically replicates the image.

Is this "replicate API" available?

@heidemn
Copy link

heidemn commented Jan 31, 2021

Is this "replicate API" available?

Haven't seen it anywhere. Probably the marketing stuff was written too early ;-)

FYI I did some testing of edge cases, if anybody is interested.
Not all of these details seem to be documented.

  • Replication creates the destination repo implicitly if it doesn't exist.
    But it does NOT copy the settings from the source repo:
    Source repo: Immutable tags, scan on push, KMS encryption.
    Destination repo, implicitly created: Mutable tags, scan on push disabled, AES encryption.
    -> If you need custom settings also on the destination repo, you must either create it upfront, or modify the settings later.
    This could be considered a bug, but it is unlikely that AWS would change the current behavior.
  • For an Ubuntu-sized image, it takes approx. 2-4 seconds until the replicated image is available with docker pull:
    I pushed the image to eu-west-1, and afterwards ran in a loop docker pull ...us-east-1...; sleep 1. -> succeeded on 3rd pull.
    (But note that I tested this on a Sunday.)
  • Tag immutability is evaluated in each region/registry separately:
    • E.g. replicating eu-west-1 -> us-east-1, source repo = mutable, destination repo = immutable.
      -> If you push image 1 to eu-west-1 repo:latest, then push image 2 to eu-west-1 repo:latest, then the tag in us-east-1 is not updated with the 2nd push.
    • E.g. source repo = immutable, destination repo = mutable:
      Since a tag-overwriting push to the source repo already fails, there's nothing to replicate.
      -> Neither source repo tag nor destination repo tag are overwritten.
  • As expected, pushing the same image to source repo and directly afterwards to destination repo works fine.
    Of course it's useless and a waste of time, but good to know that outdated build scripts pushing to both repos can't do harm.

@RLThomaz
Copy link

RLThomaz commented Feb 2, 2021

I haven't read all comments - there are too many - but, is there a way to trigger the replication of existing repositories and tags? At least the "latest" tag of each repository should be replicated, but that's not the case when the repo already exists.

Edit: perhaps we should be able to trigger the replication by pushing the same image again (this won't trigger anything currently) since this won't actually do anything (such as upload) besides checking that the image exists.

@michaelb990
Copy link

Sorry, looks like I missed a lot here!

Yes, replication is a registry setting, not a repository setting. We're working on adding the ability to filter which repositories and tags are replicated as part of the replication configuration (#1186).

@christopher-wong the blog post has now been updated. You're right, there's no "replicate API", replication is triggered on push only.

@RLThomaz we don't currently have a way to trigger replication of existing images in repositories. You can always retag them which will trigger a replication job.

We have more work planned for replication this year, so feel free to check out the other issues or open your own!

The ones that I've been following are: #1202 #1200 #1194 #1193 #1186

Thanks a lot for the feedback! Keep it coming.

@RLThomaz
Copy link

RLThomaz commented Feb 6, 2021

@michaelb990 thanks for your answer. Yes, I realised that and I opened a ticket to request that when a new region is added to the registry cross-region replication It triggers the replication to the new region. In the meantime, I've found some solutions of my own and I'm using lambda and CFN to solve this.

You can check it out here: #1252

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ECR Amazon Elastic Container Registry Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests