Unable to configure read autoscaling for dynamoDB #1267

eraac · 2019-11-15T11:27:47Z

Describe the bug
DynamoDB AutoScaling doesn't work for read

To Reproduce
Steps to reproduce the behavior:

Download this config and use it for Loki
Started Loki (latest commit)

Expected behavior
Have autoscaling for read and write

Environment:

Infrastructure: ec2 instance (with iam role attached to the instance)

Screenshots dynamodb console

logs

level=info ts=2019-11-18T10:16:21.962837889Z caller=table_manager.go:220 msg="synching tables" expected_tables=1
level=info ts=2019-11-18T10:16:24.044363479Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)

Question
Does dynamodb + autoscaling configured can work without #1226 ? Everything I try with loki 0.4.0 result as a fail, does I miss something in the configuration or no one have try this? (I will be a bit surprised)

The text was updated successfully, but these errors were encountered:

eraac · 2019-11-15T13:02:48Z

More information

Loki logs (UTC+0)

level=info ts=2019-11-15T10:10:08.392559986Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300
level=info ts=2019-11-15T10:11:08.392703076Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300
level=info ts=2019-11-15T10:42:02.824300182Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300
level=info ts=2019-11-15T11:13:22.33312897Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300
level=info ts=2019-11-15T11:15:22.298819647Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300
level=info ts=2019-11-15T12:15:22.320476983Z caller=dynamodb_table_client.go:301 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=300

I have the feeling that the new_write and the old_write are reversed.

level=info ts=2019-11-15T15:41:30.605722793Z caller=dynamodb_table_client.go:308 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=10 new_read=30 new_write=100
level=info ts=2019-11-15T15:52:30.607094963Z caller=dynamodb_table_client.go:308 msg="updating provisioned throughput on table" table=loki_index_2602 old_read=30 old_write=16 new_read=30 new_write=100

dynamodb console (UTC+0)
Metrics

Scaling activities

Scaling activities when table is inactive

cloudtrail log (UTC+1)
Cloudtrail logs from 12h10 to 12h20 for event UpdateTable -> https://gist.github.com/Eraac/8c6336119fae9e4bd164a4118dcd7d3f

Some entries are weird, like this one

bboreham · 2019-11-28T14:41:30Z

It looks like the Tablemanager is ignoring your desire to use AWS auto-scaling, and simply overwriting with the static 300 number each time it spots the provision has changed.

~~Possibly because applicationautoscaling needs a sub-field url ?~~ This shouldn't be necessary.

I gave up on AWS auto-scaling long ago and wrote the "metrics-based scaling".

bboreham · 2019-11-28T14:45:55Z

Given more logs, especially from the beginning of the run, it's possible that something might give a clue.

I have the feeling that the new_write and the old_write are reversed.

Old is what it found; new is what it's about to set it to.

eraac · 2019-11-28T15:33:38Z

Old is what it found; new is what it's about to set it to.

Just now from the log

level=info ts=2019-11-28T14:55:27.960803543Z caller=dynamodb_table_client.go:308 msg="updating provisioned throughput on table" table=loki_index_2604 old_read=100 old_write=47 new_read=100 new_write=30
level=info ts=2019-11-28T15:13:25.962475431Z caller=dynamodb_table_client.go:308 msg="updating provisioned throughput on table" table=loki_index_2604 old_read=100 old_write=35 new_read=100 new_write=30

graph

I've configured a cooldown of 1h, but the last two update have less than 20 minutes interval (generating LimitExceededException)

Given more logs, especially from the beginning of the run

I've 268840 line of log, but I can grep some keyword

autoscaling

level=info ts=2019-11-28T15:19:27.038040931Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:20:24.083825356Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:20:27.015308895Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:21:24.013114472Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:21:27.031446917Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:22:22.036986715Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:22:25.034468798Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:23:22.022567013Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:23:27.541360812Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:24:24.035129993Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:24:27.025391016Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:25:24.172227843Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:25:27.026707703Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:26:24.016891186Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:26:27.03978903Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:27:24.036532193Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:27:27.014925951Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:28:24.010808289Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:28:27.02810959Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)
level=info ts=2019-11-28T15:29:22.031204169Z caller=aws_autoscaling.go:144 msg="enabling autoscaling on table" table=(MISSING)

mostlyAtNight · 2019-12-19T08:10:20Z

Hi all,

I'm experiencing similar problems here. I thought I'd set up loki to set write (and read) autoscaling on active tables (and the first inactive) but it's only setting write autoscaling for some reason.

I thought this was because I updated by configuration after my tables were initially created but I checked the settings in AWS on a newly created active index table last night and can see that it's still missing the read autoscaling settings:

The logs show this when the new table was created:

level=info ts=2019-12-18T23:50:49.046121842Z caller=table_manager.go:220 msg="synching tables" expected_tables=8
level=info ts=2019-12-18T23:50:49.068163415Z caller=table_manager.go:363 msg="creating table" table=loki_prod_index_2607
level=info ts=2019-12-18T23:52:49.04608598Z caller=table_manager.go:220 msg="synching tables" expected_tables=8
level=info ts=2019-12-18T23:54:49.046061857Z caller=table_manager.go:220 msg="synching tables" expected_tables=8

.. and my config is shown below. Any help you can give much appreciated. Happy to attach more information if it helps.

config:
  schema_config:
    configs:
    - from: 2019-11-01
      store: aws
      object_store: aws
      schema: v9
      index:
        prefix: loki_prod_index_
        period: 168h
  storage_config:
    aws:
      s3: s3://REDACTED:REDACTED@eu-west-1/lutra-loki-prod
      dynamodbconfig:
        dynamodb: dynamodb://REDACTED:REDACTED@eu-west-1
        applicationautoscaling: https://REDACTED:REDACTED@eu-west-1
  table_manager:
    retention_deletes_enabled: true
    retention_period: 26208h
    index_tables_provisioning:
      # Active tables
      # Active tables will use provisioned throughput mode, not on-demand
      provisioned_throughput_on_demand_mode: false
      # Starting provisoned throughput values for new tables
      provisioned_read_throughput: 1
      provisioned_write_throughput: 1
      read_scale:
        enabled: true
        role_arn: arn:aws:iam::REDACTED:role/aws-service-role/dynamodb.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_DynamoDBTable
        # 1 unit is $0.000147 / hr = ~$0.1 / mo. / table
        min_capacity: 1
        # ~$0.5 / mo. on 1 active table
        max_capacity: 10
        # DynamoDB minimum seconds between each autoscale up.
        out_cooldown: 1800
        # DynamoDB minimum seconds between each autoscale down.
        in_cooldown: 3600
        # DynamoDB target ratio of consumed capacity to provisioned capacity.
        target: 80
      write_scale:
        enabled: true
        role_arn: arn:aws:iam::REDACTED:role/aws-service-role/dynamodb.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_DynamoDBTable
        # 1 unit is $0.000735 / hr = ~$0.5 / mo. / table
        min_capacity: 1
        # ~$5 / mo. on 1 active table
        max_capacity: 10
        # DynamoDB minimum seconds between each autoscale up. (default)
        out_cooldown: 1800
        # DynamoDB minimum seconds between each autoscale down. (default)
        in_cooldown: 3600
        # DynamoDB target ratio of consumed capacity to provisioned capacity.
        target: 80
      # Inactive tables
      # The most recent inactive table will still be auto-scaled but the 
      # rest should be set to use on-demand throughput
      inactive_throughput_on_demand_mode: true
      inactive_read_throughput: 1
      inactive_write_throughput: 1
      inactive_write_scale_lastn: 0
      inactive_read_scale_lastn: 1
      inactive_read_scale:
        enabled: true
        role_arn: arn:aws:iam::REDACTED:role/aws-service-role/dynamodb.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_DynamoDBTable
        # 1 unit is $0.000147 / hr = ~$0.1 / mo. / table
        min_capacity: 1
        # ~$0.5 / mo. on 1 inactive table
        max_capacity: 10
        # DynamoDB minimum seconds between each autoscale up.
        out_cooldown: 1800
        # DynamoDB minimum seconds between each autoscale down.
        in_cooldown: 3600
        # DynamoDB target ratio of consumed capacity to provisioned capacity.
        target: 80
      #inactive_write_scale:
      #  enabled: true
      #  role_arn: arn:aws:iam::REDACTED:role/aws-service-role/dynamodb.application-autoscaling.amazonaws.com/AWSServiceRoleForApplicationAutoScaling_DynamoDBTable
      #  # 1 unit is $0.000147 / hr = ~$0.1 / mo. / table
      #  min_capacity: 1
      #  # ~$0.5 / mo. on 1 inactive table
      #  max_capacity: 1
      #  # DynamoDB minimum seconds between each autoscale up.
      #  out_cooldown: 1800
      #  # DynamoDB minimum seconds between each autoscale down.
      #  in_cooldown: 3600
      #  # DynamoDB target ratio of consumed capacity to provisioned capacity.
      #  target: 80

mostlyAtNight · 2020-01-14T13:00:20Z

Hi people,

Any thoughts on this issue? Let me know if I can provide more useful information.

Kind regards,

Pete

bboreham · 2020-01-19T19:38:13Z

Setting the read scaling parameters on DynamoDB is not implemented, sorry.

Note that I plan to remove the AWS auto-scaling code from Cortex entirely.

EDIT: I think what happened is there was no way to set read scaling parameters, then they were added for the metrics-based scaling, and never implemented for AWS auto-scaling.

mostlyAtNight · 2020-01-20T08:58:43Z

Hi @bboreham - thanks for your reply. Would read/write scaling work if I switch to metrics-based scaling?

eraac · 2020-01-21T08:05:25Z

Make sense, if i understand well Loki use Cortex with AWS autoscaling mode, so Loki team has to update the code and use the "metrics-based" scaling mode? (which is implemented in cortex independently of AWS?)

I guess, this will fix the weird behaviors with write scaling? (see my second post)

stale · 2020-02-20T08:45:45Z

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Fix fmt.Errorf() error message

Typraeurion mentioned this issue Feb 13, 2020

InvalidParameter: minimum field value of 1 #1692

Closed

stale bot added the stale A stale issue or PR that will automatically be closed. label Feb 20, 2020

stale bot closed this as completed Feb 27, 2020

stijndehaes mentioned this issue May 15, 2020

Add an option to use on demand Dynamodb #2079

Closed

cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021

Merge pull request grafana#1267 from JoeWrightss/patch-2

3fd07a9

Fix fmt.Errorf() error message

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to configure read autoscaling for dynamoDB #1267

Unable to configure read autoscaling for dynamoDB #1267

eraac commented Nov 15, 2019 •

edited

Loading

eraac commented Nov 15, 2019 •

edited

Loading

bboreham commented Nov 28, 2019 •

edited

Loading

bboreham commented Nov 28, 2019

eraac commented Nov 28, 2019 •

edited

Loading

mostlyAtNight commented Dec 19, 2019

mostlyAtNight commented Jan 14, 2020

bboreham commented Jan 19, 2020 •

edited

Loading

mostlyAtNight commented Jan 20, 2020

eraac commented Jan 21, 2020

stale bot commented Feb 20, 2020

Unable to configure read autoscaling for dynamoDB #1267

Unable to configure read autoscaling for dynamoDB #1267

Comments

eraac commented Nov 15, 2019 • edited Loading

eraac commented Nov 15, 2019 • edited Loading

bboreham commented Nov 28, 2019 • edited Loading

bboreham commented Nov 28, 2019

eraac commented Nov 28, 2019 • edited Loading

mostlyAtNight commented Dec 19, 2019

mostlyAtNight commented Jan 14, 2020

bboreham commented Jan 19, 2020 • edited Loading

mostlyAtNight commented Jan 20, 2020

eraac commented Jan 21, 2020

stale bot commented Feb 20, 2020

eraac commented Nov 15, 2019 •

edited

Loading

eraac commented Nov 15, 2019 •

edited

Loading

bboreham commented Nov 28, 2019 •

edited

Loading

eraac commented Nov 28, 2019 •

edited

Loading

bboreham commented Jan 19, 2020 •

edited

Loading