Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follower stats #126

Merged
merged 1 commit into from
Sep 7, 2021
Merged

Follower stats #126

merged 1 commit into from
Sep 7, 2021

Conversation

gbbafna
Copy link
Collaborator

@gbbafna gbbafna commented Sep 2, 2021

Description

API to aggregate follower cluster's metrics at cluster and index level

Issues Resolved

#125

Sample Response

➜  curl   -XGET "${FOLLOWER}/_plugins/_replication/follower_stats"
{
  "num_syncing_indices" : 4,
  "num_bootstrapping_indices" : 0,
  "num_paused_indices" : 0,
  "num_shard_tasks" : 4,
  "num_index_tasks" : 4,
  "operations_written" : 8,
  "operations_read" : 8,
  "failed_read_requests" : 0,
  "throttled_read_requests" : 0,
  "failed_write_requests" : 0,
  "throttled_write_requests" : 0,
  "follower_checkpoint" : -4,
  "leader_checkpoint" : 4,
  "total_write_time_millis" : 825,
  "index_stats" : {
    "leader-01" : {
      "operations_written" : 3,
      "operations_read" : 3,
      "failed_read_requests" : 0,
      "throttled_read_requests" : 0,
      "failed_write_requests" : 0,
      "throttled_write_requests" : 0,
      "follower_checkpoint" : -1,
      "leader_checkpoint" : 2,
      "total_write_time_millis" : 222
    },
    "follower-01" : {
      "operations_written" : 3,
      "operations_read" : 3,
      "failed_read_requests" : 0,
      "throttled_read_requests" : 0,
      "failed_write_requests" : 0,
      "throttled_write_requests" : 0,
      "follower_checkpoint" : -1,
      "leader_checkpoint" : 2,
      "total_write_time_millis" : 143
    },
    "follower-04" : {
      "operations_written" : 1,
      "operations_read" : 1,
      "failed_read_requests" : 0,
      "throttled_read_requests" : 0,
      "failed_write_requests" : 0,
      "throttled_write_requests" : 0,
      "follower_checkpoint" : -1,
      "leader_checkpoint" : 0,
      "total_write_time_millis" : 298
    },
    "leader-03" : {
      "operations_written" : 1,
      "operations_read" : 1,
      "failed_read_requests" : 0,
      "throttled_read_requests" : 0,
      "failed_write_requests" : 0,
      "throttled_write_requests" : 0,
      "follower_checkpoint" : -1,
      "leader_checkpoint" : 0,
      "total_write_time_millis" : 162
    }
  }
}

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@gbbafna gbbafna force-pushed the follower-stats branch 2 times, most recently from cda7fed to 92024b8 Compare September 6, 2021 07:35
Copy link
Collaborator

@krishna-ggk krishna-ggk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a few comments.

Can you also share sample response type?

Comment on lines 36 to 53
var shardStats :MutableMap<ShardId, FollowerShardStats> = mutableMapOf()
var indexStats :MutableMap<String, FollowerIndexStats> = mutableMapOf()

var pausedIndices :Int = 0
var failedIndices :Int = 0
var bootstrappingIndices :Int = 0
var syncingIndices :Int = 0
var shardTaskCount :Int = 0
var indexTaskCount :Int = 0

var opsWritten: Long = 0
var opsWriteFailures: Long = 0
var opsWriteThrottles: Long = 0
var opsRead: Long = 0
var opsReadFailures: Long = 0
var opsReadThrottles: Long = 0
var localCheckpoint: Long = 0
var remoteCheckpoint: Long = 0
var totalWriteTime: Long = 0

companion object {
private val log = LogManager.getLogger(FollowerStatsResponse::class.java)
}

constructor(inp: StreamInput) : super(inp) {
shardStats = inp.readMap(::ShardId, ::FollowerShardStats)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to always derive from shardStats and only store that?

Main concerns on storing the same data twice is potential paths to introduce bugs where the paths are inconsistently handled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the same in latest revision.

Comment on lines 261 to 260
retentionLeaseHelper.renewRetentionLease(leaderShardId, indexShard.lastSyncedGlobalCheckpoint, followerShardId)
followerClusterStats.stats[followerShardId]!!.localCheckpoint = indexShard.lastSyncedGlobalCheckpoint

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract renew + stats update into a member method? This is so we don't miss if we introduce newer paths to renew.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is difficult to do that as we are updating stats at 4 different places in 3 different scenarios :

  1. GetChanges success
  2. GetChanges failure and want to know the reason as well.
  3. Retention lease success

@gbbafna gbbafna requested review from krishna-ggk and tbhanu-amzn and removed request for saikaranam-amazon September 6, 2021 18:10
tbhanu-amzn
tbhanu-amzn previously approved these changes Sep 6, 2021
Signed-off-by: Gaurav Bafna <[email protected]>
@gbbafna gbbafna merged commit bcfa725 into opensearch-project:main Sep 7, 2021
@gbbafna gbbafna deleted the follower-stats branch September 7, 2021 08:37
gbbafna added a commit to gbbafna/cross-cluster-replication that referenced this pull request Sep 7, 2021
gbbafna added a commit that referenced this pull request Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants