-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBG-3688 make sure kv_pool_size is overriden correctly for DCP #6680
Conversation
- create two functions for getting the connection string - GetGoCBConnString - GetGoCBConnStringForDCP Simplifications to API: - Fix a confusing invariant where GetGoCBConnString would initialize KvPoolSize on BucketSpec by requiring a constructor that does string parsing of server value to pull out kv_pool_size. This will fail if an invalid kv_pool_size value is created. - Move the creation of modified connection string for the bootstrap connection to the caller (rest/main.go). This means that both serverless and regular callers will have values set on the connection string for bootstrap connection. Before these values were only set for serverless bootstrap: - max_perhost_idle_http_connections - max_idle_http_connections - idle_http_connection_timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments/questions.
base/bucket.go
Outdated
@@ -119,6 +118,11 @@ type CouchbaseBucketType int | |||
|
|||
// Full specification of how to connect to a bucket | |||
type BucketSpec struct { | |||
BucketSpecOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactor feels arbitrary - I don't think KvPoolSize is any more or less of an 'option' than things like BucketOpTimeout. What's the problem that this refactor fixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KvPoolSize isn't an input parameter, it is only ever read from the connection string. It is parsed from the connection string and then stored when creating a bucket to determine how to throttle kvops in the below code.
numPools := 1
if spec.KvPoolSize > 0 {
numPools = spec.KvPoolSize
}
gocbv2Bucket.kvOps = make(chan struct{}, MaxConcurrentSingleOps*nodeCount*numPools)
A different way of addressing the immutability of the struct would be to never set this value and avoid the use of a constructor for this struct. Then, inside GetGocbV2BucketFromCluster
you could call getKvPoolSize(spec.Server)
which would reparse the connection string. The reason I didn't do it like this was to only parse the connection string once but this is such a cheap operation (string parsing) compared to bucket opening that it would be fine.
Do you think this would be more readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not set on 2f038b9 but it simplifies the PR.
This will behave identically to the original PR but in order for serverless values to work, you have to modify the connection string on the BucketSpec
to have kv_pool_size=1
. This is done in GetBucketSpec
anyway to add dcp_buffer_size
and kv_buffer_size
custom values so I think this is probably a reasonable refactor.
The downsides (which aren't bad):
- the setting of an implicit
kv_pool_size
is set asDefaultGocbKvPoolSize
inside some functions -BucketSpec.getGoCBConnString
andBucketSpec.GetKvPoolSize
and they have the potential to get out of sync. - the place that we will validate that we haven't passed
kv_pool_size
twice in a connection string will take place when opening a bucket and not loading a database. This is unlikely behavior we didn't test for anyway. - we parse the connection string twice when creating a
Bucket
, once to get the connection string for a cluster and another for figuring out thekv_pool_size
. This is a cheap string operation vs a kv op for a rare condition.
…ion to GetKvPoolSize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this makes sense. Added a suggestion to go further with removing KvPoolSize from the bucket spec and just extract it from a computed connstring as needed.
base/collection.go
Outdated
if spec.KvPoolSize > 0 { | ||
numPools = spec.KvPoolSize | ||
|
||
numPools, err := spec.GetKvPoolSize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like GetKvPoolSize should be a helper function that takes a connString, and doesn't use the spec? That way we can be certain that the value here matches what's actually getting sent to gocb (on the conn string), and we don't need to have any logic inside GetKvPoolSize around default values, etc.
That would also let us get rid of the test-only getKVConnectionPol [sic], and just use the new connString based function for test verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to a generic helper function so I could use it for kv_pool_size
and kv_buffer_size
.
base/dcp_sharded.go
Outdated
@@ -296,7 +291,7 @@ func initCBGTManager(ctx context.Context, bucket Bucket, spec BucketSpec, cfgSG | |||
|
|||
// Since cbgt initializes a buffer per CBS node per partition in most cases (vbuckets in partitions can't be grouped by CBS node), | |||
// setting the small buffer size used in cbgt 1.3.2. (see CBG-3341 for potential optimization of this value) | |||
options["kvConnectionBufferSize"] = "16384" | |||
options["kvConnectionBufferSize"] = "16384" // This value will be overridden by the BucketSpec connection string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be appropriate to log something at warn in the case where the connection string is setting a larger value for this, to help us triage possible memory issues that would result. The message would be along the lines of "dcp_buffer_size is set to %d on the connection string - this will result in increased memory usage"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, this is actually kv_buffer_size
: https://sourcegraph.com/github.com/couchbase/gocbcore@8c8c79204421aa47127a042ac2ac560d87920645/-/blob/agent_config.go?L411
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented this idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, wanted one bit of clarification.
rest/server_context.go
Outdated
@@ -455,9 +455,33 @@ func (sc *ServerContext) getOrAddDatabaseFromConfig(ctx context.Context, config | |||
return sc._getOrAddDatabaseFromConfig(ctx, config, options) | |||
} | |||
|
|||
func GetBucketSpec(ctx context.Context, config *DatabaseConfig, serverConfig *StartupConfig) (spec base.BucketSpec, err error) { | |||
// GetBucketSpec returns a BucketSpec from a given DatabaseConfig and StartupConfig. | |||
func GetBucketSpec(ctx context.Context, config *DatabaseConfig, serverConfig *StartupConfig) (*base.BucketSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit in changing to a pointer return for BucketSpec here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had changed it to a pointer in a previous iteration because the value that it returns is effectively nil and you wouldn't want to use it. It doesn't matter in this case because we check the error value of the function and performing a construction/copy on BucketSpec is rare and cheap.
I restored the previous behavior.
* CBG-3688 make sure kv_pool_size is overriden correctly for DCP - create two functions for getting the connection string - GetGoCBConnString - GetGoCBConnStringForDCP Simplifications to API: - Fix a confusing invariant where GetGoCBConnString would initialize KvPoolSize on BucketSpec but making BucketSpec stateless and moving operations to be performed on an connection string - Remove redundant ca_cert_path, set by auth - Centralize modifying the connection string defaults for serverless on Bootstrap and Bucket creation.
Simplifications to API:
Fix a confusing invariant where GetGoCBConnString would initialize KvPoolSize on BucketSpec by requiring a constructor that does string parsing of server value to pull out kv_pool_size. This will fail if an invalid kv_pool_size value is created.
Move the creation of modified connection string for the bootstrap connection to the caller (rest/main.go). This means that both serverless and regular callers will have values set on the connection string for bootstrap connection. Before these values were only set for serverless bootstrap:
Questions:
ca_cert_path
for Bucket connections and not for the bootstrap connection? The only way that I imagine that this works is if the ca cert is in the system already or embedded in the pem file. I tried to use our x509 tests but they are broken because they don't run automatically. I tried to fix them but it is somewhat non trivial.Pre-review checklist
fmt.Print
,log.Print
, ...)base.UD(docID)
,base.MD(dbName)
)docs/api
Integration Tests
GSI=true,xattrs=true
https://jenkins.sgwdev.com/job/SyncGateway-Integration/2303/