Remove context as a parameter the user provides #474

rylev · 2021-11-01T11:39:17Z

As a continuation to discussion started here:

This removes the Context parameter from all pipeline based operations while keeping the context as an internal detail (meaning all policies still have access to the Context and can use it as a key/value store (though this functionality still needs to be implemented).

We originally included Context as a parameter for all operations because we believed it would be essential for cancelation just like in other SDKs. However, #457 and the related discussion showed that we don't need to send an explicit struct for doing cancelation since Rust's futures allow for cancelation through other means. The only remaining feature that Context is used for is as a key/value store which is a secondary feature that is not widely used. Requiring the context argument in every operation call is a heavy price to pay to support this feature.

This change makes it so that (by default) users don't have to pass a Context object since this would require the user to type Context::new when almost always they don't need a custom context object. Instead the context is constructed internally so that the pipeline (and thus all individual policies) can still take advantage of the Context.

Open questions:

Presumably there are cases where the user wants to set a key/value in the Context key/value store when setting up an operation. After this PR this will no longer be possible. Since Rust does not offer default args or method overloading, to add this feature back we will need to add a new method for each operation that allows setting Context. Before we add this back, I'd like to understand what use cases there are for the user to set a key/value pair in the Context from outside of the pipeline.
Right now each operation is required to construct a Context object that gets passed to a PipelineContext object. We may be able to simply construct the Context from inside the PipelineContext constructor.

cc @JeffreyRichter @heaths

MindFlavor

Oh, yes! I've hated that useless Context since we added it few months ago! 👍😀

I wholeheartedly agree with this PR. ❤️

I would go further though: now we have an aptly named PipelineContext that instantiates an useless Context (I've shown an example below).

Can we remove the Context altogether? Any required pipeline-related context information should go in PipelineContext anyway.

MindFlavor · 2021-11-01T11:49:45Z

sdk/cosmos/src/clients/collection_client.rs

        options: GetCollectionOptions,
    ) -> crate::Result<GetCollectionResponse> {
        let mut request = self.prepare_request_with_collection_name(http::Method::GET);

-        let mut pipeline_context = PipelineContext::new(ctx, ResourceType::Collections.into());
+        let mut pipeline_context =
+            PipelineContext::new(Context::new(), ResourceType::Collections.into());


Maybe we can tackle this in a following PR but I think we can get rid of the Context here as well: PipelineContext is already enough (no need to instantiate an inner, useless Context).

yoshuawuyts

I like how much lighter these API calls feel after removing of the required Context param. I think this will feel a lot nicer to use by end-users. And because individual methods still have both *Options instances and required arguments that are passed, it doesn't seem like we're losing much in terms of configurability or expressivity.

I'm in favor of this!

JeffreyRichter · 2021-11-01T16:27:24Z

I'm not sure what happened with the discussion here. The Context still serves a purpose; it was never just about cancellation. And customers need a way to specify these key/value pairs and pass down into the pipeline. This functionality is still a requirement so we can't just get rid of Context without having some way to pass the key/value pairs through the system. The key/value pairs are required to make distributed tracing work and to override policy behavior on a per-operations basis such as to change the number of retries.

So, if we can't come up with a replacement for passing the key/value pairs down, then we still need Context.

MindFlavor · 2021-11-02T12:51:23Z

And customers need a way to specify these key/value pairs and pass down into the pipeline. This functionality is still a requirement so we can't just get rid of Context without having some way to pass the key/value pairs through the system. The key/value pairs are required to make distributed tracing work and to override policy behavior on a per-operations basis such as to change the number of retries.

Ok, that makes sense.
If the only remaining purpose is to influence the pipeline execution I suggest to change its name to something like PipelineOptions. Context is misleading, especially since it does not represent the cancellation context anymore.

Also I would prefer to make it optional.

What do you think?

yoshuawuyts · 2021-11-02T16:41:17Z

@MindFlavor Now that we know we should keep exposing the ability for customers to pass key-value pairs ¹ into the API calls, it opens up the ability for us to accurately design the design of our endpoint calls as a whole.

Another ergonomic challenge we face similar to "empty context", is having "empty options" instances:

// no options set
let options = CreateDatabaseOptions::new();
let database = client.create_database(Context::new(), "my_database", options).await?;

// various options set
let options = CreateDatabaseOptions::new()
    .consistency_level(ConsistencyLevel::Strong);
let mut cx = Context::new();
cx.insert("key", "value");
let database = client.create_database(cx, "my_database", options).await?;

I believe we could experiment with the async builder pattern to instead enable us to do something along the lines of:

// no options set
let database = client.create_database("my_database").await?;

// various options set
let database = client
    .create_database("my_database")
    .consistency_level(ConsistencyLevel::Strong)
    .insert("key", "value")
    .await?;

Adopting an API like this would take a lot of figuring out. In particular I'm uncertain how far we can get without .await respecting IntoFuture. But I think now that we have a sharper understanding of the requirements for our APIs, it's worth taking a step back and looking at our designs as a whole.

For this use I think we'd be better served using a typemap, but for simplicity let's assume we'd use a hashmap-like interface for now. ↩

MindFlavor · 2021-11-02T17:09:02Z

I believe we could experiment with the async builder pattern to instead enable us to do something along the lines of:

// no options set
let database = client.create_database("my_database").await?;

// various options set
let database = client
    .create_database("my_database")
    .consistency_level(ConsistencyLevel::Strong)
    .insert("key", "value")
    .await?;

❤️

This is basically the same API I was aiming to before we switched to the actual approach. Notice the similarities with this old example:

azure-sdk-for-rust/sdk/cosmos/examples/document_entries_00.rs

Lines 67 to 73 in 0b95c40

    
           // Let's get 3 entries at a time. 
        
           let response = client 
        
               .list_documents() 
        
               .consistency_level(response.unwrap()) 
        
               .max_item_count(3i32) 
        
               .execute::<MySampleStruct>() 
        
               .await?;

)

So I really like your proposal 😃. In my opinion the fluent syntax is much easier to read than spreading the parameters in multiple, different structs (context, options and so on...) but this is fundamentally different to what we are doing right now so 🤷 ...

heaths · 2021-11-02T17:42:44Z

So, if we can't come up with a replacement for passing the key/value pairs down, then we still need Context.

@JeffreyRichter what key/value pairs, exactly? If you mean something like a property bag in JS, options model in .NET, etc., I would think each method - if it needs it - would follow the same practice. Typically, that's {MethodName}Options across most languages.

JeffreyRichter · 2021-11-02T17:46:43Z

About Context: Let's take a step back and look at the overall architecture.

Customers create a client using 3 things: an endpoint, credentials, and options. These 3 things apply to ALL operations performed via this client object. Also, in other languages, clients must be thread safe (usually done by making policies objects immutable) so that multiple threads can share a client - for Rust, it is OK to clone a client (but this can cause a problem with credentials).
Internally, the client ctor creates a pipeline which is an ordered list of policies (the passed-in credential gets turned into a policy). The concept of a "pipeline" is not generally exposed to customers; it is an internal implementation detail. There are many policies in the pipeline such as retry, telemetry, logging, transport, credential, etc. Almost all of these policies can be configured by the customer. Customers accomplish this by setting values in the options structure passed to the client ctor. Once these policy options are set, they are immutable for the lifetime of the client object. If the customer wants to perform some operations with a different transport or different logging, then the customer must create a different client object using the desired options structure.
ALL client operations take inputs and these are used to create an HTTP request object (using the client's endpoint) and then send this request object down the pipeline. This means that ALL policies are applied to ALL client operations in exactly the same way. So, policies are operation agnostic, but they are specific to an Azure service. At the end of the pipeline is the transport policy which sends the HTTP request to the service. The service responds to the transport policy which then returns an HTTP response back through the pipeline. If an error occured, the retry policy sends the request again and some policies execute once per retry (logging, some credential, and transport). FYI: Some policies execute once per operation (telemetry, some credential, distributed tracing). When the client method gets the response, it typically deserializes some response headers and/or JSON body into a structure which is returned to customer code.

As you can see, the pipeline policies are always applied to ALL operations of a client. But, occasionally, a customer wants to override the behavior for a specific call to an operation. For example, if the customer knows they are downloading a 4 terabyte blob, they might like to have different retry policy options. To accomplish this, the customer sets some per-method call "context" with a key like "retry" that has a value like "maxTries=10". The customer sets this key/value context and then passes it to the client's blob download function which passes it to the pipeline causing it to flow through all the policies. The retry policy looks specifically to see if a "retry" key exists in the context and if it does, it reads the value and uses maxTries=10 instead of what was passed to the client ctor's options structure. End result: For this one call to downloadBlob, the retry policy is altered by the customer.

Here's another example: it is highly likely that the customer's code is inside a service itself. In this case, another service can be the client of the customer's service and the customer's service is the client of an Azure service. The request INCOMING to the customer's service can have special HTTP headers on it that are used for distributed tracing. These headers must flow from the originating service through the customer's service, and then through to Azure services. Here is how this is accomplished: The customer writes code that receives the incoming HTTP request. Then, the customer creates a "context" and sets a "tracing" key with the headers as its value. The customer MUST pass this context around their code as it represents the INCOMING request. Eventually, the customer service code calls to Azure by invoking a method on a client.
This context is passed to the method and the distributed tracing policy in the pipeline looks for the "tracing" key in the context. If found, this policy extracts the INCOMING headers, applies them to the OUTGOING HTTP request and sends these headers to the Azure service. End result: the distributed tracing information flows from the originating service through the customer's service and then all the way to the Azure services. It's now possible to see everything that happened in response to a single operation invoked by some end-user somewhere on the Internet.

In Go, you can see some our code where we set these key/value pairs on a context here. See the call to context.WithValue.

.NET uses AsyncLocal variables (similar to a thread-local variable) as a way for customers to set these key/value pairs and so they get flowed through the pipeline without polluting the client's method's signatures. This makes the methods "look nice" but this introduces other problems so it is not prefered.

Java has an actual Context class for the key/value pairs. If you look here you'll see a bunch of client methods that expose a Context parameter (it's always the last parameter). In C++, Context is also always the last parameter, see here for some examples on the KeyVault KeyClient.

For Go, Context is always the first parameter. See here for some examples in the Cosmos client.

MindFlavor · 2021-11-03T08:37:04Z

Thank you @JeffreyRichter for clearing things out. So you are confirming that the only thing missing is:

As you can see, the pipeline policies are always applied to ALL operations of a client. But, occasionally, a customer wants to override the behavior for a specific call to an operation. For example, if the customer knows they are downloading a 4 terabyte blob, they might like to have different retry policy options. To accomplish this, the customer sets some per-method call "context" with a key like "retry" that has a value like "maxTries=10". The customer sets this key/value context and then passes it to the client's blob download function which passes it to the pipeline causing it to flow through all the policies. The retry policy looks specifically to see if a "retry" key exists in the context and if it does, it reads the value and uses maxTries=10 instead of what was passed to the client ctor's options structure. End result: For this one call to downloadBlob, the retry policy is altered by the customer.

That is exactly what @yoshuawuyts is proposing: an optional way to override the pipeline configured values, on a call-basis. What I would prefer is to rename it from Context to PipelineOptionsOverride (or something like that). The mere fact that we are debating what Context is evidence that is poorly named IMHO.

The SDK customer can create a PipelineOptionsOverride (with tracing and whatnot) and pass it to every call they need to customize (besides what they did when creating the pipeline in the first place).

As how to do it, Rust does not have implicits¹ like Scala does (and it seems C# does too with AsyncLocal). While something could be cooked up with macros, I'd rather have the explicit, optional parameter or, even better, the builder pattern @yoshuawuyts mentioned.

Discussed here: https://internals.rust-lang.org/t/implicit-parameters/14514. ↩

JeffreyRichter · 2021-11-03T15:31:59Z

I'm open to a different name. I do not think it should have "Pipeline" in the name because the pipeline is not a concept that is immediately exposed to customers. There are also many precedents for tweaking an operation call by way of some "context". For example, in the .NET SDK, this search gives 127 hits for "RequestContext". And, of course, Google calls the context a Context in Go.

My point is that the term Context is frequently used for this purpose.
"PipelineOptionsOverride" is not an ideal name. Also, the distributed tracing example is not about overriding pipeline options at all.
Right now, I'd stick with "Context" but if others want to suggest alternatives, then I'm open to that.

rylev · 2021-11-03T15:51:26Z

Right now, I'd stick with "Context" but if others want to suggest alternatives, then I'm open to that.

I'd like to stick with Context for the time being if just to make sure we're on the same page with everything else.

@yoshuawuyts's suggestion above for using the builder pattern seems ideal to me. This will allow the user to specify optional things like options and key/value without being required to write the boiler plate of CreateDatabaseOptions::new() and Context::new() every time (since these are often left in their default states.

Something like the following would look and feel really nice:

database_client
  .create_database("my_database_name") // if the user didn't have any options of context they'd just call `.await` here
  .consistency_level(ConsistencyLevel::Strong) // this is an option the user doesn't need to specify
  .insert(RetryCount::new(5)) // this is a "key/value" specified as a type but looked up internally by type name
  .await?;

@JeffreyRichter the following seems to be able to handle everything you mentioned and while being different from other clients, feels like the very idiomatic way Rust usually handles optional arguments. Thoughts?

I'm going to close this PR for now since the number of changes distracts from the topic at hand. As a next step, we'll move one of the operations (likely create_database) over to this builds pattern to get feedback on it.

This is also not too dissimilar to @MindFlavor's original design albeit with some key tweaks.

JeffreyRichter · 2021-11-03T15:55:51Z

Actually, the context is usually NOT kept in its default state. This is true for our simple examples. But, if the customer is building a service, then the Context MUST be flowed through in order for distributed tracing to work. In this case, ALL of our methods and the customer's methods MUST pass the Context explicitly. Faliure to do this results in no distributed tracing.

rylev requested review from MindFlavor and yoshuawuyts November 1, 2021 11:39

MindFlavor approved these changes Nov 1, 2021

View reviewed changes

yoshuawuyts approved these changes Nov 1, 2021

View reviewed changes

Remove context as a parameter the user provides

acd0feb

rylev force-pushed the remove-context-parameter branch from 39ff91d to acd0feb Compare November 1, 2021 12:36

MindFlavor mentioned this pull request Nov 3, 2021

Migrate table service to pipeline architecture #352

Closed

cataggar added Azure.Core The azure_core crate design-discussion An area of design currently under discussion and open to team and community feedback. labels Nov 3, 2021

rylev closed this Nov 3, 2021

rylev mentioned this pull request Nov 5, 2021

Remove traits AsDataLakeClient and AsFileSystemClient #491

Merged

rylev deleted the remove-context-parameter branch November 8, 2021 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove context as a parameter the user provides #474

Remove context as a parameter the user provides #474

rylev commented Nov 1, 2021

MindFlavor left a comment •

edited

Loading

MindFlavor Nov 1, 2021

yoshuawuyts left a comment •

edited

Loading

JeffreyRichter commented Nov 1, 2021

MindFlavor commented Nov 2, 2021

yoshuawuyts commented Nov 2, 2021 •

edited

Loading

MindFlavor commented Nov 2, 2021

heaths commented Nov 2, 2021

JeffreyRichter commented Nov 2, 2021

MindFlavor commented Nov 3, 2021

JeffreyRichter commented Nov 3, 2021

rylev commented Nov 3, 2021 •

edited

Loading

JeffreyRichter commented Nov 3, 2021

Remove context as a parameter the user provides #474

Remove context as a parameter the user provides #474

Conversation

rylev commented Nov 1, 2021

MindFlavor left a comment • edited Loading

Choose a reason for hiding this comment

MindFlavor Nov 1, 2021

Choose a reason for hiding this comment

yoshuawuyts left a comment • edited Loading

Choose a reason for hiding this comment

JeffreyRichter commented Nov 1, 2021

MindFlavor commented Nov 2, 2021

yoshuawuyts commented Nov 2, 2021 • edited Loading

Footnotes

MindFlavor commented Nov 2, 2021

heaths commented Nov 2, 2021

JeffreyRichter commented Nov 2, 2021

MindFlavor commented Nov 3, 2021

Footnotes

JeffreyRichter commented Nov 3, 2021

rylev commented Nov 3, 2021 • edited Loading

JeffreyRichter commented Nov 3, 2021

MindFlavor left a comment •

edited

Loading

yoshuawuyts left a comment •

edited

Loading

yoshuawuyts commented Nov 2, 2021 •

edited

Loading

rylev commented Nov 3, 2021 •

edited

Loading