Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Etags #352

Open
haacked opened this issue Jan 30, 2014 · 14 comments
Open

Support Etags #352

haacked opened this issue Jan 30, 2014 · 14 comments
Labels
Status: Pinned A way to keep old or long lived issues around Status: Up for grabs Issues that are ready to be worked on by anyone Type: Feature New feature or request

Comments

@haacked
Copy link
Contributor

haacked commented Jan 30, 2014

GitHub API supports Etags! But Octokit.net does not.

We should support this! But it's a teensy bit tricky. If we only support this at the ApiConnection or Connection level, then using Octokit becomes kind of a pain for everyone.

Ideally, it's part of our top level api. But that poses a problem as well, do we just add an Etag property to the models? Then what do we do about cases like this?

IObservable<Repository> GetRepositories();

The ETag for that request (or set of requests) is for the list of repositories. Each individual repository won't have an etag.

Thoughts?

@haacked
Copy link
Contributor Author

haacked commented Jan 30, 2014

For reference, this is how Octokit.objc does it: https://github.com/octokit/octokit.objc/blob/master/OctoKit/OCTResponse.m#L22

Maybe our clients need to return response objects and not just the actual entities.

@shiftkey
Copy link
Member

Maybe our clients need to return response objects and not just the actual entities

I think that's the best place to put this data. It kinda sucks, but a simple abstraction to expose the HTTP data is the easiest way to do it:

public interface IResponse<T>
{
    public string ETag { get; }
    // other fields
    public T Data { get; }
}

How this mixes with our Task<T>/IObservable<T> responses is probably something worth spiking - I think it's a rather significant change...

@pmacn
Copy link

pmacn commented Jan 30, 2014

I kinda like what they have going on over in octokit.rb for this, akin to exposing an IResponse GetLastResponse() method/property/whatnot either on the connection or the client.

Snipped from the https://github.com/octokit/octokit.rb readme

user      = Octokit.user 'andrewpthorp'
response  = Octokit.last_response
etag      = response.headers[:etag]

@pmacn
Copy link

pmacn commented Jan 30, 2014

Pretty sure this approach will still cause issues with the reactive client methods that go through GetAndFlattenAllPages since there's no simple way to know when there's a new last IResponse available.

Unless you expose an IObservable<IResponse> Responses { get; } on the reactive clients. Although initially that just sounds silly, I'll let it simmer.

@haacked
Copy link
Contributor Author

haacked commented Jan 30, 2014

Ok, I woke up at 5 AM with a bit of inspiration about this I want to run by you. 😄

Let's take a step back and think about what the purpose of etags are in the first place. For the most part, they make caches more efficient in order to reduce bandwidth. They potentially reduce load on the server if the server has an efficient means to determine whether an entity has changed.

So why should a client to Octokit.net care? Well, it allows them to make conditional requests which don't count against their rate limit. But this is a lot of work for a client to manage properly.

What if we did it for them?

We could maintain an in-memory cache of requests and responses indexed by etag. So if you call GetRepositoriesForCurrent, we'd make a conditional request and if it's not modified, we'd return the set from the cache.

This in-memory cache could be replaced with other implementations. For example, you could inject a SQLite backed cache instead of an in-memory one.

We don't have to be a perfect cache to be effective. We could use the most recently used approach if memory consumption was crazy.

Alternatively, we could just automatically cache the etag (and not the response data) and just provide a method to make the conditional request test. This would allow clients to test if their cache is up to date.

@pmacn
Copy link

pmacn commented Jan 30, 2014

I didn't wake up at 5am! Mainly because I didn't get to bed until 1.
But this pretty much covers what I was thinking as I was falling asleep. Just caching request/response pairs.

Although I was envisioning a setup where the client was simply given the option to turn on/off caching and then optionally change the "cache provider" (i.e. the in-memory, file system, SQLite alternatives) Also possibly make the cache modular in some way so that you could share a cache between client instances. Not sure if people are going to be constructing clients left and right.

@haacked
Copy link
Contributor Author

haacked commented Jan 30, 2014

@half-ogre any thoughts on this?

@haacked
Copy link
Contributor Author

haacked commented Feb 11, 2014

/cc @paulcbetts for your thoughts. Sounds like we might be able to plug-in Akavache.Http.

@half-ogre
Copy link
Contributor

@haacked I think there's a place for @shiftkey's idea of Response<T> regardless of etags, but I also like the idea of a cache provider for requests along the lines of what you describe. I think not caching at all is the best default behavior, but making it easy to "turn on" simple caching (like what you describe) or drop in your own cache provider would be pretty sweet.

@nigel-sampson
Copy link
Contributor

I have a naive implementation of the discussed approach at https://github.com/nigel-sampson/octokit.caching

@shiftkey
Copy link
Member

Fun fact: we currently have this IApiRespose type which captures the relevant fields to do this:

public interface IApiResponse<out T>
{
   T Body { get; }

  IResponse HttpResponse { get; }
}

public interface IResponse
{
    object Body { get; }

    IReadOnlyDictionary<string, string> Headers { get; }

    ApiInfo ApiInfo { get; }

    HttpStatusCode StatusCode { get; }

    string ContentType { get; }
}

The part that I don't think we've tackled is how to apply the ETag on subsequent requests. I like pushing this down to the HTTP layer, so perhaps a wrapper like Octokit.Caching is still the way to go.

Anyway, I'm leaning towards doing more HttpMessageHandler-based stuff through things like #808 - taking that to it's logical conclusion would likely support something like this...

@darrelmiller
Copy link
Contributor

If you plug in Tavis.HttpCache it will transparently take advantage of Etags/LastModified assuming you allow the response to be privately cached. It will automatically convert a GET into a conditional if the cached response is stale and will serve the response from the cache if a 304 comes back.

@mitchdenny
Copy link
Contributor

@shiftkey pointed me at this issue from #1636 where I added a big +1 to the issue of dealing with etags/last-modified. @haacked mentioned that folks want support for this for caching, but for me the other big one (and the reason I commented in the first place) was dealing with concurrency issues.

In my scenario I have GitHub spraying events at a webhook endpoint which dynamically scales and I have no real control over which endpoint gets the request. So the ability to query to see whether a change has occurred since I read an entity from GitHub allows me to introduce some serialization in an effort to enforce consistency (this is causing me some grief right now with check runs).

For caching I think the ApiConnection support that exists today is okay provided you don't have multiple threads rolling through the same GitHub client (I'm not sure if there are any protections for this already). But having something that is attached to the response or entities would be useful.

The way I was thinking about this API was that it would be done via the entities, so you might end up with something like this:

// We get a checkrun somehow - could be via an API call, or via a webhook.
var client = GetClientFromSomewhere();
var run = await client.Checks.Runs.Get(repositoryId, checkRunId);

var update = new CheckRunUpdate(run) {
  Status = new StringEnum<CheckStatus>(CheckStatus.InProgress)
}

await client.Checks.Run.Update(update);

Behind the scenes, all types that participate in concurrency control would implement an interface which the client can use when making the request to set the appropriate headers. The act of constructing CheckRunUpdate passing in the CheckRun instance is signalling that you want to participate in the currency checks.

When a request fails because of this an exception would be thrown so that the developer could react - but there would also be a non-exception version of the API, for example:

if (await client.Checks.Run.TryUpdate(update))
{
  // Compensate for concurrency issue.
}

Ideally you could use ETags instead of date time stamps for concurrency control as it provides stronger guarantees (making an assumption here about the GitHub backend) - but the interface that the entities implement would allow them to signal the kind of concurrency control they support.

@martincostello
Copy link
Contributor

I came across this issue after reading @JamieMagee's great blog post Making the most of GitHub rate limits and seeing if there were any improvements I could make to my Octokit.NET usage to use less of my rate limits.

It turns out for my specific C# code usage there wasn't really anything worth the investment, particularly as it requires a bit of jiggery-pokery to pass the headers to the API requests once you grab the ETag from the ApiInfo.

At the moment it seems like to easily deal with ETags and 304 responses you have to drop down to a very low level, at which point the value provided by Octokit gets less and less as the easy abstractions aren't usable with the ETags.

First-class support of some sort for this in the future would be most welcome.

@nickfloyd nickfloyd added Type: Feature New feature or request Status: Pinned A way to keep old or long lived issues around and removed category: feature labels Oct 27, 2022
@nickfloyd nickfloyd moved this to 🔥 Backlog in 🧰 Octokit Active Dec 5, 2022
@nickfloyd nickfloyd added the Status: Up for grabs Issues that are ready to be worked on by anyone label Jan 13, 2023
@nickfloyd nickfloyd added the hacktoberfest Issues for participation in Hacktoberfest label Sep 21, 2023
@nickfloyd nickfloyd removed the hacktoberfest Issues for participation in Hacktoberfest label Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Pinned A way to keep old or long lived issues around Status: Up for grabs Issues that are ready to be worked on by anyone Type: Feature New feature or request
Projects
Status: 🔥 Backlog
Development

Successfully merging a pull request may close this issue.

9 participants