Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source headers to hosting API calls #181

Closed
fershad opened this issue Dec 19, 2023 · 7 comments · Fixed by #184
Closed

Add source headers to hosting API calls #181

fershad opened this issue Dec 19, 2023 · 7 comments · Fixed by #184
Milestone

Comments

@fershad
Copy link
Contributor

fershad commented Dec 19, 2023

Is your feature request related to a problem? Please describe.
Not a problem, but something that help us (Green Web Foundation) later down the line to understand where API requests are coming from.

Describe the solution you'd like
Add a custom header x-greencheck-src to the API fetch requests made to the Greencheck API.

const req = await fetch(
`https://api.thegreenwebfoundation.org/greencheck/${domain}`
);

const req = await fetch(`${apiPath}/${domainsString}`);

There should be a default value x-greencheck-src: "co2js" but there should be a mechanism for that to be changed to a value that's set by the user.

@fershad fershad added this to the 0.14 milestone Dec 19, 2023
@mrchrisadams
Copy link
Member

Fish, given the traffic we get I'm very much in favour of this, but would you mind adding a bit of info outlining the where the x-greencheck-src: "co2js syntax came from? I get the x- prefix for unnofficial headers, but the rest I'm less confident commenting because I'm not so familiar with comparable prior work.

I think there may be existing conventions we can refer to for identifying API clients or user agents, that have been implemented in various tooling for parsing logs, rate-limiting and so on.

This would save us work further down the line if we follow an existing convention or spec for handling API traffic.

@fershad
Copy link
Contributor Author

fershad commented Dec 20, 2023

Maybe @philsturgeon might have some idea.

Phil, we're hoping to introduce a way to see how many checks against the Green Web Dataset API are coming from different tools/providers.

I've suggested the idea of a custom x- header here as a way for folks to self-report when sending a request. Outside of an API key solution, is there any other convention for how might be able to do this?

@philsturgeon
Copy link

From what I'm understanding of the requirements you can use the User Agent header for this. That's exactly what its for!

@mrchrisadams
Copy link
Member

mrchrisadams commented Dec 20, 2023

Yeah, thanks @philsturgeon - I agree that the closest thing is likely the User Agent Header - it's in the HTTP spec as a SHOULD:

14.43 User-Agent

The User-Agent request-header field contains information about the
user agent originating the request. This is for statistical purposes,
the tracing of protocol violations, and automated recognition of user
agents for the sake of tailoring responses to avoid particular user
agent limitations. User agents SHOULD include this field with
requests. The field can contain multiple product tokens (section 3.8)
and comments identifying the agent and any subproducts which form a
significant part of the user agent. By convention, the product tokens
are listed in order of their significance for identifying the
application.

   User-Agent     = "User-Agent" ":" 1*( product | comment )

Example:

   User-Agent: CERN-LineMode/2.15 libwww/2.17b3

It's true that it's mainly used for browsers, but we know scraper/crawler bots use it too.

MDN also share some examples of API clients, or other binaries like curl, or PostManRuntime sending it in the docs below:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent

This leaves open the question of what the Use Agent String ought to be by default. I'm less sure here.

My starting point might be the library name and version - we could go with something like co2.js/0.13.1 - for example.

This would give us an idea of what API support might be across the clients hitting our API if we introduced new features over time, and it only uses characters we know are already used in other UA strings.

Because we don't restrict access or require API keys right now, setting the UA explicitly would presumably at least give us some useful basis for to understand more about the sources of traffic.

Update: It might be the case that the whatever runtime we use already sends something we can use already. I just checked the logs, and we definitely get it for most of the traffic - or rather, requests sent mostly use the same UA strings as browsers, so we can't easily differentiate API traffic from browser traffic right now.

More below:
https://blog.postman.com/what-are-http-headers
https://en.wikipedia.org/wiki/User-Agent_header

This question on stack exchange was useful context, too:
https://softwareengineering.stackexchange.com/questions/355670/does-it-make-sense-for-user-agent-to-be-required-for-rest-apis

@philsturgeon
Copy link

Anyone and everyone can set the user agent header, most just don’t bother. I generally get iOS clients to set it so I can see which versions people are using without invasive telemetry, things like that, so this seems like basically the same thing.

@sfishel18
Copy link
Contributor

👋 hello! i came across this project in the Frontend Focus newsletter, and if you're looking for new contributors i'd love to help out! this particular issue looks like a good introductory one. could i take a crack at it?

@fershad
Copy link
Contributor Author

fershad commented Dec 30, 2023

@sfishel18 thanks for reaching out. We'd love a PR for this. Based on the conversation between Chris and Phil above, here's a small spec:

  • Add a User-Agent header to the fetch requests in the original comment.
  • The header should have the value co2js/<version>, where <version> is the version number of the library being used to make the request.
  • Ideally, the version number should update by itself, without us having to manually change it every release.

Let me know if you get stuck or need a hand.

sfishel18 added a commit to sfishel18/co2.js that referenced this issue Dec 31, 2023
the version from package.json is used to inject an environment variable
when esbuild runs, which the running code can read to create the correct
user agent string when making requests.

fixes thegreenwebfoundation#181
@github-project-automation github-project-automation bot moved this from Now to Later in CO2.js public roadmap Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants