Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add headers impl #38986

Closed

Conversation

Ethan-Arrowood
Copy link
Contributor

@Ethan-Arrowood Ethan-Arrowood commented Jun 10, 2021

This PR adds a Headers class implementation based on the WHATWG Fetch Spec. It leaves off many pieces of the spec due to the non-browser environment. Additionally, it deviates from the spec slightly for sake of performance improvements.

While it maintains support for spec based initialization, folks looking for most-performant solution should utilize a flattened array.

Header entries are kept sorted at all times so all read-based use cases are fast as possible. Writing is also kept fast by using a probe-based binary search/insert.

This code is based on my work in undici-fetch. The headers api is the most stable of the pieces and has no dependencies on other things like whatwg readablestreams, so it makes for a good-first-pr for fetch in Node! I also believe this class could be utilized by other HTTP aspects of the Node API - but the initial goal is relation to Fetch.

@github-actions github-actions bot added build Issues and PRs related to build files or the CI. needs-ci PRs that need a full CI run. labels Jun 10, 2021
@jasnell
Copy link
Member

jasnell commented Jun 10, 2021

Wooo! Obviously still needs the docs. When those are added, let's be sure to mark this experimental for now.

@devsnek
Copy link
Member

devsnek commented Jun 10, 2021

Any reason this isn't just a map of arrays?

function validateName(name) {
  if (!name.match(/^[!#$%&'*+\-.^`|~\w]+$/)) {
    throw new ERR_OUT_OF_RANGE('name', 'a valid header name', name);
  }
  return toUSVString(name).toLowerCase();
}

function validateValue(value) {
  value = toUSVString(value)
    .replace(/^[\r\n\t ]+/g, '')
    .replace(/[\r\n\t ]+$/, '');
  if (/[\0\r\n]/.test(value)) {
    throw new ERR_OUT_OF_RANGE('name', 'a valid header value', value);
  }
  return value;
}

class Headers {
  #headers = new SafeMap();

  constructor(init = undefined) {
    if (init) {
      if (init instanceof Headers) {
        this.#headers = new SafeMap([...init.#headers]);
      } else if (ArrayIsArray(init)) {
        for (const header of init) {
          if (header.length !== 2) {
            throw new ERR_OUT_OF_RANGE();
          }
          this.append(header[0], header[1]);
        }
      } else {
        for (const pair of ObjectEntries(init)) {
          this.append(pair[0], pair[1]);
        }
      }
    }
  }

  append(name, value) {
    value = validateValue(value);
    name = validateName(name);
    const existing = this.#headers.get(name.toLowerCase());
    if (existing) {
      existing.push(value);
    } else {
      this.#headers.set(name.toLowerCase(), [value]);
    }
  }

  delete(name) {
    name = validateName(name);
    this.#headers.delete(name);
  }

  get(name) {
    name = validateName(name);
    const values = this.#headers.get(name);
    if (!values) {
      return null;
    }
    return values.join(', ');
  }

  has(name) {
    name = validateName(name);
    return this.#headers.has(name);
  }

  set(name, value) {
    value = validateValue(value);
    name = validateName(name);
    this.#headers.set(name, [value]);
  }
}

@Ethan-Arrowood
Copy link
Contributor Author

Ethan-Arrowood commented Jun 10, 2021

Any reason this isn't just a map of arrays?

I believe this was previously considered, but a flat array of strings was assumed to be more efficient in regards to space and speed.

Emphasis on assumed; would require some benchmarking and more research on arrays vs Map

@targos
Copy link
Member

targos commented Jun 10, 2021

How do we want to expose this and the other fetch features ( before they become globals)?

@devsnek
Copy link
Member

devsnek commented Jun 10, 2021

would require some benchmarking and more research on arrays vs Map

benchmarks would be good.

@timdp
Copy link
Contributor

timdp commented Jun 10, 2021

Any reason this isn't just a map of arrays?

I believe this was previously considered, but a flat array of strings was assumed to be more efficient in regards to space and speed.

Emphasis on assumed; would require some benchmarking and more research on arrays vs Map

Yeah, I remember assuming that. 🙂 It'd indeed be interesting to benchmark!

@Ethan-Arrowood
Copy link
Contributor Author

@devsnek here is some preliminary benchmarks: https://gist.github.com/Ethan-Arrowood/e132fb72e5ef4156dfc0de3b80ef3c16

A Map based implementation is only slightly faster in some cases and in some scenarios. It seem that the overhead of the Map api is not faster than just using arrays; and that the binarySort is actually quite fast!

The iteration of Map impl could be faster if the entries method was implemented differently; such as sorting as the headers array is created from the Map iterator.

@Ethan-Arrowood Ethan-Arrowood marked this pull request as ready for review June 21, 2021 17:36
@Ethan-Arrowood
Copy link
Contributor Author

Ethan-Arrowood commented Jun 21, 2021

Alright this PR is ready for review. All docs and tests from undici-fetch have been copied over (and transformed to the Node way). This is my first large API addition to Node so I'm probably missing some things 🤷‍♂️

(Note i'm currently working through all the linter issues now)

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to run the linter on the markdown file, there are a few CI errors that are reported

doc/api/fetch.md Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
Copy link
Contributor

@bl-ue bl-ue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is prefered for indention in the docs: tabs or spaces, and how many? I noticed you're using both a single tab and two spaces for indentation here.

@Ethan-Arrowood
Copy link
Contributor Author

How do I get the node linter to auto fix?

@aduh95
Copy link
Contributor

aduh95 commented Jun 21, 2021

How do I get the node linter to auto fix?

You can do make lint-js-fix.

@Ethan-Arrowood
Copy link
Contributor Author

And how do I deal with strings that exceed 80 characters in length? The linter isn't fixing those automatically

@targos
Copy link
Member

targos commented Jun 21, 2021

Unfortunately, ESLint doesn't have an auto-fix for this rule. I usually reformat the code with Prettier in these cases.

@Ethan-Arrowood
Copy link
Contributor Author

I usually reformat the code with Prettier in these cases.

Do you just npx prettier <file_name>?

@jasnell
Copy link
Member

jasnell commented Jul 6, 2021

@Ethan-Arrowood ... would you mind if I opened a PR against your branch? I have some more detailed changes I'd like to make here to get it closer to spec compliance?

Actually, nevermind. What I want to do here is not going to work. I was thinking of using the Headers class directly in the HTTP/3 implementation but there are several limitations that make it difficult to do so on the server side while remaining spec compliant. What I'm thinking instead is to implement a lower level "header list" class that does what I need that can eventually sit under this class (replacing the kHeaderList's use of an array.

@Ethan-Arrowood
Copy link
Contributor Author

Thank you for the review @jasnell

What I'm thinking instead is to implement a lower level "header list" class that does what I need that can eventually sit under this class (replacing the kHeaderList's use of an array.

I originally built something like this but the moment you do anything more than an array you lose some of the major speed and memory performance benefits that makes this implementation unique (and potentially viable).

I was rather thinking of extracting some of the methods I'm using in this class and making them a sort of 'Headers List Utility Set' that you can then utilize in your own Headers class implementation and still reap the same perf benefits of an underlying array. We sorta already have that since the binary search / sort is its own function, and so are the normalize/validate functions.

@Ethan-Arrowood
Copy link
Contributor Author

Summary of changes today:

  • incorporated @jasnell review
  • fixed linting errors
  • worked on docs; some things are still failing in make doc-only that I'm not sure how to fix

I can tell we are getting very close with this 🙌

lib/fetch.js Outdated Show resolved Hide resolved
lib/fetch.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
lib/internal/fetch/headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
test/parallel/test-headers.js Outdated Show resolved Hide resolved
Ethan-Arrowood and others added 4 commits July 26, 2021 10:04
Co-authored-by: Antoine du Hamel <[email protected]>
Co-authored-by: Antoine du Hamel <[email protected]>
@@ -29,6 +29,7 @@
* [Domain](domain.md)
* [Errors](errors.md)
* [Events](events.md)
* [Fetch](fetch.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned that folks will misinterpret this as being a full fetch implementation...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the best way to introduce this without implementing all of fetch?

I could make a new feature/fetch branch and we can merge all fetch work to that until its completely ready. I'm really eager to start integrating undici-fetch with the streams/web api

fill(this, init);
}

append(name, value) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to add an argument length check for the public methods. When you call .append() or .append('foo') on Chrome/Edge/FireFox/Safari you get a TypeError about invalid arg length / missing arguments. Should I use the ERR_MISSING_ARGS to accomplish this?

@Ethan-Arrowood
Copy link
Contributor Author

Do we want to close this PR in favor of adding Fetch via Undici instead? Fetch in Undici core is now under active development by @ronag

@dnalborczyk
Copy link
Contributor

the end-users should be able to access it somehow without using --expose-internals: either as a global (in which case, it would make this semver-major), either as a new core module.

@aduh95 why would this require a semver major? I think it's safe, as the global will be shadowed and generally speaking shouldn't cause any problems. if this would break anything, browsers could never add any new globals either. I think a semver minor would be just fine.

@aduh95
Copy link
Contributor

aduh95 commented Oct 22, 2021

the end-users should be able to access it somehow without using --expose-internals: either as a global (in which case, it would make this semver-major), either as a new core module.

@aduh95 why would this require a semver major? I think it's safe, as the global will be shadowed and generally speaking shouldn't cause any problems. if this would break anything, browsers could never add any new globals either. I think a semver minor would be just fine.

Adding new globals is considered semver-major, see #20306 for context.

@dnalborczyk
Copy link
Contributor

Do we want to close this PR in favor of adding Fetch via Undici instead? Fetch in Undici core is now under active development by @ronag

I think Headers should be added even with fetch following later, as the class is very usable on its own. I've seen plenty of projects just using node-fetch and using nothing but the Headers class for easier header handling.

https://nodejs.org/api/http.html#messagerawheaders

const headers = new Headers(unflat(request.rawHeaders))

I also think it should be added (lazily?) to Incoming and Outgoing messages.

@dnalborczyk
Copy link
Contributor

dnalborczyk commented Oct 22, 2021

Adding new globals is considered semver-major, see #20306 for context.

thanks. I agree with @guybedford here: #20306 (comment) I don't see how or why node.js deviates from a browser in that context.

@Ethan-Arrowood
Copy link
Contributor Author

Closing in favor of undici fetch 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.