Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Validation to Typebox #772

Merged
merged 13 commits into from
Aug 7, 2024
Merged

Migrate Validation to Typebox #772

merged 13 commits into from
Aug 7, 2024

Conversation

eddie-atkinson
Copy link
Collaborator

@eddie-atkinson eddie-atkinson commented May 4, 2024

Migrate the validation flow to use Typebox. This PR:

  • Adds Yahoo Finance types and associated validation
  • Migrates all the modules which require validation to use Typebox
  • Removes the previous JSON schema validation, associated artifacts and CI actions

Copy link

codecov bot commented May 4, 2024

Codecov Report

Attention: Patch coverage is 98.64130% with 5 lines in your changes missing coverage. Please review.

Project coverage is 96.99%. Comparing base (5a6a6c8) to head (6af931d).
Report is 36 commits behind head on devel.

Files with missing lines Patch % Lines
src/lib/datetime.ts 95.65% 2 Missing ⚠️
src/lib/yahooFinanceTypes.ts 94.59% 2 Missing ⚠️
src/modules/historical.ts 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #772      +/-   ##
==========================================
+ Coverage   93.30%   96.99%   +3.68%     
==========================================
  Files          27       30       +3     
  Lines         732      964     +232     
  Branches      247      212      -35     
==========================================
+ Hits          683      935     +252     
+ Misses         49       29      -20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gadicc
Copy link
Owner

gadicc commented May 5, 2024

This all looks great, @eddie-atkinson!

General comments:

  • Glad you started first with the coercion to make sure it will work from the get-go.
  • Typebox looks great... love the composition and readability.
  • Style is all great, love everything... thanks for the unit tests early on (not a given for everyone), edge cases, error handling and reporting. Great job 🙏

Minor nitpick:

  • For the copied code in date/time, could you add the commit SHA so we can more easily track any further work upstream. Either just in the comment or as the link (i.e. instead of master in the URL). I also wonder if there's any better way to create a union with the existing upstream functions, rather than copying and editing. Did you look at that at all? Anyway, super minor, I wouldn't waste more time on this now - let's get everything done first and revisit later.

  • Re nullable naming, will response inline above.

Thanks again, this is all great. Looking forward to following along and very excited to write new code using this in the future... the current system works well but can be quite tedious constantly rebuilding the schema to test things or forgetting to compile/commit on changes `:) So yeah, I'm excited, and big thanks again for all your efforts here 🙏

@eddie-atkinson
Copy link
Collaborator Author

Hi @gadicc

For the copied code in date/time, could you add the commit SHA so we can more easily track any further work upstream

Done.

I also wonder if there's any better way to create a union with the existing upstream functions, rather than copying and editing. Did you look at that at all?

So my understanding is that the example directory this code sits in does not get bundled with typebox itself and is instead "for reference purposes only" with the intent that you copy the code you need. That being said, I share your concern about having it just sitting there, I'd prefer to defer to standard library functionality if we could. I can look into it as a follow up.

I've added a couple more commits which add a validateAndCoerceTypebox function. It is pretty simplistic, it just attempts to decode a value for a given schema, catches the error if there is one, logs it if desired and then re-throws it. The format of the error can be seen in the snapshot of the test in the PR. I'm not sure how keen you are to keep the error format consistent with what was there previously, is that part of the library's interface? If not, I feel that the error message as produced by typebox is quite readable. It includes the schema error, the value and the path. However, I am happy to spend some time making it look more like the old one if desired.

There were also a couple of test cases I dropped from the old validateAndCoerceTypes function, notably the it('fails on invalid options usage') test. To my mind invalid options use is protected against by the Typescript compiler yelling if you pass an invalid option. However, I'm keen to understand if that's a test case you think is still needed; it would be simple to add if desired.

Looking forward to hearing more feedback, and I'll start hacking on the perf tests in the meantime

@gadicc
Copy link
Owner

gadicc commented May 7, 2024

All looks great as usual, @eddie-atkinson; thanks again!

date/time types from examples

Ah yes, right you are... didn't even notice that 😅 If it's not included in the npm package there's really not much we can do, but as you say, I guess this was their intention anyway. Definitely fine for now, and if we notice any bugs, we now have the commit hash (thanks) to report upstream or rebase upstream fixes (by hand).

validateAndCoerceTypebox & error format

Awesome, thanks! Absolutely fine with the new error reporting. All we did before was use ajv's format... no API promise here, just something the end user can understand - within reason. My only note is that - as you'll see in the existing validate() function - some common errors also logged a huge amount of unnecessary data - and we'd do our best to only log the relevant parts. This is something we'll have to address eventually, as it greatly affects usability / DX of the library... but I think once we have everything else in place, it shouldn't be too hard to create similar functionality as have real data to test against. Pity I didn't create tests for the existing functionality - my bad 😅 (Or maybe typebox logs better errors from the get-go, not sure).

invalid options use is protected against by the Typescript compiler yelling if you pass an invalid option

The non-obvious reason for this (which I should have mentioned in a comment) is for library usage by non-TS users. That's the only reason why we still need to validate at runtime. And I guess that - unfortunately - means we'll need to use typebox for each module's options too. Sorry 🙈

But step by step... the most important thing is to get the foundations up - which you've done an excellent job of. Both systems can exist side by side for a while and we can convert everything gradually.

So, thanks again. I continue to follow with interest and appreciation :)

await expect(rwo({ invalid: true })).rejects.toThrow(InvalidOptionsError);
});

it("accepts empty queryOptions", async () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is slightly weird because no other module other than search seems to support this. But I'm not really sure why

@eddie-atkinson
Copy link
Collaborator Author

Hi @gadicc,

Me again.

A few updates on where I'm at.

I've added some more functionality for using typebox to validate options and responses, notably moduleExecTypebox.ts and search.tb.ts. The current approach I have taken is to build the modules that use typebox in parallel. I'm not sure if you have thoughts on how we could roll this change out but my current plan was to build the typebox functionality in parallel and then we could replace modules one by one with the Typebox implementations as we gain more confidence that it worked. This does have the annoying side effect of increasing the module's bundle size as we will include both typebox and ajv in the final output for a while, so happy to discuss other options. The current implementation allows users to opt into the new functionality by calling yf.typebox.<moduleName>().

For the naming of the typebox files I have adopted .tb.ts as a file extension to distinguish them from the main code path. Ideally I'd have liked to segregate them into a typebox directory, but ts-json-schema-generator and typescript were interacting badly and throwing all sorts of weird errors during schema generation (can't wait to take a flamethrower to that codegen step 🫠 ).

In terms of the mechanical process of converting typescript interfaces to Typebox, that part is pretty easy. You can basically copy the entire file wholesale into the tool that the maintainer of Typebox created and copy out the result with two important caveats:

  1. The output will convert number and date types to Type.Number() and Type.Date() respectively. Running the unit tests I found the JSON outputs in the code base actually encode these values in a variety of ways (those captured in yahooFinanceTypes.ts). So for these values I switched the generated types to YahooNumber and YahooFinanceDate.
  2. The nice comments with examples you wrote get stripped so they have to be copied back

I had a bit of a play around with a performance test without getting very far, generally it seems like the example files in the repo aren't large enough for a meaningful performance difference to manifest.

Keen to discuss what the steps are from here, I'm not sure whether we want to merge this PR or split it up a bit to make reviewing more tractable

@gadicc
Copy link
Owner

gadicc commented May 15, 2024

Amazing work as usual @eddie-atkinson - big thanks! 🙏

Not worried at all about including both ajv and typebox for now... the small difference in bundle size won't make a big difference outside of the browser. In any event, it won't be long lived.

So yeah, current parallel staging looks great. I have some thoughts on best next steps, but I'm just in the middle of some travel, so will need to get back. The code I've reviewed so far all looks great but I'm a few commits behind. Need to get back to you on the code comment above too (you're right, in theory, if it's only used internally, no need to validate at runtime - but I'll double check).

Yeah, all the tschema scripts etc, although they're all stable and work great (if you don't change anything), as you saw, it can create become a hassle, so yeah, another reason why I was excited form your proposal and to move to something both clearer and more maintainable :) Thanks also for appeasing the coverage gods! No easy feat :D

In short, thanks again, and will be in touch soon!

@gadicc
Copy link
Owner

gadicc commented May 26, 2024

@eddie-atkinson, thanks again for your patience. I'm still abroad and have had a lot less time online than anticipated. To that end, I'm very happy where this is all going, and in light of my limited time now, happy for you to take the lead on how best to implement this.

My original thoughts were to rebase and squash the commits down into 1) all preliminary support (packages, coersion, etc), 2) commit(s) for module(s). For the latter, I had originally thought it would be nice to deal with a single file, so the diff would show the exact changes to switch from the old way to the new way. But that's not mutually exclusive to the parallel track we've taken here... we can have both the original and .tb.ts files co-exist as you've described which takes off the pressure somewhat, and later down the line when we want the use the new versions only, we can just rename the files over the old versions and the diffs will still look great. Hope that makes sense! Those were my thoughts but as I said since I have a lot less time online than expected now, and you have all this stuff fresh in your mind, happy to go another route on this. To save some time and energy I also don't mind to squash everything as is into a single commit.

So, just let me know how you feel about everything and if there's still more immediate work you'd like to do or if things are ready to merge now and to continue on from there. And most importantly, thanks again for all your exceptional work on this and patience too! 🙏

@eddie-atkinson
Copy link
Collaborator Author

Hi @gadicc,

I'm still abroad and have had a lot less time online than anticipated.

Enjoy it! There's always more time to sit in front of a computer. In this profession time away from the computer is an essential thing to keeping the fire burning over the long-term.

My original thoughts were to rebase and squash the commits down into 1) all preliminary support (packages, coersion, etc), 2) commit(s) for module(s).

I'm happy to work it this way. My plan for this PR was to try out a few modules and see how it all hangs together. After having done a few of the more hairy modules I'm confident we can handle all of them.

The rough plan I was thinking of was to:

  1. Make a PR for all the setup work (coercion, validateAndParse etc)
  2. Make separate PRs for each module

I was keen to make a separate PR for each module to make reviews more tractable as some of the modules are incredibly large. For the smaller ones I'm also happy to combine a few together, I just know search for example will be a bit of a hefty review.

I had originally thought it would be nice to deal with a single file, so the diff would show the exact changes to switch from the old way to the new way. But that's not mutually exclusive to the parallel track we've taken here... we can have both the original and .tb.ts

We might be able to have our cake and eat it here. Just move files like so search.ts -> search.ajv.ts and add a new search.ts file.

Anyways, happy to work the PR flow however it's going to make it easiest to review. I am conscious that this shouldn't result in breaking changes, but it's hard to be certain the TS interface won't change

@eddie-atkinson eddie-atkinson force-pushed the devel branch 5 times, most recently from 36f3052 to 7dcf40d Compare June 4, 2024 12:05
@gadicc
Copy link
Owner

gadicc commented Jun 17, 2024

Hey @eddie-atkinson

Thanks again for all your efforts here and your patience during my travels.

I'm back now. I don't think you're waiting on me for anything but let me know if that's the case. In general, super stoked about all of this, agree with everything, and you can let me know when you're ready for final review before merge (granted that we may split into additional PRs).

Also, when you have a moment, won't you please drop me a mail at [email protected] to discuss something else. Thanks :)

@eddie-atkinson
Copy link
Collaborator Author

eddie-atkinson commented Jun 17, 2024

Hey @gadicc,

Not blocked on anything at the moment. I took your suggestion to do one commit per module, then we can merge this PR and do the big switcheroo in a follow up.

I'm a bit busy with work due to EOFY projects, but will build up some more steam on this soon :)

won't you please drop me a mail at [email protected] to discuss something else.

I've flicked you an email, look out for an email from an address ending in my surname at gmail dot com

@gadicc
Copy link
Owner

gadicc commented Jun 19, 2024

Ok great, that's all perfect, thanks for the update and good luck with all the EOFY stuff! :D

And thanks for the mail, confirming receipt; will be in touch there in due course :)

@eddie-atkinson eddie-atkinson force-pushed the devel branch 3 times, most recently from e674bff to d105e8c Compare July 21, 2024 12:42
@eddie-atkinson
Copy link
Collaborator Author

Hi @gadicc,

A small update on where this is at:

  1. I have migrated every module except quote.ts. This is proving quite difficult, not because the migration is any harder than any of the other modules, but because I get a crash when generating the schema.json artifact due to the typebox types being exported as part of the function signature for the queryOptionOverrides parameter. This is incredibly weird and I have not had this issue for other modules, and am therefore at a little bit of a loss.
  2. I am getting a failure in the CI pipeline where the tests are getting SIGKILLed. Again, at a bit of a loss here the tests are passing on my machine 🤷

Therefore, in order to make progress I am thinking of changing tack slightly. I might follow up on your original suggestion to migrate the modules in place instead of as a two step process. So instead of adding a quote.tb.ts and then migrating subsequently I will do it in one step. This will allow me to do away with the JSON schema generation in the same step, resolving my issue with the schema generation. What do you think?

On the unit tests issue, I am not sure how to proceed. Is this something you could take a look at?

@gadicc
Copy link
Owner

gadicc commented Jul 21, 2024

Hey @eddie-atkinson

Let me start off like always with big thanks for all your amazing work here!

At the risk of these being famous last words, I'm feeling pretty confident with the changes and happy to move forward with this. I think (and maybe you can confirm) that there are some tests for failing validation that pass with both frameworks, in which case, I think we have great coverage. If we get bad reports in the wild we can always revert. And if each module is its own commit this will be very easy.

Re failing CI, I'm just going to take a guess here that it's running out of memory. There are aloooot of tests. I think if we stop running tests for both frameworks, it will work. I'm guided here of course by it everything passing on your machine. But I've seen this kind of thing before (at that time, I picked a VM with higher RAM, but not sure if I have the option of doing that again within the free limits, and without two frameworks, it won't be necessary).

Very exciting that we got to this stage! So thanks again for everything, really - and super excited to get these merged.

@eddie-atkinson eddie-atkinson force-pushed the devel branch 4 times, most recently from 0566048 to 3516c5d Compare July 26, 2024 05:07
You can revert to the old behaviour with:

```js
yahooFinance._disallowAdditionalProps();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gadicc I guess this is probably a breaking change given the docs previously stated users could use this if required.

Note: We can support this if desired, by having this as an option on the config object to strip extraneous properties and then using that config value to run Clean from typebox to trim the result sets.

Though I'm not sure I understand the use case for wanting to throw if the payload has extraneous properties but otherwise matches our validation spec

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I would actually like to keep, in one form or another. Doesn't necessarily need identical API since we prefixed with _.

  • For the user, most will never need this unless they're pedantic. We don't need to cover this.

  • For contributors, this has been the most useful way to see if we're adequately covering the entire response (that's why it's the default for NODE_ENV=test. I don't really want to accept commits unless they actually provide full typing for everything Yahoo is supplying - otherwise tests should fail.

So, the big requirement here is more for test validation during development rather than regular runtime validation of the library.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok not an issue.

I've added this as an option to the options.validation field and set it to default to NODE_ENV === "test" so that covers off the need for stricter types during development, whilst giving the options for callers to specify it themselves where it's useful to them with the caveat that it's an internal only option.

Let me know if this is sufficient to support all parties, if not I can have a bit more of a think about it

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, yes, that's absolutely perfect! 🙏

src/lib/moduleExec.spec.ts Show resolved Hide resolved
src/lib/moduleExec.spec.ts Outdated Show resolved Hide resolved
@eddie-atkinson eddie-atkinson changed the title [DRAFT]: Add initial types and unit tests for coercion of tricky inputs using typebox Migrate Validation to Typebox Jul 26, 2024
@eddie-atkinson eddie-atkinson marked this pull request as ready for review July 26, 2024 07:04
@eddie-atkinson
Copy link
Collaborator Author

Hi @gadicc,

I'm pleased to say that I think this PR is ready for review. I apologise for how long it is, the change of strategy to replace the entire existing flow has meant that I've touched a lot more files than I would have liked to.

My hope is most of the changes to the module code should be relatively easy to follow. That being said, there are several cases where I've had to reorder the types in addition to changing them over to Typebox. This is an unfortunate consequence of the fact that the schema are objects as opposed to being purely types. To reduce cognitive load I would be happy to write a different PR that simply orders the types in the same way as the typebox schema have been ordered here if that would make the diff here simpler to review.

that there are some tests for failing validation that pass with both frameworks, in which case, I think we have great coverage.

There are a couple in moduleExec.spec.ts that assert that the behaviour for failed validation is consistent, and this is something I informally verified along the way (read: messed up the schema and had failing tests), so I'm pretty confident we will be fine.

My only major concern is if there have been subtle changes to the Typescript interface which would cause broken builds, if not runtime issues, for end users. I have tried pretty hard to be consistent, but it is nevertheless possible.

Anyway, looking forward to your feedback and let me know if there is anything you'd like me to add tests for, or change to make this a safer and easier PR to review 😎

@gadicc
Copy link
Owner

gadicc commented Jul 26, 2024

@eddie-atkinson, amazing! Thanks so much.

Going through this now. Some of my comments will be inline (like answering about additionalProperties and logger validation), so my reply might come in pieces. I'll send another message clarifying when I'm totally finished.

No worries about the length. Honestly, I'm going through things now and I'm getting so happy at every piece of the pipeline you've removed related to schema stuff, it's a joy for me and I hope will make it easier for others to contribute too. So thanks again for taking on this big project, and for the many, many, many hours you've put into it amongst your other priorities. It's been a pleasure working together on this!

Comment on lines 24 to 38
export class FailedYahooValidationError extends Error {
name = "FailedYahooValidationError";
result: any;
errors?: null | ErrorObject[];

constructor(
message: string,
{ result, errors }: { result: any; errors?: null | ErrorObject[] }
) {
super(message);
this.result = result;
this.errors = errors;
}
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one unfortunately we'll need to keep, otherwise it's a breaking change.

I don't actually mind if we return the typebox errors here (vs ajv), I realise that's still a breaking change but I'll bite the bullet on this one as I don't think anyone is really inspecting the errors closely (beyond the message string property, which we should retain), but, we should still export a FailedYahooValidationError class and use it so that users can match with instanceof.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue, I've added it back with some light modification of the type to be an array of Typebox errors. As you say this is a breaking change, so up to you as to whether we do a major or minor bump. But from the perspective of an instanceof check nothing should have changed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic, thanks! Yes, that's exactly what I was looking for... I'll take the risk on the part of the breaking change that I don't believe anyone is using, while we retain compatibility with the more popular part. Thanks!

@gadicc
Copy link
Owner

gadicc commented Jul 26, 2024

Ok I think I'm actually done :D Admittedly I went through the rest of the stuff quite quickly but honestly, it all looks great. I'm really happy with the way this has all gone. So, I think, just my two comments above (on additionalProperties and FailedYahooValidationError) and then we'll be good to go! 🎉

My only major concern is if there have been subtle changes to the Typescript interface which would cause broken builds, if not runtime issues, for end users. I have tried pretty hard to be consistent, but it is nevertheless possible.

I'm confident enough to move forward with this. We can add a small console.log() with a small notice about the changes and for people to report anything odd, to be removed a month later or so. (As much as these notices annoy people, but it's important; I have an open issue for improving the logging we do too, for another time).

@gadicc
Copy link
Owner

gadicc commented Aug 6, 2024

Thanks! Will give this one more look through later today / tomorrow but I believe we've covered everything and I expect to merge! So thanks again, will confirm soon 🙏

@gadicc
Copy link
Owner

gadicc commented Aug 7, 2024

Massive thanks, again, @eddie-atkinson! This was just a big job, and so greatly appreciated - since it so greatly improves DX. Thanks also for the rebase/reword/squashes so I can merge as is. All looks great.

Again, from going through all this, and from all our back-and-forth between, I'm acutely aware of how much time time you put into this, how much attention you paid to detail, to understanding the lib, style, flow, etc.

Merging now. Hopefully nothing else urgent comes up and everyone will have some time with this in dev before we publish the next release. Will tag you here or where relevant if anything comes up. I know you have other work commitments but hope you have a bit of time for a small break and celebration after this. If you have a https://buymeacoffee.com/ link, that's the list I could do. Thanks again! 🙏

@gadicc gadicc merged commit ae257e6 into gadicc:devel Aug 7, 2024
3 checks passed
@eddie-atkinson
Copy link
Collaborator Author

@gadicc I think you're the one who deserves the thanks, you have been responsive, encouraging and pragmatic throughout this process. This was my first major contribution to open source, and you have honestly been the dream maintainer to work with on this, so once again thank you :)

If you have a https://buymeacoffee.com/ link, that's the list I could do.

I could never take money from someone who selflessly maintains software for others. After all, I can build my stock tracking app on CloudFlare Workers now, you've already saved me $5 😛

@gadicc
Copy link
Owner

gadicc commented Aug 7, 2024

Haha, ok, fine! `:)

Greatly appreciate all the kind words and was a pleasure collaborating 🙏

You're also now the project's # 3 top human contributor with 13 commits (while acknowledging the rebase), and +6,021 and -15,003 LoC. The community thanks you!

@gadicc
Copy link
Owner

gadicc commented Sep 16, 2024

🎉 This PR is included in version 2.12.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants