Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should result validation be optional #16

Closed
gadicc opened this issue Feb 5, 2021 · 11 comments
Closed

Should result validation be optional #16

gadicc opened this issue Feb 5, 2021 · 11 comments

Comments

@gadicc
Copy link
Owner

gadicc commented Feb 5, 2021

Regarding the validation

As far as I can understand the validation function validateAndCoerceTypes() is called twice?

  1. On the request - taking the query overrides, schemaKey and moduleName into account
  2. On the response from Yahoo - taking the result from the response and schemaKey into account

Am I correct in saying that if the validation would fail it'd throw an error and cause the application to "fall over"? Ofcourse the
developer could handle the error and do something else, but I'm getting to the point below.

I'm not worried about the first case 1) On the request since that's something that's stable and in our domain. However, since > case 2) On the response from Yahoo is outside of our domain, I was thinking...

Would it make sense to let the developer using this library decide if they would like to turn on/off validation? The user story for > this would be:

As a developer
I want to know that node-yahoo-finance2 library will perform the requests I tell it to regardless of what response Yahoo gives us
So that I know that the app won't throw an error and fall over

Originally posted by @pudgereyem in #13 (comment)

@gadicc
Copy link
Owner Author

gadicc commented Feb 5, 2021

Well, it's all about your last line:

So that I know that the app won't throw an error and fall over

Say I have some typical code like:

function calculateStockHoldingValue(symbol) {
  const { qty } = await db.holdings.findOne({ symbol });
  const { price } = await yahooFinance.quoteSummary(symbol, { modules: "price" });

  return qty * price.regularMarketPrice;
}

Say qty = 50 and unexpectedly, price.regularMarketPrice is undefined.

const result = 50 * undefined; // NaN

In the best case scenario, this will crash my app somewhere else. In the worst case scenario, the NaN will be used with other calculations, which will also be NaNs, which could then crash all other unrelated parts of my app.

Or with dates:

const { price: { regularMarketTime }} = await yahooFinance.quoteSummary('AAPL');

// Uncaught TypeError: Cannot read property 'getTime' of undefined
regularMarketTime.getTime();  //  "But it worked fine in development!!"

// Even worse is it IS defined, but is not a Date.  That *won't* throw an error in my code,
// instead, I'll be making a comparison that will be true when it should be false, etc.
const goodTimeToTrade = regularMarketTime > anotherDateObject;

Having said all that, if these options happen frequently, we could at least make all fields optional, and allow additional properties (at least then typescript won't allow you to use a property that could be undefined without a check). My preference however is to have a bit more pain now and then have pain-free development and production ever after - but let's see what the next few days look like :)

Don't forget, the user can and should be catching these errors:

let result;
try {
  result = await yahooFinance.search('gold');
} catch(e) {
  // do nothing with invalid result
  return;
}
// Everything below here will be safe and won't throw unexpected errors
$('input').value(result.Result[0].name);

However, I will add a way for the user to still get access to the result and errors to figure out if they want to regardless, understanding of the risk, say: (not possible yet)

let result;
try {
  result = await yahooFinance.search('gold');
} catch(e) {
  // i'll take responsibility for this
  result = e.result; // maybe
}
// and will do my own validation
if (result && isArray(result.Result) && result.Result[0] && typeof result.Result[0].name === 'string')
  $('input').value(result.Result[0].name);

gadicc added a commit that referenced this issue Feb 5, 2021
The Error object will now include "result" and "error" properties.
Note, func itself can turn off logging, but no way to use this yet.
@pudgereyem
Copy link
Contributor

pudgereyem commented Feb 5, 2021

Hey @gadicc, thanks a lot for creating a separate issue for this and nice writeup.

I agree with you that strict schema validation is great since it would prevent errors as in the examples you provided above. But I don't think I'm a big fan of having strict schema validation on the response from a third party that in theory could change / add to their response whenever.

Let me give you an example of where strict schema validation on the response is unwanted, in my humble opinion.

Example of where strict schema validation on the response is unwanted

Let's say that I want to get the stock symbol for a listed company, but I only have the company name at hand.

My code could look something like this:
import yahooFinance2 from "yahoo-finance2";
import { show } from "./utils";

const getSymbolForCompanyName = async (companyName: string) => {
  // 1. Search for the company name to retrieve the stock symbol
  let searchResult;
  try {
    searchResult = await yahooFinance2.search(companyName);
  } catch (e) {
    throw e;
  }

  // 2. Throw an error if the search didn't return any quotes
  if (searchResult === undefined || !searchResult.quotes.length) {
    throw new Error(`Could not find any quotes for the search`);
  }

  // 3. Filter out the quotes to only include "stocks"
  const onlyQuotesWithQuoteTypeEquity = searchResult.quotes.filter(
    (searchQuote) => {
      if ('quoteType' in searchQuote && searchQuote.quoteType === 'EQUITY') {
        return true;
      } else {
        return false;
      }
    },
  );

  // 4. Get the symbol (type guarded)
  let symbol;
  if (
    'quoteType' in onlyQuotesWithQuoteTypeEquity[0] &&
    'symbol' in onlyQuotesWithQuoteTypeEquity[0]
  ) {
    symbol = onlyQuotesWithQuoteTypeEquity[0].symbol;
  } else {
    throw new Error(`The quote we found din't have a symbol`);
  }

  return symbol;
};

show(async () => {
  // Please change the companyName below
  const companyName = "Apple";
  const symbol = await getSymbolForCompanyName(companyName);
  return `We searched for company name ${companyName} and found the symbol: ${symbol}`;
});

I forked your CodeSandbox demo and added my code, but the proxy gives me 403, you can copy the code from here though.

What could happen that's "out of our control" that would break this code when not necessary?

First, let's set the scene for this particular use case

  1. The only request we make in this case to Yahoo is search
  2. The response we get back from Yahoo we assume to be SearchResult and it's strictly validated with JSON schema here
  3. We are only interested in looking at a particular subset of this response, namely SearchResult.quotes

So, let's ask ourselves; what could happen to the response from Yahoo that would case our validation to throw an error when in reality it wasn't needed?

  1. If Yahoo decide to add an additional property to the response SearchResult
  2. If Yahoo decide to change (or add) what they return in quotes[]
  3. If Yahoo decide to change (or add) what they return in news[]

The first issue I believe can be avoided by setting additionalProperties to true. This could be done for the other types, e.g SearchQuoteYahooEquity as well.

However, the other two issues, where Yahoo could possibly decide to change (or add) to quotes or news are harder to account for. For example, they could possibly add another type of "News Object", which doesn't affect my code at all, but would still cause the request to throw an error.

Closing thoughts

  1. I think "Strict schema validation" on the response is AWESOME to have in development, because this way we can keep the Types and the Schema up to date as time goes on.
  2. However, I think that "Strict schema validation" is not something I'd like to have in PRODUCTION. I'd rather deal with the validation that is relevant to my "case", as I did above, see The quote we found din't have a symbol.
  3. For some, strict schema validation might be great tho - which is why I think this should be something that you can opt in/out for

Eager to hear your thoughts on this @gadicc 💯

@pudgereyem
Copy link
Contributor

pudgereyem commented Feb 5, 2021

@gadicc also, here's my thoughts on your suggestion to allow the developer to opt out of the validation.

However, I will add a way for the user to still get access to the result and errors to figure out if they want to regardless, understanding of the risk, say: (not possible yet)

let result;
try {
  result = await yahooFinance.search('gold');
} catch(e) {
  // i'll take responsibility for this
  result = e.result; // maybe
}
// and will do my own validation
if (result && isArray(result.Result) && result.Result[0] && typeof result.Result[0].name === 'string')
  $('input').value(result.Result[0].name);

At a first glance I don't like this approach, because in the developers opinion it's not an error. It's an active choice to simply not validate the response. I'm not sure what the best way is, but one way would be to pass an argument to the request itself, pseudo code below.

// search for gold but don't validate the response
const result = await yahooFinance.search('gold', { validateResponse: false});
// and will do my own validation
if (result && isArray(result.Result) && result.Result[0] && typeof result.Result[0].name === 'string')
  $('input').value(result.Result[0].name);

@gadicc
Copy link
Owner Author

gadicc commented Feb 5, 2021

Hey @pudgereyem

Thanks for writing all this up. I'm up for making it possible to opt-out out of validation. I don't believe on forcing anything on developers as long as the risks are clear.

Your proposal looks good, let me play around a bit and see what it looks like.

@pudgereyem
Copy link
Contributor

@gadicc awesome! Is there anything I could help out with? I had a quick look at #8 and I'm definitely up for adding another module.

I was thinking that my approach could be:

  1. Create a new issue for a particular module, e.g recommendations by symbol
  2. Write up a proposed solution
  3. Get your thoughts
  4. Create a PR

Thoughts?

@gadicc
Copy link
Owner Author

gadicc commented Feb 5, 2021

Think we're good! You can go check it out.

Just note from https://github.com/gadicc/node-yahoo-finance2/blob/devel/docs/validation.md that it's a moduleOption and not a qqueryOption, so:

const result = await yahooFinance.search('gold', {}, { validateResponse: false});

@gadicc
Copy link
Owner Author

gadicc commented Feb 5, 2021

Is there anything I could help out with? I had a quick look at #8 and I'm definitely up for adding another module.

That would be amazing!!

You could probably just take a stab at the code and submit a PR... it's pretty modular now with moduleExec.

@pudgereyem
Copy link
Contributor

@gadicc sweet! Will try it out later tonight and look into a new module as well! 💯

gadicc pushed a commit that referenced this issue Feb 5, 2021
# [1.5.0](v1.4.2...v1.5.0) (2021-02-05)

### Bug Fixes

* **search:** change longname to be an optional property ([9b8128d](9b8128d))
* **validate:** honor _opts: { validation: { logErrors: true }} ([1e0ebae](1e0ebae))
* **validate:** show errors by default ([ab87ad3](ab87ad3))

### Features

* **modules:** allow { validateResult: false } to not throw ([#16](#16)) ([dc199b5](dc199b5))
gadicc added a commit that referenced this issue Feb 6, 2021
If { validateResult: false } we already don't throw from earlier fix.
However, we still logged the errors, which doesn't really make sense.
Although the user could turn off error logging in general for all
requests, this fix will turn off error logging "automatically" just
for the specific request where { validateResult: false }.
gadicc pushed a commit that referenced this issue Feb 6, 2021
## [1.5.2](v1.5.1...v1.5.2) (2021-02-06)

### Bug Fixes

* **modules:** don't log errors when {validateResult:false} ([#16](#16)) ([29f23dc](29f23dc))
@gadicc
Copy link
Owner Author

gadicc commented Feb 8, 2021

@pudgereyem thanks, let me know if you have any issues with it and i'll re-open. And of course feel free to open issues on anything else.

@gadicc gadicc closed this as completed Feb 8, 2021
@pudgereyem
Copy link
Contributor

@gadicc sweet! I spent the weekend offline mostly but will have a look now! Hope you had a nice weekend.

@gadicc
Copy link
Owner Author

gadicc commented Feb 8, 2021

Yes, sure, no pressure at all from my side! Just wanted to tidy up a bit :)

I aimed to spend the weekend offline but didn't have strong enough willpower to avoid working on a few more things 😅 But it's probably for the best as I'll have a lot less time to work on this this week.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants