Change approach to fake test data #835

shaunagm · 2023-06-06T19:27:53Z

Currently, our connector tests involve large amounts of fake data, usually in JSON format (but occasionally stored as Python dicts, csvs, or other formats). Sometimes this data is incorporated into the tests themselves, making them hard to read. Sometimes they're put in separate files, which is better, but it's still not ideal to have, say, a 400 line test data file to test just one connector.

Are there other approaches that might be more readable, easier to maintain, and easier to write? (I know generating the test data is often the most annoying part of writing tests for connectors.)

I'm aware of tools like Factory Boy but that's for Python objects, not really for data. There's Faker which seems more promising.

Another option might be making use of Json Schemas although "validate the schema" isn't a huge part of the tests we're doing.

(I don't love that any of these approaches would involve adding another dependency - maybe it's time to separate out the handful of dev dependencies, like we do the docs dependencies?)

Whatever we do, we should make sure to document it really well so that it makes the lives of people writing Parsons tests easier rather than harder and more confusing.

What do folks think?

corasaurus-hex · 2023-06-07T21:53:30Z

What do you think about something like hypothesis-jsonschema? The plus side to using something like this is that you can default to running just one example per test but can, in CI or otherwise, use more examples to stress test.

shaunagm · 2023-06-16T17:58:46Z

@corasaurus-hex great suggestion. I haven't used hypothesis-jsonschema before, so my main concern is around usability for people who don't have engineering backgrounds. Also just general time to implement vs other solutions. But this definitely deserves consideration!

corasaurus-hex · 2023-06-16T18:56:31Z

@shaunagm that's an extremely fair take on that, it's definitely more challenging and time-consuming to implement, and maybe a little confusing if they find it fails in once instance and not in another because the data is all generated. so, consider that suggestion retracted.

As a side note, it looks like Factory Boy can create dicts, which I wasn't aware of.

shaunagm mentioned this issue Jun 6, 2023

Action Builder connector #826

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change approach to fake test data #835

Change approach to fake test data #835

shaunagm commented Jun 6, 2023

corasaurus-hex commented Jun 7, 2023 •

edited

Loading

shaunagm commented Jun 16, 2023

corasaurus-hex commented Jun 16, 2023

Change approach to fake test data #835

Change approach to fake test data #835

Comments

shaunagm commented Jun 6, 2023

corasaurus-hex commented Jun 7, 2023 • edited Loading

shaunagm commented Jun 16, 2023

corasaurus-hex commented Jun 16, 2023

corasaurus-hex commented Jun 7, 2023 •

edited

Loading