Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(v2): add single quote to CSV output #4950

Merged
merged 12 commits into from
Sep 29, 2022

Conversation

wanlingt
Copy link
Contributor

@wanlingt wanlingt commented Sep 27, 2022

Problem

Works towards #133

Solution

Use regex expressions to determine if the field value starts with a formula character. If so, prefix it with a single quote. The only exceptions to this are field values that only contain numbers, + or - e.g. +65, -1.

Design doc here

Breaking Changes

  • Yes - this PR contains breaking changes
    • Downloaded CSVs might have cells that start with an additional single quote, if any of their field values start with a formula character

Tests

  • Enter a value starting with a formula character into a storage form's short text, long text and table fields. Download the responses. These values should be prefixed with a single quote.
  • Repeat the same for a form's feedback.

@wanlingt wanlingt marked this pull request as ready for review September 27, 2022 02:43
Copy link
Contributor

@karrui karrui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but wait for one more set of eyes pls :P.

@amitogp @mantariksh would we need to communicate this change to users too? not sure what we decided on.

Comment on lines 17 to 18
export const PURE_NUMBER_REGEXP = new RegExp('^[+-][0-9]+')
export const FORMULA_INJECTION_REGEXP = new RegExp('^[+=@-].*')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments:

  • No need to use the RegExp constructor, unless we are constructing a pattern from parts
  • Pure numbers regexp should have an end anchor as well. Otherwise in its current form, it still allows formulas like this: +3*SUM(A1) (in fact do add a test for an input like that please)
  • Checking for .* at the end of a regexp can be discarded
Suggested change
export const PURE_NUMBER_REGEXP = new RegExp('^[+-][0-9]+')
export const FORMULA_INJECTION_REGEXP = new RegExp('^[+=@-].*')
export const PURE_NUMBER_REGEXP = /^[+-]\d+(\.\d+)?$/
export const FORMULA_INJECTION_REGEXP = /^[+=@-]/

Maybe we can simplify PURE_NUMBER_REGEXP to /^[+-][\d.]+$/ ? That's not exact number regexp, but it's safe from an injection perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about whether we should check just /^[+-.\d]*$/ for expressions like "-4+2-0" or something like that, since i guess those are formulas but i don’t think they actually pose any risk (correct me if i’m wrong). It might be simpler to describe to admins too, to explain the behavior of what we add quotes to and what we don’t

Copy link
Contributor Author

@wanlingt wanlingt Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justynoh yeah I don't think they pose any risks if they are just digits and + or -. Didn't come across that being mentioned in any of the sites which talked about formula injection - the malicious commands all contained at least 1 alphabet. thanks for spotting that case!
Made the change in 858f4dc, decided to go with /^[+-.\d]+$/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timotheeg thanks tim! added the additional test here c0b1511

Copy link
Contributor Author

@wanlingt wanlingt Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timotheeg 's offline comment: - is a range separator, so it should only be added at the end of the regexp. otherwise it captures ASCII characters between + and .
e0da940

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, then, if they all contain at least 1 alphabet, why not just use a single regex /^[-+@=].*[a-zA-Z]/ haha. Maybe there is a case I'm not seeing

Copy link
Contributor Author

@wanlingt wanlingt Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justynoh that regex wouldn't capture other symbols (e.g. |) and formulas like =SUM(A1). also I think it's better to play it safe and not have Excel react to any formulas at all (other than ones with numbers and + | -, which we know are harmless).

@amitogp
Copy link

amitogp commented Sep 28, 2022

@karrui @wanlingt the plan is to release it with react and announce it as a security improvement along with other react enhancements. Something like: "Enhanced security to stop malicious inputs to form responses." @pearlyong please help in adding it in our react enhancements. @kennethchangOPENGOV will need your help in adding details of this to our guide.

@wanlingt
Copy link
Contributor Author

wanlingt commented Sep 29, 2022

@karrui @wanlingt the plan is to release it with react and announce it as a security improvement along with other react enhancements. Something like: "Enhanced security to stop malicious inputs to form responses." @pearlyong please help in adding it in our react enhancements. @kennethchangOPENGOV will need your help in adding details of this to our guide.

@amitogp I've created new issues for these followup actions
@pearlyong - #4970
@kennethchangOPENGOV - #4971

@wanlingt wanlingt merged commit 7a154fd into form-v2/develop Sep 29, 2022
@wanlingt wanlingt deleted the form-v2/formula-injection branch September 29, 2022 03:56
.tz('Asia/Singapore')
.format('DD MMM YYYY hh:mm:ss A')

const insertedLine = `${insertedCreatedDate},'${feedback.comment},${feedback.rating}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanlingt there's a bug here right? there should be a single quote in the expected line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or actually should not have anything here since there was no change to AngularJS's feedback csv generator.

Suggested change
const insertedLine = `${insertedCreatedDate},'${feedback.comment},${feedback.rating}`
const insertedLine = `${insertedCreatedDate},${feedback.comment},${feedback.rating}`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad, you are right! this tests for feedback csv generator in AngularJS not React. Will have to add a separate test for React

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in your PR 8b33acc

@justynoh justynoh mentioned this pull request Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants