Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AForm Validation for fields with non-digit characters #14307

Closed
wants to merge 6 commits into from

Conversation

cincodenada
Copy link

@cincodenada cincodenada commented Nov 24, 2021

This is a proposed fix for #14306 - I believe the core issue is that mergeChanges() tries to use the keyboard event to predict what the value of the field will be after the change is applied, and gets it wrong for changes that don't involve an explicit selection, at least in my environment.

The cause here is that mergeChanges() seems to assume that character insertions/appends will come with selStart and selEnd set, and thus it can just treat them like any other selection replacement. That isn't the case on in my environment, where appends (as well as single-character insertions and deletions) have selStart and selEnd set to -1. This results in the predicted value of the field being a single character: the character being appended.

This ends up not totally breaking things - for validations that will pass if each individual character typed is a valid first character in the field (since that's what it gets interpreted as), things are mostly fine - a ZIP code can be entered just fine, although length is not constrained until the field is exited.

But for fields with hyphens in them, for instance, pdf.js won't allow you to type the hyphen (because "-" by itself is not the beginning of a valid SSN), and then when the field is blurred, the final validation runs, and complains about the missing hyphens.

This PR updates mergeChanges() so that if selStart and selEnd are -1 it assumes the event is either an append and predicts the result as value + change, or a delete and predicts the result as value.slice(0, -1). This fixes the most common cases, but still leaves room for some very strange behavior if the user inserts or deletes characters in the middle.

Because we don't have selStart and selEnd, I don't see any way of distinguishing between and append an an insertion, or a deletion at the end of the value vs an internal deletion. In this PR I just assume that changes are happening at the end of the string, but that means we'll be quite wrong about internal deletions/insertions, and if someone goes back and inserts a character we'll predict that it was added at the end of the string.

Fortunately that prediction isn't persisted, and internal insertions/deletions should be rare, so I think it's worth this most-of-the-way fix that should cover the majority of cases. If there is some way to determine where single-character changes are being made, from the event or some other source, that would of course be optimal, but I didn't see one, and I don't know enough about how these events are generated to know if it's something we could add ourselves, or if it's something in some PDF standard somewhere.

Also improve variable names for clarity, mask/char -> maskChar/valueChar
to make clear that these are both characters
This is how my pdf.js behaves, I'm not sure if there are environmental
differences, but these tests now demonstrate the failure that I'm seeing
as an end-user
Without a selStart/selEnd (that is, with them at -1), which is how
simple appends show up in my environment, the existing method ended up
predicting that the resulting field value was always a single character
- the character being appended.

Instead, check if we have no selection, and if so, assume the event is
an append.

This is suboptimal, as it could also be a delete or an internal
insertion, but I don't see any way to distinguish those, so in order to
not let perfect be the enemy of good, we assume it's an append, which is
by far the most common case
We can at least tell that it's a delete because change is an empty
string
This was a simple off-by-one error I ran into while fixing, it wouldn't
accept 867-5309 because that's 8 characters, so it switched to the
longer format and thus the validation failed.

Move the threshold to > 8 instead, and also use the longer format if the
field starts with a parenthesis, since otherwise there's not actually a
way to successfully enter the long form
@cincodenada
Copy link
Author

I found the contribution guide I had somehow glossed over the first time and ran unit tests, and fixed up some expectations - I plan to circle back and add some more unit tests to cover the additional cases I ran into here as well.

mergeChange(event) {
let value = event.value;
if (typeof value !== "string") {
value = value.toString();
}
// If there was no selection, it's a single-character change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you find this information ?
In the specs (see http://dev.datalogics.com/cookbook/document/AcrobatDC_js_api_reference.pdf#page=381&zoom=auto,-214,751) about selStart:
The starting position of the current text selection during a keystroke event.
I just tried in Acrobat to add text in textfield and print the changes.

  • I entered: abcdef
  • I selected bcd and I pressed g
    and I get the following output:
change=a, selStart=0, selEnd=0
change=b, selStart=1, selEnd=1
change=c, selStart=2, selEnd=2
change=d, selStart=3, selEnd=3
change=e, selStart=4, selEnd=4
change=f, selStart=5, selEnd=5
change=g, selStart=1, selEnd=4

My understanding is that the selStart is the position of the caret before the change.
When there is no selection, selEnd is equals to selStart else it's the position of the end of the selection.

Copy link
Author

@cincodenada cincodenada Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it through some testing and logging with pdf.js, and seeing what the events being created actually looked like, and they had -1 for single-character edits.

Like I said (maybe here or on the bug, I don't remember), I wasn't sure whether the behavior of these events was in a specification or something somewhere or not - it looks like it is, which I guess means the bug is that something else in pdf.js isn't populating selStart/selEnd for these events? That would also make sense, and would solve the edge cases as well.

@@ -361,15 +361,15 @@ describe("Scripting", function () {
name: "Keystroke",
willCommit: false,
change: "o",
selStart: 4,
selEnd: 4,
selStart: -1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the changes in this file are very likely wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes may be wrong in that they are against spec, but they are correct in that they are what pdf.js is emitting currently. It sounds like pdf.js's behavior in creating these events is the actual bug though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, I agree these will probably need to end up being reverted, hopefully replaced by some tests (or fixed test) of the generation of events for single-character edits, to validate that they are populating selStart/selEnd like this further up in the process.

const mask = _cMask.charAt(i);
const char = _value.charAt(i);
const checker = checkers.get(mask);
for (let i = 0, ii = _value.length; i < ii; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad... thank you to point that out.

/*
* Take an event object and reconstruct what we think the result will be once
* the change is applied, so we can validate against it and cancel the event
* if need be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Here, and everywhere in the patch, please make sure to always end all sentences with a period.

Also, please keep https://github.com/mozilla/pdf.js/wiki/Squashing-Commits in mind when updating the patch.

@calixteman
Copy link
Contributor

The main issue is here:
https://github.com/mozilla/pdf.js/blob/master/src/display/annotation_layer.js#L1072
elementData.beforeInputSelectionRange is null when it shouldn't.

There is another problem here:
https://github.com/mozilla/pdf.js/blob/master/src/scripting_api/event.js#L29-L30
where we should have ?? instead of ||.

@@ -564,7 +564,7 @@ class AForm {
event.change.length +
event.selStart -
event.selEnd;
if (finalLen >= 8) {
if (finalLen > 8 || event.value[0] === "(") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rhs of || is useless but definitely the lhs is a way better like this.

Copy link
Author

@cincodenada cincodenada Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RHS is necessary because we validate every time the field changes. For final validation you're right that it's unnecessary, but for the validation-during-typing, without the RHS if someone starts typing (888) the validation will stop them - for some reason the first character seems to be ignored for validation-during-typing, but the second parenthesis will cause an error:
Screen Shot 2021-11-24 at 3 34 02 PM

This is because (888) only has 5 characters, so this code selects the second pattern (999-9999), and (888) does not match the beginning of the that pattern. Instead, in that case we want to select the first pattern to match against, which is what the RHS here is doing.

I can add some tests around this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, +1.

@calixteman
Copy link
Contributor

And I just checked MDN:
https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/beforeinput_event
and beforeinput is now available, so we should likely use it now instead of this workaround.

@cincodenada
Copy link
Author

cincodenada commented Nov 24, 2021

The main issue is here:
https://github.com/mozilla/pdf.js/blob/master/src/display/annotation_layer.js#L1072
elementData.beforeInputSelectionRange is null when it shouldn't.

Okay, so yeah sounds like this code will work fine if selStart/selEnd is always present, which it sounds like is the behavior of Acrobat. So then, the root bug is further up the stack, where we're not passing down selStart/selEnd for edits without an explicit selection. Are you on the trail of a fix for that, or should I try digging in further?

There is another problem here:
https://github.com/mozilla/pdf.js/blob/master/src/scripting_api/event.js#L29-L30
where we should have ?? instead of ||.

Oh, good catch - I looked at that code and didn't even notice but you're right, that code is dropping zeroes improperly. I wonder if that's connected to the reason the first character isn't validating during typing...

@cincodenada
Copy link
Author

In summary: I'm happy to continue digging here and move this PR towards integrating beforeInput and getting a fix in higher up where it belongs, but I don't want to do duplicate work, so I can also just revert the workaround bits and this PR can just be the smaller fixes around phone numbers. Either way I'll squash things down once things are settled - do y'all prefer a small number of atomic commits, or one commit per PR?

@calixteman
Copy link
Contributor

I'm fine with a PR to only modify aform.js and it'd be nice if you could add a test to check that format is fine.
It could something similar to:

it("should not validate a number on a keystroke event", async () => {
const refId = getId();
const data = {
objects: {
field: [
{
id: refId,
value: "",
actions: {
Validate: [
`AFNumber_Keystroke(null, 0, null, null, null, null);`,
],
},
type: "text",
name: "MyField",
},
],
},
appInfo: { language: "en-US", platform: "Linux x86_64" },
calculationOrder: [],
dispatchEventName: "_dispatchMe",
};
sandbox.createSandbox(data);
await sandbox.dispatchEventInSandbox({
id: refId,
value: "123s456.789",
name: "Keystroke",
willCommit: true,
});
expect(send_queue.has("alert")).toEqual(true);
expect(send_queue.get("alert")).toEqual({
command: "alert",
value:
"The value entered does not match the format of the field [ MyField ]",
});
});
});

where an alert is handled and to:
it("should trigger a Keystroke event and change it", async () => {
const refId = getId();
const data = {
objects: {
field: [
{
id: refId,
value: "",
actions: {
Keystroke: [`event.change = "a";`],
},
type: "text",
},
],
},
appInfo: { language: "en-US", platform: "Linux x86_64" },
calculationOrder: [],
};
sandbox.createSandbox(data);
await sandbox.dispatchEventInSandbox({
id: refId,
value: "hell",
name: "Keystroke",
willCommit: false,
change: "o",
selStart: 4,
selEnd: 4,
});

where a text is entered (using selStart & selEnd).
Once that stuff is fixed then I think you could work on a second PR to add beforeinput stuff.

@calixteman
Copy link
Contributor

@cincodenada, any progress ?
If you don't have time to work on that stuff, please tell it and I'll fix the issues myself.

@calixteman
Copy link
Contributor

I push this PR: #14429
in order to address issues found by @cincodenada.
@cincodenada I hope you're well and thank you for your help.

@calixteman calixteman closed this Jan 7, 2022
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 8, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`).
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 15, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 15, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 22, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 22, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 22, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 23, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
calixteman added a commit to calixteman/pdf.js that referenced this pull request Jan 23, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
bh213 pushed a commit to bh213/pdf.js that referenced this pull request Jun 3, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
rousek pushed a commit to signosoft/pdf.js that referenced this pull request Aug 10, 2022
 - it aims to fix issue mozilla#14307;
 - this event has been added recently in Firefox and we can now use it;
 - fix few bugs in aform.js or in annotation_layer.js;
 - add some integration tests to test keystroke events (see `AFSpecial_Keystroke`);
 - make dispatchEvent in the quickjs sandbox async.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants