Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SourceSpan interference #151

Closed
B3zaleel opened this issue Aug 22, 2018 · 8 comments · Fixed by #246
Closed

SourceSpan interference #151

B3zaleel opened this issue Aug 22, 2018 · 8 comments · Fixed by #246
Assignees
Labels

Comments

@B3zaleel
Copy link
Contributor

Currently if you try to select a predefined function like sin (\sin) in the example project's Textbox, it doesn't get selected. It can only get selected when you place at least 8 characters before it (which is equal to the number of characters of the functions' actual leading characters, which in this case is "\mathrm{\sin}". After that you have to select the characters that precede the function. Try this after clearing the textbox's content "TRYTHISFxn\cos^2(\theta) + \sin^2(\theta)=1".
Here the cos and sin seem to be bound together.

@ForNeVeR ForNeVeR self-assigned this Aug 22, 2018
@ForNeVeR ForNeVeR removed their assignment Sep 1, 2019
@rstm-sf
Copy link
Contributor

rstm-sf commented Oct 6, 2019

Is the issue obsolete? In master

@ForNeVeR
Copy link
Owner

ForNeVeR commented Oct 6, 2019

Nope, it isn't. Check this (the selected text in the text box vs. the highlighted formula part):
image

@rstm-sf
Copy link
Contributor

rstm-sf commented Oct 6, 2019

Sorry, I misunderstood.

@rstm-sf
Copy link
Contributor

rstm-sf commented Nov 5, 2019

I have a hunch that this is due to alias parsing (see PredefinedTexFormulas.xml and #214). And I think #215 has a similar behavior.

@rstm-sf
Copy link
Contributor

rstm-sf commented Nov 5, 2019

If so, then another similar issue #132

@ForNeVeR
Copy link
Owner

ForNeVeR commented Nov 9, 2019

Your conclusion looks right to me. Please don't fix this yet, I have students who're dispatched to fix this issue :)

@ForNeVeR
Copy link
Owner

Alright, here're some thoughts.

As usual, this task is much harder that I initially thought. Let me explain.

The root cause of the issue is that our formula parser isn't hygienic enough for the predefined formulae. Consider how we parse any simple thing like \sin:

  1. First, we correctly parse the command (\sin) and find it in the predefined formula list: https://github.com/ForNeVeR/wpf-math/blob/36707619581933b6c9fef213627fdb7beb24236b/src/WpfMath/Data/PredefinedTexFormulas.xml#L288-L295
  2. Then, we start creating the formula according to the instructions in that XML. <CreateFormula> is simple (it just defines and initializes a variable), and MethodInvocation calls an AddOperator method.
  3. The AddOperator method then parses its first argument (namely, \mathrm{sin}) as a completely new formula using our TexFormulaParser.
  4. Which will find a \mathrm style-based command, will parse it and then create an atom. The resulting atom's source is a text string that was passed to a parser: \mathrm{sin}. It has absolutely no relation to the string that was entered by user, it's just a parser leftover.
  5. But, still, this fake atom source will be used in the resulting atom tree we produce. That's the issue.

@dedok1997 tried to fix the issue in #227 by performing some maneuveres around the atom sources in this particular call, AddOperator. But, as you can see, the issue is bigger: it is in parser's hygiene. It just should've never created the wrong sources for atoms in the first place, and then we wouldn't have to fix them up. Even after the fix, these "wrong", fake sources are visible in the atom tree (as serialized by our tests).

Also, post-parse atom fixup from #227 requires us to make the atoms' sources mutable. This is a big issue: earlier, we had problems because some code (AFAIK the same troublesome TexPredefinedFormulaParser) was caching some atoms in static locations, but that particular snippet seems gone now. I don't want to make our atoms mutable again.

There's this interesting comment in the TexFormulaHelper, see #132:

TODO[F]: Review the cases where the formula gets constructed from this.Formula.RootAtom wrapped in something (e.g. SetFixedTypes): in these cases, it looks like we should clean the SourceSpan inside of a formula and set it for the new root atom only?

It looks like the issue is much bigger I even thought when I was writing this comment :)

(But we'll fix #132 here, too.)

Expected parser behavior

First of all, let's state the behavior expected from our parser in cases when it encounters a predefined formula. I believe that, for these cases, it should return an atom that has a Source pointing to a command name (e.g. to \sin or sin in our case, doesn't matter which one). Children of the root atom shouldn't have their Source set at all (i.e. Source = null).

How to achieve that

To me it is now obvious that we should create a special parser mode, hygienic parsing, when it saves no source information at all. Only the hygienic parsers should be used when parsing any of the predefined formula stuff, and the root source for the resulting atom should be forcibly set to the one we want to use.

The situation is slightly complicated by the fact that our parsing itself sometimes may rely on proper Sources saved to atoms. But hopefully we haven't too much of this, and we'll be able to fix it (replace by working with parser's position instead of any parsed atoms' Sources in the parser code).

There's another big issue: how to set the Source of the root atom? I have no good answer yet, but I see two possibilities: we could either wrap the whole atom into a RowAtom with proper Source, or we could define an abstract WithSource method that returns a modified copy of atom with its source reset to a desired value. Notably, RowAtom already have such method.

The new hygienic parsing mode will probably require some architectural solution to make sure we're not leaking any source information: e.g. we could pass a Parser instance to the Atom itself, and it will ask the parser if it's allowed to store the source information, or we could pass a lambda, or something else like that. We'll see.

@ForNeVeR
Copy link
Owner

The Plan

(Thanks to @gsomix for a productive discussion.)

After thinking about that more, I think I've found the solution which is good (i.e. non-duggy), follows the modern practices regarding text editor implementation and won't require us to write much code.

I think that it's okay to include source information from additional text sources (e.g. macro text or predefined formula or whatever): we just need to mark these sources accordingly!

  • The initial issue isn't in our API, it is in the FormulaControl that's trying to highlight boxes based on the Source indices while never checking if the Source is right. We could add the check and it will solve the problem.

  • But I suggest we add a little bit of protection here by marking the sources accordingly: we'll add a SourceKind property into the SourceSpan class, and distinguish various sources (user input, predefined formula, whatever) by this property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants