Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MathJax fails on entities #2650

Closed
fast-reflexes opened this issue Mar 14, 2021 · 11 comments
Closed

MathJax fails on entities #2650

fast-reflexes opened this issue Mar 14, 2021 · 11 comments
Labels
Accepted Issue has been reproduced by MathJax team Code Example Contains an illustrative code example, solution, or work-around Fixed Test Needed v3 v3.1
Milestone

Comments

@fast-reflexes
Copy link

fast-reflexes commented Mar 14, 2021

Issue Summary

When MathJax encounters a non-existing entity, it tries to download a file that does not exist and then fails. I don't expect MathJax to be able to handle non-existing entities, but the error is still odd (should produce MathProcessing error or something instead). Is this behaviour expected and desired?

Steps to Reproduce:

Check code pen https://codepen.io/fast-reflexes/pen/LYbabaY and check the console (the actual browser console) for an error message when MathJax tries to download non-existing file https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.1.2/es5/util/entities/f.js when presented with non-existing entity &fpp;.

Any other information you want to share that is relevant to the issue
being reported. Especially, why do you consider this to be a bug? What
do you expect to happen instead?

Technical details:

  • MathJax Version: 3.1.2
  • Client OS: Mac OS Catalina 10.15.7
  • Browser: Chrome Version 88.0.4324.192 (Official Build) (x86_64)
@dpvc
Copy link
Member

dpvc commented Mar 16, 2021

MathJax doesn't load all the entity names initially for two reasons: first, there are a lot of them, and most pages won't be using the, and second, for in-browser use, the browser will usually translate the entities before MathJax runs, so there is usually no reason for it to need them. In the case of an unknown entity, however, you are right, MathJax will try to load one of its files and fail. This was set up for use in server-side node applications, and doesn't work in-browser.

There are two possible solutions:

First, there is an option that controls whether to try to load the extra entity names, and you can turn that off:

MathJax = {
  startup: {
    ready() {
      MathJax._.util.Entities.options.loadMissingEntities = false;
      MathJax.startup.defaultReady();
    }
  }
};

(Note that you don't need the loader section in your codepen, as loading tex-mml-chtml.js already includes the files you specified.)

The second would be to load all the entity names, so MathJax won't try to load any individual files. For this, use

MathJax = {
  loader: { load: ["input/mml/entities"] }
};

That will mean all the entity definitions are loaded, and MathJax won't need to load any other files.

As it happens, I had been looking at this myself the day before your issue was posted, and was already planning to change this for in-browser use. It is good to have an issue tracker to remind me to make sure that happens.

@dpvc dpvc added this to the 3.1.3 milestone Mar 16, 2021
@dpvc dpvc added Accepted Issue has been reproduced by MathJax team v3 labels Mar 16, 2021
@dpvc dpvc self-assigned this Mar 16, 2021
@fast-reflexes
Copy link
Author

fast-reflexes commented Mar 17, 2021

Thank you for you answer!

(Note that you don't need the loader section in your codepen, as loading tex-mml-chtml.js already includes the files you specified.)

I have noticed this as well and I think that this should be in the documentation. Now it says in the docs on MathML that either a mml: {} configuration block should be present or that the load array should contain input/mathml.

  • Regarding trying to load the entities. I totally understand your answer but it doesn't matter that the entity in question does not exist since it tries to download a file covering all entities beginning with a certain letter (I assume), like .../entities/f.js and I don't think that this should fail. You already said that you plan to look into this but just saying that I think that either this behaviour should be disabled in the non-server version OR the file in question should exist with the CDN. I also don't think it's a good solution to either disable or enable the import even though it's good to know that this can be done :)

  • Just wanted to add that I got this behaviour in the browser when using React. So React doesn't translate all entities into HTML entities at which point it translates it into text instead (have filed a report about this in React as well but apparently it has got something to do with XSS vulnerabilities with some entities). This results in the browser NOT translating the entity (because it believes it is text as indicated by React) and so it remains untranslated. MathJax then tries to load the corresponding entity file at which point it fails despite a valid entity and the error message is very confusing to someone who does not know all of this background. Two examples where this happen are for entities ≈ and ≥ but there are many others.

Thanks a lot for you input and best regards!

@fast-reflexes
Copy link
Author

... and oh, I forgot in all my eagerness to talk about potential bugs, thanks for a TRULY GREAT framework :)

@dpvc
Copy link
Member

dpvc commented Mar 17, 2021

Tanks for your kind words, and your additional information about React. I agree that you should not get the message and that that needs to be fixed. That is what I meant when I said I was going to change that in the next release. I gave you the other alternatives as a work-around until then.

@fast-reflexes
Copy link
Author

fast-reflexes commented Mar 18, 2021

What bugs me a bit is that in the setting with React that I'm talking about ... if the MathML is given as a string to the mathml2chtml function with the same entity (≈ for example), then it works. But perhaps this is because this way of doing it bypasses React and renders the MathML with the browser before typesetting, thus effectively removing the propblem caused by React (and the browser translates it to a symbol before typesetting).

You can see the behaviour here https://codesandbox.io/s/mathjax-react-entities-bug-egeef?file=/src/App.js Try changing the entity in App.js to ≈ and check the browser console (not Sandbox console) and spot the error. Then change back and hit the button and you see that it works perfectly despite having ≈ in mj.js.

In the same example we can see that React handles ≥ in the same way (it translates it to text) but here MathJax can sort it on its own without trying to fetch an entity file... changing it to ≥ does not work however ...

I will stop dwelling on this now :D

@dpvc
Copy link
Member

dpvc commented Mar 18, 2021

Then change back and hit the button and you see that it works perfectly despite having ≈ in mj.js.

I'm not able to get this to work as you say, but perhaps this is what's happening: MathJax only tries to load the file once, and if it fails, the next entity starting with that letter will be considered undefined. So that may be what is causing the behavior that you see.

In the same example we can see that React handles ≥ in the same way (it translates it to text) but here MathJax can sort it on its own without trying to fetch an entity file

MathJax contains a number of the entity definitions without having to download a file. I believe these are the ones from the MathML entities list if I remember correctly. Note that ≥ is among those. So it can process some, but not all, entities without loading a file.

@fast-reflexes
Copy link
Author

fast-reflexes commented Mar 18, 2021

I'm not able to get this to work as you say, but perhaps this is what's happening: MathJax only tries to load the file once, and if it fails, the next entity starting with that letter will be considered undefined. So that may be what is causing the behavior that you see.

Sorry was a bit unclear.

  • Navigate to the sandbox
  • Change ≈ to ≈ in App.js on row 14.
  • Reload right window and verify in the browser console (NOT sandbox console) that Mathjax tried to load a.js which it didn't manage to.
  • Change back in App.js on row 14 to ≈
  • Reload the right window anew and verify that the loading error of a.js is gone in the cleared console and that the approx sign is successfully rendered.
  • NOW, press the button Typeset and witness that MathJax is able to typeset with ≈ (as specified in mj.js) perfectly as long as it is done with mathml2chtml (it does not try to download a non-existing file any longer).

@dpvc
Copy link
Member

dpvc commented Mar 18, 2021

OK, thanks for the updated instructions. I am able to reproduce it now.

I think it has something to do either with React or with the way codesandbox.io works. It turns out that in some cases, the entity ≈ is being converted by the browser before MathJax sees it and sometimes not. You can check this yourself by adding

  startup: {
    ready() {
      const Entities = window.MathJax._.util.Entities;
      const translate = Entities.translate;
      Entities.translate = (text) => {
        console.log('translate: ', text);
        return translate(text);
      }
      window.MathJax.startup.defaultReady();
    }
  }

into the window.MathJax = {...} in the mj.js file. This will show you all the strings that MathJax is checking for entities.

On the initial run (unedited App.js), I get

translate:  – "10"
translate:  – "/"
translate:  – "3"
translate:  – "≈"
translate:  – "3.33"
translate:  – "10"
translate:  – "/"
translate:  – "3"
translate:  – "≥"
translate:  – "3.33"

Note that the initial ≈ on the forth line has already been turned into U+2248 (whereas the ≥ has not). Then, after changing to ≈ as in your second bullet point, I get

translate:  – "10" 
translate:  – "/"
translate:  – "3"
translate:  – "≈"

followed by the error message about the file not being found.

Changing back to ≈ and loading the right window, I get the same as the initial output above. Then clicking the typeset button, I get

translate:  – "10"
translate:  – "/"
translate:  – "3"
translate:  – "≈"
translate:  – "3.33"

where the ≈ entity has already been replaced by the browser before MathJax sees the string, so it is never called on to make the substitution itself (as it is for the &GreaterThan; entity).

So this is not about MathJax handling ≈ without loading the file, since it never sees the ≈ entity in this case. It is already replaced by the time MathJax gets it, so that is being done, probably by the browser, sometime earlier, either by how React handled the inserted HTML, or how codesandbox.io does. I'm not sure which.

Incidentally, I load the sandbox fresh and select mj.js for editing and paste in the configuration code above, the auto-run occurs, and I see the initial translate messages, but with &Greaterthan; also substituted initially:

translate:  – "10"
translate:  – "/"
translate:  – "3"
translate:  – "≈"
translate:  – "3.33"
translate:  – "10"
translate:  – "/"
translate:  – "3"
translate:  – "≥"
translate:  – "3.33"

Then if I reload the right-hand frame using the reload icon, I get &GreaterThan; explicitly like my first listing above. So I suspect it has more to do with the sandbox than with React itself. In any case, the cause is not within MathJax.

@dpvc
Copy link
Member

dpvc commented Mar 18, 2021

Note: the typeset button properly renders the ≈ initially (on a fresh load of the sandbox), as well as after all your bullet points. In all cases, the ≈ in the call to mathml2chtml is being replaced when MathJax converts the string to DOM elements (via window.DOMParser and its parseFromString() method). It is the "Tpeset after React has rendered" copy of ≈ that is not being replaced by the browser for some reason. The one from the typeset button is always replaced.

@fast-reflexes
Copy link
Author

Thanks for taking your time to answer!

Your observations are in line with mine. However, I don't think we should blame the sandbox as I have been able to reproduce the exact same results locally without sandboxes. Nonetheless, I think it is as you say... the BROWSER itself knows how to interpret the entity so if it's up to the browser, the entity is correctly translated before MathJax processing. When we use the mathml2chtmlfunction, MathJax interacts with the browser directly and so it works, but if React processes the entity before, then it decides to turn it into a string. The browser then accepts it as a string which forces MathJax to do the download since it sees an entity. I have filed an issue with React regarding this but as I said, they purposely don't translate all entities due to XSS vulnerabilities which are currently not something that I understand.

The reason to my interest is that I'm writing a library for MathJax use in React so this is very valuable to me. But I agree with you that the only problem with MathJax is that it tries to download non-existing files, other than that, MathJax is clean.. and we have already discussed that so I'm happy with this outcome.

All the best to you my friend!

@dpvc
Copy link
Member

dpvc commented Mar 18, 2021

Thanks for the kind words. It was an interesting problem to look into. Good luck with your project!

dpvc added a commit to mathjax/MathJax-src that referenced this issue Mar 19, 2021
…ilter for entities to load the full entity component when MathML input is used. (mathjax/MathJax#2650)
@dpvc dpvc removed their assignment Mar 19, 2021
@dpvc dpvc added the Code Example Contains an illustrative code example, solution, or work-around label Apr 1, 2021
dpvc added a commit to mathjax/MathJax-src that referenced this issue Apr 8, 2021
Make path resolving more flexible, and don't fail when loading entity files. (mathjax/MathJax#2650)
@dpvc dpvc added Merged Merged into develop branch and removed Ready for Review labels Apr 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Issue has been reproduced by MathJax team Code Example Contains an illustrative code example, solution, or work-around Fixed Test Needed v3 v3.1
Projects
None yet
Development

No branches or pull requests

2 participants