Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems for preview of an entry with % sign in title #6753

Closed
1 task done
crystalfp opened this issue Aug 10, 2020 · 4 comments · Fixed by #6760
Closed
1 task done

Problems for preview of an entry with % sign in title #6753

crystalfp opened this issue Aug 10, 2020 · 4 comments · Fixed by #6760
Labels
bug Confirmed bugs or reports that are very likely to be bugs entry-preview

Comments

@crystalfp
Copy link

JabRef version 5.1-PullRequest6152.694--2020-06-28--314bc7d
Windows 10 10.0 amd64
Java 14.0.1

In the preview of an entry containing % sign in the title these disappear.

Steps to reproduce the behavior:

  1. Load the following entry in a biblatex library:
@WWW{Thalheimer2006,
  author  = {Will Thalheimer},
  date    = {2006-05-01},
  title   = {{People remember 10\%, 20\%…Oh Really?}},
  url     = {https://www.worklearning.com/2006/05/01/people_remember/},
  urldate = {2020-05-23},
  langid  = {english},
}
  1. Select the entry and look at the preview (the standard one):
WWW (Thalheimer2006)
Thalheimer, W.
People remember 10 20Oh Really? 
2006

Notice the % sign and the three dots are missing.
3. Now switch the Preview to IEEE. The title is correct:

[1]
W. Thalheimer, “People remember 10%, 20%…Oh Really?,” 2020-05-23. https://www.worklearning.com/2006/05/01/people_remember/.
@Siedlerchr Siedlerchr added entry-preview bug Confirmed bugs or reports that are very likely to be bugs labels Aug 11, 2020
@Vince250598
Copy link
Contributor

Vince250598 commented Aug 12, 2020

The problem is in the class HTMLChars in the format method. The case '%' falls under the last else of the method where there's a TODO. I added 2 small modifications so it works with this case but i didn't do extensive testing.

@Override
    public String format(String inField) {
        int i;
        String field = inField.replaceAll("&|\\\\&", "&") // Replace & and \& with &
                              .replaceAll("[\\n]{2,}", "<p>") // Replace double line breaks with <p>
                              .replace("\n", "<br>") // Replace single line breaks with <br>
                              .replace("\\$", "&dollar;") // Replace \$ with &dollar;
                              .replaceAll("\\$([^$]*)\\$", "\\{$1\\}"); // Replace $...$ with {...} to simplify conversion

        StringBuilder sb = new StringBuilder();
        StringBuilder currentCommand = null;

        char c;
        boolean escaped = false;
        boolean incommand = false;

        for (i = 0; i < field.length(); i++) {
            c = field.charAt(i);
            if (escaped && (c == '\\')) {
                sb.append('\\');
                escaped = false;
            } else if (c == '\\') {
                if (incommand) {
                    /* Close Command */
                    String command = currentCommand.toString();
                    String result = HTML_CHARS.get(command);
                    sb.append(Objects.requireNonNullElse(result, command));
                }
                escaped = true;
                incommand = true;
                currentCommand = new StringBuilder();
            } else if (!incommand && ((c == '{') || (c == '}'))) {
                // Swallow the brace.
            } else if (Character.isLetter(c) /*I removed " || c == '% "*/
                    || StringUtil.SPECIAL_COMMAND_CHARS.contains(String.valueOf(c))) {
                escaped = false;

                if (!incommand) {
                    sb.append(c);
                } else {
                    currentCommand.append(c);
                    testCharCom:
                    if ((currentCommand.length() == 1)
                            && StringUtil.SPECIAL_COMMAND_CHARS.contains(currentCommand.toString())) {
                        // This indicates that we are in a command of the type
                        // \^o or \~{n}
                        if (i >= (field.length() - 1)) {
                            break testCharCom;
                        }

                        String command = currentCommand.toString();
                        i++;
                        c = field.charAt(i);
                        String commandBody;
                        if (c == '{') {
                            String part = StringUtil.getPart(field, i, false);
                            i += part.length();
                            commandBody = part;
                        } else {
                            commandBody = field.substring(i, i + 1);
                        }
                        String result = HTML_CHARS.get(command + commandBody);

                        sb.append(Objects.requireNonNullElse(result, commandBody));

                        incommand = false;
                        escaped = false;
                    } else {
                        // Are we already at the end of the string?
                        if ((i + 1) == field.length()) {
                            String command = currentCommand.toString();
                            String result = HTML_CHARS.get(command);
                            /* If found, then use translated version. If not,
                             * then keep
                             * the text of the parameter intact.
                             */
                            sb.append(Objects.requireNonNullElse(result, command));
                        }
                    }
                }
            } else {
                if (!incommand) {
                    sb.append(c);
                } else if (Character.isWhitespace(c) || (c == '{') || (c == '}')) {
                    String command = currentCommand.toString();

                    // Test if we are dealing with a formatting
                    // command.
                    // If so, handle.
                    String tag = getHTMLTag(command);
                    if (!tag.isEmpty()) {
                        String part = StringUtil.getPart(field, i, true);
                        i += part.length();
                        sb.append('<').append(tag).append('>').append(part).append("</").append(tag).append('>');
                    } else if (c == '{') {
                        String argument = StringUtil.getPart(field, i, true);
                        i += argument.length();
                        // handle common case of general latex command
                        String result = HTML_CHARS.get(command + argument);
                        // If found, then use translated version. If not, then keep
                        // the text of the parameter intact.

                        if (result == null) {
                            if (argument.isEmpty()) {
                                // Maybe a separator, such as in \LaTeX{}, so use command
                                sb.append(command);
                            } else {
                                // Otherwise, use argument
                                sb.append(argument);
                            }
                        } else {
                            sb.append(result);
                        }
                    } else if (c == '}') {
                        // This end brace terminates a command. This can be the case in
                        // constructs like {\aa}. The correct behaviour should be to
                        // substitute the evaluated command and swallow the brace:
                        String result = HTML_CHARS.get(command);
                        // If the command is unknown, just print it:
                        sb.append(Objects.requireNonNullElse(result, command));
                    } else {
                        String result = HTML_CHARS.get(command);
                        sb.append(Objects.requireNonNullElse(result, command));
                        sb.append(' ');
                    }
                } else { // HERE '\%'
                    sb.append(c); //This line was added
                    /*
                     * TODO: this point is reached, apparently, if a command is
                     * terminated in a strange way, such as with "$\omega$".
                     * Also, the command "\&" causes us to get here. The former
                     * issue is maybe a little difficult to address, since it
                     * involves the LaTeX math mode. We don't have a complete
                     * LaTeX parser, so maybe it's better to ignore these
                     * commands?
                     */
                }

                incommand = false;
                escaped = false;
            }
        }

        return sb.toString().replace("~", "&nbsp;"); // Replace any remaining ~ with &nbsp; (non-breaking spaces)
    }`
```

@Siedlerchr
Copy link
Member

@Vince250598 Thanks for looking into it. It would be really cool if you could create a PR with your changes. That helps us to better see the differences. Regarding tetsts, we already have automated tests (HTMLCharsTest) for this class so your changes can easily be tested.

@Siedlerchr
Copy link
Member

Thanks to @Vince250598 this issue should now be fixed in the latest master (currently building, ready in approx 30 min)
http://builds.jabref.org/master/

@crystalfp
Copy link
Author

The latest master fixed it. Thanks!

@koppor koppor moved this to Done in Prioritization Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs entry-preview
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants