Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit conflict marker detection #629

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Explicit conflict marker detection #629

wants to merge 2 commits into from

Conversation

koppor
Copy link
Member

@koppor koppor commented Sep 26, 2022

Fixes JabRef#9167

WIP, because the parsing architecture is a bit complicated here.

We cannot "just" read the whole file, because it could be very slow when reading large data bases.

koppor and others added 2 commits September 26, 2022 20:39
Co-authored-by: Christoph <[email protected]>
Co-authored-by: Carl Christian Snethlage <[email protected]>
Co-authored-by: Houssem Nasri <[email protected]>
Co-authored-by: Christoph <[email protected]>
Co-authored-by: Carl Christian Snethlage <[email protected]>
Co-authored-by: Houssem Nasri <[email protected]>
Co-authored-by: Benedikt Tutzer <[email protected]>
@koppor
Copy link
Member Author

koppor commented Oct 11, 2022

Two implementation ideas:

  1. Dive into the grammar and add "exception" paths for conflict markers
  2. Do a pre-reading of the file and check for conflict markers
  3. Spawn a parallel thread checking for conflict markers. If the normal reading thread returns, with an error, check what the checking thread said.

Think, we could go for 2 even though file loading might be slower?! -- https://www.amitph.com/java-read-write-large-files-efficiently/


ParserResult expected = ParserResult.fromErrorMessage("Found git conflict markers");

assertEquals(expected, parserResult);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the test passing by checking for git markers inside the read method after consuming a newline character.
I also called checkForGitMarkers at the beginning of the file to ensure it's called on the first line.

private int read() throws IOException {
        int character = pushbackReader.read();

        if (!isEOFCharacter(character)) {
            pureTextFromFile.offerLast((char) character);
        }
        if (character == '\n') {
            line++;
            checkForGitConflictMarker();
        }
        return character;
    }

This is the logic. It looks for a line that starts with the 'ours' marker, which is represented by the symbol <<<<<<<. Then it continues to skip lines until it reaches the 'theirs' marker >>>>>>>.

    private void checkForGitConflictMarker() throws IOException {
        skipSpace();
        int markerCount = 0;
        // Looking for the 'ours' marker
        char c;
        while ((c = (char) peek()) == '<' && !isEOFCharacter(c)) {
            read();
            markerCount++;
        }
        if (markerCount == 7) {
            parserResult.addWarning("Found git conflict markers at line %d".formatted(line));
            // Skip 'ours' marker <<<<<<<
            skipLine();
            // Keep skipping lines until we hit the beginning of 'theirs' marker >>>>>>>
            while (peek() != '>' && !isEOFCharacter(peek())) {
                skipLine();
            }
            // Skip 'theirs' marker if we haven't hit EOF already
            if (!isEOFCharacter(peek())) {
                skipLine();
            }
        }
    }

    private void skipLine() throws IOException {
        while (peek() != '\n' && !isEOFCharacter(peek())) {
            read();
        }
        skipOneNewline();
    }

I had to modify the test slightly to pass because the logic I used would continue parsing after the marker, resulting in a parser result with two entries when the expected parser result is zero. I changed it to check whether the warning list contains the git conflict warning.

       assertTrue(parserResult.warnings().contains("Found git conflict markers at line 3"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the commit? 🎉👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make a commit. I got the inspiration of the solution while working on another PR, so I just made the changes on that PR's branch. However, you can use the code above inside BibTeXParser, and don't forget to set co-authored with me 😁.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants