Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve user comments in bib file #1471

Merged
merged 38 commits into from
Jul 14, 2016
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d6d6721
Add failing test with preceding comment
lenhard Jun 3, 2016
cb85443
Add failing parser test
lenhard Jun 3, 2016
b50c3d6
Improve test naming
lenhard Jun 3, 2016
2c39c89
Reuse Globals.ENCODING_PREFIX in test
lenhard Jun 3, 2016
b547703
Check explicitly for encoding line and purge it
lenhard Jun 3, 2016
ce56e23
Add changelog entry
lenhard Jun 3, 2016
fe79b41
Write out user comments also for modified entries
lenhard Jun 3, 2016
ec589e8
Add test to check preservation of ENCODING_PREFIX inside an entry
lenhard Jun 3, 2016
c7f4694
Make BibEntryWriter more robust when dealing with faulty parsed seria…
lenhard Jun 3, 2016
3957680
Add test with user comment in file
lenhard Jun 8, 2016
399bcb9
Add test that changes and reformats an entry with a user comment
lenhard Jun 8, 2016
14db4bf
Add test with user comment before String
lenhard Jun 8, 2016
392864e
Preserve newlines also when used with bibtex strings
lenhard Jun 8, 2016
9c23e94
Add test for serialization of epilog
lenhard Jun 8, 2016
21eac95
Fix string detection in test
lenhard Jun 8, 2016
3206b29
In case of change, only remove trailing whitespaces between user comm…
lenhard Jun 8, 2016
a1c98c5
Merge branch 'master' into preserve-comments
lenhard Jun 14, 2016
678c83e
Remove unused variable
lenhard Jun 14, 2016
7a3522a
Remove unused import
lenhard Jun 14, 2016
51d82a2
Reformat changelog entry
lenhard Jul 8, 2016
1dca1a7
Move comment detection logic to BibtexString
lenhard Jul 8, 2016
cee181f
Compute user comment in string only once
lenhard Jul 8, 2016
6496190
Move user comment computation to BibEntry
lenhard Jul 8, 2016
afea9b0
Add test for comment in the same line as BibEntry
lenhard Jul 8, 2016
dc4ae0f
Merge branch 'master' into preserve-comments
lenhard Jul 8, 2016
ccddef0
Remove unnecessary epilog test
lenhard Jul 8, 2016
6d1ac65
Remove redundant test bib file
lenhard Jul 8, 2016
0972b51
Remove redundant asserts
lenhard Jul 8, 2016
b4b288a
Elevate registry actions
stefan-kolb Jul 13, 2016
67a4885
Delete stuff
stefan-kolb Jul 13, 2016
9000f1c
Revert "Elevate registry actions"
stefan-kolb Jul 13, 2016
d970a4c
Revert "Delete stuff"
stefan-kolb Jul 13, 2016
c1db48f
Merge branch 'master' into preserve-comments
lenhard Jul 14, 2016
75d6262
Merge branch 'preserve-comments' of github.com:JabRef/jabref into pre…
lenhard Jul 14, 2016
98cbe98
Use optional in assert
lenhard Jul 14, 2016
72de4ce
Remove duplicate test
lenhard Jul 14, 2016
e5e5c44
Remove unnecessary asserts
lenhard Jul 14, 2016
a8a7b20
Remove unused import of Optional
lenhard Jul 14, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ We refer to [GitHub issues](https://github.com/JabRef/jabref/issues) by using `#
## [Unreleased]

### Changed
- [#1026](https://github.com/JabRef/jabref/issues/1026) JabRef does no longer delete user comments outside of BibTeX entries and strings

### Fixed

Expand Down
10 changes: 10 additions & 0 deletions src/main/java/net/sf/jabref/exporter/BibDatabaseWriter.java
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,8 @@ private void writeString(Writer fw, BibtexString bs, Map<String, BibtexString> r
return;
}

writeUserCommentsForString(bs, fw);

if(isFirstStringInType) {
fw.write(Globals.NEWLINE);
}
Expand Down Expand Up @@ -354,6 +356,14 @@ private void writeString(Writer fw, BibtexString bs, Map<String, BibtexString> r
fw.write("}" + Globals.NEWLINE);
}

private void writeUserCommentsForString(BibtexString string, Writer out) throws IOException {
String userComments = string.getUserComments();

if(!userComments.isEmpty()) {
out.write(userComments + Globals.NEWLINE);
}
}

/**
* Write all strings in alphabetical order, modified to produce a safe (for
* BibTeX) order of the strings if they reference each other.
Expand Down
33 changes: 14 additions & 19 deletions src/main/java/net/sf/jabref/importer/fileformat/BibtexParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
import java.util.Objects;
import java.util.Optional;

import net.sf.jabref.Globals;
import net.sf.jabref.MetaData;
import net.sf.jabref.importer.ParserResult;
import net.sf.jabref.logic.bibtex.FieldContentParser;
Expand Down Expand Up @@ -323,29 +324,23 @@ private String dumpTextReadSoFarToString() {
// if there is no entry found, simply return the content (necessary to parse text remaining after the last entry)
if (indexOfAt == -1) {
return purgeEOFCharacters(result);
} else {

//skip all text except newlines and whitespaces before first @. This is necessary to remove the file header
int runningIndex = indexOfAt - 1;
while (runningIndex >= 0) {
if (!Character.isWhitespace(result.charAt(runningIndex))) {
} else if(result.contains(Globals.ENCODING_PREFIX)) {
// purge the encoding line if it exists
int runningIndex = result.indexOf(Globals.ENCODING_PREFIX);
while(runningIndex < indexOfAt) {
if(result.charAt(runningIndex) == '\n') {
break;
} else if(result.charAt(runningIndex) == '\r') {
if(result.charAt(runningIndex + 1) == '\n') {
runningIndex++;
}
break;
}
runningIndex--;
}

if(runningIndex > -1) {
// We have to ignore some text at the beginning
// so we view the first line break as the end of the previous text and don't store it
if(result.charAt(runningIndex + 1) == '\r') {
runningIndex++;
}
if(result.charAt(runningIndex + 1) == '\n') {
runningIndex++;
}
runningIndex++;
}

return result.substring(runningIndex + 1);
} else {
return result;
}
}

Expand Down
24 changes: 17 additions & 7 deletions src/main/java/net/sf/jabref/logic/bibtex/BibEntryWriter.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,22 +37,32 @@ public void write(BibEntry entry, Writer out, BibDatabaseMode bibDatabaseMode) t
/**
* Writes the given BibEntry using the given writer
*
* @param entry The entry to write
* @param out The writer to use
* @param entry The entry to write
* @param out The writer to use
* @param bibDatabaseMode The database mode (bibtex or biblatex)
* @param reformat Should the entry be in any case, even if no change occurred?
* @param reformat Should the entry be in any case, even if no change occurred?
*/
public void write(BibEntry entry, Writer out, BibDatabaseMode bibDatabaseMode, Boolean reformat) throws IOException {
// if the entry has not been modified, write it as it was
if (!reformat && !entry.hasChanged()) {
out.write(entry.getParsedSerialization());
return;
}

writeUserComments(entry, out);
out.write(Globals.NEWLINE);
writeRequiredFieldsFirstRemainingFieldsSecond(entry, out, bibDatabaseMode);
out.write(Globals.NEWLINE);
}

private void writeUserComments(BibEntry entry, Writer out) throws IOException {
String userComments = entry.getUserComments();

if(!userComments.isEmpty()) {
out.write(userComments + Globals.NEWLINE);
}
}

public void writeWithoutPrependedNewlines(BibEntry entry, Writer out, BibDatabaseMode bibDatabaseMode) throws IOException {
// if the entry has not been modified, write it as it was
if (!entry.hasChanged()) {
Expand All @@ -71,7 +81,7 @@ public void writeWithoutPrependedNewlines(BibEntry entry, Writer out, BibDatabas
* @throws IOException
*/
private void writeRequiredFieldsFirstRemainingFieldsSecond(BibEntry entry, Writer out,
BibDatabaseMode bibDatabaseMode) throws IOException {
BibDatabaseMode bibDatabaseMode) throws IOException {
// Write header with type and bibtex-key.
TypedBibEntry typedEntry = new TypedBibEntry(entry, Optional.empty(), bibDatabaseMode);
out.write('@' + typedEntry.getTypeForDisplay() + '{');
Expand Down Expand Up @@ -127,9 +137,9 @@ private void writeKeyField(BibEntry entry, Writer out) throws IOException {
/**
* Write a single field, if it has any content.
*
* @param entry the entry to write
* @param out the target of the write
* @param name The field name
* @param entry the entry to write
* @param out the target of the write
* @param name The field name
* @throws IOException In case of an IO error
*/
private void writeField(BibEntry entry, Writer out, String name, int indentation) throws IOException {
Expand Down
25 changes: 25 additions & 0 deletions src/main/java/net/sf/jabref/model/entry/BibEntry.java
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,31 @@ public BibEntry withField(String field, String value) {
return this;
}

/*
* Returns user comments (arbitrary text before the entry), if they exist. If not, returns the empty String
*/
public String getUserComments() {

if (parsedSerialization != null) {

try {
// get the text before the entry
String prolog = parsedSerialization.substring(0, parsedSerialization.indexOf('@'));

// delete trailing whitespaces (between entry and text)
prolog = prolog.replaceFirst("\\s+$", "");

// if there is any non whitespace text, write it
if (prolog.length() > 0) {
return prolog;
}
} catch (StringIndexOutOfBoundsException ignore) {
// if this occurs a broken parsed serialization has been set, so just do nothing
}
}
return "";
}

public Set<String> getFieldAsWords(String field) {
String fieldName = toLowerCase(field);
Set<String> storedList = fieldsAsWords.get(fieldName);
Expand Down
24 changes: 24 additions & 0 deletions src/main/java/net/sf/jabref/model/entry/BibtexString.java
Original file line number Diff line number Diff line change
Expand Up @@ -152,4 +152,28 @@ public String getParsedSerialization() {
public boolean hasChanged(){
return hasChanged;
}

/*
* Returns user comments (arbitrary text before the string) if there are any. If not returns the empty string
*/
public String getUserComments() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now an exact copy of BibEntry.getUserComments right? Maybe add a superclass BibItem containing this method and then BibEntry and BibString derive from this superclass (maybe there is even more common code which could be extracted to the super class)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to always slip into the role of your antagonist, but I am against creating a class hierarchy. In most cases, it makes the code harder to understand, and the code duplication savings are just not worth the effort. In addition, one has to adhere to the liskov substitution principle to create a good class hierarchy, but this is hard to do right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, having an experienced antagonist is probably the best way to learn something 😄

So now ignoring everything about the actual implementation and only speaking about "business objects": There are different items which can be contained in a bib file. For example @Comments, @Strings and normal bib entries. They all have a similar structure (e.g begin with @, followed by some identifier and then braces which surround the actual content) and all of them could have some text comments in front of them. Thus there is also a common behavior when it comes to parsing and writing.
So how would my deer antagonist reflect this common structure and behavior in java code?

My idea would be: BibString, BibEntry, BibComment all derive from BibItem. BibItem has methods to parse and write at least the form @something { abstract method to parse / write value }. Then the writer just gets a list of BibItems and invokes the write method on them. Similarly the parser returns a list of BibItems while the parsing of every item was done in the subclass. But again...this idea somehow corresponds from a service implementation to a full-blown domain model....so yeah, maybe we should discuss this in the next dev call ;-)

if(parsedSerialization != null) {

try {
// get the text before the string
String prolog = parsedSerialization.substring(0, parsedSerialization.indexOf('@'));

// delete trailing whitespaces (between string and text)
prolog = prolog.replaceFirst("\\s+$", "");
// if there is any non whitespace text, write it with proper line separation
if (prolog.length() > 0) {
return prolog;
}
} catch(StringIndexOutOfBoundsException ignore) {
// if this occurs a broken parsed serialization has been set, so just do nothing
}
}

return "";
}
}
77 changes: 77 additions & 0 deletions src/test/java/net/sf/jabref/exporter/BibDatabaseWriterTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,82 @@ public void roundtrip() throws IOException {
}
}

@Test
public void roundtripWithUserComment() throws IOException {
Path testBibtexFile = Paths.get("src/test/resources/testbib/bibWithUserComments.bib");
Charset encoding = StandardCharsets.UTF_8;
ParserResult result = BibtexParser.parse(ImportFormat.getReader(testBibtexFile, encoding));

SavePreferences preferences = new SavePreferences().withEncoding(encoding).withSaveInOriginalOrder(true);
BibDatabaseContext context = new BibDatabaseContext(result.getDatabase(), result.getMetaData(),
new Defaults(BibDatabaseMode.BIBTEX));

databaseWriter.writePartOfDatabase(stringWriter, context, result.getDatabase().getEntries(), preferences);
try (Scanner scanner = new Scanner(testBibtexFile,encoding.name())) {
assertEquals(scanner.useDelimiter("\\A").next(), stringWriter.toString());
}
}

@Test
public void roundtripWithUserCommentAndEntryChange() throws IOException {
Path testBibtexFile = Paths.get("src/test/resources/testbib/bibWithUserComments.bib");
Charset encoding = StandardCharsets.UTF_8;
ParserResult result = BibtexParser.parse(ImportFormat.getReader(testBibtexFile, encoding));

BibEntry entry = result.getDatabase().getEntryByKey("1137631").get();
entry.setField("author", "Mr. Author");

SavePreferences preferences = new SavePreferences().withEncoding(encoding).withSaveInOriginalOrder(true);
BibDatabaseContext context = new BibDatabaseContext(result.getDatabase(), result.getMetaData(),
new Defaults(BibDatabaseMode.BIBTEX));

databaseWriter.writePartOfDatabase(stringWriter, context, result.getDatabase().getEntries(), preferences);

try (Scanner scanner = new Scanner(Paths.get("src/test/resources/testbib/bibWithUserCommentAndEntryChange.bib"),encoding.name())) {
assertEquals(scanner.useDelimiter("\\A").next(), stringWriter.toString());
}
}

@Test
public void roundtripWithUserCommentBeforeString() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an exact copy of

public void roundtrip() throws IOException {

Path testBibtexFile = Paths.get("src/test/resources/testbib/complex.bib");
Charset encoding = StandardCharsets.UTF_8;
ParserResult result = BibtexParser.parse(ImportFormat.getReader(testBibtexFile, encoding));

SavePreferences preferences = new SavePreferences().withEncoding(encoding).withSaveInOriginalOrder(true);
BibDatabaseContext context = new BibDatabaseContext(result.getDatabase(), result.getMetaData(),
new Defaults(BibDatabaseMode.BIBTEX));

databaseWriter.writePartOfDatabase(stringWriter, context, result.getDatabase().getEntries(), preferences);
try (Scanner scanner = new Scanner(testBibtexFile,encoding.name())) {
assertEquals(scanner.useDelimiter("\\A").next(), stringWriter.toString());
}
}

@Test
public void roundtripWithUserCommentBeforeStringAndChange() throws IOException {
Path testBibtexFile = Paths.get("src/test/resources/testbib/complex.bib");
Charset encoding = StandardCharsets.UTF_8;
ParserResult result = BibtexParser.parse(ImportFormat.getReader(testBibtexFile, encoding));

BibtexString string = result.getDatabase().getStringValues().iterator().next();
if(string.getContent().isEmpty()) {
// do nothing
} else {
string.setContent("my first string");
}

SavePreferences preferences = new SavePreferences().withEncoding(encoding).withSaveInOriginalOrder(true);
BibDatabaseContext context = new BibDatabaseContext(result.getDatabase(), result.getMetaData(),
new Defaults(BibDatabaseMode.BIBTEX));

databaseWriter.writePartOfDatabase(stringWriter, context, result.getDatabase().getEntries(), preferences);

try (Scanner scanner = new Scanner(testBibtexFile,encoding.name())) {
assertEquals(scanner.useDelimiter("\\A").next(), stringWriter.toString());
}
}

@Test
public void writeSavedSerializationOfEntryIfUnchanged() throws IOException {
BibEntry entry = new BibEntry();
Expand Down Expand Up @@ -543,4 +619,5 @@ public void writeEntriesInOriginalOrderWhenNoSaveOrderConfigIsSetInMetadata() th
+ Globals.NEWLINE
, stringWriter.toString());
}

}
Loading