Skip to content

Commit

Permalink
Sanitize URLs in file fields to handle invalid pipe characters ('|') (#…
Browse files Browse the repository at this point in the history
…12156)

* Sanitize URLs in file fields to handle invalid pipe characters ('|') Closes #11876.

- Introduced URLUtil.createUri() and URLUtil.create() to handle URL sanitization.
- Replaced direct calls to URI.create() and URI.create().toURL() with the new utility methods.
- URLs containing the pipe character ('|') are now properly encoded as '%7C' to prevent parsing errors.
- Added test cases to URLUtilTest to verify correct sanitization and URL creation.

* Sanitize URLs in file fields to handle invalid pipe characters ('|') Closes #11876.

- Introduced URLUtil.createUri() and URLUtil.create() to handle URL sanitization.
- Replaced direct calls to URI.create() and URI.create().toURL() with the new utility methods.
- URLs containing the pipe character ('|') are now properly encoded as '%7C' to prevent parsing errors.
- Added test cases to URLUtilTest to verify correct sanitization and URL creation.
- Added @archtest to ensure that the URI.create() method is not directly called in the codebase.

* Discard changes to src/main/java/module-info.java

* Sanitize URLs in file fields to handle invalid pipe characters ('|') Closes #11876.

- Introduced URLUtil.createUri() and URLUtil.create() to handle URL sanitization.
- Replaced direct calls to URI.create() and URI.create().toURL() with the new utility methods.
- URLs containing the pipe character ('|') are now properly encoded as '%7C' to prevent parsing errors.
- Added test cases to URLUtilTest to verify correct sanitization and URL creation.
- Added @archtest to ensure that the URI.create() method is not directly called in the codebase.

* Sanitize URLs in file fields to handle invalid pipe characters ('|') Closes #11876.

- Introduced URLUtil.createUri() and URLUtil.create() to handle URL sanitization.
- Replaced direct calls to URI.create() and URI.create().toURL() with the new utility methods.
- URLs containing the pipe character ('|') are now properly encoded as '%7C' to prevent parsing errors.
- Added test cases to URLUtilTest to verify correct sanitization and URL creation.
- Added @archtest to ensure that the URI.create() method is not directly called in the codebase.

* Refine comment

* Fix checkstyle

* Fix import

* Update Openrewrite

* Move URLUtil to logic (where possible)

* Fix architecture test

* Fix FQN

* Compilefix

* Add Arch exception

* Fix imports

* Add CHANGELOG.md entry

---------

Co-authored-by: Oliver Kopp <[email protected]>
  • Loading branch information
hitalo-siriano and koppor authored Nov 25, 2024
1 parent b453985 commit 861a6ce
Show file tree
Hide file tree
Showing 61 changed files with 328 additions and 246 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv
- We fixed an issue where the "Do not ask again" checkbox was not working, when asking for permission to use Grobid [koppor#556](https://github.com/koppor/jabref/issues/566).
- We fixed an issue where we display warning message for moving attached open files. [#10121](https://github.com/JabRef/jabref/issues/10121)
- We fixed an issue where it was not possible to select selecting content of other user's comments.[#11106](https://github.com/JabRef/jabref/issues/11106)
- We fixed an issue when handling URLs containing a pipe (`|`) character. [#11876](https://github.com/JabRef/jabref/issues/11876)
- We fixed an issue where web search preferences "Custom API key" table modifications not discarded. [#11925](https://github.com/JabRef/jabref/issues/11925)
- We fixed an issue when opening attached files in [extra file columns](https://docs.jabref.org/finding-sorting-and-cleaning-entries/filelinks#adding-additional-columns-to-entry-table-for-file-types). [#12005](https://github.com/JabRef/jabref/issues/12005)
- We fixed an issue where trying to open a library from a failed mounted directory on Mac would cause an error. [#10548](https://github.com/JabRef/jabref/issues/10548)
Expand Down
4 changes: 2 additions & 2 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ plugins {

id 'idea'

id 'org.openrewrite.rewrite' version '6.27.0'
id 'org.openrewrite.rewrite' version '6.27.2'

id "org.itsallcode.openfasttrace" version "3.0.1"
}
Expand Down Expand Up @@ -400,7 +400,7 @@ dependencies {
xjc group: 'org.glassfish.jaxb', name: 'jaxb-xjc', version: '3.0.2'
xjc group: 'org.glassfish.jaxb', name: 'jaxb-runtime', version: '3.0.2'

rewrite(platform("org.openrewrite.recipe:rewrite-recipe-bom:2.19.0"))
rewrite(platform("org.openrewrite.recipe:rewrite-recipe-bom:2.22.0"))
rewrite("org.openrewrite.recipe:rewrite-static-analysis")
rewrite("org.openrewrite.recipe:rewrite-logging-frameworks")
rewrite("org.openrewrite.recipe:rewrite-testing-frameworks")
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
package org.jabref.gui.entryeditor.citationrelationtab.semanticscholar;

import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.util.List;

import org.jabref.logic.importer.FetcherException;
import org.jabref.logic.importer.ImporterPreferences;
import org.jabref.logic.importer.fetcher.CustomizableKeyFetcher;
import org.jabref.logic.net.URLDownload;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.entry.BibEntry;

import com.google.gson.Gson;
Expand Down Expand Up @@ -38,7 +38,7 @@ public List<BibEntry> searchCitedBy(BibEntry entry) throws FetcherException {

URL citationsUrl;
try {
citationsUrl = URI.create(getAPIUrl("citations", entry)).toURL();
citationsUrl = URLUtil.create(getAPIUrl("citations", entry));
} catch (MalformedURLException e) {
throw new FetcherException("Malformed URL", e);
}
Expand All @@ -62,7 +62,7 @@ public List<BibEntry> searchCiting(BibEntry entry) throws FetcherException {

URL referencesUrl;
try {
referencesUrl = URI.create(getAPIUrl("references", entry)).toURL();
referencesUrl = URLUtil.create(getAPIUrl("references", entry));
} catch (MalformedURLException e) {
throw new FetcherException("Malformed URL", e);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.nio.file.Path;
import java.util.ArrayList;
Expand Down Expand Up @@ -38,6 +37,7 @@
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.BackgroundTask;
import org.jabref.logic.util.TaskExecutor;
import org.jabref.logic.util.URLUtil;
import org.jabref.logic.util.io.FileUtil;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;
Expand Down Expand Up @@ -186,7 +186,7 @@ private List<LinkedFileViewModel> findAssociatedNotLinkedFiles(BibEntry entry) {

public boolean downloadFile(String urlText) {
try {
URL url = URI.create(urlText).toURL();
URL url = URLUtil.create(urlText);
addFromURLAndDownload(url);
return true;
} catch (MalformedURLException exception) {
Expand Down
79 changes: 6 additions & 73 deletions src/main/java/org/jabref/gui/fieldeditors/URLUtil.java
Original file line number Diff line number Diff line change
@@ -1,89 +1,22 @@
package org.jabref.gui.fieldeditors;

import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
import java.util.Optional;

import org.jabref.gui.externalfiletype.ExternalFileTypes;
import org.jabref.gui.frame.ExternalApplicationsPreferences;

/**
* URL utilities for URLs in the JabRef GUI.
* <p>
* For logic-oriented URL utilities see {@link org.jabref.logic.util.URLUtil}.
*/
public class URLUtil {
private static final String URL_EXP = "^(https?|ftp)://.+";

// Detect Google search URL
private static final String GOOGLE_SEARCH_EXP = "^https?://(?:www\\.)?google\\.[\\.a-z]+?/url.*";

private URLUtil() {
}

/**
* Cleans URLs returned by Google search.
*
* <example>
* If you copy links from search results from Google, all links will be enriched with search meta data, e.g.
* https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&&url=http%3A%2F%2Fwww.inrg.csie.ntu.edu.tw%2Falgorithm2014%2Fhomework%2FWagner-74.pdf&ei=DifeVYHkDYWqU5W0j6gD&usg=AFQjCNFl638rl5KVta1jIMWLyb4CPSZidg&sig2=0hSSMw9XZXL3HJWwEcJtOg
* </example>
*
* @param url the Google search URL string
* @return the cleaned Google URL or @code{url} if no search URL was detected
*/
public static String cleanGoogleSearchURL(String url) {
Objects.requireNonNull(url);

if (!url.matches(GOOGLE_SEARCH_EXP)) {
return url;
}
// Extract destination URL
try {
URL searchURL = URI.create(url).toURL();
// URL parameters
String query = searchURL.getQuery();
// no parameters
if (query == null) {
return url;
}
// extract url parameter
String[] pairs = query.split("&");

for (String pair : pairs) {
// "clean" url is decoded value of "url" parameter
if (pair.startsWith("url=")) {
String value = pair.substring(pair.indexOf('=') + 1);

String decode = URLDecoder.decode(value, StandardCharsets.UTF_8);
// url?
if (decode.matches(URL_EXP)) {
return decode;
}
}
}
return url;
} catch (MalformedURLException e) {
return url;
}
}

/**
* Checks whether the given String is a URL.
* <p>
* Currently only checks for a protocol String.
*
* @param url the String to check for a URL
* @return true if <c>url</c> contains a valid URL
*/
public static boolean isURL(String url) {
try {
URI.create(url).toURL();
return true;
} catch (MalformedURLException | IllegalArgumentException e) {
return false;
}
}

/**
* Look for the last '.' in the link, and return the following characters.
* <p>
Expand All @@ -96,7 +29,7 @@ public static Optional<String> getSuffix(final String link, ExternalApplications
String strippedLink = link;
try {
// Try to strip the query string, if any, to get the correct suffix:
URL url = URI.create(link).toURL();
URL url = org.jabref.logic.util.URLUtil.create(link);
if ((url.getQuery() != null) && (url.getQuery().length() < (link.length() - 1))) {
strippedLink = link.substring(0, link.length() - url.getQuery().length() - 1);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import org.jabref.gui.preferences.GuiPreferences;
import org.jabref.logic.integrity.FieldCheckers;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.entry.field.Field;
import org.jabref.model.strings.StringUtil;

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
package org.jabref.gui.linkedfile;

import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.util.Optional;

Expand All @@ -14,6 +13,7 @@
import org.jabref.gui.preferences.GuiPreferences;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.TaskExecutor;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.database.BibDatabaseContext;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.LinkedFile;
Expand Down Expand Up @@ -61,7 +61,7 @@ public void execute() {
}

try {
URL url = URI.create(urlforDownload.get()).toURL();
URL url = URLUtil.create(urlforDownload.get());
LinkedFileViewModel onlineFile = new LinkedFileViewModel(
new LinkedFile(url, ""),
entry,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URI;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Optional;
Expand All @@ -25,6 +24,7 @@
import org.jabref.gui.util.FileDialogConfiguration;
import org.jabref.logic.FilePreferences;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.URLUtil;
import org.jabref.logic.util.io.FileNameCleaner;
import org.jabref.logic.util.io.FileUtil;
import org.jabref.model.database.BibDatabaseContext;
Expand Down Expand Up @@ -166,7 +166,7 @@ public LinkedFile getNewLinkedFile() {

if (LinkedFile.isOnlineLink(link.getValue())) {
try {
return new LinkedFile(description.getValue(), URI.create(link.getValue()).toURL(), fileType, sourceUrl.getValue());
return new LinkedFile(description.getValue(), URLUtil.create(link.getValue()), fileType, sourceUrl.getValue());
} catch (MalformedURLException e) {
return new LinkedFile(description.getValue(), link.getValue(), fileType, sourceUrl.getValue());
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@
import javafx.scene.paint.Color;

import org.jabref.gui.actions.ActionFactory;
import org.jabref.gui.fieldeditors.URLUtil;
import org.jabref.gui.icon.IconTheme;
import org.jabref.gui.preferences.GuiPreferences;
import org.jabref.logic.l10n.Localization;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.entry.identifier.DOI;
import org.jabref.model.strings.StringUtil;

Expand Down
5 changes: 3 additions & 2 deletions src/main/java/org/jabref/gui/theme/StyleSheetFile.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import java.io.IOException;
import java.io.InputStream;
import java.net.URI;
import java.net.URL;
import java.net.URLConnection;
import java.nio.file.Files;
Expand All @@ -11,6 +10,8 @@
import java.util.Optional;
import java.util.concurrent.atomic.AtomicReference;

import org.jabref.logic.util.URLUtil;

import com.google.common.base.Strings;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand Down Expand Up @@ -54,7 +55,7 @@ final class StyleSheetFile extends StyleSheet {

StyleSheetFile(URL url) {
this.url = url;
this.path = Path.of(URI.create(url.toExternalForm()));
this.path = Path.of(URLUtil.createUri(url.toExternalForm()));
reload();
}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
package org.jabref.logic.ai.chatting.model;

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
Expand All @@ -9,6 +8,7 @@
import java.util.stream.Collectors;

import org.jabref.logic.ai.AiPreferences;
import org.jabref.logic.util.URLUtil;

import com.google.gson.Gson;
import dev.langchain4j.data.message.AiMessage;
Expand Down Expand Up @@ -62,7 +62,7 @@ public Response<AiMessage> generate(List<ChatMessage> list) {
String baseUrl = aiPreferences.getSelectedApiBaseUrl();
String fullUrl = baseUrl.endsWith("/") ? baseUrl + "chat/completions" : baseUrl + "/chat/completions";
HttpRequest httpRequest = HttpRequest.newBuilder()
.uri(URI.create(fullUrl))
.uri(URLUtil.createUri(fullUrl))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(requestBody))
.timeout(Duration.ofSeconds(60))
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jabref/logic/importer/fetcher/ACS.java
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
package org.jabref.logic.importer.fetcher;

import java.io.IOException;
import java.net.URI;
import java.net.URL;
import java.util.Objects;
import java.util.Optional;

import org.jabref.logic.importer.FulltextFetcher;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.identifier.DOI;
Expand Down Expand Up @@ -45,7 +45,7 @@ public Optional<URL> findFullText(BibEntry entry) throws IOException {

if (link != null) {
LOGGER.info("Fulltext PDF found @ ACS.");
return Optional.of(URI.create(source.replaceFirst("/abs/", "/pdf/")).toURL());
return Optional.of(URLUtil.create(source.replaceFirst("/abs/", "/pdf/")));
}
return Optional.empty();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.net.URLConnection;
import java.util.Objects;
import java.util.Optional;

import org.jabref.logic.importer.FulltextFetcher;
import org.jabref.logic.util.URLUtil;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.identifier.DOI;
Expand Down Expand Up @@ -48,7 +48,7 @@ public Optional<URL> findFullText(BibEntry entry) throws IOException {
if (code == 200) {
LOGGER.info("Fulltext PDF found @ APS.");
try {
return Optional.of(URI.create(pdfRequestUrl).toURL());
return Optional.of(URLUtil.create(pdfRequestUrl));
} catch (MalformedURLException e) {
LOGGER.warn("APS returned malformed URL, cannot find PDF.");
}
Expand Down Expand Up @@ -77,7 +77,7 @@ private Optional<String> getId(String doi) {

URLConnection con;
try {
con = URI.create(doiRequest).toURL().openConnection();
con = URLUtil.create(doiRequest).openConnection();
con.connect();
con.getInputStream();
String[] urlParts = con.getURL().toString().split("abstract/");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
import java.util.ArrayList;
Expand Down Expand Up @@ -34,6 +33,7 @@
import org.jabref.logic.importer.PagedSearchBasedFetcher;
import org.jabref.logic.importer.fetcher.transformers.ArXivQueryTransformer;
import org.jabref.logic.integrity.BracesCorrector;
import org.jabref.logic.util.URLUtil;
import org.jabref.logic.util.io.XMLUtil;
import org.jabref.logic.util.strings.StringSimilarity;
import org.jabref.model.entry.BibEntry;
Expand Down Expand Up @@ -710,7 +710,7 @@ public ArXivEntry(Node item) {
if (linkTitle.equals(Optional.of("pdf"))) {
pdfUrlParsed = XMLUtil.getAttributeContent(linkNode, "href").map(url -> {
try {
return URI.create(url).toURL();
return URLUtil.create(url);
} catch (MalformedURLException e) {
return null;
}
Expand Down
Loading

0 comments on commit 861a6ce

Please sign in to comment.