-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for POST + other improvements of the server #657
Conversation
1. There is now proper support for GET and POST queries. The POST queries now also allow both variants specified in the SPARQL 1.1 standard: with a URL-encoded query string ("query=...") and with content type "application/sparql-query" and just the unencoded SPARQL query in the body of the POST request. 2. Improved the URL parsing helper functions along the way. Eventually, this should be done with a propery library, but until we are there, let's improve what we have. 3. Used the opportunity to add a command-line argument for specifying the access token to be used for restricted API calls like clear-cache-complete. This was simple enough and completes work from an earlier PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This already looks great, I only have a few smaller comments.
We need a Mocking framework for the Server code to write unit tests for it, but that's a story for another day:)
src/ServerMain.cpp
Outdated
@@ -62,6 +63,9 @@ int main(int argc, char** argv) { | |||
"The basename of the index files (required)."); | |||
add("port,p", po::value<int>(&port)->required(), | |||
"The port on which HTTP requests are served (required)."); | |||
add("access-token,a", | |||
po::value<std::string>(&accessToken)->default_value("CHANGE_ME"), | |||
"Access token for restricted API calls."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really ok to have the "restricted" API calls via the default "CHANGE_ME" token, when somebody forgets to set the token?
src/engine/Server.cpp
Outdated
auto filenameAndParams = [&request]() { | ||
if (request.method() == http::verb::get) { | ||
// For a GET request, `request.target()` yields the part after the domain, | ||
// which is a concatenation of the path and the query string (the query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me as a human the usage of the term query
(or query string
) for
the GET parameters with ?
and &
connections is a little bit confusing, because we use query
typically as a synonym for SPARQL query`
src/engine/Server.cpp
Outdated
// auto filenameAndParams = | ||
// ad_utility::UrlParser::parseTarget(request.target()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be deleted.
src/engine/Server.cpp
Outdated
LOG(DEBUG) << "Request method: \"" << request.method() << "\"" | ||
<< ", target: \"" << request.target() << "\"" | ||
<< ", body: \"" << request.body() << "\"" << std::endl; | ||
auto filenameAndParams = [&request]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lambda (logic + error handling + Comments) has become so long that it can become a separate (static) function of the Server
class,
As it also takes only one parameter (the request
).
1. If something goes wrong during query processing as far the URL parameters is concerned, the error message in the log and the error message returned to the client are now in sync. 2. Better logging also at other places, in particular, when processing the commands. 3. If the server is started without --access-token, there is now no access to the restricted API calls at all (no matter which value is provided in the URL parameters for access-token). 4. When an access token is provided and it doesn't match or the server was started without --access-toknen, the whole request is now ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine, as soon as the catch-co_await thing is fixed, we can merge it.
1. Now avoid co_await in catch block because it's against the C++ standard. 2. Also log the query processing time because it was simple enough and somewhat connected (request logging was improved at various places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
One line which I still think can be removed, but otherwise this looks good to go.
Does this now yield the expected responses when sending the wrong Content-type?
src/engine/Server.cpp
Outdated
std::optional<std::string> exceptionErrorMsg; | ||
std::optional<http::response<http::string_body>> badRequestResponse; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the badRequestResponse
variable unused? If so, remove it!
src/engine/Server.cpp
Outdated
co_return co_await send(std::move(response)); | ||
}; | ||
// Process the request using the `process` method and if it throws an | ||
// exception, log the error message and send a HTTP/1.1 404 Bad Request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 400 is Bad Request and 404 is not found.
Note: The log was quite inconsistent regarding what it logs and what it sends to the server. Made that more consistent now. Also logging some more relevant information to the INFO log, like the total time for processing a query or the basic parameters of a request (method, content type, accept hearders). I tested it for a number of queries and it worked alright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my standpoint, this may now be merged!
1. When access to a URL parameter is denied or when there are URL parameters that were not processed, now abort the query processing with a 400 error. In particular, this has the consequence that a file is only served if there are no URL parameters (which makes sense). 2. In processQuery, now do the query planning *after* determining the media type. Before it could happen that the query planning took some time and then the processing aborted because of an invalid media type, which didnt't make sense. Log the time required for the query planning. 3. Support an explicit "ping" action with an optional message. This is useful for the qlever script and beyond. For example, the ping from the qlever script is now logged in a way that we can see that it came from the qlever script. 4. Make sparql-results+json the default media type, as required by the standard. This is possible now because the latest version of the QLever UI explicitly asks for qlever-results+json for every SPARQL query it sends. 5. ALso log the size of the result when it has been computed. This useful information was missing from the INFO log so far. 6. Minor reformatting of a few strangely formatted comments that I encountered on the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge it, but don't add additional features.
In the future please split different issues into different PRs, I identify at least:
- Implementation of POST
- Implementation of access tokens
- Improved logging (in various places unrelated to the above)
- more consistent error handling in the Server.
Thanks, but I am not totally convinced, let's talk about it later. The size of this PR is average for our PRs. I know that it started out as only adding POST. Then it turned out to become a more general improve "process" and "processQuery" because the issues I eventually dealt with (which you mention in your list) are relatively tightly connected. The --access-token was just a small leftover from a previous PR. |
Currently, just a TODO, but this should be feasible with relatively few lines of code.