-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert templated queries to named queries and separate concerns by introducing named query middleware #2412
Comments
Potential test platform https://scholia.portal.mardi4nfdi.de/ |
see also #2063 and ad-freiburg/qlever#859 |
I am trying to understand this. Do you propose the use of |
@fnielsen the intention is to do information hiding and don't reveal what the actual query looks like. Take The specific query for your personal QID Q20980928 on QLever would e.g. https://qlever.cs.uni-freiburg.de/wikidata/084HGc and doesnot run out of the box The query on Wikidata does give 71 results but the URL shortening fails so i can't give a short link here and purposely i don't intend to show the details of the query. you'd just be interested in the result. our pyLodStorage library already allows commands such as:
which will pick up the query specification from a yaml file with author_events_fan - named query spec(see result below). author_events_fanresult
|
We have been hard at work on our Graph Split experiment [1], and we What is the WDQS Graph Split experiment? We want to address the growing size of the Wikidata graph by splitting See our previous update for more details [2]. Who should care? Anyone who uses WDQS through the UI or programmatically should check What are those test endpoints? We expose 3 test endpoints, for the full, main and scholarly articles The endpoints are:
Each of the endpoints is backed by a single dedicated server of What kind of feedback is useful? We expect queries that don’t require scholarly articles to work We want to hear about: General use cases or classes of queries which break under federation Examples of queries and pointers to code will be helpful in your feedback. Where should feedback be sent? You can reach out to us using the project’s talk page [1], the Will feedback be taken into account? Yes! We will review feedback and it will influence our path forward. Have fun!
[1] https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split
|
There is now a Wikimedia Hackathon 2024 project task for this https://phabricator.wikimedia.org/T363894 |
Check out http://snapquery.bitplan.com/query/scholia/author_list-of-publications http://snapquery.bitplan.com has the demo and project is at https://github.com/WolfgangFahl/snapquery with further links to the Hackathon results - thanks to Tim and Dennis for making this happen! |
I get |
@fnielsen there is another server at https://snapquery.wikidata.dbis.rwth-aachen.de/query/scholia/author_list-of-publications which might work. A socket connection is created which might not work behind firewalls or on internet connections with high latency. |
version 0.0.8 of snapquery is ready. It has e.g. params_statsquerySELECT count(*),
params
FROM "QueryDetails"
GROUP BY params
ORDER BY 1 desc
result
|
Some queries have more complex query parameters like here: SELECT ?venue (COUNT(DISTINCT ?work) AS ?number_of_works) (COUNT(?citing_work) AS ?number_of_citations)
WHERE {
VALUES ?venue { {% for q in qs %} wd:{{ q }} {% endfor %} }
OPTIONAL {
?work wdt:P1433 ?venue .
OPTIONAL { ?citing_work wdt:P2860 ?work }
}
}
GROUP BY ?venue |
Is your feature request related to a problem? Please describe.
blazegraph is getting close to the 4TB limit. Wikimedia foundation is testing a graph split in Q1/2024.
This will eventually and likeley force the use of:
also there is the already limiting timeout of 1 min of the official WDQS
Describe the solution you'd like
Describe alternatives you've considered
Get your own copy of wikidata and use it see CEUR-WS Vol-3262 paper Getting and hosting your own copy of Wikidata
Additional context
Search Platfrom Office Hours 2023-12-06
Named Query handling:
Queries may be referenced theses days with e.g. short urls which are boths supported by the Wikdata Query Service and QLever. Personally i think it would be good to go one step futher and have "named queries". See e.g. https://cr.bitplan.com/index.php/List_of_Queries as a example for queries. Scholia also uses a similar idea internally. See https://github.com/WDscholia/scholia/tree/master/scholia/app/templates. Quite a few of these queries have no only a few parameters. E.g. https://github.com/WDscholia/scholia/blob/master/scholia/app/templates/author_topics.sparql only takes a single q - identifier has input.
In my own pylodstorage project https://pypi.org/project/pyLodStorage/ i am already offering named queries but without parameters. WolfgangFahl/pyLoDStorage#113 is the issue to parameterize the queries. The queries are described in Yaml files in this solution. I imagine a RESTFul service that takes a query name and a set of parameters and returns the result in a SPARQL server compatible way. This would mean that the details of the Query (e.g. whether it is federated or on which endpoint it runs) are hidden. I believe that this approach would work well with the intended Wikidata Split attempt in QI / 2024.
Links:
Previous analysis of blazegraph alternatives:
Qlever federation
Scaling Wikidata Query Service - Split the Graph experiment
The text was updated successfully, but these errors were encountered: