Implement a resolver with a second backend for collections #57

PonteIneptique · 2017-06-01T16:52:27Z

Currently, the only resolver we have has a backend directly read from XML or from cache.

This new resolver should :

Inherits from the wonderful current resolver. To allow for switching from one to the other and continue improving both side by side.
Provide a connection to a backend for the RDFLib store. Most probably with an ORM that allows multiple solutions such as SQLAlchemy or a more graph oriented database (but less common for devs...) like Mongo (I could not find better examples for now)
1. Retrieving metadata about text is already fully dependant on RDFLib but...
2. There will be a need to rewrite graph traversal of the collection so to work with RDFLib. IE, rewrite getitem(), .parent, .ancestors, . descendants to access this information through RDFLib. It means mostly that the new resolver should have it's own Collection/Textgroup/etc. system. But again, the modification would be light...
Implements caching of answers for this metadata (Because cache would mostly be used really as cache)
(Optional) Potentially think about reusing the same backend to store list of references for each texts. That could speed up some other part of the code (?)

sonofmun · 2017-07-21T10:56:46Z

Some Notes:

The new resolver (the Extended Resolver) would manage the store already existing as a global in MyCapytain (as it stores all the metadata of all files). It should definitely reuse a RDFLib Store Adapter, more likely a SQLAlchemy one because it provides another layer of adaptation ( https://github.com/RDFLib/rdflib-sqlalchemy )
MyCapytain will be responsible for setting up the Graph, Nautilus for adapting the store for the Extended Resolver: Nautilus would need to provide also subclasses of collections of MyCapytain to deal with the tree navigation that currently occurs in dictionaries and lists (.descendants, .children, .readableDescendants, etc.)
The collection metadata will be removed from the cache in favor of the rdflib store

sonofmun · 2017-07-21T10:59:34Z

More information on the current process:

The set up :

The resolver with resources that need parsing are declared in https://github.com/OpenGreekAndLatin/leipzig_cts/blob/master/modules/capitains/templates/app.py.erb#L75-L79
The inventory is actually built with this information in https://github.com/OpenGreekAndLatin/leipzig_cts/blob/master/modules/capitains/templates/update_capitains_repos.rb.erb#L68-L71 : everytime our corpora change, we rebuild some of the cache : we do a parse to get the inventory in cache
parse is called by the manager, which goes into every xml file (text or metadata) to build some of the information needed : https://github.com/Capitains/Nautilus/blob/master/capitains_nautilus/cts/resolver.py#L161-L258
App just calls the resolver in most of its queries . It's basically the core of the app.

Workflow when running

Anytime we need to access metadata (name of a text, citation scheme, text itself) we hit the inventory.
The inventory, defined here https://github.com/Capitains/Nautilus/blob/master/capitains_nautilus/cts/resolver.py#L87-L91,
if it dropped from cache, it will ask to reparse the whole thing . It is most likely the reason for 502 because this can take a really long time for a normal process (ie it should not be the case)

sonofmun · 2017-08-07T08:38:39Z

All these new functions should be unit tested.

PonteIneptique · 2018-03-12T11:24:14Z

Partially implemented in #68

PonteIneptique added the enhancement label Jun 1, 2017

PonteIneptique mentioned this issue Jul 24, 2017

Implement a rotating system for Cache, Files and potentially future store OpenGreekAndLatin/leipzig_cts#17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a resolver with a second backend for collections #57

Implement a resolver with a second backend for collections #57

PonteIneptique commented Jun 1, 2017

sonofmun commented Jul 21, 2017 •

edited by PonteIneptique

Loading

sonofmun commented Jul 21, 2017

sonofmun commented Aug 7, 2017

PonteIneptique commented Mar 12, 2018

Implement a resolver with a second backend for collections #57

Implement a resolver with a second backend for collections #57

Comments

PonteIneptique commented Jun 1, 2017

sonofmun commented Jul 21, 2017 • edited by PonteIneptique Loading

sonofmun commented Jul 21, 2017

sonofmun commented Aug 7, 2017

PonteIneptique commented Mar 12, 2018

sonofmun commented Jul 21, 2017 •

edited by PonteIneptique

Loading