-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use case: Equivalent of "namespacing" Fedora to accommodate multisites #396
Comments
@rosiel cool! This really needs to be discussed. My first guess would be: we could decide what to sync, where to sync based on a given, arbitrary and configurable predicate. But this needs to be explored further, mostly because if sync is happening from fedora to Drupal 8 (resource was not originated by islandora for example) then that sync utility would need a reverse map for this, something like: or we could define that URL as an rdf property? Many many ways to define the same Thanks a lot!! |
I think the easiest way to deal with this would be structure. From the repository root, you'd need to have seperate containers for each multisite. Then you could re-index per container. That pattern could be applied to multitenancy with appropriate authz. |
@dannylamb you mean LDP based? i can see some scalability issues with that, in specific if we use the default PID minter which is handy to avoid a unbalanced tree. Also makes filtering in a triple store kinda complex (like show me all objects that are descendant of.. what if that descendant of is 5 steps with different predicates). Good talk for next CLAW call! |
Have I mentioned how much I hate that semantics and storage are jumbled up in Fedora? |
"That pattern could be applied to multitenancy with appropriate authz." -> this sounds cool but is way over my head. What should I read to fill in my blanks? |
@rosiel I think this sounds more complex than it is. In my mind @dannylamb is proposing a Fedora 4 repo structure of
Then you can set authorization based on the root level elements, ie. Bob is admin of site1 and Jane is admin of site2. But neither can access the other's repository contents. But @DiegoPino is right that this might have issues of unbalanced trees. Perhaps we should pull @ruebot in and have him do one of his performance and scaling massive ingests to see how it works if you create 3-4 root level objects and ingest a ratio of objects into each. Like
and see how ingest and response times go? This test would be directly on Fedora and so could avoid any issues of PHP/Drupal in it's timing. |
There is no (performance) problem at all with an unbalanced tree, at least from the Fedora side. The problem is having too many children of a single node/resource. |
That's what the pair-tree PID minter protects against. |
@ajs6f when you say "too many children of a single node", do you mean just having a tonne of children under a single node, or do the children have to direct children of the single node?
versus
|
Yes, as @DiegoPino says. it's too many immediate/direct children that are a problem. |
In fact, if you can guarantee by other means (particularly by controlling your own id minting) that you won't stick too many children under a single parent, then you shouldn't use the hierarchy builder minter. You should just use PUT and stick things wherever it makes sense. |
So I have two concerns here:
|
|
It sounds like you are suggesting that the hierarchical structure that is built into Fedora 4 Objects would be ˆsometimesˆ meaningful and ˆsometimesˆ arbitrary. Does this sound like a solid plan? (I am not being sarcastic; I actually don't know). Would it be better to include an extra, hereditary predicate and let pid-minters populate the hierarchy for optimal storage/retrieval? |
No, what I am telling you is that the hierarchical structure that is built into Fedora 4 Objects is now sometimes meaningful and sometimes arbitrary if you use the hierarchical ID minter. I'm suggesting you decide whether you can avoid that. I don't know what the phrase "hereditary predicate" means. |
On the Islandora Metadata Interest Group, a discussion was started on OAI-PMH support. In addition to some wanted features, the idea of namespaces came up. Our use case is different from that of @rosiel and wanted to add it here.
|
@uconnjeustis can you create a separate issue for this if this is a separate use case? Also, I think it would be a really good idea to talk this out on a future CLAW call, so please do not hesitate in adding it to the agenda, and attending the meeting. |
Not a CLAW-specific issue, either. Might be worth bringing up on a Fedora call-- some documentation of best practices would be good. |
My use case as it's slightly different though related to this issue is now in a Islandora-CLAW/CLAW-478. Please direct responses there. Thanks I just came back from vacation and think I missed the last CLAW meeting. I'll check the schedule and try to hope on the next one. |
Should the current migration sprint account for how to make Fedora 3.x PID namespaces migrate over losslessly? Just askin'. Related issue: #822. |
I would think mapping PID namespaces to LDP containers would be best. |
I think organizing objects by stuffing them in a container per namespace would separate them out nicely if you really want to solidify the distinction. FWIW, so long a we stuff the PID on a field somewhere, we can then query on it to do things like "Get me all objects who were in namespace X". |
Do containers suffer from the many-direct-children scalability issue discussed above? |
Fedora suffers from that problem. There's nothing inherent in LDP that causes that problem, but to the extent that you're committed to Fedora, you would have to deal with it. |
Worth mentioning here: It's unhealthy to think in a D8 context/CLAW about multi sites the way they were applied in Islandora 7.x. Multi sites, by definition, imply different DB tables (not speaking about domain access module), means one site can not access other site's entities, which makes splitting/or better said, reusing nodes/entities from one site to another, extremely complex, not recommend, or even impossible without hacking (now speaking about the (domain access module)[https://www.drupal.org/project/domain]. FYI: There has been discussions about the whole multi site approach a lot here https://www.drupal.org/project/drupal/issues/2306013 |
FWIW, we have been using namespace prefixes in D7 to accomplish multisite without actually using multisite. We serve a consortium of ~20 members, each with its own namespace prefix; using this scheme lets us support the idea of 'sub-institutions' (to arbitrary depth, in theory). I'm glad to share more, and at the very least, we have plenty of data like this that we could use to test a migration along the lines proposed above We don't use a RELS-EXT to define the relationship, so every collection is really just a child of root.
Effectively, however, this flat example represents two top-level institutions, lsu and latech, and one subinstitution of lsu, lsu-sc:
|
If we're storing the 7.x PID as per #822, and we're creating taxonomies as described in #888, maybe we should provide an option to create and populate a taxonomy of PID namespaces and assign the relevant value to each new CLAW node on the migration fly. That way, we get the ability immediately after migration to do some of the things in CLAW we were doing in the source 7.x with PID namespaces. I'm not suggesting we do this during the migration sprint, but maybe after. Might be a good first issue for someone (like me but it doesn't necessarily have to be me) to take on. |
Linking to #926 |
Now that migrate_7x_claw migrates the 7.x object's PID to the corresponding D8 node's Related issue: #822. |
Following from my previous comment, I've written a Context condition plugin will be useful for objects migrated from 7.x. It tests the namespace part of a PID in a D8 islandora_object node's Here's the configuration form of a context that uses it, with a reaction (which is part of the core Context module) being to use the Bartik theme: Here's a screenshot of a node that has one of the registered namespaces: And a screenshot of a node that does not have one of the registered namespaces (i.e., reaction isn't executed): Currently, we don't have an context reactions that would be useful in a "multisite" setup (just to bring this back to @rosiel's original use case), but it would be possible to write some reactions that replicated 7.x multisite behavior. If people think this Context condition will be useful, I can open a PR against https://github.com/Islandora-CLAW/islandora to add it. |
Just throwing these here in case they are of use later. |
Seeing @bondjimbond's awesome work on multitenancy, I would be happy to close this ticket as the multitenancy use case is more thoroughly expanded in #1300, and that sounds like a more advisable set up for multitenant systems. Namespacing was never really the issue; it was more about dividing up content. To summarize the output from this thread:
Thank you @DiegoPino @dannylamb @whikloj @ajs6f @uconnjeustis @mjordan @jpeak5 @ruebot @Natkeeran for your work on this thread. So... we good to close this thread? |
@rosiel awesome summary and relating of issues. One thing I'd like to offer though:
That's not necessarily true. A while back I put together https://github.com/mjordan/ip_range_access specifically use Context for access control. I'd love to get some additional eyes on it. I wrote that module to replace a capability of the 7.x Islandora Context module that we use to control access to some licensed vendor content we host in our Islandora repo, and that we make accessible from off campus via Ezproxy. |
I think there's a reason that Contexts doesn't come with a "deny access" reaction. It works on the node or media's page. This does not carry through to Views, blocks, or other ways of exposing content. So if you're using this, be very careful. |
@rosiel thanks for the heads up. We haven't tested that module for those things yet but certainly will. |
Was told to create an issue for this, apologies if duplicated.
The text was updated successfully, but these errors were encountered: