Migration Issues with DKAN for an already existing Drupal7-DKAN-based Data Platform #3828
Replies: 3 comments 2 replies
-
Hi @nplathe - quite a few things here we may need to break this into a few different discussions. Let me focus on the question about schemas here. This is definitely a challenging model you are describing, as in my opinion it blurs the line a bit between data and metadata. Arguably your sets of sources and patents are themselves two datasets, which your primary datasets are referencing. You have probably realized that you can define two additional customs schemas - source and patent in your case - as long as they are represented with valid json-schema documents. You should even be able to create a form in the Drupal admin UI for each of them with minimal effort. However, as you have noted our current search is very dataset-focused. The good news is that the search functionality is quite isolated into its own module, metastore_search, which could be disabled and replaced with your own customized copy of it. I think by re-implementing the DataSource plugin and a couple other classes, you could create an index that brings in all three schemas. But obviously this would be new territory. It's something we eventually need to figure out ourselves if we are going to open up the schema system to be as flexible as we want it to be, but may not happen on our end fast enough to meet your timeline. |
Beta Was this translation helpful? Give feedback.
-
All this said, even if the search indexing is ironed out, we still have the frontend issue. We are very aware that building DKAN primarily to power decoupled react catalog apps creates a high barrier to entry. @dgading may want to say more about this, but we are starting to think about and experiment with a better out-of-box experience for DKAN that does not require a project to have a react developer on staff. There are a few different things that will require - some twig work to get at least most of the basic DKAN functionality on a normal Drupal rendered page, and some improvements to the search api integration (probably a lot of overlap with the previous comment here) to allow metadata listings to be created with Views. |
Beta Was this translation helpful? Give feedback.
-
Would be happy to discuss collaborating if you'd like to contribute to pushing some of these items forward faster. And either way we're trying to be more communicative and transparent about progress. Also you may be aware that some people have been working on an alternative approach that is more of a straight rebuild of DKAN's D7 version: https://www.drupal.org/project/ekan |
Beta Was this translation helpful? Give feedback.
-
Hi, there.
My name is Nick Plathe, I work together with Markus Becker (markus-m-becker) at the Leibniz Institute for Plasma Science and Technology. We are participating in a project called QPTdat with the goal to establish infrastructure and tools supporting data-driven research and knowledge transfer, thus working a lot with metadata in our research field. Although we and our colleagues are experienced in programming, however, none of us are experienced web developers and most of us are severely limited by time and topic constraints.
Initial Situation
In the last couple years, we established a data platform called INPTDAT, which utilises the DKAN modules based on Drupal 7 (moniker: "old DKAN" or "DKAN 1") and Drupal itself to provide several features like different content types, searchable information based on our own JSON Schema Plasma-MDS, etc.. However, with EOL for Drupal 7 drawing nearer and IT security in mind, we are forced to migrate to newer versions or find other solutions soon.
We already experimented with the newer version of DKAN based of Drupal 8+ (moniker: "new DKAN" or "DKAN 2"), but had to realise that DKAN 2, as fancy and all new modern as it is, behaves very different than old DKAN and utilising the harvester is just not enough to properly migrate. Because of the troubles we have had with migrating, we want to reach out to the developers and the community of DKAN, asking and discussing a few questions on how to deal with these issues.
Management of Metadata in the backend
As mentioned, we're developing new metadata standards for our research field in order to introduce FAIR principles to daily research processes. On our platform, we distinguisch between datasets (which would be covered by the new DKAN as well), patents and plasma sources for now, since these types of content hold different information relevant for different audiences. With DKAN 1, we could easily declare new content types, but with the decoupled frontend and the changed functionality of the backend, this is now somewhat out of options.
So one of the most important questions would be to us
Following the previous question, we tried to implement additional schemas for our needs, because a "one fits it all" logic is not applicable in our use case, for that, the types of content differ too much. While we are aware, that developers of DKAN know about the current situation of how new DKAN is engaging on the topic of new schemas, it is problematic to postpone the imminet update of our data platform and waiting for additional features to be implemented because of the afore mentioned security issues. What we have tested out so far is injecting our schema into the
dataset.json
located in the metastore package of new DKAN, which contains the DCAT schema for storing information about datasets in new DKAN, but in vanilla state, does not fulfill our needs entirely. However, because of this approach, the dataset entries not only hold duplicate information, since our schema preserves part of the information already present in DCAT, but are more difficult to manage, especially in a situation, where updates may break our approach.Which leads to the question:
Frontend dynamics
Lets assume for a moment that we managed to somehow circumvent the issues with the backend of DKAN. While the frontend provides a nice and comprehensive view of a dataset, it lacks a dynamic approach on which metadata shall be selected for presentation. Since we want to include other metadata than the selected, however, this would be a crucial function, making the frontend more flexible. Additionally, it would be nice to not only rely on tags or topics in order to search for datasets in the frontend, especially in the facet selection menu of the search. Of course, building a new frontend based on the provided API is technically possible, but for us without having either dedicated personnel and/or time, this is out of option and more or less problematic, if DKAN ships any API breaking updates.
Which brings up the following question:
And, finally, one issue we have had: The issue on how to realise our several content types. We would like to present our platform in a uniform way, with all the content we already have. Of course, we could split our content and let it be handled by Drupal, but that looks neither uniform nor nice, since Drupal essentially only serves as a backend in new DKAN opposed to old DKAN.
So, the last question:
Remarks
We hope, we could explain to you properly what our needs are, which problems we encountered on new DKAN and that we intend to solve these issues, but are not able to do so without some help.
Kind regards
Beta Was this translation helpful? Give feedback.
All reactions