Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solr: modification of fields related to access rights #462

Open
e-maud opened this issue Nov 28, 2024 · 4 comments
Open

Solr: modification of fields related to access rights #462

e-maud opened this issue Nov 28, 2024 · 4 comments
Assignees

Comments

@e-maud
Copy link
Member

e-maud commented Nov 28, 2024

New information regarding access rights and copyrights

...are coming in the main SOLR document index (solr2), and would need to be reflected in the middle layer.

As defined in the access-right schema, information should be displayed at newspaper and content item levels, and be shipped to both the WebApp and the API / Python Library.

Content item level (SOLR)

This is where modifications happen in solr main index, with modifications and additions of new fields.

Since field values are always the same, some short surrogates are being used. The mapping between the full values and their surrogates is currently in enums.py in the solr repo, but will move to impresso-essentials.

  • Information on the data domain

    • The former field access_right_s becomes data_domain_s.
    • The field is stored and indexed.
    • Possible values are defined in enum.py#L23
  • Information on copyright

    • New field copyright_detail_i'
    • The field is currently stored and indexed.
    • Possible values are defined in enum.py#L33
  • Information on permitted use, per action:

    • New fields: perm_use_explore_plain, perm_use_get_tr_plain and perm_use_get_img_plain.
    • The field is stored but not indexed
    • Possible values are defined in enum.py#L62
  • Bitmaps
    Already implemented and discussed.

Questions

  1. Are the possible values of copyright_detail_i OK or too cryptic / similar to bitmaps?
  2. Do you want to filter on permitted uses (and then the field need to be indexed)?
  3. The source of truth for mapping is now in solr repo, will go in impresso-essentials as Enum: Ok for you, or you prefer a JSON file?

Newspaper level (MySQL)

  • human readable right statement (property_id = 36 as before) that includes data domain + permitted use, per period.

Example in current new mysql:

image

These information should continue to be shown in the newspaper page as it is currently.

@theorm
Copy link
Member

theorm commented Nov 29, 2024

Hi @e-maud , thanks for the detailed description of the changes. My answers:

Are the possible values of copyright_detail_i OK or too cryptic / similar to bitmaps?
Doesn't matter for IML. As long as they are consistent. We can add our own mapping in IML if they need to be changed.

Do you want to filter on permitted uses (and then the field need to be indexed)?
We can but I'm not sure if this is needed (I can't think of any scenario right now). I think it's a question for @danieleguido and @mduering.

The source of truth for mapping is now in solr repo, will go in impresso-essentials as Enum: Ok for you, or you prefer a JSON file?
We won't be reading the JSON file automatically, so an Enum is fine.

What was the reason for changing access_right_s to data_domain_s? It's easy to change in the code, but then it won't be compatible with the old Solr instance. Is it right to say that the access_right_s field has been deprecated and removed and data_domain_s has been added? It will be easier to treat it this way.

@e-maud
Copy link
Member Author

e-maud commented Nov 29, 2024

Hi @theorm,

  • about copyrights values: perfect, thanks.

  • permitted used indexed or not: ok, lets wait for daniele and marten.

  • source of truth in Enum: perfect as well. I will let you know once it changes repo. Values should not change.

  • about change of access_right_s to data_domain_s: the reason is that the field name 'access rights' was quite under specific / vague, encompassing somehow both copyright status and access rights, while now this is more neatly and precisely defined (copyright status, permitted use, data domain, bitmap = jointly contributing to the definition of access rights for a CI somehow). You are completely right regarding compatibility with the old solr, I overlooked this. Yes, it is right to see access_right_s as deprecated and removed, rather than changed, and data_domain_s as added.
    But then, is it OK for IML to handle both ways for a while?

@danieleguido
Copy link
Contributor

hi @e-maud and @theorm, I agree that adding data_domain_s is better than replacing, no problem to add the other fields. Regarding permitted uses, there is no need to filter on them imo, as we already have data_domain_s

@theorm
Copy link
Member

theorm commented Dec 2, 2024

But then, is it OK for IML to handle both ways for a while?

I think it is. We just need to make sure the app doesn't break when one of these fields is missing - basically we will be treating both of them as optional. At least to start with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants