Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add External Metadata Validation code to Dataverse codebase #8155

Closed
djbrooke opened this issue Oct 14, 2021 · 2 comments · Fixed by #8245
Closed

Add External Metadata Validation code to Dataverse codebase #8155

djbrooke opened this issue Oct 14, 2021 · 2 comments · Fixed by #8245
Assignees

Comments

@djbrooke
Copy link
Contributor

Overview of the Feature Request

There was an increase in spam deposits in the Harvard Dataverse installation beginning in mid-2020. As of late 2020, this has been largely mitigated through the implementation of a solution based on SpamAssassin. This code has not been merged to develop and exists in another repository. This has caused us to create a custom build for Harvard Dataverse with each release. This was not merged to develop at the time because we expected other installations to not need it, as most installations are not as open for deposit as the Harvard Dataverse installation. We'd like to bring it to this codebase because:

  • other installations may get spam and we want them to be able to respond quickly through a configuration change instead of needing to do a custom build
  • it takes time to do the custom build for each release for the Harvard Dataverse repository and we'd like to spend that time doing other things
  • the spam checker will not change any behavior around publishing/editing unless it's explicitly enabled, so it is OK to bring into this repository, and it gives us the chance to change/improve on it if we want to do so.

What kind of user is the feature intended for?

@landreev, as he is the one who has had to do the custom builds. :)

What inspired the request?

The 5.7 release and the need to do a custom build made us reflect on this process and start a discussion.

What existing behavior do you want changed?

The need to do a custom build for Harvard Dataverse with each release.

Any brand new behavior do you want to add to Dataverse?

This will make the spam checker available, but an installation would need to set up a script to turn it on.

Any related open or closed issues to this feature request?

#7291

@djbrooke djbrooke changed the title Add Spam Filter code to Dataverse codebase Add External Metadata Validation code to Dataverse codebase Oct 27, 2021
@djbrooke
Copy link
Contributor Author

  • Code is in a branch in the dataverse.harvard.edu repo
  • We'd want to remove some parts of the existing code (the SpamAssassin config file) and bring what's reasonable into develop
  • Add a release note about possible other uses for this, like metadata validation (would be good for @djbrooke to learn what other uses there are!)
  • There is some desire to bring this into workflows, but we'd need to build a workflow for Dataverses and we can handle this later - there is some debate about whether this is a separate feature or not (put a comment about this being added later!)

@djbrooke djbrooke added the Small label Oct 27, 2021
@landreev
Copy link
Contributor

I'll get this out of the way quickly, as soon as 5.8 is tagged.

landreev added a commit that referenced this issue Nov 17, 2021
…8155).

Also, changed "dataverse" to "dataverse collections" in the default error messages.
landreev added a commit that referenced this issue Nov 17, 2021
landreev added a commit that referenced this issue Nov 17, 2021
landreev added a commit that referenced this issue Nov 17, 2021
landreev added a commit that referenced this issue Nov 19, 2021
… script sections, clarifying the way the script runs on a saved metadata export. (#8155)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants