Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that sectionTitle and sectionAbbrev match throughout the data #30

Open
ajnyga opened this issue Oct 31, 2024 · 6 comments
Open

Check that sectionTitle and sectionAbbrev match throughout the data #30

ajnyga opened this issue Oct 31, 2024 · 6 comments

Comments

@ajnyga
Copy link
Owner

ajnyga commented Oct 31, 2024

An important validity check is to make sure that the sectionTitle and sectionAbbrev fields are used correctly.

If we have for example Articles - ART in the data and one row happens to have for example Articles - REV, this will create a valid XML document, but the import will fail with:

PHP Fatal error: Uncaught Error: Call to a member function getData() on null in /classes/search/ArticleSearchIndex.inc.php:38
Basically the whole issue containing the error will not be imported and you end up with Submission objects without Publications. Probably reported here pkp/pkp-lib#9755

@ronste
Copy link
Collaborator

ronste commented Nov 1, 2024

Yes, I had comparable issues. One needs to verify sections exist in the OJS installation before import. In particular if custom sections are used.

I don't think we can add this verification to the conversion script since we don't know whether or not "Articles - REV" is intended to be a valid section or not. If it is it needs to be manually created before import.

However, we can provide a warning message in case we encounter section defintions different from the one default section (Articles - ART) OJS comes with in a default installation. This test needs to be localized however.

What do you think?

@ajnyga
Copy link
Owner Author

ajnyga commented Nov 1, 2024

I mean a case where you have both "Articles - ART" and "Articles - REV" in the same data. So for each section title you should always have just one abbreviation in use. Of course if the data has only "Articles - REV" throughout the whole sheet, then it is not a problem, but the mix is something that should be probably prevented.

@ronste
Copy link
Collaborator

ronste commented Nov 1, 2024

I mean a case where you have both "Articles - ART" and "Articles - REV" in the same data. So for each section title you should always have just one abbreviation in use. Of course if the data has only "Articles - REV" throughout the whole sheet, then it is not a problem, but the mix is something that should be probably prevented.

Ok, I see your point and it is slightly different from my example. However, why should identical section titles always have identical abbreviations? Couldn't it be possible to have two sections named "Articles" in the same data which have different settings in OJS? E.g. one where abstracts are required and another where not? Or with different numbers of word counts? Just theoretically ...

@ajnyga
Copy link
Owner Author

ajnyga commented Nov 1, 2024

I guess they could, but at the moment that causes the import to break down. Of course this could be something that should be fixed in the import side.

@ajnyga
Copy link
Owner Author

ajnyga commented Nov 1, 2024

The weird thing is that using the UI I can create duplicate sections "Articles - ART" without any problem. Also, I guess that if I create sections "Articles - ART" and "Articles - REV", export content and try to import it again, then pkp/pkp-lib#9755 happens?

@ronste
Copy link
Collaborator

ronste commented Nov 7, 2024

I looked into the details of this problem again and was able to reproduce the above mentioned error only if the section definition was missing from the XML file.

E.g. if my XML file looks like:

  <issue ...>
    ...
    <sections>
      <section ref="ART" seq="0">
        <abbrev locale="en_US">ART</abbrev>
        <title locale="en_US">Articles</title>
      </section>
      <section ref="REV" seq="1">
        <abbrev locale="en_US">REV</abbrev>
        <title locale="en_US">Reviews</title>
      </section>
    </sections>
    <articles ...>
      <article ...>
        ...
        <publication ... section_ref="REV" ...>
          ...
        </publication>
      </article>
    </articles>
  </issue>

and the section "Reviews" does not exist in the installation, OJS will automatically create it and all articles are imported correctly. So this is not an OJS issue but an issue of an inconsistent XML file. A simple xmllint will point out the missing section reference.

My last commit (from yesterday) to the new branch will print out information of how many issues, sections, articles, ... were added to the XML file. If it added two sections and you only expect one this will already give a hint towards any typos in the excel sheet.

I could also do an automatic validation before saving the file. Do you think this would be useful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants