-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding description info to the fileDsc seciton in DDI CodeBook. #5051 #10938
Conversation
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks fine. I'm trusting that "notes" is the right place to put the file descriptions. I did leave a couple other comments.
Hey @landreev, on trying to test this I had an observation. When I upload a CSV file, I am able to see the correct DDI output (see below): When I tested uploading a Stata file, I noticed that the format is different and notes is appearing twice. Here is the Stata Test File (I compressed it so I could add it here, please unzip it first): |
@ofahimIQSS We cannot guarantee to be able to successfully ingest any file in a potentially "ingestable" file format. This is especially true with CSV. We are going to try to parse and ingest any CSV file uploaded by the user (as long as it is below the ingestable size limit, if defined), but it may or may not succeed. Some of the more common reasons why we may fail to ingest a CSV file: ingest will stop if the first line is not a comma-separated list of what looks like the names of the individual variables; ingest will fail unless every row contains the same number of comma-separated fields. The |
Thanks for the clarification @landreev - Merging PR |
What this PR does / why we need it:
Apparently, users have been asking for this since 2018 - for tabular files that have the Description field populated, this label was never exported in the DDI (non-ingested files always had their descriptions exported, in the corresponding
<otherMat>
sections).There is no obvious field under
<fileDscr>
in the DDI Codebook schema for it - probably the reason we chose not to export it back in the day (?) - but putting it into another dedicated free text<note>
field seems like a reasonable solution.The RestAssured export tests are passing, so un-drafting the PR.
I kept the changes minimal to stay under the "3" estimate.
Which issue(s) this PR closes:
Special notes for your reviewer:
Suggestions on how to test this:
Straightforward. Upload some file that's known to be ingestable (Stata, CSV ... doesn't matter). Populate the description field in the file metadata. Publish the dataset. Look at the DDI export, the description should not be showing in the corresponding
<fileDscr ...>
, like this:For extra credit, look at the file under Data Explorer, verify that new
<notes>
element isn't causing any trouble there (the Explorer relies on the DDI for viewing and - in the latest version - editing).Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: