Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map Avram to MQA Schema #210

Open
nichtich opened this issue Dec 12, 2024 · 4 comments
Open

Map Avram to MQA Schema #210

nichtich opened this issue Dec 12, 2024 · 4 comments

Comments

@nichtich
Copy link
Contributor

I think we can map an Avram schema of format family flat (e.g. CSV files) to a corresponding MQA Schema. Avram schemas for MARC and PICA would first require addition of these formats and their locator languages (path) to metadata-qa-api.

@pkiraly
Copy link
Owner

pkiraly commented Dec 12, 2024

@nichtich Sorry, I do not really understand the idea. Could you add an example for the CSV file?

@nichtich
Copy link
Contributor Author

Sample flat record from Avram specification:

{
  "fields": [
    { "tag": "given", "value": "Henriette" },
    { "tag": "given", "value": "Davidson" },
    { "tag": "surname", "value": "Avram" },
    { "tag": "birth", "value": "1919-10-07" }
  ]
}

Same in CSV (with non-standard internal separator '|'):

given,surname,birth
Henriette|Davidson,Avram,1919-10-07

Sample Avram schema

{
  "family": "flat",
  "fields": {
    "given": {
      "label": "given name",
      "required": false,
      "repeatable": true
    },
    "surname": {      
      "required": true,
      "repeatable": true
    },
    "birth": {
      "description": "date of birth in YYYY-MM-DD format",
      "required": false,
      "repeatable": false,
      "pattern": "^[0-9-]+$"
    }
  }
}

Could be an MQA Schema as well.

Hoever flat CSV files are rarely validated at all in practice. Either you get dirty CSV or you get more format data but then its not CSV but JSON, XML or some other format.

@pkiraly
Copy link
Owner

pkiraly commented Dec 19, 2024

Thanks for the example, now it is clear. The sample Avram schema could be transformed to MQA schema as:

format: CSV
fields:
  - path: given
    name: given name
    rules:
    - minCount: 0
  - path: surname
    name: surname
    rules:
    - minCount: 1
  - path: birth
    name: birth
    description: date of birth in YYYY-MM-DD format
    rules:
    - minCount: 0
    - pattern: ^\\d{4}-\\d{2}-\\d{2}$

In that simple case the translation is straightforward. Do you have a repository that contains full examples of those flat schemas, that I could use as the base inputs for a Avram2MQA "translation" class?

@nichtich
Copy link
Contributor Author

I'm not sure whether this repository is the right code base and it's not urgent, so let's keep Avram2MQA (and possibly other transformations such as from/to Data Package Table Schema) for 2025.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants