Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String datatype requires a language, even when the data is bytes not text #9191

Closed
azaroth42 opened this issue Jan 19, 2023 · 3 comments
Closed

Comments

@azaroth42
Copy link
Collaborator

Arches Version/Branch: stable/7.2.1

Describe the bug and how to reproduce it:

Some string data such as identifiers does not have a language. For example "sh89054622" is a string, but not in any human language. The current implementation of the string datatype requires a language, as represented by the language map in the JSON-LD.

This could be a second data type string-without-language, or a "magic" language such as zxx (meaning Not Applicable in ISO639) which then gets treated differently from strings with real languages.

@azaroth42
Copy link
Collaborator Author

azaroth42 commented Jan 19, 2023

This would let zxx or another language code be defined as being just a raw string, in the to_rdf of the string datatype:

        if edge_info["range_tile_data"] is not None:
            g.add((edge_info["d_uri"], RDF.type, URIRef(edge.domainnode.ontologyclass)))
            try:
                null_code = settings.NULL_LANGUAGE_CODE
            except:
                null_code = 'zxx'
            for key in edge_info["range_tile_data"].keys():
                if edge_info["range_tile_data"][key]["value"]:
                    if key != null_code:
                        g.add((edge_info["d_uri"], URIRef(edge.ontologyproperty), Literal(edge_info["range_tile_data"][key]["value"], lang=key)))
                    else:
                        g.add((edge_info["d_uri"], URIRef(edge.ontologyproperty), Literal(edge_info["range_tile_data"][key]["value"])))

A related issue is that the default language code in the card designer doesn't get associated with the card, so you can't default identifier values to zxx and actual language bearing content to a real language such as en. The values also don't render (for me at least) in the report or editor.

(The above also has the fix for #9192)

(My current workaround is simply to set en as NULL_LANGUAGE_CODE and never use languages on data)

@SDScandrettKint
Copy link
Member

SDScandrettKint commented Sep 6, 2023

I've recently discovered this issue as well; there are some data in the form of strings which are universal to all languages. My suggestion was similar to yours - the creation of a new datatype which I called "keyword".

Example use cases of the "keyword" datatype would be identifiers such as names or id's, or parts of an address like a postcode and a street name. These data have no specific language but, at this moment in time, to get this data to show up in each of the desired languages the data needs to be duplicated for each languages. This means holding the same postcode data for each language when the value is constant. This practice seems unnecessary and duplicated data makes things look messy. Hence the potential need for a "keyword" datatype, that will have the properties of the previous string datatype before i18n was implemented.

@chiatt
Copy link
Member

chiatt commented Sep 12, 2023

We could simply re-add the old string datatype with a new name, so this wouldn't be difficult to implement. I like keyword or maybe string-identifier, but I think string-without-language would be most explicit, helping graph designers understand the implications of their datatype selection.

apeters added a commit that referenced this issue Mar 1, 2024
apeters added a commit that referenced this issue Mar 15, 2024
apeters added a commit that referenced this issue Mar 25, 2024
…xt.htm


add aria-label for required status, re #9191

Co-authored-by: Jacob Walls <[email protected]>
apeters added a commit that referenced this issue Mar 25, 2024
apeters added a commit that referenced this issue Apr 4, 2024
apeters added a commit that referenced this issue Apr 4, 2024
@chiatt chiatt closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants