Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a client code hook for controlling model truthiness #110

Closed
lu-pl opened this issue Oct 23, 2024 · 4 comments · Fixed by #113
Closed

Provide a client code hook for controlling model truthiness #110

lu-pl opened this issue Oct 23, 2024 · 4 comments · Fixed by #113
Assignees
Labels
enhancement New feature or request

Comments

@lu-pl
Copy link
Contributor

lu-pl commented Oct 23, 2024

Model truthiness is an important metric for the rdfproxy grouping mechanism. Currently, the logic for determining model truthiness is hard-coded to recognize a model as truthy if any of its fields is truthy, (see rdfproxy.mapper.ModelBindingsMapper._get_unique_models line 35).

This is a sane default, yet certain frontend demands require different model truth conditions.

Current behavior

With the current implementation, a simple model definition like

from fastapi import FastAPI
from pydantic import BaseModel, ConfigDict
from rdfproxy import Page, SPARQLModelAdapter

query = """
select ?parent ?child ?name
where {
    values (?parent ?child ?name) {
        ('x' 'c' 'foo')
        ('y' 'd' UNDEF)
        ('y' 'e' UNDEF)
        ('z' UNDEF UNDEF)
    }
}
"""

class Child(BaseModel):
    name: str | None = None

class Parent(BaseModel):
    model_config = ConfigDict(group_by="parent")

    parent: str
    children: list[Child]

adapter = SPARQLModelAdapter(
    target="https://query.wikidata.org/bigdata/namespace/wdq/sparql",
    query=query,
    model=Parent,
)

app = FastAPI()

@app.get("/")
def base_route(page: int = 1, size: int = 100) -> Page[Parent]:
    return adapter.query(page=page, size=size)

yields the following result:

{

      "items": [
            {
                  "parent": "x",
                  "children": [
                        {
                              "name": "foo"
                        }
                  ]
            },
            {
                  "parent": "y",
                  "children": [ ]
            },
            {
                  "parent": "z",
                  "children": [ ]
            }
      ],
      "page": 1,
      "size": 100,
      "total": 3,
      "pages": 1
}

According to the currently hard-coded truth condition for model instances, a model is truthy if any of its fields is truthy; so the above configuration correctly returns empty arrays for y and z children, because for those rows, the single Child field name is None.

However, it might very well be desirable for backend implementers and API consumers to differentiate between "no object" and "an object with a single null value/only null values". Currently, this is not possible.

Solution proposal

A solution for this is to provide a hook for allowing client code to control the conditions for model instance truthiness by supporting a model_bool field in pydantic.ConfigDict.

The model_bool property would accept arguments of type

  1. Callable
    Client code may provide a callable of arity 1 which receives the model instance as argument at runtime.

  2. str
    A string value for model_bool defines the truthiness of the field denoted by that string value as general truth condition for the model.

  3. Iterable[str]
    An Iterable[str] value for model_bool defines the truthiness of the model as the conjunction of all fields referenced in the iterable, i.e. the model is only considered to be truthy if all the referenced fields have truthy values.

This way it would be possible to adapt the above example to allow objects with only a single null value like so:

class Child(BaseModel):
    model_config = ConfigDict(model_bool=lambda model: True)
    
    name: str | None = None

The expected result would then be:

{

      "items": [
            {
                  "parent": "x",
                  "children": [
                        {
                              "name": "foo"
                        }
                  ]
            },
            {
                  "parent": "y",
                "children": [
		    {
			"name": null
		    },
		    {
			"name": null
		    }
		]
            },
            {
                  "parent": "z",
                "children": [
		    {
			"name": null
		    }
		]
            }
      ],
      "page": 1,
      "size": 100,
      "total": 3,
      "pages": 1
}
@lu-pl
Copy link
Contributor Author

lu-pl commented Oct 23, 2024

Type for model_bool callable arguments:

class ModelBoolPredicate(Protocol):
    def __call__(self, model: _TModelInstance) -> bool: ...

@lu-pl lu-pl self-assigned this Oct 23, 2024
@lu-pl lu-pl added the enhancement New feature or request label Oct 23, 2024
@lu-pl
Copy link
Contributor Author

lu-pl commented Oct 24, 2024

Example for model_bool with a str argument

A string value for model_bool defines the truthiness of the field denoted by that string value as general truth condition for the model.

So e.g. for the following Child definition

class Child(BaseModel):
    model_config = ConfigDict(model_bool="child")

    name: str | None = None
    child: str | None = None

the expected result would be:

{
    "items": [
        {
            "parent": "x",
            "children": [
                {
                    "name": "foo",
		    "child": "c"
                }
            ]
        },
        {
            "parent": "y",
            "children": [
                {
                    "name": null,
		    "child": "d"
                },
                {
                    "name": null,
		    "child": "e"
                }
            ]
        },
        {
            "parent": "z",
            "children": [ ]
        }
    ],
    "page": 1,
    "size": 100,
    "total": 3,
    "pages": 1
}

The result row ('z' UNDEF UNDEF) will produce an empty array for the children field, because the condition for Child to be true is defined in terms of the child field to be true.

I.e. also ('z' UNDEF 'bar') would return an empty array for the children field.

@lu-pl
Copy link
Contributor Author

lu-pl commented Oct 24, 2024

Note that it would currently NOT be possible to achieve

{
    "items": [
        {
            "parent": "x",
            "children": [
                {
                    "name": "foo",
                }
            ]
        },
        {
            "parent": "y",
            "children": [
                {
                    "name": null,
                },
                {
                    "name": null,
                }
            ]
        },
        {
            "parent": "z",
            "children": [ ]
        }
    ],
    "page": 1,
    "size": 100,
    "total": 3,
    "pages": 1
}

by excluding a model field from serialization like so:

class Child(BaseModel):
    model_config = ConfigDict(model_bool="child")

    child: str | None = Field(default=None, exclude=True)
    name: str | None = None

This is due the kludgy implementation of the currently hard-coded model truthiness logic which relies on serialization..
I consider this a bug, issue pending.

@lu-pl
Copy link
Contributor Author

lu-pl commented Oct 24, 2024

This is due the kludgy implementation of the currently hard-coded model truthiness logic which relies on serialization..
I consider this a bug, issue pending.

This might actually be a very easy fix, instead of calling _model.model_dump().values() in the above mentioned line 35, dict-casting the model should do the trick. That is, the serializer won't run in that case.

See #112 .

lu-pl added a commit that referenced this issue Oct 28, 2024
lu-pl added a commit that referenced this issue Nov 6, 2024
lu-pl added a commit that referenced this issue Nov 6, 2024
@lu-pl lu-pl added this to the Model truthiness milestone Nov 7, 2024
lu-pl added a commit that referenced this issue Nov 8, 2024
lu-pl added a commit that referenced this issue Nov 8, 2024
lu-pl added a commit that referenced this issue Nov 11, 2024
lu-pl added a commit that referenced this issue Nov 11, 2024
lu-pl added a commit that referenced this issue Nov 15, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.
The feature provides the option for client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 15, 2024
lu-pl added a commit that referenced this issue Nov 15, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.
The feature introduces a model_bool option in the model_config that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 15, 2024
lu-pl added a commit that referenced this issue Nov 15, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 15, 2024
lu-pl added a commit that referenced this issue Nov 16, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 16, 2024
b1rger added a commit that referenced this issue Nov 19, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 19, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 19, 2024
lu-pl added a commit that referenced this issue Nov 20, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 20, 2024
lu-pl added a commit that referenced this issue Nov 25, 2024
Model truthiness is an important metric for the rdfproxy grouping mechanism.
Currently, a model is considered truthy if at least one of its fields is truthy.

This is a sane default, yet certain frontend demands require different model truth conditions.

The feature introduces a model_bool option in the model_config
that allows client code to specify conditions/predicates for determining model truthiness.

Closes #110, closes #112.
lu-pl added a commit that referenced this issue Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant