-
Notifications
You must be signed in to change notification settings - Fork 28
Filtering resources
Many use cases only need a specific subset of a resource. For example, when querying the animals
endpoint in the registration API, you may be interested in only the alive animals, the ones that were on the farm in a specific period, or maybe the ones that require attention.
Some filters are quite common, others are very use-case specific. Selecting on a date range is very common. Not all filters are as easy to implement. For example: the 'requires attention' filter may be easy to implement for servers that work closely with a farmer and have task-lists etc implemented, but not so much for basic registration systems.
Providing proper filters on an endpoint may reduce the load on the server (since it can deliver less data) but only if that filter is easily added / readily available for that server. A client can always filter out the animals it needs for a specific use case, provided that it has the information need to do so. As such, within ICAR ADE, filters are not compulsory. Instead, the standard focuses on creating common names for possible filters. However, a client cannot depend on the data source implementing that filter. As such, a client should expect the possibility of more data being delivered than one would expect based on the filter parameters. Common filters, specifically the ones that filter on required data in the message, should be easily implemented and as such are recommended for any data source.
(As a side note: discovery of which filters are available for a data source is a topic we are investigating.)
The ADE standard thus states:
- a server MAY implement any filter it deems relevant for an endpoint
- a server SHOULD implement RECOMMENDED filters
- whenever a server implements a filter, it MUST use the naming conventions as provided by the standard
When no naming convention exists for a filter, a server can choose its own name, keeping the back- and forward compatibility rules in mind:
Consider if your filter is potentially standardisable. If other regions or vendors may require something similar, define it in such a way that it may become part of the standard at one time. By doing this, no technical changes may be required by the time a new version comes out which includes this filter. If your filter is not potentially standardisable, then make sure that the name you choose does not potentially conflict with future fields. E.g., use a prefix for your region or company name. (See also Backward and forward compatibility ).
In many cases, a filter parameter relates to a specific field in the message. In those cases, simply name the filter the same as the field name. For example, using the animals
endpoint again: expected filters can be gender=Female
or specie=Buffalo
. This would filter all animals based on those constraints. Note that these act as an AND operation: if both the gender
and specie
filter are specified, only female buffalo's will be returned. If the same field is specified more than once, e.g. specie=Cow
and specie=Buffalo
, this is interpreted as an OR (for that field). The field name is assumed to be located directly beneath the member
field (the wrapper object found in most messages). If the field is nested deeper, the parent fields should be included in the filter, concatenated with a "-". E.g., for the treatment-programs
endpoint, you could specify a filter diagnoses-name
.
In other cases, you may need to be able to filter on a range. For example, all animals born in a specific period. Here, we can again use the field name (birthDate
) and use a suffix indicating the beginning and/or ending of the period. birthDate-from=2020-01-01
and birthDate-to=2020-02-01
should give you all animals born in January 2020. Note that the from
is inclusive and the to
is exclusive (an other way look at this is to expand the birthDate to a timestamp, filling out the time part with 00:00:00
).
Fields like the animal id are composite fields: they are composed of an id and a scheme that allows us to support different regions and countries: "animal": { "id": "NL 877034232", "scheme": "nl-v1" }
. To create a filter for this, we simply concatenate the subfields and create two parameters: animal-id="NL 877034232"
and animal-scheme="nl-v1"
. As both parameters work as an AND operation (see above), this will select precisely the animal within the dutch numbering scheme. It is not recommended to allow just one of these parameters to be used without the other: while selecting all animals with any dutch id is fine (although a very unlikely use case), selecting animals only on their id is ambiguous.
Similarly this happens with fields with units. For example, a milking visit duration is specified as: "milkingVisitDuration": { "value": 349, "unitCode": "SEC" }
. A range filter for this would be: milkingVisitDuration-value.from=60 & milkingVisitDuration-unitCode.from=SEC
to select all visits equal to or longer than a minute. The To/From postfix is appended to both the value and the unitCode field to keeps things generic.
In some cases, there is no direct field which can be used to implement a filter. For example, if you want to known which animals are available at a location at a specific point in time is derived data: you need to listen to all arrivals and departures. If the server has implemented this using events, then even for the server it may not be trivial to answer. If the server does want to support a filter for this use case, the ADE standard will provide recommended names for them.
A client may need to synchronise its state with a server. For that it can be necessary to track changes from a specific source from a specific point in time. As such, filters like 'meta-modified-from' and 'meta-source' are RECOMMENDED.
There may be use cases for fuzzy matching on e.g. descriptions or names (e.g., "all animal id's that start with an 'N'"). We could consider adding a suffix for these use cases (e.g. animal.name.match). As of this point, we do not have enough hard use cases to make a recommendation for this. Following our design principles, we leave fuzzy matching out of the specification for now until we have a need for it.
We expect the filter list to grow based on use cases. As such, to be able to track changes and have a proper review process, we should have markdown file in git's version management. For now, I've collected some of the proposed filters here:
Resource | Filter | Description |
---|---|---|
meta | meta-source | select a specific source |
meta-modified-from & meta-modified-to | select a range for the modified timestamp | |
meta-created-from & meta-created-to | select a range for the created timestamp | |
meta-creator | select the creator of the record | |
meta-validFrom & meta-validTo | select a range in which the record is valid | |
deprecated | start-date-time | prefer to use meta-modified-from |
end-date-time | prefer to use meta-modified-to | |
common | animal-id + animal-scheme | select a specific animal |
location-id + location-scheme | select a specific location | |
milking-visits | milkingStartingDateTime-from | range filter based on the milking start time |
milkingStartingDateTime-to | range filter based on the milking start time | |
milkingVisitDuration-value-from & milkingVisitDuration-unitCode-from | range filter based on the milking visit duration | |
milkingVisitDuration-value-to & milkingVisitDuration-unitCode-to | range filter based on the milking visit duration | |
milkingDuration-value-from & milkingDuration-unitCode-from | range filter based on the milking duration | |
milkingDuration-value-to & milkingDuration-unitCode-to | range filter based on the milking duration | |
milkingType | filters on milking type | |
... | ||
quarterMilkings-icarQuarterId | filters on milking visits that have a recording for a specific quarter |
In our collections definition, we allow for pagination based on JSON-LD. Typically, the pagination is driven by the server: previous/next pages are determined by URI's as provided in the view
object in the message. Optionally, a client could steer the pagination by providing query parameters. Typically, these query parameters are named similar to the fields in the view
object (e.g. page=2
). We do not anticipate a naming clash for these query parameters but we will have to be careful if we introduce fields in the actual payload that are similar to those in the view object.
We also discussed the OData standard. This was rejected since it feels like a heavy burden on the implementation server side. Also, since it allows for complex queries, it may put a heavy load on the server which is hard to manage by syntax only. Simply using OData syntax in a limited way felt like misleading since we would not support the full standard.
An alternative towards these filters would be to allow querying by example. This offers a powerful and relatively easy way to search. We may consider defining this functionality at a later stage at a separate end-point (e.g. ../search
). This way, interested parties can simply publish this endpoint while others can stick with the current proposal.
Note that parties are still free to add OData or search support, as long as it fits the guidelines as set out at Backward and forward compatibility.
For more on the in-depth discussion, see the original ticket issue 130.