Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deeply-nested filters don't work with Elasticsearch #5642

Closed
colinodell opened this issue Jun 28, 2023 · 1 comment
Closed

Deeply-nested filters don't work with Elasticsearch #5642

colinodell opened this issue Jun 28, 2023 · 1 comment

Comments

@colinodell
Copy link
Contributor

colinodell commented Jun 28, 2023

API Platform version(s) affected: 2.6.8 (and likely earlier) through 3.x

Description

MatchFilter and TermFilter do not properly support multi-level nested fields in Elasticsearch if the top level is a collection.

What Works

It seems to work fine if the top level is an object. For example, given this sample document from the docs linked above:

{
  "driver" : {
        "last_name" : "McQueen",
        "vehicle" : [
            {
                "make" : "Powell Motors",
                "model" : "Canyonero"
            },
            {
                "make" : "Miller-Meteor",
                "model" : "Ecto-1"
            }
        ]
    }
}

Asking API Platform to filter on driver.vehicle.model would generate a valid, working query like this:

{
    "nested": {
        "path": "driver.vehicle",
        "query": {
            "term": {"driver.vehicle.model": "Canyonero"}
        }
    }
}

Interestingly, the Elasticsearch docs suggest the query should instead be doubly-nested like this:

{
    "nested": {
        "path": "driver", // <-- instead of driver.vehicle
        "query": {
            "nested": { // <-- additional nested layer
                "path": "driver.vehicle",
                "query": {
                    "term": {"driver.vehicle.model": "Canyonero"}
                }
            }
        }
    }
}

But I'm assuming ES will happily accept either version. Regardless, everything described above seems to work just fine.

What's Broken

However, if we modify the structure to have an array of drivers as the top-level property like this:

{
    "race": "Piston Cup",
    "drivers": [
        {
            "last_name" : "McQueen",
            "vehicle" : [
                {
                    "make" : "Powell Motors",
                    "model" : "Canyonero"
                },
                {
                    "make" : "Miller-Meteor",
                    "model" : "Ecto-1"
                }
            ]
        }
    ]
}

Attempting to filter on drivers.vehicle.model generates this query:

{
    "nested": {
        "path": "drivers",
        "query": {
            "term": {"drivers.vehicle.model": "Canyonero"}
        }
    }
}

Which fails to match any documents because the path is missing the .vehicle part. Manually adding the .vehicle part causes the expected documents to appear in the results:

{
    "nested": {
        "path": "drivers.vehicle", // <-- this change seems to work
        "query": {
            "term": {"drivers.vehicle.model": "Canyonero"}
        }
    }
}

Using a proper deeply-nested query also works (but API Platform won't generate this):

{
    "nested": {
        "path": "drivers", // <-- instead of drivers.vehicle
        "query": {
            "nested": { // <-- additional nested layer
                "path": "drivers.vehicle",
                "query": {
                    "term": {"drivers.vehicle.model": "Canyonero"}
                }
            }
        }
    }
}

Likely Cause

I think the current behavior is caused by FieldDatatypeTrait::getNestedFieldPath() never recursively calling itself when the top-level type is a collection. It will never check the rest of the path to see if it might be a nested object.

How to reproduce

I don't have a reproducer for how the query fails to work in Elasticsearch, but I did tweak TermFilterTest::testApplyWithNestedProperty() to demonstrate the unexpected query:

    public function testApplyWithDeeplyNestedProperty(): void
    {
        $libraryType = new Type(Type::BUILTIN_TYPE_ARRAY, false, Library::class, true, new Type(Type::BUILTIN_TYPE_INT), new Type(Type::BUILTIN_TYPE_OBJECT, false, Library::class));
        $bookType = new Type(Type::BUILTIN_TYPE_ARRAY, false, Book::class, true, new Type(Type::BUILTIN_TYPE_INT), new Type(Type::BUILTIN_TYPE_OBJECT, false, Book::class));
        $messageType = new Type(Type::BUILTIN_TYPE_STRING);

        $propertyMetadataFactoryProphecy = $this->prophesize(PropertyMetadataFactoryInterface::class);
        $propertyMetadataFactoryProphecy->create(Library::class, 'libraries')->willReturn((new ApiProperty())->withBuiltinTypes([$libraryType]))->shouldBeCalled();
        $propertyMetadataFactoryProphecy->create(Library::class, 'book')->willReturn((new ApiProperty())->withBuiltinTypes([$bookType]))->shouldBeCalled();
        $propertyMetadataFactoryProphecy->create(Book::class, 'message')->willReturn((new ApiProperty())->withBuiltinTypes([$messageType]))->shouldBeCalled();

        $resourceClassResolverProphecy = $this->prophesize(ResourceClassResolverInterface::class);
        $resourceClassResolverProphecy->isResourceClass(Library::class)->willReturn(true)->shouldBeCalled();
        $resourceClassResolverProphecy->isResourceClass(Book::class)->willReturn(true)->shouldBeCalled();

        $nameConverterProphecy = $this->prophesize(NameConverterInterface::class);
        $nameConverterProphecy->normalize('libraries.book.message', Library::class, null, Argument::type('array'))->willReturn('libraries.book.message')->shouldBeCalled();
        $nameConverterProphecy->normalize('libraries.book', Library::class, null, Argument::type('array'))->willReturn('libraries.book')->shouldBeCalled();
        $nameConverterProphecy->normalize('libraries', Library::class, null, Argument::type('array'))->willReturn('libraries')->shouldBeCalled();

        $termFilter = new TermFilter(
            $this->prophesize(PropertyNameCollectionFactoryInterface::class)->reveal(),
            $propertyMetadataFactoryProphecy->reveal(),
            $resourceClassResolverProphecy->reveal(),
            $this->prophesize(IriConverterInterface::class)->reveal(),
            $this->prophesize(PropertyAccessorInterface::class)->reveal(),
            $nameConverterProphecy->reveal(),
            ['libraries.book.message' => null]
        );

        self::assertEquals(
            ['bool' => ['must' => [['nested' => ['path' => 'libraries', 'query' => ['nested' => ['path' => 'libraries.book', 'query' => ['term' => ['libraries.book.message' => 'Krupicka']]]]]]]]],
            $termFilter->apply([], Library::class, null, ['filters' => ['libraries.book.message' => 'Krupicka']])
        );
    }

Possible Solution

First, fix FieldDatatypeTrait::getNestedFieldPath() so it returns the complete parent path.

Once that's done, modify MatchFilter::getQuery() and TermFilter::getQuery() so that they return deeply nested queries as shown in the Elasticsearch documentation. This seems "more correct" than the current approach of only using one nested key regardless of how deeply nested the parameter actually is.

Additional Context

api-platform/api-platform#1375 describes an issue with filtering nested properties where getNestedFieldPath() is also a suspected culprit. My case seems different because the top-level item in my path is a collection and not an object - the execution path is very different.

@stale
Copy link

stale bot commented Aug 27, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 27, 2023
@soyuka soyuka added enhancement and removed stale labels Aug 28, 2023
soyuka pushed a commit that referenced this issue Sep 14, 2023
* chore(elasticsearch): define a valid, known type for 'baz'

* fix(elasticsearch): fix nested object paths (api-platform/api-platform#1375)

* fix(elasticsearch): support additional nesting within collections (#5642)

---------

Co-authored-by: Colin O'Dell <[email protected]>
@soyuka soyuka closed this as completed Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants