Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching with multiple STI models broken since 4.0 #1259

Closed
cars10 opened this issue May 3, 2019 · 7 comments
Closed

Searching with multiple STI models broken since 4.0 #1259

cars10 opened this issue May 3, 2019 · 7 comments

Comments

@cars10
Copy link

cars10 commented May 3, 2019

Hi,

consider having 3 STI models

class Animal < ApplicationRecord
  searchkick inheritance: true
end

class Dog < Animal; end
class Cat < Animal; end
class Fish < Animal; end

And searching like this:

Searchkick.search '*', models: [Dog, Cat]

This will not return any objects of type Dog. The problem seems to be that you changed how the index / @index_mapping (new name) variable is calculated, see 7ce0440#diff-9e535cec87dc0b7137c4f81bdabdafd7R60

On searchkick 3.x it was an array so you could have duplicates of the same index in there. In our example all three sti classes use the same index (animals). With the change in searchkick 4 you changed that to be a hash, but because each key has to be unique we cannot have the same behaviour as before. We will end up with only Cat getting searched because it overrides the previously set key for Dog in @index_mapping.

Edit:
The hash will look something like this in the first step:

@index_mapping # => {'animals' => Dog}

And in the next iteration:

@index_mapping # => {'animals' => Cat}

What we actually want is that both models are searched.

@cars10 cars10 changed the title Searching with mutliple STI models broken since 4.0 Searching with multiple STI models broken since 4.0 May 3, 2019
@ankane
Copy link
Owner

ankane commented May 3, 2019

Hey @cars10, all of the inheritance tests are passing. Please use the gist in the Contributing Guide to reproduce.

@cars10
Copy link
Author

cars10 commented May 3, 2019

Sure.

require "bundler/inline"

gemfile do
  source "https://rubygems.org"

  gem "activerecord"
  gem "sqlite3"
  gem "searchkick", git: "https://github.com/ankane/searchkick.git"
end

require "active_record"

puts "Searchkick version: #{Searchkick::VERSION}"
puts "Elasticsearch version: #{Searchkick.server_version}"

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.create_table :animals do |t|
  t.string :name
  t.string :type
end

class Animal < ActiveRecord::Base
    searchkick inheritance: true
end
  
class Dog < Animal; end
class Cat < Animal; end
class Fish < Animal; end

Animal.reindex
Dog.create!(name: "Dog1")
Cat.create!(name: "Cat1")

Animal.search_index.refresh
search = Searchkick.search("*", fields: [:name], models: [Dog, Cat], execute: false)
response = search.execute

# this only returns 1 mapping, but we are searching for 2 models. See description in my issue
pp response.options[:index_mapping] 

# this only returns 1 cat, not 1 cat and 1 dog.
pp response.results

Edit:
Just for completion, the same thing worked with 3.1.3.

require "bundler/inline"

gemfile do
  source "https://rubygems.org"

  gem "activerecord"
  gem "sqlite3"
  gem "searchkick", git: "https://github.com/ankane/searchkick.git", tag: "v3.1.3"
end

require "active_record"

puts "Searchkick version: #{Searchkick::VERSION}"
puts "Elasticsearch version: #{Searchkick.server_version}"

ActiveRecord::Base.establish_connection adapter: "sqlite3", database: ":memory:"

ActiveRecord::Migration.create_table :animals do |t|
  t.string :name
  t.string :type
end

class Animal < ActiveRecord::Base
    searchkick inheritance: true
end
  
class Dog < Animal; end
class Cat < Animal; end
class Fish < Animal; end

Animal.reindex
Dog.create!(name: "Dog1")
Cat.create!(name: "Cat1")

Animal.search_index.refresh
search = Searchkick.search("*", fields: [:name], index_name: [Dog, Cat], execute: false)
response = search.execute

# this returns 1 cat and 1 dog - on searchkick 3.1.3
pp response.results

@ankane
Copy link
Owner

ankane commented May 3, 2019

Thanks @cars10, will check it out when I have a chance

@ankane
Copy link
Owner

ankane commented May 3, 2019

Hey @cars10, after an initial look, the examples above aren't the documented way to search inherited models. In your 2nd example (with Searchkick 3.1.3), add a fish record and you'll see it returns 3 results. You want to use the type option that's documented in the inheritance section of the readme.

Animal.search("*", type: [Cat, Dog])

There may be some magic we can do to make the models option take inheritance into account.

@ankane
Copy link
Owner

ankane commented May 3, 2019

Alright, the models option should behave much more intuitively on master (commits above + b567a65). The approach documented in the inheritance section is still preferred when possible as it prevents extra database calls.

Thanks for the detailed report!

@ankane ankane closed this as completed May 3, 2019
@cars10
Copy link
Author

cars10 commented May 6, 2019

Thanks for the fix. What would be the recommended way to search for multiple STI models and a different model together? for example

Searchkick.search("*", models: [Dog, Cat, Person])

I think i cannot use the type option here that you recommended above.

@ankane
Copy link
Owner

ankane commented May 7, 2019

I think this situation is pretty rare (searching two child models + a different one), but depending on your case, you could use multi-search instead (if results don't need to be scored together) or add Searchkick to each of the child models instead of the parent.

Ideally, we could replace the type option with models to make the interface simpler. We basically need an OR query that looks like:

[{_index: "animal", type: "dog"}, {_index: "animal", type: "cat"}, {_index: "person"}]

However, _index does not work with aliases right now (the indices query removed in Elasticsearch 6 did support them). We could query the aliases to get the indexes, but this would add latency and more importantly, introduce a race condition where the query could return no results for a split second during a reindex process.

Hopefully support for aliases will be added to the _index field in the future. Here's the official issue for it: elastic/elasticsearch#23306

@lock lock bot locked as resolved and limited conversation to collaborators Jun 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants