Skip to content

Commit

Permalink
Added rrf method
Browse files Browse the repository at this point in the history
  • Loading branch information
ankane committed Sep 3, 2024
1 parent 8fde24e commit 17e6e6d
Show file tree
Hide file tree
Showing 6 changed files with 46 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
## 5.4.0 (unreleased)

- Added experimental `knn` option
- Added experimental `rrf` method
- Added experimental support for `_raw` to `where` option
- Added warning for `exists` with non-`true` values
- Added warning for full reindex and `:queue` mode
Expand Down
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1875,7 +1875,13 @@ semantic_search = Product.search(knn: {field: :embedding, vector: [1, 2, 3]}, li
Searchkick.multi_search([keyword_search, semantic_search])
```
To combine the results, use a reranking model
To combine the results, use Reciprocal Rank Fusion (RRF)
```ruby
Searchkick::Reranking.rrf(keyword_search, semantic_search)
```
Or a reranking model
```ruby
rerank = Informers.pipeline("reranking", "mixedbread-ai/mxbai-rerank-xsmall-v1")
Expand Down
5 changes: 4 additions & 1 deletion examples/hybrid.rb
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,10 @@ class Document < ActiveRecord::Base

Searchkick.multi_search([keyword_search, semantic_search])

# to combine the results, use a reranking model
# to combine the results, use Reciprocal Rank Fusion (RRF)
p Searchkick::Reranking.rrf(keyword_search, semantic_search).map { |v| v[:result].content }

# or a reranking model
rerank = Informers.pipeline("reranking", "mixedbread-ai/mxbai-rerank-xsmall-v1")
results = (keyword_search.to_a + semantic_search.to_a).uniq
p rerank.(query, results.map(&:content), top_k: 5).map { |v| results[v[:doc_id]] }.map(&:content)
1 change: 1 addition & 0 deletions lib/searchkick.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
require_relative "searchkick/record_indexer"
require_relative "searchkick/relation"
require_relative "searchkick/relation_indexer"
require_relative "searchkick/reranking"
require_relative "searchkick/results"
require_relative "searchkick/raw"
require_relative "searchkick/version"
Expand Down
28 changes: 28 additions & 0 deletions lib/searchkick/reranking.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
module Searchkick
module Reranking
def self.rrf(first_ranking, *rankings, k: 60)
rankings.unshift(first_ranking)
rankings.map!(&:to_ary)

ranks = []
results = []
rankings.each do |ranking|
ranks << ranking.map.with_index.to_h { |v, i| [v, i + 1] }
results.concat(ranking)
end

results =
results.uniq.map do |result|
score =
ranks.sum do |rank|
r = rank[result]
r ? 1.0 / (k + r) : 0.0
end

{result: result, score: score}
end

results.sort_by { |v| -v[:score] }
end
end
end
7 changes: 5 additions & 2 deletions test/hybrid_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,11 @@ def test_multi_search
semantic_search = Product.search(knn: {field: :embedding, vector: [1, 2, 3]})
Searchkick.multi_search([keyword_search, semantic_search])

results = Searchkick::Reranking.rrf(keyword_search, semantic_search)
expected = ["The bear is growling", "The dog is barking", "The cat is purring"]
assert_equal expected.first(1), keyword_search.map(&:name)
assert_equal expected, semantic_search.map(&:name)
assert_equal expected, results.map { |v| v[:result].name }
assert_in_delta 0.03279, results[0][:score]
assert_in_delta 0.01612, results[1][:score]
assert_in_delta 0.01587, results[2][:score]
end
end

0 comments on commit 17e6e6d

Please sign in to comment.