-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Thresholding by distance doesn't work #502
Comments
We tried seeing if we'd get incorrect results when doing this brute force, like below. BEGIN;
DROP INDEX segment_embedding_idx;
WITH distances AS (
SELECT
segment.id, segment.category,
embedding <=> '[0.048492431640625,0.004207611083984375,0.020538330078125,-0.06622314453125,-0.009918212890625,0.0819091796875,-0.0247344970703125,-0.01090240478515625,0.0679931640625,-0.040924072265625,-0.0034122467041015625,0.072021484375,0.0008335113525390625,-0.006832122802734375,0.00824737548828125,-0.03314208984375,0.025909423828125,0.04852294921875,-0.0171661376953125,0.0048370361328125,-0.0230712890625,-0.01190185546875,-0.0043792724609375,0.01007080078125,-0.052276611328125,0.0180816650390625,-0.00858306884765625,0.0078582763671875,-0.050811767578125,0.007328033447265625,0.0712890625,0.033416748046875,-0.03240966796875,-0.0029735565185546875,-0.0399169921875,-0.07537841796875,-0.0631103515625,-0.02191162109375,-0.0292510986328125,0.063232421875,0.093017578125,0.022674560546875,-0.0186920166015625,0.0041046142578125,-0.0300140380859375,-0.0941162109375,0.038970947265625,0.036651611328125,-0.0223388671875,0.002079010009765625,0.00818634033203125,-0.0187225341796875,0.031982421875,-0.01050567626953125,-0.00460052490234375,-0.0517578125,-0.0269317626953125,-0.038055419921875,0.007781982421875,0.038482666015625,-0.05145263671875,0.00878143310546875,0.021575927734375,0.053375244140625,0.0005512237548828125,0.03448486328125,-0.01157379150390625,0.0237274169921875,-0.05712890625,0.099365234375,-0.0019664764404296875,0.056243896484375,-0.00019276142120361328,-0.027069091796875,0.1966552734375,-0.01096343994140625,0.00528717041015625,-0.0244140625,-0.0211944580078125,0.0116424560546875,-0.053863525390625,-0.042266845703125,-0.0241546630859375,0.016571044921875,0.01535797119140625,0.0203704833984375,-0.036376953125,0.029815673828125,-0.048858642578125,-0.027984619140625,0.0044097900390625,0.049072265625,-0.06146240234375,0.002292633056640625,-0.01023101806640625,-0.0404052734375,0.03216552734375,-0.0093536376953125,-0.01009368896484375,0.03692626953125,-0.0257568359375,0.0193939208984375,0.004894256591796875,0.005229949951171875,-0.007106781005859375,0.06427001953125,0.007354736328125,0.050048828125,0.05267333984375,0.044952392578125,-0.0908203125,-0.02679443359375,0.01000213623046875,0.0247955322265625,0.0106658935546875,-0.0261688232421875,0.0207672119140625,0.045806884765625,0.035491943359375,-0.014556884765625,0.0421142578125,-0.045654296875,-0.048065185546875,0.0234832763671875,-0.14111328125,-0.004230499267578125,0.024566650390625,-0.0244293212890625,0.00693511962890625,0.01145172119140625,0.040283203125,-0.06060791015625,0.00921630859375,0.029083251953125,-0.09417724609375,0.028656005859375,-0.00201416015625,0.09478759765625,-0.01023101806640625,0.0249786376953125,0.018310546875,-0.02783203125,-0.022979736328125,-0.032073974609375,-0.0009412765502929688,0.00971221923828125,0.0071563720703125,-0.06329345703125,-0.01146697998046875,0.0149078369140625,0.0007505416870117188,0.01476287841796875,-0.0672607421875,-0.05224609375,-0.004695892333984375,0.006256103515625,0.00579833984375,0.0203704833984375,-0.01812744140625,-0.003818511962890625,0.007167816162109375,0.041107177734375,-0.01318359375,0.01480865478515625,-0.05267333984375,-0.0017557144165039062,0.011199951171875,-0.0153656005859375,0.0550537109375,0.03387451171875,-0.0196990966796875,0.033782958984375,-0.0158233642578125,0.0,0.037567138671875,0.0020351409912109375,-0.0206451416015625,-0.0242767333984375,0.037933349609375,0.03271484375,-0.035003662109375,-0.0303802490234375,-0.0282440185546875,0.0411376953125,0.023040771484375,0.183349609375,0.034759521484375,-0.01788330078125,0.07275390625,-0.025360107421875,0.0416259765625,0.037567138671875,-0.04315185546875,-0.06610107421875,0.001766204833984375,0.06866455078125,0.0164642333984375,-0.014373779296875,-0.033905029296875,-0.01079559326171875,-0.03521728515625,0.07550048828125,-0.01229095458984375,0.0278778076171875,0.039764404296875,-0.046722412109375,-0.0186309814453125,0.033660888671875,0.00748443603515625,0.0007295608520507812,-0.0947265625,0.04132080078125,-0.0140380859375,0.01447296142578125,-0.051544189453125,-0.055419921875,0.02691650390625,0.0095062255859375,-0.01568603515625,0.004299163818359375,-0.041656494140625,-0.0280303955078125,0.07684326171875,0.00970458984375,0.04351806640625,-0.078125,-0.0384521484375,0.0222930908203125,-0.003314971923828125,-0.055999755859375,0.01373291015625,0.01485443115234375,-0.04168701171875,0.04571533203125,-0.04693603515625,-0.0166473388671875,-0.072998046875,-0.0240478515625,0.002040863037109375,0.0220794677734375,-0.0009369850158691406,-0.027069091796875,0.04248046875,-0.0750732421875,-0.0960693359375,-0.03765869140625,-0.0173187255859375,-0.0325927734375,0.050750732421875,-0.052520751953125,-0.038818359375,-0.177001953125,-0.0188751220703125,0.0130462646484375,0.04058837890625,0.042236328125,-0.03338623046875,-0.031982421875,0.04229736328125,-0.05474853515625,0.0287017822265625,0.007640838623046875,-0.033447265625,0.0033664703369140625,-0.055694580078125,-0.0714111328125,-0.036346435546875,0.00142669677734375,-0.03070068359375,0.0299530029296875,-0.03369140625,-0.00966644287109375,0.005718231201171875,0.0187225341796875,-0.0213775634765625,-0.0185546875,-0.04180908203125,-0.0450439453125,-0.0229034423828125,0.04693603515625,0.036163330078125,0.061004638671875,-0.0672607421875,0.0009226799011230469,-0.046600341796875,-0.0242767333984375,-0.058624267578125,-0.1256103515625,0.023223876953125,0.0295867919921875,-0.046234130859375,0.0163421630859375,0.050384521484375,-0.025115966796875,0.037994384765625,0.027984619140625,0.0261383056640625,-0.0117645263671875,-0.004352569580078125,0.00960540771484375,0.07293701171875,0.062225341796875,-0.01007080078125,-0.02685546875,-0.0005059242248535156,-0.0122833251953125,-0.024749755859375,0.036529541015625,-0.03192138671875,0.015869140625,0.0777587890625,0.0109405517578125,-0.050018310546875,0.06787109375,0.08477783203125,-0.035736083984375,0.01482391357421875,0.05218505859375,-0.004047393798828125,-0.0165252685546875,0.1019287109375,-0.04510498046875,0.022186279296875,0.005382537841796875,0.004451751708984375,0.01129913330078125,-0.048980712890625,0.047027587890625,-0.0146026611328125,-0.012939453125,-0.0210418701171875,-0.0308990478515625,-0.057220458984375,0.0005764961242675781,0.01168060302734375,-0.004848480224609375,0.01641845703125,-0.0226898193359375,0.0187835693359375,0.1494140625,0.055328369140625,-0.0178985595703125,-0.0169219970703125,-0.043792724609375,-0.0238800048828125,-0.013580322265625,0.0104522705078125,0.0526123046875,-0.033050537109375,0.00775909423828125,-0.025238037109375,0.00724029541015625,-0.025909423828125,-0.00861358642578125,0.040679931640625,0.1619873046875,0.0263519287109375,-0.0229034423828125,0.0101165771484375,-0.041107177734375,-0.0301361083984375,0.006916046142578125,-0.0175323486328125,0.01502227783203125,-0.0161895751953125,-0.01323699951171875,-0.004383087158203125,-0.0087432861328125,-0.0511474609375,0.00356292724609375,0.011383056640625,-0.0548095703125,-0.039947509765625,0.045318603515625,-0.01392364501953125,-0.01788330078125,0.025634765625,0.00868988037109375,-0.00244903564453125,0.0030956268310546875,0.0032634735107421875,0.06744384765625,0.0064697265625,0.005268096923828125,-0.014190673828125,-0.01392364501953125,0.01105499267578125,0.0307769775390625,-0.017425537109375,0.002033233642578125,0.034698486328125,-0.0250244140625,-0.00201416015625,-0.0233154296875,-0.019287109375,0.0181884765625,0.008575439453125,-0.03564453125,0.045379638671875,0.003162384033203125,0.0053863525390625,0.011322021484375,-0.01247406005859375,-0.04608154296875,0.02166748046875,0.0238189697265625,0.022125244140625,0.0035915374755859375,0.022064208984375,0.00432586669921875,-0.03546142578125,0.01241302490234375,0.02740478515625,0.004619598388671875,-0.09027099609375,-0.052764892578125,0.0325927734375,-0.02593994140625,0.0094146728515625,-0.01071929931640625,0.01453399658203125,0.00960540771484375,0.05364990234375,0.003635406494140625,-0.020416259765625,-0.0279083251953125,-0.004726409912109375,-0.007053375244140625,0.004291534423828125,-0.00629425048828125,-0.0200958251953125,-0.049163818359375,0.02899169921875,-0.06494140625,-0.0117034912109375,-0.04913330078125,0.0148468017578125,-0.070556640625,0.04052734375,0.0009059906005859375,0.021759033203125,-0.0352783203125,-0.0341796875,-0.050262451171875,0.00033593177795410156,0.043792724609375,-0.024658203125,0.0182952880859375,-0.01245880126953125,-0.01078033447265625,-0.0157012939453125,0.045257568359375,0.0224609375,-0.0308074951171875,0.06298828125,0.0051727294921875,0.00559234619140625,-0.025604248046875,-0.0242919921875,0.01508331298828125,-0.01375579833984375,0.0262298583984375,-0.043365478515625,-0.032958984375,-0.0101776123046875,-0.0250244140625,-0.0118408203125,0.0208282470703125,-0.006626129150390625,0.047698974609375,0.0027217864990234375,-0.023956298828125,-0.0016622543334960938,0.051513671875,-0.001529693603515625,-0.0250396728515625,-0.0016460418701171875,-0.0206451416015625,0.047576904296875,-0.06134033203125,0.03338623046875,-0.0251922607421875,0.17041015625,0.04351806640625,-0.07464599609375,0.03369140625,1.4543533325195312e-05,-0.0011463165283203125,-0.048004150390625,0.00016164779663085938,0.0247344970703125,-0.0215301513671875,0.03076171875,-0.016937255859375,-0.0943603515625,-0.026824951171875,0.01163482666015625,0.039276123046875,-0.01502227783203125,0.00400543212890625,0.01165771484375,0.0169677734375,-0.033050537109375,-0.03863525390625,-0.09918212890625,0.04339599609375,-0.04229736328125,-0.005825042724609375,-0.030731201171875,-0.0289306640625,0.0068206787109375,0.0158538818359375,0.043853759765625,0.005344390869140625,-0.0231170654296875,0.0287017822265625,0.01654052734375,0.041961669921875,0.005596160888671875,0.0307769775390625,0.0102081298828125,0.03472900390625,0.0164337158203125,0.00878143310546875,0.061492919921875,-0.0435791015625,-0.08380126953125,0.0120391845703125,-0.020660400390625,-0.049346923828125,0.03900146484375,-0.03497314453125,0.00788116455078125,-0.053192138671875,0.0191497802734375,0.02227783203125,0.0333251953125,-0.00919342041015625,-0.104248046875,-0.049774169921875,-0.0645751953125,0.028350830078125,-0.03839111328125,-0.01654052734375,-0.00263214111328125,-0.007259368896484375,0.017120361328125,-0.00391387939453125,-0.0848388671875,0.006214141845703125,-0.0175018310546875,-0.051971435546875,-0.048828125,-0.08807373046875,0.00550079345703125,0.030303955078125,0.00983428955078125,-0.0084686279296875,0.01213836669921875,0.01532745361328125,0.003978729248046875,0.03924560546875,-0.02044677734375,0.0391845703125,-0.051513671875,-0.02532958984375,-0.0233612060546875,0.03936767578125,-0.0286712646484375,-0.037811279296875,0.05712890625,0.0418701171875,0.0294036865234375,0.006137847900390625,-0.0252532958984375,0.003887176513671875,-0.0084991455078125,0.003604888916015625,-0.01824951171875,0.04150390625,-0.08380126953125,-0.00284576416015625,0.01535797119140625,0.0009307861328125,-0.00376129150390625,0.01319122314453125,-0.057586669921875,0.0261383056640625,0.01068878173828125,0.0295867919921875,-0.00908660888671875,-0.0201873779296875,-0.01419830322265625,0.0159149169921875,0.06597900390625,-0.0137176513671875,0.014495849609375,0.01540374755859375,0.01279449462890625,-0.0262298583984375,-0.0182952880859375,0.04644775390625,-0.00547027587890625,-0.0021648406982421875,-0.03570556640625,-0.04400634765625,0.00457000732421875,0.01441192626953125,0.0352783203125,0.043182373046875,-0.0010404586791992188,-0.032196044921875,-0.0079345703125,-0.043792724609375,-0.016204833984375,0.05029296875,0.004558563232421875,-0.055328369140625,-0.0265960693359375,0.028350830078125,0.03192138671875,0.029693603515625,0.040435791015625,0.004878997802734375,-0.034515380859375,0.025146484375,-0.01326751708984375,-0.0140838623046875,-0.0008788108825683594,-0.0230865478515625,0.0088653564453125,0.00522613525390625,-0.0179290771484375,0.0303802490234375,0.032928466796875,-0.0184326171875,0.033935546875,-0.002887725830078125,-0.0090484619140625,0.0176849365234375]'
AS distance
FROM
segment
)
SELECT distances.id, distances.distance, distances.category
FROM distances
WHERE distances.distance < 0.97
ORDER BY distances.distance
LIMIT 30
;
ROLLBACK;
|
Without the indexing, you can always get the right results, and everything is done in a brute force way. With the index, currently the index didn't handle the Here we can push down the And for the result accuracy, all the vector index are approximate nearest neighbor algorithm, which means the results are "approximate" (not exactly the same with the real results). You can increase the |
Made an issue at #503 |
Thanks for the detailed explanation! This makes sense. Will be really great to have that feature. |
Tangential to #501
I'm trying to store the computed cosine distances in a new column, and then order the results by that column while filtering out rows that have a distance greater than some threshold. I believe there's a bug when trying to do thresholding
Dataset Context
My dataset has 3 tables
collection
,asset
, andsegment
, setup like so:I'm using the
0.3.0-alpha2
docker container for all my tests.I then build some standard indexes on theese tables, followed by the HNSW index. Here is a fully reproducible script to reproduce the dataset:
https://gist.github.com/rsomani95/56194e33364208019a09846cb9bacabb#file-build_dataset_with_indexes-sql-L1-L113
Query: store distance + filter by threshold + order by distance (Does not work)
Results
I believe the issue is the addition of the
WHERE
clause.Below, I'm sharing a query that does the ordering correctly:
Query: store distance + order by distance
The returned results are ordered by distance correctly:
Results
These results are ordered correctly.
There is one additional discrepancy with the incorrect query above: the distance values themselves are different. In the erroneous query, the lowest distance is ~0.95 (2nd last row), whereas here the closest distance here is ~0.96. I'm not sure which one is correct however.
The text was updated successfully, but these errors were encountered: