Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance drop on excessive number of IN() set conditions. #62

Open
oruchreis opened this issue Aug 18, 2020 · 4 comments
Open

Performance drop on excessive number of IN() set conditions. #62

oruchreis opened this issue Aug 18, 2020 · 4 comments

Comments

@oruchreis
Copy link

oruchreis commented Aug 18, 2020

Hi,
I'm using Reindexer a few months without any performance issues. And I'm quite happy with query performance especially for queries that have IN(set) conditions. But lately, some namespaces growed more in these months, and I'm facing some performance issues on specific queries on big namespaces which have items about 1 million. I've figured out if the count of values in the set condition is smaller from a limit it executes index method, if it is bigger then this limit, it executes scan method which is very slow. And I think this limit is changing by items count in the namespace.
I've uploaded dump file of the namespace to reproduce this issue. For this specific namespace, If the set condition has 679 values, it executes index method, but if the count of values is 680 then it executes scan method which results slower query. Also I've figured out if it executes scan method, the current activities page in the face displays as select_loop which takes about 30-50sn. And if the count of this scan queries increased over time, then cpu usage are also increasing exponentially.
I've tested on v2.6.2, 2.9.1, 2.11.0, 2.11.1 and 3.0.0 on both linux and windows.
Here is the dump of namespace to reproduce the issue:
TestNs.zip

And the query which executes scan method instead of index method. It executed scan in v3.0.0 and prior versions but if it doesn't execute scan method you can increase amount of values in the IN condition.

EXPLAIN SELECT * FROM TestNs
WHERE  (QuotaDate >= 637352928000000000 AND QuotaDate <= 637356384000000000) AND  
 HotelRoomId IN (
'9580','9581','9582','15461','9165','8107','15465','9594','9595','9596','9597','15466','9602','15467','9603','15468','15469','15470','15471','9604','15472','9605','15473','9606','15474','15475','15476','9607','15477','15478','9166',
'9187','9612','9613','15487','9617','9618','9794','9619','8522','9620','15488','15489','15490','15491','15492','15493','9824','9825','9828','9636','9630','9631','9829','9632','9639','8061','8053','8054','15510','15513','15514',
'9661','9662','15520','9541','15521','9877','8059','8060','9667','15523','15524','15525','15527','15528','15529','9671','16579','16580','16581','15530','16582','16583','15531','9668','16584','15526','9672','9669','14841','9673',
'9670','9674','9845','9675','15538','9676','9677','15539','13766','9678','13767','19438','15540','13768','19439','9679','9904','8290','8443','9683','9684','9685','9686','9687','15544','9688','15543','9689','23143','15545','15546',
'23141','9694','15551','9451','9695','15552','9696','15553','15554','15555','8128','8129','14936','8130','8114','8115','8116','8120','15556','15557','9702','9701','15559','9703','8602','9704','15560','9705','9706','15561','9707',
'15563','15564','15565','9714','8395','8021','8022','8023','9708','8029','8031','9709','9710','13289','13290','19483','19484','15576','9719','9720','9718','15575','9721','9722','14601','11329','11336','11332','9723','15582',
'9724','9725','9156','9726','15588','9729','15589','9730','9731','15591','15592','8709','9751','8306','15612','15613','15614','11350','10198','15615','9753','9754','9756','8452','8435','15621','9757','15623','23247','23248',
'15622','15624','15626','15627','15625','9764','9765','9766','9767','9768','9769','13366','9770','9771','9772','9773','19437','9869','9776','9777','9779','15639','9780','15640','8953','9781','9782','9783','9784','8954',
'8952','8669','9785','8667','8666','9786','8670','9787','8673','9788','9789','8674','8672','9790','9791','8671','9792','8987','19577','9793','9971','19578','15643','9795','9796','9797','9511','9512','15656','17807','16923','13259',
'19600','20408','19601','9805','19602','9806','23320','23321','23322','23323','23324','23325','23326','23328','23327','9406','23329','8311','8466','9408','15678','15679','15680','15683','15681','15682','15686','15687','9817',
'15688','15689','15690','15691','9818','15692','15693','15694','9819','15695','9820','15696','9821','8625','9822','9823','15697','14213','9826','9827','8511','15705','15704','15703','8185','8186','8187','8190','9844','10188',
'10189','10190','10191','9851','9852','9853','9854','9855','9856','9859','9444','9863','15736','9866','9178','9180','15741','9870','9871','15743','15744','15745','15746','9874','15747','15748','9875','9410','8854','8088','9883',
'9884','9885','9858','19725','9888','9872','9399','15760','9895','15761','23474','9893','15762','15763','9894','9892','23479','9905','9906','9907','9908','9903','9909','9910','9911','9912','15778','15781','15782','15783','15784',
'9922','9923','10151','9936','9949','9950','9967','9968','9969','19784','15804','15805','13635','9301','15811','15812','15813','19809','15816','15815','15817','15818','15819','15822','9981','9982','15823','10008','15824','15825',
'15826','10012','10013','19850','19851','19852','10019','19846','10020','10021','19855','15829','19853','15830','19849','15831','19854','19838','13633','19839','13634','19840','19841','19842','19843','19830','19831','19832','19833',
'14789','19856','15834','19834','19835','11346','19836','11347','11348','19837','8717','8723','10046','10047','19864','10057','23622','23621','10059','23620','10060','10061','10076','10077','10078','10079','10080','10081','10082',
'10083','15879','23641','10128','15880','15881','15882','15883','9266','15884','15885','15886','15887','8358','8143','19921','19923','19924','8715','8716','10135','15902','10136','8904','8905','8906','8999','8937','9000','8938',
'10139','9001','8939','8945','8837','14800','8838','10140','8839','8872','14799','8873','10142','10143','10144','10145','9267','15915','15916','15917','14938','14939','8407','8408','19946','8062','19947','19948','9556','10177',
'8413','15934','10179','15935','10181','10182','10183','10184','10185','10186','15937','15940','10187','15941','19965','10192','10193','10194','19966','10195','19967','9433','9464','12928','8513','10205','10206','10137','10138',
'10141','10209','10210','10211','15961','10212','10152','8855','10217','10218','17184','20028','16450','8578','15985','15986','20056','20057','8733','20058','20059','8543','15991','20062','15992','8230','15993','15995','12558',
'20069','14766','14768','15996','15997','14779','15998','15999','20077','16000','20078','16001','20080','16002','20081','16008','9077','9078','9079','16011','16012','16015','16014','16013','23871','20130','7950','7951','7952',
'8044','8045','8049','12528','16047','16048','16049','16050','16051','14053','9445','11213','16060','11214','16063','16064','16065','11217','11218','16066','16067','16068','16069','16070','16071','16072','16073','16076','16077',
'16078','23959','23960','9513','20187','20188','9924','16088','16087','16490','20197','16089','20198','12691','16090','16091','16092','11276','11278','11287','16093','11288','11289','11280','15318','11286','23979','9226','11246',
'11247','16101','16102','16103','20207','16104','13558','15462','13559','15463','13560','11245','16207','20208','11248','20209','11249','11250','11251','11252','16113','9442','11253','11254','16112','8591','11255','16111','16110',
'16109','9238','17212','11265','11266','16118','16119','11267','11268','16120','8786','8785','8787','8789','16132','8788','16133','16135','16134','15271','11339','8803','9539','9540','14797','12502','12503','11291','11292','11293',
'11294','8159','11295','11296','24026','11264','11297','11298','11300','11274','11301','11275','11365','11302','16565','11303','16567','16492','20261','24044','20263','11343','16152','11344','20264','11277','20262','16153','16154',
'11279','16860','16861','8510','11281','11307','11282','11283','11284','16166','11285','16167','16168','16169','24058','20286','24059','16165','11290','24060','17103','17104','24065','8012','11313','11314','8658','17712',
'16192','11315','16196','16197','16200','20310','16201','11316','20311','16202','20312','11317','11327','11318','16203','12570','12569','24086','11320','11322','11323','11330','11331','16215','11333','11334','16216','16217',
'11337','11338','16223','11340','11342','16225','16226','16227','11345','16228','20353','16229','7960','16230','11349','24131','20363','20364','20360','16237','20361','20362','20365','20366','20367','20368','8648','11366',
'16249','11367','16250','11362','11363','16251','16253','11373','23664','23665','16286','11400','11401','11402','8668','16308','16307','16309','12621','11364','20410','16316','20411','16318','11414','16317','11415','11416',
'24207','11417','11418','11420','11421','24209','11422','9157','9218','9219','9220','20425','11425','20426','15029','11426','20427','24217','16331','11427','11428','20432','20431','20430','20429','20428','8812','8811',
'24232','11432','11434','9059','9060','16349','16350','16351','16352','16353','16354','16355','16356','16357','20459','20460','20461','20462','8308','20463','16365','20465','20464','16366','14609','14610','16487','14518',
'20467','16377','14649','16388','16389','20502','20503','20504','20505','15270','8529','9283','24363','16886','20433','20434','17784','17782','17781','14210','14211','14212','24390','16430','8579','16556','16431',
'16432','20536','16433','20537','20538','16434','20539','16435','9534','12930','16444','9396','12463','16446','12464','12465','12466','20563','16448','12467','9371','16449','20561','9976','9977','16451','16452','8546',
'8547','8548','8549','8551','8552','8553','20564','8307','20570','20571','24452','16466','16467','24453','13619','16476','16475','12712','12710','16484','12498','12499','16485','16486','16488','16489','16491','17175','16493',
   '8219','12504','18000','24498','12505','12506','12507','12508','13695','24506','24507','24508','16506','16507','16508','9270','8675','8676','8677','8437','9386','9387','9878','9879','14617','16510','12529','16511',
'12530','8747','16512','12531','16513','20631','16515','16516','20632','16518','9458','9310','20633','16522','20079','12546','12547','12548','12549','16542','16691','12555','12556','12557','12561','12562','12563','12564',
'12565','8467','8468','12566','16551','16552','16553','12567','16554','12568','24562','16555','16557','16558','13398','16563','16564','16566','16568','12595','12596','12598','12599')
@oruchreis
Copy link
Author

oruchreis commented Aug 27, 2020

I've debugged for this query, and I've ended up again indexunordered.cc. As you mentioned here that increasing kMaxSelectivityPercentForIdset slows down some queries. So I didn't change this constant, instead I've added a threshold constant for the condition of choosing index or scan method. So it prevents scan method if the item count greater than this threshold.

https://github.com/oruchreis/reindexer/blob/c363c40d5f785ca8637ba5a4094a8cdfa130a0d8/cpp_src/core/index/indexunordered.cc#L195

I couldn't benchmarked in depth though, I don't know if it slowed down another query, but as far as I saw this workaround solved my performance issues on wide in() conditions for now. For this reason I didn't created a PR for this, because I don't know if it is a solution or not:
master...oruchreis:set_cond_selectivity

@oruchreis oruchreis changed the title Performance drop on big sized set conditions. Performance drop on excessive number of IN() set conditions. Sep 7, 2020
@oruchreis
Copy link
Author

Hi, have you able to look through this issue? We have splitted our queries because of this issue, big queries and small ones. if we send big queries to reindexer, it will become unresponsive soon.
Thanks.

@slowcheetahzzz
Copy link
Contributor

slowcheetahzzz commented Nov 20, 2020

Hi, have you able to look through this issue? We have splitted our queries because of this issue, big queries and small ones. if we send big queries to reindexer, it will become unresponsive soon.
Thanks.

Hello, @oruchreis. We've been quite busy developing other issues and didn't have any chance to take a look at your problem - we are quite sorry about that. I've just dubbed this issue in our local repository (we use gitlab) and we're going to fix it in the nearest future. Sorry for all the inconveniences and thank you for your patience.

@olegator77
Copy link
Contributor

olegator77 commented Dec 30, 2020

Hi @oruchreis. We have investigated issue.

  1. The reason of original optimization for indexunordered, which have been introduced in 2.9.2, is following: when condition with many arguments of IN operator is secondary (e.g. another query conditions are high selectivity, and already drops results count to few tens or thousands), then use comparator for left tens-thousands of items is more efficient, than request to index, got many results and intersect them.
    Unfortunately request selectivity for choose method is not correct in all cases, e.g. like your case. To fix this there are serious changes it query optimizer are required - need to add second pass of optimization (current 1 pass implementation can't predict, selectivity of another query conditions).
    In next version we will add another factor for enable optimization - number of another conditions in query.

  2. The actual reason of performance problem with your query is bug in comparator - the IN for strings has O(N*M) complexity (M - number of arguments of IN). After fix it will be O(N) amortized. For your case without any another changes this fix will reduce time from 1500ms to 8ms

@Restream Restream deleted a comment Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants