Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About range query performance questions. #2266

Closed
MochiXu opened this issue Nov 20, 2023 · 12 comments
Closed

About range query performance questions. #2266

MochiXu opened this issue Nov 20, 2023 · 12 comments

Comments

@MochiXu
Copy link
Contributor

MochiXu commented Nov 20, 2023

Recently, I'm interpreting tantivy in ClickHouse. Here is the schema in tantivy:

let mut schema_builder = Schema::builder();
    schema_builder.add_u64_field("row_id", INDEXED | STORED | FAST);
    schema_builder.add_text_field("text", TEXT);
    let schema = schema_builder.build();

    let mut index = match Index::create_in_dir(dir_str, schema) {
        Ok(index) => index,
        Err(_) => return std::ptr::null_mut(),
    };

And this is my FFI function in Rust, it will consume 0.99s.

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    let query_c_str = unsafe {
        assert!(!query_ptr.is_null());
        CStr::from_ptr(query_ptr)
    };

    let schema = unsafe { (*ir).index.schema() };
    let row_id = schema.get_field("row_id").expect("missing field row_id");
    let text = schema.get_field("text").expect("missing field text");

    let searcher = unsafe { (*ir).reader.searcher() };

    let range_query = RangeQuery::new_u64("row_id".to_string(), lrange..rrange);

    let top_docs = searcher.search(&range_query, &Docs::with_limit(std::u64::MAX as usize)).expect("failed to search");

    false
}

If only text search is applied, it will only consume 0.2s.

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    let query_c_str = unsafe {
        assert!(!query_ptr.is_null());
        CStr::from_ptr(query_ptr)
    };
    let query_str = query_c_str.to_str().expect("failed to get &str from cstr");
    let query_string = query_str.to_string();

    let schema = unsafe { (*ir).index.schema() };
    let row_id = schema.get_field("row_id").expect("missing field row_id");
    let text = schema.get_field("text").expect("missing field text");

    let searcher = unsafe { (*ir).reader.searcher() };

    let query_parser = QueryParser::for_index(unsafe { &(*ir).index }, vec![text]);
    let text_query = query_parser.parse_query(query_str).expect("tantivy failed to parse query");
    
    let top_docs = searcher.search(&text_query, &Docs::with_limit(std::u64::MAX as usize)).expect("failed to search");

    false
}

In fact, I want to filter the row_id using lrange and rrange first, and then do a text search based on the filtered results. I thought this would speed up the query results, but now it looks like range queries are too slow.
The data set I used was wiki abstract 5 million, with each text containing about 200 words.

Would you please give me some advice on what I can do to get a more efficient query😨

@PSeitz
Copy link
Contributor

PSeitz commented Nov 20, 2023

Can you retest without setting the field fast?
If the field is fast we use the columnar storage for range queries. But if only few terms hit the range, a query on the inverted index would be better.

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 20, 2023

@PSeitz Thank you very much for your reply. I use INDEXED retest it. Here I give a minimum test case, the time consume is lower.

You can download the wiki_5m dataset from this link, the datasets is about 1GB.
The variable recreate_index in main function can decide whether recreate tantivy index or not. If you changed field_options for test, please modify recreate_index together.

Here is some information about test case:

  • range query: I will use RangeQuery::new_u64("row_id".to_string(), 80000..90000) to search from index.
  • text query: I will use term volunteer to search.
  • boolean query: combine range query and text query.
BooleanQuery::new(vec![
      (Occur::Must, Box::new(range_query)),
      (Occur::Must, Box::new(text_query)),
  ]);

Here is the test result:

  • row_id field_options is FAST:
range query answer size:10000, range_duration:280ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:333ms
  • row_id field_options is INDEXED:
range query answer size:10000, range_duration:31ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:30ms

One more thing I want to figure out:
In boolean query, I want to know the order in which range query and text query are executed 😺.

Here is my minimum test case:

extern crate tantivy;

use std::fs;
use std::path::Path;
use std::time::Instant;

use tantivy::collector::Count;
use tantivy::query::BooleanQuery;
use tantivy::query::QueryParser;
use tantivy::query::RangeQuery;
use tantivy::query_grammar::Occur;
use tantivy::schema::*;
use tantivy::Index;
use tantivy::ReloadPolicy;
use serde::{Deserialize, Serialize};


#[derive(Deserialize, Serialize)]
struct Item {
    id: u64,
    title: String,
    body: String,
}


fn recreate_or_load_index(index_path: &Path, recreate: bool, schema: &Schema) -> tantivy::Result<Index> {
    if recreate {
        if index_path.exists() {
            let _ = fs::remove_dir_all(index_path);
        }
        let _ = fs::create_dir_all(index_path);
        let mut index = Index::create_in_dir(index_path, schema.clone()).expect("create index error");
        let _ = index.set_default_multithread_executor();
        Ok(index)
    } else {
        let mut index = Index::open_in_dir(index_path).expect("msg");
        let _ = index.set_default_multithread_executor();
        Ok(index)
    }
}


fn index_from_json_file(index:&Index, json_file_path: &Path, schema: &Schema) -> tantivy::Result<()> {
    // Create index_writer and set merge policy.
    let mut index_writer = index.writer_with_num_threads(8, 1024 * 1024 * 1024 * 8).unwrap();
    let mut merge_policy = tantivy::merge_policy::LogMergePolicy::default();
    merge_policy.set_max_docs_before_merge(100_000);
    merge_policy.set_min_num_segments(4);
    index_writer.set_merge_policy(Box::new(merge_policy));

    let row_id = schema.get_field("row_id").unwrap();
    let title = schema.get_field("title").unwrap();
    let body = schema.get_field("body").unwrap();

    // Read datasets.
    let json_content = fs::read_to_string(json_file_path).expect("file should be read");
    let documents: Vec<Item> = serde_json::from_str(&json_content)?;

    // Index data from json.
    for doc in documents {
        let mut temp = Document::default();
        temp.add_u64(row_id, doc.id);
        temp.add_text(title, doc.title);
        temp.add_text(body, doc.body);
        let _ = index_writer.add_document(temp);
    }
    index_writer.commit()?;

    // Release merging threads.
    let segment_ids = index
    .searchable_segment_ids()
    .expect("Searchable segments failed.");
    if segment_ids.len() > 1 {
        index_writer.merge(&segment_ids).wait()?;
        index_writer.wait_merging_threads()?;
    }
    Ok(())

}

fn main() -> tantivy::Result<()> {
    // Tantivy index files local path.
    let index_path = Path::new("/home/mochix/workspace_github/tantivy_demo/index_path");
    // Datasets, you can download it from AWS S3: wget https://myscale-example-datasets.s3.amazonaws.com/wiki_560w.json
    let json_file_path = Path::new("/home/mochix/workspace_github/tantivy_demo/wiki_560w.json");
    // If you want to recreate tantivy index, make it be true.
    let recreate_index = true;

    let mut schema_builder = Schema::builder();
    schema_builder.add_u64_field("row_id", FAST);
    schema_builder.add_text_field("title", TEXT);
    schema_builder.add_text_field("body", TEXT);
    let schema = schema_builder.build();
    
    let index = recreate_or_load_index(index_path, recreate_index, &schema)?;
    if recreate_index {
        let _ = index_from_json_file(&index, json_file_path, &schema);
    }

    let reader = index
    .reader_builder()
    .reload_policy(ReloadPolicy::OnCommit)
    .try_into()?;
    let searcher = reader.searcher();

    let range_query_start = Instant::now();
    let range_query = RangeQuery::new_u64("row_id".to_string(), 80000..90000);
    let range_size = searcher.search(&range_query, &Count).expect("range query error");
    let range_query_duration = range_query_start.elapsed().as_millis();
    println!("range query answer size:{:?}, range_duration:{:?}ms", range_size, range_query_duration);

    let text_query_start = Instant::now();
    let query_parser = QueryParser::for_index(&index, vec![schema.get_field("body").unwrap()]);
    let text_query = query_parser.parse_query("volunteer")?;
    let text_ans_size = searcher.search(&text_query, &Count)?;
    let text_query_duration = text_query_start.elapsed().as_millis();
    println!("text query answer size:{:?}, range_duration:{:?}ms", text_ans_size, text_query_duration);

    let boolean_query_start = Instant::now();
    let boolean_query = BooleanQuery::new(vec![
        (Occur::Must, Box::new(range_query)),
        (Occur::Must, Box::new(text_query)),
    ]);
    let boolean_ans_size = searcher.search(&boolean_query, &Count)?;
    let boolean_query_duration = boolean_query_start.elapsed().as_millis();
    println!("boolean query answer size:{:?}, range_duration:{:?}ms", boolean_ans_size, boolean_query_duration);


    Ok(())
}

@PSeitz
Copy link
Contributor

PSeitz commented Nov 20, 2023

These numbers seem off. Are you running in release mode? What CPU are you on?

In boolean query, I want to know the order in which range query and text query are executed 😺.

They run at the same time, but with INDEXED, a datastructure is created beforehand for efficient intersection.

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 20, 2023

Where can I find out whether this test case is running in release or debug mode?
My test case Cargo.toml is:

[package]
name = "tantivy_demo"
version = "0.1.0"
edition = "2018"

[dependencies]
tantivy = "0.21.1"
tempfile = "3.2"
serde = { version = "1.0.127", features = ["derive"] }
serde_json = "1.0.66"

[[bin]]
name = "tantivy_demo"
path = "src/main.rs"

Here is my cpu info:

❯ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 6900HX with Radeon Graphics
    CPU family:          25
    Model:               68
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            1
    Frequency boost:     enabled
    CPU max MHz:         4933.8862
    CPU min MHz:         1600.0000
    BogoMIPS:            6587.56
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmpe
                         rf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topo
                         ext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveo
                         pt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter
                          pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
  Virtualization:        AMD-V
Caches (sum of all):
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    4 MiB (8 instances)
  L3:                    16 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Mitigation; safe RET, no microcode
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

@fulmicoton
Copy link
Collaborator

You just need to add --release to you cargo run or cargo test command

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 20, 2023

Here is test result in release mode:

  • row_id field_options is INDEXED:
❯ ./target/release/tantivy_demo 
range query answer size:10000, range_duration:5ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:5ms

@PSeitz
Copy link
Contributor

PSeitz commented Nov 20, 2023

What about with FAST?

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 20, 2023

@PSeitz Here is FAST in release mode

❯ ./target/release/tantivy_demo
range query answer size:10000, range_duration:29ms
text query answer size:2574, range_duration:0ms
boolean query answer size:5, range_duration:26ms

Based on the results from the release mode of INDEXED, a search through 5.6 million lines of text takes 0ms.
However, the speed decreases when combined with a range query(5ms).
My requirement is to determine if any row_id within the specified range is present in the results of a text search.

Could you offer some suggestions? The integration of Tantivy into ClickHouse demands stringent requirements regarding search latency, which is quite critical in our process.

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 21, 2023

I’ve realized that by using a custom Collector, I can gather the results for row_id. My current schema is as follows:

schema_builder.add_u64_field("row_id", FAST | INDEXED);
schema_builder.add_text_field("title", TEXT);
schema_builder.add_text_field("body", TEXT);

And the code for the custom Collector is provided below (code file: row_id_collector.rs):

use tantivy::columnar::Column;
// ---
// Importing tantivy...
use tantivy::collector::{Collector, SegmentCollector};
use tantivy::query::QueryParser;
use tantivy::schema::{Schema, FAST, INDEXED, TEXT, Field};
use tantivy::{doc, Index, Score, SegmentReader};


pub struct RowIdCollector {
    pub row_id_field: String,
}

impl RowIdCollector {
    pub fn with_field(row_id_field: String) -> RowIdCollector {
        RowIdCollector { row_id_field }
    }
}

impl Collector for RowIdCollector {
    // That's the type of our result.
    // Our standard deviation will be a float.
    type Fruit = Vec<u64>;

    type Child = RowIdSegmentCollector;

    fn for_segment(
        &self,
        _segment_local_id: u32,
        segment_reader: &SegmentReader,
    ) -> tantivy::Result<Self::Child> {
        let row_id_reader = segment_reader.fast_fields().u64(&self.row_id_field)?;
        Ok(RowIdSegmentCollector {
            row_id_reader: row_id_reader,
            row_ids: Vec::new(),
        })
    }

    fn requires_scoring(&self) -> bool {
        // this collector does not care about score.
        false
    }

    fn merge_fruits(&self, segment_row_ids: Vec<Self::Fruit>) -> tantivy::Result<Self::Fruit> {
        Ok(segment_row_ids.into_iter().flatten().collect())
    }
}

pub struct RowIdSegmentCollector {
    row_id_reader: Column,
    row_ids: Vec<u64>,
}

impl SegmentCollector for RowIdSegmentCollector {
    type Fruit = Vec<u64>;

    fn collect(&mut self, doc: u32, _score: Score) {
        // Since we know the values are single value, we could call `first_or_default_col` on the
        // column and fetch single values.
        for value in self.row_id_reader.values_for_doc(doc) {
            self.row_ids.push(value);
        }
    }

    fn harvest(self) -> <Self as SegmentCollector>::Fruit {
        self.row_ids
    }
}

And here is the test case result in release mode:

❯ ./target/release/tantivy_demo
range query answer size:10000, range_query_duration: 24.703 ms
text query answer size:2574, text_query_duration: 0.118 ms
boolean query answer size:5, boolean_query_duration: 11.758 ms
row_id answer size:2574, row_id_collector_query_duration: 0.777 ms

I am interested in knowing how to modify this custom Collector to accelerate the query if the indexing method for the row_id column is INDEXED rather than FAST | INDEXED.

Based on yesterday's experience, using INDEXED yielded better results than FAST. Therefore, I am considering changing the indexing method of row_id to INDEXED. This means I need to modify the for_segment method of my custom Collector. I am curious about how to find similar example code in the Tantivy source code. Could you guide me on this? 🥺

@PSeitz
Copy link
Contributor

PSeitz commented Nov 21, 2023

Based on the results from the release mode of INDEXED, a search through 5.6 million lines of text takes 0ms.

That's not how it works, it's 1 lookup in the inverted index and return the docs.

However, the speed decreases when combined with a range query(5ms).

Range Query with INDEXED does 10_000 lookups.
Range Query with FAST does a fullscan over the 5.6million docs. This is close to a pathological case how the intersection algorithm works in this case. The distribution of the docs from the search is quite uniform. The skipping is relatively short and the fast field range query always scans chunks of docs.

The issue here is that seek, doesn't just move the cursor foward, but also loads the next doc in this DocSet.

E.g. if DocSet1 (Term DocSet) has docids [1_000_000, 2_000_000] and DocSet2 (FastField DocSet) is empty.
DocSet1 would call seek(1_000_000) on DocSet2. There's no next hit on DocSet2 and would try to load the next docid, doing a fullscan.

Show DocIds For TermQuery

hit:2693
hit:5232
hit:8717
hit:11396
hit:11551
hit:15126
hit:18079
hit:19384
hit:26570
hit:29538
hit:33228
hit:34408
hit:43163
hit:45752
hit:48777
hit:49029
hit:52346
hit:60052
hit:65910
hit:70748
hit:71233
hit:78036
hit:79163
hit:81731
hit:82433
hit:84981
hit:85891
hit:91327
hit:97348
hit:97378
hit:97758
hit:102461
hit:113383
hit:116688
hit:118032
hit:118098
hit:119591
hit:122237
hit:126246
hit:128203
hit:128808
hit:130609
hit:131208
hit:136276
hit:140294
hit:140756
hit:140788
hit:144749
hit:146657
hit:146886
hit:150969
hit:151223
hit:153866
hit:154855
hit:161728
hit:162534
hit:163081
hit:169103
hit:169606
hit:171902
hit:172141
hit:172989
hit:173039
hit:173055
hit:173120
hit:173523
hit:174259
hit:175109
hit:178578
hit:182622
hit:183159
hit:184821
hit:188031
hit:190642
hit:193326
hit:195727
hit:198021
hit:200637
hit:201507
hit:201692
hit:205145
hit:205309
hit:205464
hit:214898
hit:224189
hit:225944
hit:235564
hit:250388
hit:253567
hit:253594
hit:256686
hit:257085
hit:260438
hit:260441
hit:260443
hit:260471
hit:260556
hit:260695
hit:260853
hit:260856
hit:262210
hit:265224
hit:265929
hit:266464
hit:269289
hit:269304
hit:269492
hit:269552
hit:274301
hit:275523
hit:279053
hit:279288
hit:279680
hit:279681
hit:287601
hit:287757
hit:289911
hit:294220
hit:294605
hit:295453
hit:296057
hit:317717
hit:321075
hit:321104
hit:321271
hit:321547
hit:321548
hit:324347
hit:324852
hit:327414
hit:328677
hit:329098
hit:329325
hit:333065
hit:334275
hit:336724
hit:337091
hit:337226
hit:338980
hit:341889
hit:343365
hit:346549
hit:347006
hit:350734
hit:351281
hit:352102
hit:352677
hit:353489
hit:353612
hit:353637
hit:353675
hit:354015
hit:355071
hit:355654
hit:356922
hit:358636
hit:359561
hit:361215
hit:363456
hit:365885
hit:366302
hit:366392
hit:366739
hit:366858
hit:369073
hit:371240
hit:371273
hit:371278
hit:371396
hit:376732
hit:376741
hit:376765
hit:376865
hit:376867
hit:378830
hit:378838
hit:381111
hit:382131
hit:382326
hit:383138
hit:390957
hit:390962
hit:391452
hit:401437
hit:410925
hit:410961
hit:416546
hit:416693
hit:417824
hit:421726
hit:424114
hit:424477
hit:426936
hit:428321
hit:428324
hit:430319
hit:430577
hit:432384
hit:434909
hit:435970
hit:436055
hit:438227
hit:438229
hit:438240
hit:438380
hit:441515
hit:441670
hit:444199
hit:444397
hit:446093
hit:449578
hit:453278
hit:455825
hit:457231
hit:458012
hit:461202
hit:461672
hit:463472
hit:463894
hit:466791
hit:474577
hit:475090
hit:475274
hit:475561
hit:478483
hit:480062
hit:485510
hit:485797
hit:489664
hit:501235
hit:502758
hit:504514
hit:514297
hit:515840
hit:516208
hit:516646
hit:519344
hit:520160
hit:520887
hit:528502
hit:530476
hit:532617
hit:534463
hit:534675
hit:536355
hit:539558
hit:540091
hit:542569
hit:542610
hit:543991
hit:544015
hit:549092
hit:551747
hit:557294
hit:558018
hit:558538
hit:560895
hit:564717
hit:566862
hit:568494
hit:569793
hit:580969
hit:581346
hit:582921
hit:585367
hit:586647
hit:587988
hit:588773
hit:589440
hit:593060
hit:594596
hit:594854
hit:598487
hit:601092
hit:609049
hit:610482
hit:610641
hit:612068
hit:613822
hit:613852
hit:617220
hit:624665
hit:626328
hit:626867
hit:627869
hit:627922
hit:638118
hit:638809
hit:640929
hit:641326
hit:647145
hit:652021
hit:654964
hit:657166
hit:657984
hit:659303
hit:659958
hit:661623
hit:665022
hit:667303
hit:669093
hit:674252
hit:675250
hit:678847
hit:683480
hit:683893
hit:687706
hit:694781
hit:695060
hit:702240
hit:704129
hit:704552
hit:706300
hit:709321
hit:711058
hit:711625
hit:717560
hit:717901
hit:725322
hit:731287
hit:731905
hit:735227
hit:737216
hit:739476
hit:739668
hit:740040
hit:741701
hit:741934
hit:744268
hit:745075
hit:745499
hit:747539
hit:750061
hit:763469
hit:764636
hit:765217
hit:766904
hit:769449
hit:769844
hit:772304
hit:774323
hit:774977
hit:780940
hit:784451
hit:793998
hit:797961
hit:800547
hit:803977
hit:804360
hit:809740
hit:811075
hit:811785
hit:812053
hit:815216
hit:816234
hit:818049
hit:819325
hit:819648
hit:824601
hit:827316
hit:828184
hit:839416
hit:841115
hit:841980
hit:844209
hit:844933
hit:845144
hit:856010
hit:857727
hit:857855
hit:861044
hit:862292
hit:871727
hit:872951
hit:877247
hit:878443
hit:879749
hit:882472
hit:885263
hit:888391
hit:888893
hit:890169
hit:891148
hit:891547
hit:891838
hit:891884
hit:892508
hit:895794
hit:896147
hit:898598
hit:900531
hit:900656
hit:901316
hit:910982
hit:912664
hit:913125
hit:913290
hit:916834
hit:919642
hit:928322
hit:932800
hit:932933
hit:934272
hit:937384
hit:937667
hit:940496
hit:940563
hit:940565
hit:941818
hit:941950
hit:944511
hit:960576
hit:960584
hit:963738
hit:971520
hit:978576
hit:978980
hit:980912
hit:980914
hit:981285
hit:982384
hit:982417
hit:984976
hit:985842
hit:986517
hit:986937
hit:994232
hit:996030
hit:996450
hit:996852
hit:997630
hit:1001034
hit:1002189
hit:1002221
hit:1002241
hit:1002357
hit:1003919
hit:1005699
hit:1006224
hit:1007212
hit:1012150
hit:1016877
hit:1017032
hit:1017036
hit:1022833
hit:1023606
hit:1027128
hit:1033266
hit:1034528
hit:1035041
hit:1035390
hit:1038109
hit:1041024
hit:1041388
hit:1041570
hit:1042684
hit:1042947
hit:1044098
hit:1044099
hit:1044765
hit:1044839
hit:1048065
hit:1050599
hit:1051445
hit:1051736
hit:1054004
hit:1056466
hit:1060403
hit:1061396
hit:1062053
hit:1063426
hit:1064080
hit:1064242
hit:1064243
hit:1064246
hit:1070149
hit:1071236
hit:1072025
hit:1072027
hit:1074228
hit:1074877
hit:1076583
hit:1076830
hit:1081607
hit:1084697
hit:1087104
hit:1089016
hit:1089057
hit:1093966
hit:1098872
hit:1099505
hit:1099559
hit:1100360
hit:1102428
hit:1104284
hit:1104385
hit:1104434
hit:1105129
hit:1105138
hit:1105241
hit:1105922
hit:1108173
hit:1109881
hit:1112343
hit:1113713
hit:1115922
hit:1115929
hit:1116939
hit:1119995
hit:1124259
hit:1126037
hit:1126414
hit:1129001
hit:1130226
hit:1132469
hit:1134273
hit:1135042
hit:1135498
hit:1136562
hit:1137991
hit:1140744
hit:1146185
hit:1147361
hit:1148405
hit:1149042
hit:1149237
hit:1150540
hit:1152934
hit:1153705
hit:1156184
hit:1160394
hit:1169210
hit:1169674
hit:1174884
hit:1175371
hit:1176909
hit:1186355
hit:1187785
hit:1187881
hit:1188067
hit:1188900
hit:1189220
hit:1192271
hit:1193136
hit:1194846
hit:1195563
hit:1197135
hit:1197466
hit:1199007
hit:1200612
hit:1201145
hit:1207944
hit:1210758
hit:1211996
hit:1212361
hit:1217431
hit:1217902
hit:1218421
hit:1221992
hit:1222195
hit:1224113
hit:1226465
hit:1226984
hit:1226986
hit:1227311
hit:1229740
hit:1232712
hit:1233181
hit:1233245
hit:1235299
hit:1236974
hit:1237924
hit:1242540
hit:1242576
hit:1244804
hit:1245317
hit:1246104
hit:1251734
hit:1255405
hit:1255646
hit:1255748
hit:1256521
hit:1258974
hit:1261177
hit:1262938
hit:1263750
hit:1263969
hit:1264009
hit:1267095
hit:1268987
hit:1269597
hit:1276201
hit:1277197
hit:1277491
hit:1278496
hit:1281239
hit:1287945
hit:1288341
hit:1289178
hit:1292083
hit:1294713
hit:1296431
hit:1301959
hit:1305313
hit:1306556
hit:1306675
hit:1309687
hit:1309932
hit:1312640
hit:1314073
hit:1321869
hit:1323679
hit:1329393
hit:1332643
hit:1332761
hit:1335138
hit:1339367
hit:1343039
hit:1343088
hit:1344162
hit:1350806
hit:1353831
hit:1354197
hit:1354342
hit:1354543
hit:1356532
hit:1359120
hit:1359289
hit:1362555
hit:1362880
hit:1370038
hit:1376831
hit:1381568
hit:1384438
hit:1384665
hit:1384774
hit:1385479
hit:1390037
hit:1390557
hit:1391224
hit:1393797
hit:1394647
hit:1397043
hit:1397457
hit:1399244
hit:1399406
hit:1400804
hit:1406066
hit:1410709
hit:1413025
hit:1417189
hit:1417808
hit:1418784
hit:1419368
hit:1421408
hit:1422273
hit:1430091
hit:1430500
hit:1433113
hit:1435051
hit:1439721
hit:1443985
hit:1450178
hit:1454518
hit:1454535
hit:1456531
hit:1459519
hit:1461856
hit:1468509
hit:1468595
hit:1471102
hit:1482465
hit:1486867
hit:1488989
hit:1489194
hit:1495516
hit:1495722
hit:1496979
hit:1498910
hit:1504289
hit:1507975
hit:1508103
hit:1508986
hit:1519451
hit:1519761
hit:1519840
hit:1524130
hit:1532692
hit:1533295
hit:1534825
hit:1535786
hit:1537014
hit:1540407
hit:1543894
hit:1556102
hit:1557507
hit:1558471
hit:1559466
hit:1561221
hit:1568243
hit:1568604
hit:1569532
hit:1571790
hit:1579705
hit:1580404
hit:1582094
hit:1583937
hit:1585428
hit:1586491
hit:1590317
hit:1590673
hit:1590692
hit:1590699
hit:1599071
hit:1599668
hit:1604517
hit:1608169
hit:1611045
hit:1611162
hit:1611469
hit:1611881
hit:1612840
hit:1615711
hit:1619409
hit:1619953
hit:1620145
hit:1620725
hit:1622263
hit:1627081
hit:1628994
hit:1630163
hit:1632178
hit:1638630
hit:1644968
hit:1645293
hit:1647985
hit:1650270
hit:1652071
hit:1656505
hit:1656730
hit:1660015
hit:1660021
hit:1660753
hit:1661257
hit:1663279
hit:1663753
hit:1664057
hit:1664612
hit:1673182
hit:1675252
hit:1676676
hit:1681023
hit:1683600
hit:1685861
hit:1692669
hit:1696556
hit:1697005
hit:1697171
hit:1698179
hit:1698240
hit:1702071
hit:1702913
hit:1706196
hit:1709325
hit:1709860
hit:1710641
hit:1717272
hit:1718448
hit:1718456
hit:1718686
hit:1718708
hit:1719063
hit:1722493
hit:1727765
hit:1728117
hit:1728190
hit:1728778
hit:1739573
hit:1744500
hit:1744610
hit:1752051
hit:1753369
hit:1762048
hit:1762551
hit:1763772
hit:1764624
hit:1765331
hit:1769819
hit:1770930
hit:1771053
hit:1771071
hit:1771252
hit:1771596
hit:1771599
hit:1774215
hit:1778464
hit:1778599
hit:1780857
hit:1785792
hit:1786462
hit:1786552
hit:1787475
hit:1787901
hit:1788498
hit:1788501
hit:1788523
hit:1790216
hit:1792891
hit:1795321
hit:1796201
hit:1800649
hit:1801156
hit:1801581
hit:1805202
hit:1805416
hit:1810574
hit:1810711
hit:1810726
hit:1810935
hit:1812640
hit:1813528
hit:1814010
hit:1815292
hit:1816717
hit:1819599
hit:1819706
hit:1822909
hit:1823702
hit:1827828
hit:1828236
hit:1828360
hit:1828365
hit:1828366
hit:1829162
hit:1831161
hit:1833450
hit:1834144
hit:1834334
hit:1835004
hit:1835263
hit:1835820
hit:1839850
hit:1841742
hit:1841809
hit:1842288
hit:1845411
hit:1845826
hit:1846196
hit:1846256
hit:1846626
hit:1851582
hit:1855589
hit:1857988
hit:1858357
hit:1860962
hit:1870795
hit:1872256
hit:1873099
hit:1875646
hit:1878332
hit:1878785
hit:1879204
hit:1883244
hit:1886663
hit:1889316
hit:1889379
hit:1889785
hit:1890768
hit:1891458
hit:1892895
hit:1893116
hit:1894914
hit:1895232
hit:1895270
hit:1895893
hit:1896729
hit:1896992
hit:1897990
hit:1902567
hit:1907645
hit:1908571
hit:1914453
hit:1915109
hit:1919956
hit:1919959
hit:1920486
hit:1925406
hit:1925456
hit:1925551
hit:1927982
hit:1928608
hit:1931412
hit:1936685
hit:1937153
hit:1940482
hit:1941824
hit:1943199
hit:1943821
hit:1946248
hit:1946566
hit:1946659
hit:1950082
hit:1952205
hit:1952451
hit:1953724
hit:1957159
hit:1959844
hit:1963287
hit:1963791
hit:1964854
hit:1968171
hit:1969101
hit:1971388
hit:1971824
hit:1974397
hit:1974407
hit:1974749
hit:1975242
hit:1977048
hit:1982394
hit:1987825
hit:1992315
hit:1996256
hit:2002373
hit:2005547
hit:2008083
hit:2008101
hit:2008242
hit:2008358
hit:2008570
hit:2011258
hit:2014495
hit:2015461
hit:2017213
hit:2017599
hit:2019162
hit:2028529
hit:2029264
hit:2029361
hit:2034952
hit:2036878
hit:2037042
hit:2037713
hit:2040161
hit:2040817
hit:2042139
hit:2042242
hit:2052803
hit:2053987
hit:2055755
hit:2056081
hit:2069935
hit:2070147
hit:2070251
hit:2081238
hit:2082669
hit:2083891
hit:2085956
hit:2088449
hit:2091338
hit:2100133
hit:2100448
hit:2101420
hit:2106610
hit:2115043
hit:2116549
hit:2123359
hit:2125214
hit:2133229
hit:2138015
hit:2140311
hit:2145344
hit:2152951
hit:2153568
hit:2156724
hit:2158327
hit:2162496
hit:2164986
hit:2167231
hit:2168562
hit:2173624
hit:2174076
hit:2174227
hit:2176011
hit:2178396
hit:2192146
hit:2196682
hit:2208346
hit:2209052
hit:2209071
hit:2214325
hit:2219185
hit:2221619
hit:2223179
hit:2223752
hit:2223987
hit:2225748
hit:2229648
hit:2229811
hit:2236059
hit:2238264
hit:2240638
hit:2243182
hit:2247854
hit:2249814
hit:2252835
hit:2254338
hit:2259737
hit:2262195
hit:2264389
hit:2264485
hit:2265891
hit:2266450
hit:2269351
hit:2284010
hit:2286839
hit:2289093
hit:2290730
hit:2291159
hit:2292768
hit:2295035
hit:2297463
hit:2297632
hit:2299783
hit:2301135
hit:2303915
hit:2306189
hit:2312908
hit:2312913
hit:2317641
hit:2319272
hit:2330251
hit:2330268
hit:2331164
hit:2331431
hit:2331726
hit:2331884
hit:2333104
hit:2335146
hit:2339176
hit:2339254
hit:2339672
hit:2339677
hit:2339685
hit:2339688
hit:2340962
hit:2341474
hit:2349380
hit:2349502
hit:2349519
hit:2350744
hit:2351033
hit:2367460
hit:2368320
hit:2369253
hit:2369449
hit:2372032
hit:2374170
hit:2374198
hit:2374353
hit:2375339
hit:2376565
hit:2376907
hit:2382539
hit:2385812
hit:2386504
hit:2386986
hit:2393581
hit:2395443
hit:2397437
hit:2403783
hit:2404755
hit:2406114
hit:2408196
hit:2408322
hit:2408605
hit:2409857
hit:2412128
hit:2412453
hit:2414303
hit:2424618
hit:2424788
hit:2424829
hit:2426894
hit:2427869
hit:2431680
hit:2432295
hit:2432545
hit:2432581
hit:2440824
hit:2440859
hit:2443540
hit:2444996
hit:2447623
hit:2448963
hit:2450149
hit:2450265
hit:2450268
hit:2450908
hit:2459347
hit:2459383
hit:2460419
hit:2460567
hit:2463752
hit:2465238
hit:2465668
hit:2467021
hit:2467028
hit:2472387
hit:2475166
hit:2476820
hit:2483813
hit:2484755
hit:2485114
hit:2485946
hit:2487591
hit:2491784
hit:2493883
hit:2493887
hit:2495683
hit:2497340
hit:2502164
hit:2503344
hit:2504310
hit:2506328
hit:2508283
hit:2510893
hit:2511008
hit:2511149
hit:2512305
hit:2512332
hit:2512380
hit:2512550
hit:2513004
hit:2513809
hit:2516268
hit:2516293
hit:2518181
hit:2518540
hit:2521149
hit:2521620
hit:2525247
hit:2525612
hit:2528427
hit:2528742
hit:2529046
hit:2529065
hit:2532957
hit:2533075
hit:2539173
hit:2539211
hit:2540430
hit:2541394
hit:2542756
hit:2542894
hit:2542973
hit:2542975
hit:2546691
hit:2546733
hit:2547086
hit:2547314
hit:2547320
hit:2547323
hit:2547338
hit:2547343
hit:2547350
hit:2547353
hit:2550402
hit:2550403
hit:2553689
hit:2555351
hit:2557660
hit:2558038
hit:2558484
hit:2562731
hit:2567757
hit:2570033
hit:2574368
hit:2574520
hit:2574771
hit:2580666
hit:2583047
hit:2583482
hit:2585907
hit:2589010
hit:2589122
hit:2590363
hit:2590380
hit:2590982
hit:2593201
hit:2593260
hit:2593266
hit:2593846
hit:2598369
hit:2598953
hit:2599511
hit:2599719
hit:2603295
hit:2604633
hit:2604644
hit:2605081
hit:2607738
hit:2608725
hit:2609954
hit:2621211
hit:2621556
hit:2624111
hit:2625619
hit:2626361
hit:2628357
hit:2630207
hit:2640961
hit:2644752
hit:2646940
hit:2647404
hit:2654940
hit:2655016
hit:2655675
hit:2655827
hit:2661344
hit:2661531
hit:2661610
hit:2662403
hit:2667405
hit:2668719
hit:2669308
hit:2670575
hit:2671926
hit:2674703
hit:2674947
hit:2677287
hit:2679722
hit:2684980
hit:2686252
hit:2686617
hit:2688206
hit:2692549
hit:2700168
hit:2701296
hit:2706210
hit:2707857
hit:2709264
hit:2710397
hit:2712442
hit:2714920
hit:2715662
hit:2719144
hit:2723005
hit:2726645
hit:2727519
hit:2727700
hit:2727815
hit:2729238
hit:2731013
hit:2738321
hit:2741902
hit:2742041
hit:2746296
hit:2747820
hit:2749986
hit:2751246
hit:2755396
hit:2756143
hit:2757357
hit:2759303
hit:2762093
hit:2764232
hit:2770067
hit:2770298
hit:2770966
hit:2771694
hit:2774275
hit:2775941
hit:2779655
hit:2779928
hit:2783137
hit:2785045
hit:2788434
hit:2791243
hit:2794281
hit:2796227
hit:2800094
hit:2801591
hit:2801953
hit:2804032
hit:2804766
hit:2806727
hit:2810194
hit:2823101
hit:2826085
hit:2830673
hit:2831071
hit:2834270
hit:2837482
hit:2837591
hit:2838018
hit:2841751
hit:2841973
hit:2842847
hit:2843530
hit:2843624
hit:2847580
hit:2853878
hit:2854128
hit:2854404
hit:2855064
hit:2860732
hit:2863414
hit:2865244
hit:2869580
hit:2869916
hit:2876554
hit:2880365
hit:2882814
hit:2884776
hit:2893565
hit:2902334
hit:2905384
hit:2905713
hit:2906175
hit:2906498
hit:2906937
hit:2915486
hit:2921149
hit:2921256
hit:2921817
hit:2922352
hit:2923136
hit:2927486
hit:2928978
hit:2929288
hit:2930018
hit:2931813
hit:2940458
hit:2940621
hit:2941072
hit:2944607
hit:2945008
hit:2945598
hit:2951335
hit:2953588
hit:2954616
hit:2955326
hit:2960370
hit:2962196
hit:2964540
hit:2965549
hit:2967470
hit:2967863
hit:2971769
hit:2978425
hit:2985318
hit:2986910
hit:2988429
hit:2989233
hit:2989805
hit:2989809
hit:2994336
hit:2995973
hit:3001155
hit:3003161
hit:3003444
hit:3005480
hit:3006431
hit:3007412
hit:3011774
hit:3013912
hit:3016751
hit:3018978
hit:3022824
hit:3031921
hit:3032474
hit:3032505
hit:3032520
hit:3034820
hit:3035414
hit:3036725
hit:3037467
hit:3049805
hit:3050555
hit:3055119
hit:3055829
hit:3061377
hit:3061414
hit:3063866
hit:3068963
hit:3074661
hit:3076317
hit:3079036
hit:3079126
hit:3082422
hit:3084137
hit:3084315
hit:3087078
hit:3088235
hit:3088606
hit:3088974
hit:3089130
hit:3092364
hit:3095813
hit:3099050
hit:3100265
hit:3102035
hit:3113970
hit:3115546
hit:3116987
hit:3118199
hit:3119706
hit:3121798
hit:3122405
hit:3123611
hit:3124220
hit:3127395
hit:3130137
hit:3134318
hit:3134710
hit:3135876
hit:3137196
hit:3138627
hit:3138692
hit:3139190
hit:3139258
hit:3140222
hit:3145909
hit:3146036
hit:3148419
hit:3148576
hit:3149998
hit:3152732
hit:3152982
hit:3154177
hit:3157566
hit:3161005
hit:3161126
hit:3161922
hit:3165357
hit:3166399
hit:3167310
hit:3167529
hit:3170105
hit:3171242
hit:3174055
hit:3174149
hit:3174513
hit:3176980
hit:3183184
hit:3183209
hit:3183239
hit:3184185
hit:3192817
hit:3192825
hit:3192963
hit:3193551
hit:3195667
hit:3198568
hit:3199478
hit:3200608
hit:3202189
hit:3202234
hit:3204678
hit:3206166
hit:3210784
hit:3210973
hit:3212276
hit:3212339
hit:3212492
hit:3212495
hit:3212962
hit:3212985
hit:3213161
hit:3216065
hit:3217454
hit:3217878
hit:3218453
hit:3220660
hit:3222620
hit:3223839
hit:3225018
hit:3225275
hit:3225412
hit:3226933
hit:3229209
hit:3229323
hit:3229932
hit:3229936
hit:3229940
hit:3230301
hit:3234534
hit:3234823
hit:3234909
hit:3235038
hit:3235227
hit:3235254
hit:3235273
hit:3235279
hit:3235799
hit:3240075
hit:3242538
hit:3242540
hit:3243024
hit:3245673
hit:3246702
hit:3247345
hit:3247348
hit:3247367
hit:3247388
hit:3247403
hit:3247411
hit:3247464
hit:3247522
hit:3250393
hit:3252152
hit:3252384
hit:3252872
hit:3259812
hit:3264704
hit:3266311
hit:3270818
hit:3271893
hit:3281896
hit:3284453
hit:3294348
hit:3296799
hit:3297393
hit:3298315
hit:3299633
hit:3304264
hit:3306372
hit:3306584
hit:3308095
hit:3314409
hit:3315726
hit:3315800
hit:3315816
hit:3320601
hit:3320619
hit:3320623
hit:3325849
hit:3326258
hit:3331749
hit:3332207
hit:3334164
hit:3337538
hit:3338559
hit:3339383
hit:3343551
hit:3345979
hit:3353701
hit:3358945
hit:3359429
hit:3360050
hit:3361255
hit:3365473
hit:3366530
hit:3368704
hit:3373833
hit:3375734
hit:3378754
hit:3381529
hit:3382583
hit:3384840
hit:3384917
hit:3388511
hit:3389143
hit:3394448
hit:3396745
hit:3396884
hit:3398449
hit:3401997
hit:3405187
hit:3409043
hit:3409728
hit:3411704
hit:3413031
hit:3416887
hit:3418014
hit:3420073
hit:3421262
hit:3430284
hit:3435894
hit:3445512
hit:3450016
hit:3451771
hit:3452911
hit:3453489
hit:3453538
hit:3453646
hit:3454873
hit:3459726
hit:3461068
hit:3461150
hit:3461299
hit:3468092
hit:3471862
hit:3472661
hit:3478892
hit:3480284
hit:3481032
hit:3485207
hit:3491996
hit:3502974
hit:3506619
hit:3508650
hit:3510128
hit:3513243
hit:3513926
hit:3516831
hit:3519976
hit:3519997
hit:3524293
hit:3532072
hit:3533017
hit:3533767
hit:3536204
hit:3538990
hit:3540477
hit:3541398
hit:3542576
hit:3552368
hit:3554374
hit:3557056
hit:3566925
hit:3568139
hit:3568572
hit:3569369
hit:3570076
hit:3571917
hit:3574006
hit:3576554
hit:3593029
hit:3596696
hit:3599101
hit:3600250
hit:3602941
hit:3603052
hit:3603283
hit:3604315
hit:3606181
hit:3606322
hit:3611188
hit:3611261
hit:3616043
hit:3621981
hit:3626265
hit:3627233
hit:3634599
hit:3636635
hit:3639719
hit:3649266
hit:3650629
hit:3652364
hit:3656406
hit:3661894
hit:3663902
hit:3667488
hit:3672720
hit:3675135
hit:3678833
hit:3678916
hit:3678947
hit:3682752
hit:3685306
hit:3689515
hit:3690643
hit:3692056
hit:3697805
hit:3701126
hit:3703035
hit:3703513
hit:3703629
hit:3707454
hit:3712733
hit:3717913
hit:3725501
hit:3725508
hit:3727213
hit:3728570
hit:3728587
hit:3733111
hit:3734134
hit:3735076
hit:3736714
hit:3738253
hit:3743044
hit:3746531
hit:3746537
hit:3747089
hit:3766265
hit:3768218
hit:3768809
hit:3771390
hit:3774179
hit:3774954
hit:3784261
hit:3789210
hit:3791508
hit:3792137
hit:3792149
hit:3792202
hit:3793386
hit:3793783
hit:3794093
hit:3796676
hit:3796974
hit:3801241
hit:3802886
hit:3803670
hit:3803675
hit:3804331
hit:3805281
hit:3818233
hit:3818835
hit:3820337
hit:3820423
hit:3823114
hit:3825799
hit:3826788
hit:3827860
hit:3828866
hit:3832307
hit:3832319
hit:3836629
hit:3841544
hit:3842271
hit:3842274
hit:3842315
hit:3842501
hit:3843712
hit:3843841
hit:3848656
hit:3849811
hit:3849815
hit:3849854
hit:3849871
hit:3850767
hit:3857806
hit:3857873
hit:3857874
hit:3859692
hit:3861242
hit:3863999
hit:3865025
hit:3865041
hit:3865067
hit:3866220
hit:3866513
hit:3867449
hit:3869886
hit:3870900
hit:3871531
hit:3871544
hit:3874433
hit:3874712
hit:3876445
hit:3878585
hit:3882831
hit:3884166
hit:3887070
hit:3887157
hit:3888291
hit:3890218
hit:3892457
hit:3894325
hit:3895211
hit:3895362
hit:3895386
hit:3895698
hit:3895746
hit:3898916
hit:3901317
hit:3901517
hit:3907286
hit:3910366
hit:3910375
hit:3910378
hit:3910379
hit:3910559
hit:3910917
hit:3911049
hit:3911051
hit:3914416
hit:3915350
hit:3915710
hit:3916539
hit:3917093
hit:3919309
hit:3920596
hit:3920687
hit:3920767
hit:3922415
hit:3922711
hit:3924504
hit:3924771
hit:3924777
hit:3926165
hit:3927401
hit:3928097
hit:3928349
hit:3928403
hit:3928538
hit:3931710
hit:3933178
hit:3934536
hit:3937349
hit:3939740
hit:3939755
hit:3939760
hit:3939782
hit:3939900
hit:3944070
hit:3945173
hit:3945624
hit:3945658
hit:3945764
hit:3945769
hit:3946579
hit:3946968
hit:3947598
hit:3948389
hit:3958039
hit:3958101
hit:3958987
hit:3962071
hit:3968099
hit:3968211
hit:3968214
hit:3968296
hit:3969697
hit:3976566
hit:3979434
hit:3982145
hit:3986370
hit:3991221
hit:3991257
hit:3993198
hit:3994816
hit:3996881
hit:4003136
hit:4004791
hit:4005117
hit:4005263
hit:4008416
hit:4009448
hit:4009866
hit:4010552
hit:4011351
hit:4012075
hit:4012765
hit:4014735
hit:4018444
hit:4022771
hit:4023711
hit:4025844
hit:4028377
hit:4030651
hit:4030787
hit:4031844
hit:4034918
hit:4035890
hit:4036496
hit:4036630
hit:4037368
hit:4038492
hit:4042771
hit:4049498
hit:4051438
hit:4056690
hit:4056692
hit:4057386
hit:4060459
hit:4062692
hit:4063536
hit:4065697
hit:4067351
hit:4070317
hit:4077459
hit:4086314
hit:4086669
hit:4087066
hit:4089593
hit:4093434
hit:4096093
hit:4099503
hit:4099746
hit:4101801
hit:4102180
hit:4104709
hit:4108602
hit:4111489
hit:4114428
hit:4114476
hit:4115564
hit:4115779
hit:4115877
hit:4115973
hit:4118138
hit:4123139
hit:4125453
hit:4125613
hit:4126233
hit:4126605
hit:4127282
hit:4127511
hit:4127701
hit:4128013
hit:4129085
hit:4133274
hit:4145601
hit:4147048
hit:4152140
hit:4153450
hit:4154665
hit:4156638
hit:4156980
hit:4157133
hit:4158116
hit:4158852
hit:4161931
hit:4163641
hit:4169310
hit:4174099
hit:4174867
hit:4174978
hit:4179535
hit:4194251
hit:4197296
hit:4208094
hit:4208727
hit:4212166
hit:4213497
hit:4221854
hit:4224140
hit:4229173
hit:4233893
hit:4234465
hit:4237620
hit:4238977
hit:4239099
hit:4241623
hit:4241652
hit:4247651
hit:4247894
hit:4251670
hit:4252204
hit:4252255
hit:4261435
hit:4263388
hit:4269233
hit:4277070
hit:4279601
hit:4280039
hit:4289949
hit:4292490
hit:4292621
hit:4298406
hit:4298650
hit:4302896
hit:4303650
hit:4304490
hit:4316638
hit:4321512
hit:4321527
hit:4321905
hit:4326033
hit:4326117
hit:4326237
hit:4327774
hit:4330065
hit:4331216
hit:4334333
hit:4334392
hit:4335136
hit:4335289
hit:4335482
hit:4336469
hit:4339019
hit:4339358
hit:4349893
hit:4352256
hit:4353690
hit:4354476
hit:4356810
hit:4359710
hit:4366664
hit:4367699
hit:4368141
hit:4368721
hit:4369987
hit:4373079
hit:4373290
hit:4376775
hit:4380648
hit:4381517
hit:4388960
hit:4390014
hit:4390209
hit:4395688
hit:4396539
hit:4397843
hit:4397845
hit:4405427
hit:4415994
hit:4419725
hit:4422297
hit:4424018
hit:4426458
hit:4430704
hit:4433807
hit:4434514
hit:4435128
hit:4436944
hit:4440108
hit:4450040
hit:4451272
hit:4452071
hit:4452564
hit:4452807
hit:4455858
hit:4459724
hit:4462245
hit:4462294
hit:4464760
hit:4465915
hit:4467009
hit:4467128
hit:4468344
hit:4477360
hit:4479253
hit:4480541
hit:4480850
hit:4481186
hit:4482340
hit:4483443
hit:4483481
hit:4486583
hit:4486655
hit:4489888
hit:4490202
hit:4497378
hit:4497775
hit:4499948
hit:4503266
hit:4503496
hit:4508112
hit:4509521
hit:4509573
hit:4510162
hit:4511571
hit:4518198
hit:4518206
hit:4518266
hit:4523068
hit:4528087
hit:4528146
hit:4532143
hit:4534742
hit:4536489
hit:4536889
hit:4538687
hit:4542634
hit:4543194
hit:4544252
hit:4544910
hit:4547930
hit:4549471
hit:4555599
hit:4556364
hit:4556809
hit:4557028
hit:4558664
hit:4561199
hit:4562682
hit:4563921
hit:4564104
hit:4566977
hit:4571749
hit:4571840
hit:4574357
hit:4574445
hit:4578809
hit:4581074
hit:4581603
hit:4583070
hit:4583804
hit:4585269
hit:4585365
hit:4587048
hit:4587432
hit:4587650
hit:4589633
hit:4594951
hit:4598008
hit:4598065
hit:4598115
hit:4598353
hit:4598357
hit:4598367
hit:4600870
hit:4601865
hit:4602293
hit:4602426
hit:4607724
hit:4608597
hit:4608638
hit:4609110
hit:4609134
hit:4610954
hit:4611061
hit:4611846
hit:4611847
hit:4612369
hit:4613402
hit:4616265
hit:4616559
hit:4616927
hit:4616932
hit:4617001
hit:4617007
hit:4617013
hit:4618296
hit:4618685
hit:4619752
hit:4623180
hit:4623460
hit:4627318
hit:4627368
hit:4633846
hit:4635131
hit:4635822
hit:4635981
hit:4636454
hit:4636470
hit:4636982
hit:4638770
hit:4642477
hit:4644335
hit:4644617
hit:4644722
hit:4646031
hit:4649134
hit:4653315
hit:4656225
hit:4658676
hit:4658714
hit:4659213
hit:4662296
hit:4662315
hit:4662426
hit:4663980
hit:4667392
hit:4667654
hit:4670605
hit:4670738
hit:4672333
hit:4673774
hit:4673924
hit:4673931
hit:4678837
hit:4687226
hit:4690162
hit:4693014
hit:4694896
hit:4695565
hit:4696062
hit:4703658
hit:4704060
hit:4710253
hit:4710395
hit:4711925
hit:4717591
hit:4717905
hit:4718070
hit:4719790
hit:4729136
hit:4729728
hit:4736320
hit:4741117
hit:4744005
hit:4744500
hit:4745444
hit:4745683
hit:4747610
hit:4748348
hit:4752686
hit:4756006
hit:4756457
hit:4756564
hit:4757428
hit:4758232
hit:4758547
hit:4772566
hit:4772660
hit:4773000
hit:4774066
hit:4774444
hit:4775330
hit:4778120
hit:4778260
hit:4781641
hit:4789957
hit:4790815
hit:4791610
hit:4795474
hit:4801259
hit:4801717
hit:4804093
hit:4805754
hit:4805916
hit:4808785
hit:4810246
hit:4811236
hit:4813981
hit:4821388
hit:4823603
hit:4824216
hit:4826395
hit:4835313
hit:4836867
hit:4840417
hit:4843347
hit:4845311
hit:4845803
hit:4846525
hit:4850822
hit:4852331
hit:4852345
hit:4857788
hit:4859768
hit:4862035
hit:4866107
hit:4870070
hit:4871542
hit:4875811
hit:4877092
hit:4877410
hit:4882825
hit:4883886
hit:4884021
hit:4893333
hit:4903744
hit:4903964
hit:4904710
hit:4906093
hit:4906259
hit:4910241
hit:4912149
hit:4913795
hit:4919328
hit:4919365
hit:4924145
hit:4924632
hit:4925928
hit:4926271
hit:4928637
hit:4933750
hit:4934275
hit:4942181
hit:4942771
hit:4943812
hit:4949065
hit:4963428
hit:4965341
hit:4973243
hit:4973350
hit:4976968
hit:4979300
hit:4980819
hit:4986947
hit:4989160
hit:4991836
hit:4992379
hit:5001674
hit:5003541
hit:5006749
hit:5008874
hit:5009488
hit:5010656
hit:5012705
hit:5014266
hit:5020284
hit:5023591
hit:5028318
hit:5028751
hit:5032935
hit:5039863
hit:5042063
hit:5044227
hit:5047652
hit:5048153
hit:5048702
hit:5052553
hit:5054381
hit:5057026
hit:5059001
hit:5062355
hit:5065283
hit:5074063
hit:5074982
hit:5075614
hit:5076344
hit:5077182
hit:5081233
hit:5082819
hit:5084416
hit:5085415
hit:5085426
hit:5088261
hit:5091056
hit:5097289
hit:5099888
hit:5101019
hit:5101344
hit:5102899
hit:5105658
hit:5105934
hit:5105935
hit:5106042
hit:5107642
hit:5109596
hit:5110126
hit:5113107
hit:5113522
hit:5115212
hit:5119392
hit:5119498
hit:5119504
hit:5122947
hit:5123017
hit:5127396
hit:5127494
hit:5127642
hit:5128017
hit:5128082
hit:5129401
hit:5131138
hit:5134216
hit:5137005
hit:5139698
hit:5141886
hit:5142999
hit:5150071
hit:5150635
hit:5150642
hit:5150812
hit:5150817
hit:5151170
hit:5151563
hit:5155277
hit:5155674
hit:5157681
hit:5161576
hit:5163605
hit:5166690
hit:5166736
hit:5166738
hit:5166854
hit:5166877
hit:5167974
hit:5168477
hit:5168910
hit:5169217
hit:5169523
hit:5169988
hit:5172002
hit:5176666
hit:5180037
hit:5185396
hit:5188339
hit:5188566
hit:5188748
hit:5196599
hit:5197125
hit:5199958
hit:5201800
hit:5203826
hit:5204024
hit:5206292
hit:5206403
hit:5208863
hit:5213194
hit:5214186
hit:5214550
hit:5214912
hit:5214917
hit:5223306
hit:5233487
hit:5233776
hit:5237249
hit:5241026
hit:5241191
hit:5244196
hit:5244579
hit:5248736
hit:5248883
hit:5248887
hit:5248899
hit:5251855
hit:5254275
hit:5255343
hit:5255345
hit:5255424
hit:5255559
hit:5260356
hit:5261859
hit:5262962
hit:5263400
hit:5263744
hit:5265793
hit:5267725
hit:5269507
hit:5273412
hit:5273571
hit:5273572
hit:5273864
hit:5276274
hit:5280773
hit:5280828
hit:5284753
hit:5288490
hit:5288677
hit:5289982
hit:5292629
hit:5296047
hit:5300856
hit:5302518
hit:5302837
hit:5304178
hit:5307785
hit:5307787
hit:5309026
hit:5309065
hit:5311492
hit:5312805
hit:5312905
hit:5315699
hit:5315869
hit:5318733
hit:5320685
hit:5321041
hit:5322004
hit:5325280
hit:5330160
hit:5330180
hit:5330182
hit:5330189
hit:5330193
hit:5330262
hit:5330265
hit:5332198
hit:5333129
hit:5333725
hit:5337556
hit:5338889
hit:5338928
hit:5339900
hit:5344390
hit:5344866
hit:5347201
hit:5347411
hit:5350292
hit:5352796
hit:5357091
hit:5359983
hit:5360050
hit:5362470
hit:5364789
hit:5368404
hit:5369361
hit:5370044
hit:5370606
hit:5372416
hit:5375024
hit:5379350
hit:5382809
hit:5391962
hit:5392748
hit:5393846
hit:5394641
hit:5396163
hit:5399581
hit:5400600
hit:5400797
hit:5400996
hit:5402993
hit:5403362
hit:5407228
hit:5407302
hit:5411939
hit:5412407
hit:5412976
hit:5418367
hit:5420173
hit:5424684
hit:5426245
hit:5428657
hit:5434305
hit:5434486
hit:5441513
hit:5442562
hit:5442770
hit:5445885
hit:5446363
hit:5446513
hit:5447622
hit:5453347
hit:5456866
hit:5457166
hit:5461448
hit:5462110
hit:5464127
hit:5464794
hit:5465778
hit:5466911
hit:5468117
hit:5469307
hit:5471382
hit:5473198
hit:5473919
hit:5476851
hit:5476934
hit:5477533
hit:5477963
hit:5478070
hit:5479906
hit:5481040
hit:5484836
hit:5485759
hit:5488706
hit:5490768
hit:5495465
hit:5497395
hit:5503764
hit:5504476
hit:5511547
hit:5515569
hit:5521754
hit:5527364
hit:5529979
hit:5530092
hit:5534006
hit:5538303
hit:5541905
hit:5549914
hit:5550026
hit:5550526
hit:5554826
hit:5558235
hit:5562122
hit:5564068
hit:5569276
hit:5569693
hit:5572724
hit:5572971
hit:5577321
hit:5577580
hit:5579384
hit:5579464
hit:5579780
hit:5581770
hit:5582751
hit:5586361
hit:5588369
hit:5589645
hit:5589650
hit:5590343
hit:5591309
hit:5594968
hit:5599340
hit:5602824
hit:5603198
hit:5606095
hit:5608529
hit:5614911
hit:5618517

size:2574

Could you offer some suggestions? The integration of Tantivy into ClickHouse demands stringent requirements regarding search latency, which is quite critical in our process.

What are the requirements?

I am interested in knowing how to modify this custom Collector to accelerate the query if the indexing method for the row_id column is INDEXED rather than FAST | INDEXED.

The code for range queries on the fast fields FAST or inverted index INDEXED is here:
https://github.com/quickwit-oss/tantivy/tree/main/src/query/range_query.

The biggest gain would probably be on this type of query to sort the index by row_id and then exploit the sorting in the RangeDocSet, by finding the relevant docid range via a binary search.

@MochiXu
Copy link
Contributor Author

MochiXu commented Nov 21, 2023

Thank you for your patient explanation, now I understand the scanning method of the FAST index.
My specific requirement is to write an FFI search function for C++ usage.

The tantivy index schema structure looks like:

let mut schema_builder = Schema::builder();
schema_builder.add_u64_field("row_id", FAST | INDEXED);
schema_builder.add_text_field("text", TEXT);
let schema = schema_builder.build();

This FFI search function takes three important parameters:

  • query_ptr the string used for searching
  • lrange the left interval of row_id_range
  • rrange the right interval of row_id_range.

It looks like:

pub struct IndexR {
    pub path: String,
    pub index: Index,
    pub reader: IndexReader,
}

#[no_mangle]
pub extern "C" fn tantivy_search_in_range(ir: *mut IndexR, query_ptr: *const c_char, lrange: u64, rrange: u64) -> bool {
    // ...
}

The function returns a boolean value.
This function will search for the string in the tantivy index and collect a set of corresponding row_ids from the search results. Then, it will determine if there is an overlap between this row_ids set and the row_id_range provided by the function parameters [lrange, rrange]. If there is an overlap, it returns true, otherwise false.

🎯 Everything became clear after that.

1️⃣ ➡️ Initially, I tried using the following more traditional logic to determine if the row_ids overlap with the row_id_range:

let docs_results = searcher.search(&query, &Docs::with_limit(limit as usize)).expect("failed to search");
for (_score, doc_address) in docs_results {
    let row_id_column = &row_id_columns[doc_address.segment_ord as usize];
    let row_id_value = row_id_column.values_for_doc(doc_address.doc_id).next().unwrap_or(0);
    if lrange <= row_id_value && row_id_value <= rrange {
        return true;
    }
}
false

2️⃣ ➡️ Later, I planned to optimize this FFI interface. The goal of the optimization is to reduce latency. So, I thought of combining a range query. First, I would limit the row_id according to the function's parameters [lrange, rrange], and then perform a text search within this specified range [lrange, rrange]. If a result is found, this FFI function would immediately return true; otherwise, it would return false. The code would look something like this:

let range_query = RangeQuery::new_u64("row_id".to_string(), lrange..rrange);
let query_parser = QueryParser::for_index(&index, vec![schema.get_field("text").unwrap()]);
let text_query = query_parser.parse_query("volunteer")?;
let boolean_query = BooleanQuery::new(vec![
    (Occur::Must, Box::new(range_query)),
    (Occur::Must, Box::new(text_query)),
]);
let boolean_ans_size = searcher.search(&boolean_query, &Count)?;
if boolean_ans_size !=0 {
    return true;
}
false

However, the latency when using a range query was still relatively high, as shown in the previous results I posted. Executing a text search alone takes 0.118 ms, but adding a range query increases the latency to 5ms (with the row_id field using an INDEXED only index).

3️⃣ ➡️ Therefore, I started to consider whether it would be possible to work on the custom Collector used in the text search. By directly collecting the row_ids corresponding to the search results in the Collector, it's feasible to determine during the collector process whether the [lrange, rrange] range has been hit. Code looks like:

let row_id_hit_size = searcher.search(&text_query, &RowIdCollector::with_field("row_id".to_string(), lrange, rrange))?;
if row_id_hit_size.unwrap().get_hit_count() !=0 {
    return true;
}
false

The custom Collector code is below:

use tantivy::columnar::Column;
use tantivy::collector::{Collector, SegmentCollector};
use tantivy::{ Score, SegmentReader};

#[derive(Default)]
pub struct RowIdRangeHit {
    hit: bool,
    hit_count: u64,
}

impl RowIdRangeHit {
    pub fn new() -> RowIdRangeHit {
        RowIdRangeHit {
            hit: false,
            hit_count: 0,
        }
    }

    pub fn get_hit_count(&self) -> u64 {
        self.hit_count
    }

    pub fn mark_hit(&mut self) {
        self.hit_count += 1;
        self.hit = true;
    }

    pub fn is_hit(&self) -> bool {
        self.hit
    }
}

pub struct RowIdCollector {
    pub row_id_field: String,
    pub lrange: u64,
    pub rrange: u64,
}

impl RowIdCollector {
    pub fn with_field(row_id_field: String, lrange: u64, rrange: u64) -> RowIdCollector {
        RowIdCollector { row_id_field, lrange, rrange }
    }
}

impl Collector for RowIdCollector {
    type Fruit = Option<RowIdRangeHit>;
    type Child = RowIdSegmentCollector;

    fn for_segment(
        &self,
        _segment_local_id: u32,
        segment_reader: &SegmentReader,
    ) -> tantivy::Result<Self::Child> {
        let row_id_reader_ = segment_reader.fast_fields().u64(&self.row_id_field)?;
        Ok(RowIdSegmentCollector {
            row_id_reader: row_id_reader_,
            row_id_range_hit: RowIdRangeHit::new(),
            lrange: self.lrange,
            rrange: self.rrange,
        })
    }

    fn requires_scoring(&self) -> bool {
        // this collector does not care about score.
        false
    }

    fn merge_fruits(&self, segment_row_ids: Vec<Self::Fruit>) -> tantivy::Result<Self::Fruit> {
        let mut row_id_range_hit = RowIdRangeHit::new();
        for segment_row_id_range_hit in segment_row_ids.into_iter().flatten() {
            if segment_row_id_range_hit.is_hit() {
                row_id_range_hit.hit = true;
                row_id_range_hit.hit_count += segment_row_id_range_hit.get_hit_count();
            }
        }
        Ok(Some(row_id_range_hit))
    }
}

pub struct RowIdSegmentCollector {
    row_id_reader: Column,
    row_id_range_hit: RowIdRangeHit,
    pub lrange: u64,
    pub rrange: u64,
}

impl SegmentCollector for RowIdSegmentCollector {
    type Fruit = Option<RowIdRangeHit>;

    fn collect(&mut self, doc: u32, _score: Score) {
        for row_id_ in self.row_id_reader.values_for_doc(doc) {
            if row_id_ >= self.lrange && row_id_ <= self.rrange {
                self.row_id_range_hit.mark_hit();
            }
        }
    }

    fn harvest(self) -> <Self as SegmentCollector>::Fruit {
        Some(self.row_id_range_hit)
    }
}

This approach is currently the most effective method I have found for minimizing latency. I tested this custom Collector in the test code mentioned earlier (using the row_id with a FAST | INDEXED indexing method). The search latency was reduced to 0.717 ms, which is very exciting.

range query answer size:10000, range_query_duration: 15.182 ms
text query answer size:2574, text_query_duration: 0.126 ms
boolean query answer size:5, boolean_query_duration: 12.160 ms
row_id hit count size:5, row_id_collector_query_duration: 0.717 ms

I'm wondering if changing the indexing method of row_id to INDEXED only could further accelerate the entire query process. I suspect that in cases where there are a large number of search results, the efficiency of this custom Collector might also be affected by the row_id FAST index. For instance, if I search for the word "a", I might get results like the following:

❯ ./target/release/tantivy_demo                   
range query answer size:10000, range_query_duration: 53.900 ms
text query answer size:1796537, text_query_duration: **3.695 ms**
boolean query answer size:3109, boolean_query_duration: 11.944 ms
row_id hit count size:3109, row_id_collector_query_duration: **11.278 ms**

So, if I change the indexing method of row_id to INDEXED only, how should I modify the for_segment function in my custom Collector? The fast_fields used inside this function would require the indexing method of row_id to be FAST, right?

Thank you for your patient reading.💗

@PSeitz
Copy link
Contributor

PSeitz commented Nov 21, 2023

My specific requirement is to write an FFI search function for C++ usage.

I meant the "stringent requirements regarding search latency"

So, if I change the indexing method of row_id to INDEXED only, how should I modify the for_segment function in my custom Collector? The fast_fields used inside this function would require the indexing method of row_id to be FAST, right?

Similar to the range_query. It's slower for large ranges.

DocSets And Intersection

The problem can be simplified two DocSets, each set hits a list of DocIds.

DocSet1  DocSet2
1
2
3
..
100_000  100_000
100_001
...
200_000  200_000

The intersection would result in two DocIds [100_000, 200_000].

On its own, DocSet1 is bad for the INDEXED variant. It also largely depends on the number of terms though, since we need to load all the docs per term and build the BitSet from it.
On its own, DocSet2 is bad for FAST, because there is no index so we need to scan the whole index, although there are just 2 docs.
With intersection it gets more complicated, on which DocSet is going to drive the seek. Currently it seems to alternate, but that doesn't seem the best choice since DocSets may have quite different performance characteristics regarding seek.

Currently tantivy always uses the columnar storage with FAST, but that should be conditional.

You collector manually implements the intersection. It alleviates the problem that seek also forwards to the next DocId.
It also generates new issues, since it is the final destination it can't cause seek on another DocSet, therefore there is no seek on the Term Query.

Early Exit

Your use case is also special, that you would want to early exit the query after the first hit. There is no such mechanism currently, but it could be added by changing how the query is driven with some parameter. There several methods, but one of them is: https://github.com/quickwit-oss/tantivy/blob/main/src/query/weight.rs#L26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants