Searching in Swift? #13

keehun · 2016-02-27T05:50:44Z

Hi,

Been trying to get search to work in Swift. I've spent the last week or so figuring out how to scrape a lot of HTML and have about 220,000 documents in Lucene. Searching with Java works really well. Exactly what I need.

Now I'm trying to bring it to iOS with Swift. I've got the framework set up correctly, and the CLuceneSearchService class initializes fine.

func application(application: UIApplication, didFinishLaunchingWithOptions launchOptions: [NSObject: AnyObject]?) -> Bool {

        searchService = CLuceneSearchService(indexPath:  NSBundle.mainBundle().resourcePath!.stringByAppendingString("/LuceneIndexes"))
        if (searchService as BRSearchService?) == nil {
            print("unable to intialize search database")
        }

        let results:BRSearchResults = searchService.search("piano")
        NSLog("Results: %i", results.count())

        return true
    }

When this code runs, it always returns 0. I'm not sure if:

CLuceneSearchService isn't actually reading the imported indexes (I've checked that the path is correct and files are recognized to already exist)... How do I check how many documents CLuceneSearchService is looking at? IndexReader doesn't seem to be available?
If I need to recreate the indexes with iOS code and not Java code--although I thought the whole point was it wouldn't matter

Any pointers on why this isn't working? The term "piano" should hit many terms (134 hits in the Java code).

Thanks

p.s. Here's the Java code for the search:

public static void searchIndex(String searchString) throws IOException, ParseException {
        System.out.println("Searching for '" + searchString + "'");
        Directory directory = FSDirectory.getDirectory(INDEX_DIRECTORY);
        IndexReader indexReader = IndexReader.open(directory);
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);

        Analyzer analyzer = new StopAnalyzer();

        QueryParser queryParser = new QueryParser(JSON_WORD_SEARCH, analyzer);
        Query query = queryParser.parse(searchString);
        Hits hits = indexSearcher.search(query);
        System.out.println("Number of hits: " + hits.length());

        Iterator<Hit> it = hits.iterator();
        while (it.hasNext()) {
            Hit hit = it.next();
            org.apache.lucene.document.Document document = hit.getDocument();
            String path = document.get(JSON_WORD_DISPLAY);
            System.out.println("Hit: " + path);
        }

    }

keehun · 2016-02-27T06:10:12Z

Ok--I think it has to do with the default field names that BRFullTextSearch has set. There's no easy way to change this if using CocoaPods--will recompile with a dependent project and use my own field keys and see if that'll work.

keehun · 2016-02-27T07:12:24Z

The problem actually was that the built-in search code was using the wrong analyzer for my indexes. It seems really weird that I have to full-on edit the Framework to change these settings. Can they be changed anywhere in my code and leave the library as-is? The snowball analyzer is hard-coded into [CLuceneSearchService defaultAnalyzer] and search: has defaultAnalyzer: hard-coded in, so I'm guessing that it's not yet implemented. I propose these become settings that can be readily changed.

Once I changed this method to what I have here:

- (std::auto_ptr<Analyzer>)analyzerForLanguage:(NSString *)lang {
    std::auto_ptr<Analyzer> stopanalysis(new lucene::analysis::StopAnalyzer());
    return stopanalysis;
}

it worked perfectly. I also had to change the field keys. That seems like something that should definitely be in a setting/property.

I do get that Lucene and BRFullTextSearch was built to used to both create and search documents (and thus negating the need to set your own field keys or analyzer types). Creating an index from elsewhere and importing it into a project wasn't the first consideration--so I understand these design choices.

msqr · 2016-02-28T18:34:05Z

Hi @keehun,

Some parts of this API have not been exposed for extensibility or configurability yet, as you have discovered. This is more from time constraints rather than oversight. In your case, the defaultAnalyzer method can be overridden by subclassing CLuceneSearchService to return the Analyzer instance you'd like to use.

The generalTextFields internal array should be a configurable property on CLuceneSearchService, but as a work-around you can avoid using kBRSearchFieldNameTitle or kBRSearchFieldNameValue field keys in your BRIndexable implementations. As you're creating the index outside of the app, you could just use the expected field names to match what CLuceneSearchService is written to support by default.

Thanks for your feedback!

msqr · 2016-02-28T18:43:38Z

@keehun I forgot to also mention, you can set

@property (nonatomic, getter=isStemmingDisabled) BOOL stemmingDisabled;

to YES to turn off stemming, which makes the Analyzer more similar to using a plain StopAnalyzer, that is a StandardTokenizer along with a LowercaseFilter and StopFilter.

keehun closed this as completed Feb 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Searching in Swift? #13

Searching in Swift? #13

keehun commented Feb 27, 2016

keehun commented Feb 27, 2016

keehun commented Feb 27, 2016

msqr commented Feb 28, 2016

msqr commented Feb 28, 2016

Searching in Swift? #13

Searching in Swift? #13

Comments

keehun commented Feb 27, 2016

keehun commented Feb 27, 2016

keehun commented Feb 27, 2016

msqr commented Feb 28, 2016

msqr commented Feb 28, 2016