-
Notifications
You must be signed in to change notification settings - Fork 6
Elasticsearch
Groovity-elasticsearch is a data-source implementation for the groovity-data module that allows an application to leverage Elasticsearch as a datastore for reading and/or writing data.
The Elasticsearch data source REQUIRES knowledge of the elasticsearch base url; by default it attempts to locate elasticsearch on the localhost at the default port
es.baseUrl - defaults to http://localhost:9200/, configure a different value to overrides
e.g. -Des.baseUrl=http://somwhere.else:9200/
This data source defaults to a 60 second timeout connecting to and reading from elasticsearch, you can override this with the http.timeout configuration variables
e.g. -Dhttp.timeout=120
OR to target elasticsearch more exclusively
-D/data/sources/elasticsearch/http.timeout=240
The Elasticsearch data source recognizes 4 special configuration options on a data type:
es.index - the name of the elasticsearch index or alias to query for the type
es.type - the name of the type to search for within the elasticsearch index
es.date - the name of a date field to be used to watch for data changes
es.dateFormat - the date format associated with the date field (not needed if date is represented in raw millis)
This module contains a single trait IsElasticDoc
that can be applied to DataModels to automatically capture _index, _type, _id and _version meta fields and ingest the document _source, as well as build the pointer for a newly created model.
For example, from the unit tests
static conf=[
source:'elasticsearch',
ttl:60,
refresh:45,
'es.index':'unit_test_shoe_inventory',
'es.type':'shoe',
'es.date':'modified'
]
public class Shoe implements DataModel, Stored, IsElasticDoc{
boolean mens
boolean womens
boolean kids
int eyelets
float size
Date modified
def setModified(Date d){
this.modified = d
}
def setModified(Number n){
this.modified = new Date(n.toLong())
}
}
new Shoe()
To create and store a new shoe:
factory('shoe').putAll(
mens:true,
size:10.5,
eyelets:6,
modified: System.currentTimeMillis()
).store()
This module also comes with a default elasticsearch
data type that can be used to perform ad-hoc reads and writes against elasticsearch without defining a custom data type. For this to work you have to manually configure the _index and _type on a model.
load '/data/factory'
factory('elasticsearch').putAll(
_index: 'myIndex',
_type: 'myType',
color: 'purple',
size: 10
).store()
The elasticsearch module allows you to perform queries using native Elasticsearch query string syntax
load '/data/factory'
factory('elasticsearch','myIndex/myType/_search?q=size:>8').each{
//...
}
The results the come back from using the elasticsearch data type are plain maps. You can also use query string syntax with your custom types and the results will be a list of DataModels of your type:
load '/data/factory'
factory('shoe','_search?q=size:>8&mens:true&sort=size:desc').each{
// ...
}
Elasticsearch queries may return a mixture of types, and this will be reflected in the results. Let's say you have multiple custom data types configured in the same index and you want to search across all of them; you can define an additional data type that is attached to a specific index, but not to a specific type, and use it to query or watch against all types in that index.
public static conf = [
source: 'elasticsearch',
ttl: 30,
refresh: 15,
'es.index': 'content',
'es.date': 'modified'
]
static contentWatcher
static start(){
contentWatcher = load('/data/factory').watch('content'){ pointer ->
//aggressively refresh cache entries for all content when modified
load('/data/factory').refresh(pointer)
}
}
static destroy(){
contentWatcher?.cancel(true)
}
class Content implements DataModel, IsElasticDoc {
}
new Content()
Then to query you would call
load '/data/factory'
factory('content','_search?q=weather').each{
//...
}
As long as the type of each result in elasticsearch exactly corresponds to the groovity type name, the result list from the query will contain an appropriate mixture of your DataModel types based on the elasticsearch types. So it pays to make sure you use the same type names in groovity and elasticsearch. The Content class itself doesn't implement any domain fields as it is only used here as a placeholder for the other data types.
You can perform more complex elasticsearch queries, for example processing aggregations or highlights, by using the elasticsearch "source" parameter on the search query url, or by directly supplying a JSON format query in place of a URL format query string. If you use either of these mechanisms, the target data type must be prepared to ingest the raw elasticsearch response and explicitly process hits and/or aggregations.