-
Notifications
You must be signed in to change notification settings - Fork 18
Heatmap Details
Converting the Solr response to a map layer is straigntforward. The heatmap works better if the count data is binned into classifications. At least two independent groups have used Jenks classifications and used computed them with the geostats JavaScript library. From Wikipedia:
The Jenks optimization method, also called the Jenks natural breaks classification method, is a data clustering method designed to determine the best arrangement of values into different classes. This is done by seeking to minimize each class’s average deviation from the class mean, while maximizing each class’s deviation from the means of the other groups. In other words, the method seeks to reduce the variance within classes and maximize the variance between classes.
The number of clusters is may be based on what looks best with your data or, more simply, the number of colors in the color map you want to use.
Once clustered, the counts can be rendered on a map. This library uses OpenLayers 2.x as well as the HeatmapLayer library from Bjoern Hoehrmann. Essentially, each non-zero count is used to create a HeatmapSource and added to the HeatmapLayer. The size of the HeatmapSource object in pixels is based on the grid size of the Solr data and the number of pixels on the map.
When the user clicks on the time slider, selects an item format or enters a keyword, the function web/js/app/views/collections/jda.view.collection.results.js:search is called with an event and a reset flag. The search function calls fetch and web/js/app/collections/jda.collection.item.js:setSearch on this.collection. The code updates an instance of Browser.Items.Collection (jda.view.collection.results.js) which calls the load function in Browser.Items.MapCollection (jda.collection.item.js). This load function can access the UI search info via jda.app.resultsView.getSearch() and generate a request to Solr.
We are using Solr heatmaps to replace the GPU based map tiles. The map tiles provided contained a red icon for each item. Clicking on the map calls up a list of items near the mouse event. The GPU based version of JDA sends a getFeatureInfo request and puts the list of nearby items in a pop-up window. This request includes a SQL query and looks like:
http://geops.cga.harvard.edu/getFeatureInfo&_SQL=select id from jda where dist(point(media_geo_longitude,media_geo_latitude),point(140.45471209375,37.756601563905)) < 2.746582031260301 LIMIT 50&COLUMNS=all
The request goes to the geops (GPU) server. The point is the location of the mouse event. The request returns at most 50 items. The distance (in this case, 2.74...) changes with zoom level. It is computed with:
bounds = jda.app.eventMap.map.getExtent();
latLngBounds = bounds.transform(jda.app.eventMap.map.getProjectionObject(),new OpenLayers.Projection("EPSG:4326"));
dist = 1000*Math.abs(latLngBounds.left-latLngBounds.right)/window.innerWidth;
Here is a sample response (text removed and simplified):
{"n":1,"results": [{"description":"...","display_name":"Hypercities","goog_x":15495339.15782527,"goog_y":4924635.272722726,"id":"391836","layer_type":"Tweet","media_creator_realname":"Unknown","media_creator_username":"skobl_fumiya","media_date_created":1300332945,"media_geo_latitude":40.40270,"media_geo_longitude":139.1970,"media_type":"Tweet","text":"...","thumbnail_url":"http://a0.twimg.com/profile_images/1238183171/skobl_fumiya_normal.jpg","title":"","uri":"48256182158229504","username":"hypercities"}]}
Removing the geops server implies the JDA client must obtain item data in a new way. Since this data is in Solr, we will use it as the source. The required distance calculation can use the new field of type RPT and Solr geofilt spatial filter.
Code in jda.collection.item.js receives the mouse click event on the map and generates a Solr request. Essentially, we want to return the items that are within a few dozen pixels from the click event. However, the Solr query wants a point expressed as latitude, longitude and a distance in kilometers. These values are computed from the mouse click and the map projection. Solr sorts the returned items based on their distance from the mouse click.
The distance we are using for heatmaps is larger than the previous solution. That is, if you click on the map you will get items from further way. This is needed because the heatmap elements on the map are offset from the actual location of the items. Solr heatmap generates counts based on cells. For example, when a cell has one item that heatmap feature is rendered at the center of the cell. Clicking on the map returns a list of nearby items. "Nearby items" is defined as those items whose distance is less then a threshold. So, we make this radius larger than before to ensure the offset doesn't cause items to be excluded.
It would be nice if rendering the heatmap was quicker. I used the Chrome developer tools to see where time was being spent. On one test, it took 270ms between the time the Solr request returned and the heatmap was displayed. 3ms was browser overhead, the remaining 267ms was in the success function. Of that most of the time was in the Jenks classification library (77ms) and OpenLayers libraries (addLayer 76ms, redraw 81ms). So 234ms or 87.6% of the time is spent computing the Jenks classification and dealing with OpenLayers. Less then 14% of the CPU time is spent on making heatmap cells or executing code I wrote. Significant performance improvements can come from using OpenLayers more efficiently or creating the classifications more efficiently.