Merge branch 'master' into bokeh_charts

holoviz · Dec 19, 2015 · 39a81e0 · 39a81e0
2 parents 1534e03 + 22e8b38
commit 39a81e0
Show file tree

Hide file tree

Showing 39 changed files with 1,047 additions and 266 deletions.
diff --git a/doc/Tutorials/Elements.ipynb b/doc/Tutorials/Elements.ipynb
@@ -33,6 +33,7 @@
     "  <dt>[``Scatter``](#Scatter)</dt><dd>Discontinuous collection of points indexed over a single dimension.</dd>\n",
     "  <dt>[``Points``](#Points)</dt><dd>Discontinuous collection of points indexed over two dimensions.</dd>\n",
     "  <dt>[``VectorField``](#VectorField)</dt><dd>Cyclic variable (and optional auxiliary data) distributed over two-dimensional space.</dd>\n",
+    "  <dt>[``Spikes``](#Spikes)</dt><dd>A collection of horizontal or vertical lines at various locations with fixed height (1D) or variable height (2D).</dd>\n",
     "  <dt>[``SideHistogram``](#SideHistogram)</dt><dd>Histogram binning data contained by some other ``Element``.</dd>\n",
     "  </dl>\n",
     "\n",
@@ -461,6 +462,95 @@
     "points + points[0.3:0.7, 0.3:0.7].hist()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### ``Spikes`` <a id='Spikes'></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Spikes represent any number of horizontal or vertical line segments with fixed or variable heights. There are a number of uses for this type, first of all they may be used as a rugplot to give an overview of a one-dimensional distribution. They may also be useful in more domain specific cases, such as visualizing spike trains for neurophysiology or spectrograms in physics and chemistry applications.\n",
+    "\n",
+    "In the simplest case a Spikes object therefore represents a 1D distribution:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%%opts Spikes (alpha=0.4)\n",
+    "xs = np.random.rand(50)\n",
+    "ys = np.random.rand(50)\n",
+    "hv.Points((xs, ys)) * hv.Spikes(xs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When supplying two dimensions to the Spikes object the second dimension will be mapped onto the line height. Optionally you may also supply a cmap and color_index to map color onto one of the dimensions. This way we can for example plot a mass spectrogram:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%%opts Spikes (cmap='Reds')\n",
+    "hv.Spikes(np.random.rand(20, 2), kdims=['Mass'], vdims=['Intensity'])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Another possibility is to draw a number of spike trains as you would encounter in neuroscience. Here we generate 10 separate random spike trains and distribute them evenly across the space by setting their ``position``. By also declaring some yticks each spike traing can be labeled individually:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%%opts Spikes NdOverlay [show_legend=False]\n",
+    "hv.NdOverlay({i: hv.Spikes(np.random.randint(0, 100, 10), kdims=['Time'])(plot=dict(position=0.1*i))\n",
+    "              for i in range(10)})(plot=dict(yticks=[((i+1)*0.1-0.05, i) for i in range(10)]))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally we may use ``Spikes`` to visualize marginal distributions as adjoined plots:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%%opts Spikes (alpha=0.05) [spike_length=0.5] AdjointLayout [border_size=0]\n",
+    "points = hv.Points(np.random.randn(500, 2))\n",
+    "points << hv.Spikes(points['y']) << hv.Spikes(points['x'])"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},

diff --git a/doc/Tutorials/Options.ipynb b/doc/Tutorials/Options.ipynb
@@ -228,7 +228,8 @@
    "cell_type": "code",
    "execution_count": null,
    "metadata": {
-    "collapsed": false
+    "collapsed": false,
+    "scrolled": false
    },
    "outputs": [],
    "source": [
@@ -523,6 +524,45 @@
     "\n",
     "Here parentheses indicate style options, square brackets indicate plot options, and curly brackets indicate norm options (with ``+axiswise`` and ``+framewise`` indicating True for those values, and ``-axiswise`` and ``-framewise`` indicating False).  Additional *target-specification*s and associated options of each type for that *target-specification* can be supplied at the end of this line.  This ultra-concise syntax is used throughout the other tutorials, because it helps minimize the code needed to specify the plotting options, and helps make it very clear that these options are handled separately from the actual data.\n",
     "\n",
+    "Here we demonstrate the concise syntax by customizing the style and plot options of the ``Curve`` in the layout:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%%opts Curve (color='r') [fontsize={'xlabel':15, 'ticks':8}] \n",
+    "layout"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The color of the curve has been changed to red and the fontsizes of the x-axis label and all the tick labels have been modified. The ``fontsize`` is an important plot option and you can find more information about the available options in the ``fontsize`` documentation above.\n",
+    "\n",
+    "The ``%%opts`` magic is designed to allow incremental customization which explains why the curve in the cell above has retained the increased thickness specified earlier. To reset all the customizations that have been applied to an object, you can create a fresh, uncustomized copy as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "layout()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "The ``%opts`` \"line\" magic (with one ``%``) works just the same as the ``%%opts`` \"cell\" magic, but it changes the global default options for all future cells, allowing you to choose a new default colormap, line width, etc.\n",
     "\n",
     "Apart from its brevity, a big benefit of using the IPython magic syntax ``%%opts`` or ``%opts`` is that it is fully tab-completable.  Each of the options that is currently available will be listed if you press ``<TAB>`` when you are ready to write it, which makes it much easier to find the right parameter.  Of course, you will still need to consult the full ``holoviews.help`` documentation (described above) to see the type, allowable values, and documentation for each option, but the tab completion should at least get you started and is great for helping you remember the list of options and see which options are available.\n",
@@ -600,7 +640,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython2",
-   "version": "2.7.10"
+   "version": "2.7.11"
   }
  },
  "nbformat": 4,

diff --git a/doc/Tutorials/Pandas_Seaborn.ipynb b/doc/Tutorials/Pandas_Seaborn.ipynb
@@ -408,8 +408,8 @@
    "outputs": [],
    "source": [
     "%%opts Regression [apply_databounds=True]\n",
-    "tips.regression('total_bill', 'tip', mdims=['smoker','sex'],\n",
-    "                extents=(0, 0, 50, 10), reduce_fn=np.mean).overlay('smoker').layout('sex')"
+    "tips.regression(['total_bill'], ['tip'], mdims=['smoker','sex'],\n",
+    "                extents=(0, 0, 50, 10)).overlay('smoker').layout('sex')"
    ]
   },
   {

diff --git a/doc/reference_data b/doc/reference_data
diff --git a/holoviews/core/data.py b/holoviews/core/data.py
@@ -21,7 +21,7 @@
 
 import param
 
-from .dimension import Dimension
+from .dimension import Dimension, Dimensioned
 from .element import Element, NdElement
 from .dimension import OrderedDict as cyODict
 from .ndmapping import NdMapping, item_check, sorted_context
@@ -52,6 +52,10 @@ class Columns(Element):
         format listed will be used until a suitable format is found (or
         the data fails to be understood).""")
 
+    # In the 1D case the interfaces should not automatically add x-values
+    # to supplied data
+    _1d = False
+
     def __init__(self, data, **kwargs):
         if isinstance(data, Element):
             pvals = util.get_param_values(data)
@@ -82,6 +86,7 @@ def __setstate__(self, state):
         elif util.is_dataframe(self.data):
             self.interface = DFColumns
 
+        super(Columns, self).__setstate__(state)
 
     def closest(self, coords):
         """
@@ -343,6 +348,18 @@ def dimension_values(self, dim, unique=False):
             return dim_vals
 
 
+    def get_dimension_type(self, dim):
+        """
+        Returns the specified Dimension type if specified or
+        if the dimension_values types are consistent otherwise
+        None is returned.
+        """
+        dim_obj = self.get_dimension(dim)
+        if dim_obj and dim_obj.type is not None:
+            return dim_obj.type
+        return self.interface.dimension_type(self, dim)
+
+
     def dframe(self, dimensions=None):
         """
         Returns the data in the form of a DataFrame.
@@ -351,6 +368,7 @@ def dframe(self, dimensions=None):
             dimensions = [self.get_dimension(d).name for d in dimensions]
         return self.interface.dframe(self, dimensions)
 
+
     def columns(self, dimensions=None):
         if dimensions is None: dimensions = self.dimensions()
         dimensions = [self.get_dimension(d) for d in dimensions]
@@ -417,7 +435,8 @@ def initialize(cls, eltype, data, kdims, vdims, datatype=None):
         # Set interface priority order
         if datatype is None:
             datatype = eltype.datatype
-        prioritized = [cls.interfaces[p] for p in datatype]
+        prioritized = [cls.interfaces[p] for p in datatype
+                       if p in cls.interfaces]
 
         head = [intfc for intfc in prioritized if type(data) in intfc.types]
         if head:
@@ -571,7 +590,10 @@ def reshape(cls, eltype, data, kdims, vdims):
             data = tuple(data.get(d) for d in dimensions)
         elif isinstance(data, np.ndarray):
             if data.ndim == 1:
-                data = (np.arange(len(data)), data)
+                if eltype._1d:
+                    data = np.atleast_2d(data).T
+                else:
+                    data = (np.arange(len(data)), data)
             else:
                 data = tuple(data[:, i]  for i in range(data.shape[1]))
         elif isinstance(data, list) and np.isscalar(data[0]):
@@ -593,6 +615,9 @@ def reshape(cls, eltype, data, kdims, vdims):
             raise ValueError("NdColumns interface couldn't convert data.""")
         return data, kdims, vdims
 
+    @classmethod
+    def dimension_type(cls, columns, dim):
+        return Dimensioned.get_dimension_type(columns, dim)
 
     @classmethod
     def shape(cls, columns):
@@ -661,6 +686,12 @@ class DFColumns(DataColumns):
 
     datatype = 'dataframe'
 
+    @classmethod
+    def dimension_type(cls, columns, dim):
+        name = columns.get_dimension(dim).name
+        idx = list(columns.data.columns).index(name)
+        return columns.data.dtypes[idx].type
+
     @classmethod
     def reshape(cls, eltype, data, kdims, vdims):
         element_params = eltype.params()
@@ -693,7 +724,10 @@ def reshape(cls, eltype, data, kdims, vdims):
                 data = OrderedDict(((c, col) for c, col in zip(columns, column_data)))
             elif isinstance(data, np.ndarray):
                 if data.ndim == 1:
-                    data = (range(len(data)), data)
+                    if eltype._1d:
+                        data = np.atleast_2d(data).T
+                    else:
+                        data = (range(len(data)), data)
                 else:
                     data = tuple(data[:, i]  for i in range(data.shape[1]))
 
@@ -812,11 +846,13 @@ def values(cls, columns, dim):
     @classmethod
     def sample(cls, columns, samples=[]):
         data = columns.data
-        mask = np.zeros(cls.length(columns), dtype=bool)
+        mask = False
         for sample in samples:
+            sample_mask = True
             if np.isscalar(sample): sample = [sample]
             for i, v in enumerate(sample):
-                mask = np.logical_or(mask, data.iloc[:, i]==v)
+                sample_mask = np.logical_and(sample_mask, data.iloc[:, i]==v)
+            mask |= sample_mask
         return data[mask]
 
 
@@ -842,6 +878,10 @@ class ArrayColumns(DataColumns):
 
     datatype = 'array'
 
+    @classmethod
+    def dimension_type(cls, columns, dim):
+        return columns.data.dtype.type
+
     @classmethod
     def reshape(cls, eltype, data, kdims, vdims):
         if kdims is None:
@@ -874,7 +914,10 @@ def reshape(cls, eltype, data, kdims, vdims):
         if data is None or data.ndim > 2 or data.dtype.kind in ['S', 'U', 'O']:
             raise ValueError("ArrayColumns interface could not handle input type.")
         elif data.ndim == 1:
-            data = np.column_stack([np.arange(len(data)), data])
+            if eltype._1d:
+                data = np.atleast_2d(data).T
+            else:
+                data = np.column_stack([np.arange(len(data)), data])
 
         if kdims is None:
             kdims = eltype.kdims
@@ -995,9 +1038,12 @@ def sample(cls, columns, samples=[]):
         data = columns.data
         mask = False
         for sample in samples:
+            sample_mask = True
             if np.isscalar(sample): sample = [sample]
             for i, v in enumerate(sample):
-                mask |= data[:, i]==v
+                sample_mask &= data[:, i]==v
+            mask |= sample_mask
+
         return data[mask]
 
 
@@ -1040,6 +1086,11 @@ class DictColumns(DataColumns):
 
     datatype = 'dictionary'
 
+    @classmethod
+    def dimension_type(cls, columns, dim):
+        name = columns.get_dimension(dim).name
+        return columns.data[name].dtype.type
+
     @classmethod
     def reshape(cls, eltype, data, kdims, vdims):
         odict_types = (OrderedDict, cyODict)
@@ -1057,7 +1108,10 @@ def reshape(cls, eltype, data, kdims, vdims):
             data = {d: data[d] for d in dimensions}
         elif isinstance(data, np.ndarray):
             if data.ndim == 1:
-                data = np.column_stack([np.arange(len(data)), data])
+                if eltype._1d:
+                    data = np.atleast_2d(data).T
+                else:
+                    data = np.column_stack([np.arange(len(data)), data])
             data = {k: data[:,i] for i,k in enumerate(dimensions)}
         elif isinstance(data, list) and np.isscalar(data[0]):
             data = {dimensions[0]: np.arange(len(data)), dimensions[1]: data}
@@ -1195,13 +1249,17 @@ def select(cls, columns, selection_mask=None, **selection):
 
     @classmethod
     def sample(cls, columns, samples=[]):
-        mask = np.zeros(len(columns),  dtype=np.bool)
+        mask = False
         for sample in samples:
+            sample_mask = True
             if np.isscalar(sample): sample = [sample]
             for i, v in enumerate(sample):
                 name = columns.get_dimension(i).name
-                mask |= (np.array(columns.data[name])==v)
-        return  {k:np.array(col)[mask] for k, col in columns.data.items()}
+                sample_mask &= (np.array(columns.data[name])==v)
+            mask |= sample_mask
+        return {k: np.array(col)[mask]
+                for k, col in columns.data.items()}
+
 
     @classmethod
     def aggregate(cls, columns, kdims, function, **kwargs):