{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using agate in a Jupyter notebook\n",
    "\n",
    "First we import agate. Then we create an agate Table by loading data from a CSV file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<agate.table.Table object at 0x10f091b00>\n"
     ]
    }
   ],
   "source": [
    "import agate\n",
    "\n",
    "table = agate.Table.from_csv('examples/realdata/ks_1033_data.csv')\n",
    "\n",
    "table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Question 1: What was the total cost to Kansas City area counties?\n",
    "\n",
    "To answer this question, we first must filter the table to only those rows which refer to a Kansas City area `county`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1575\n",
      "15\n"
     ]
    }
   ],
   "source": [
    "kansas_city = table.where(lambda r: r['county'] in ('JACKSON', 'CLAY', 'CASS', 'PLATTE'))\n",
    "\n",
    "print(len(table.rows))\n",
    "print(len(kansas_city.rows))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can then print the Sum of the costs of all those rows. (The cost column is named `total_cost`.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "$3716\n"
     ]
    }
   ],
   "source": [
    "print('$%d' % kansas_city.aggregate(agate.Sum('total_cost')))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Question 2: Which counties spent the most?\n",
    "\n",
    "This question is more complicated. First we group the data by `county`, which gives us a TableSet named `counties`. A TableSet is a group of tables with the same columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('ALLEN', 'ANDERSON', 'BARTON', 'BOURBON', 'BROWN', 'BUTLER', 'CHAUTAUQUA', 'CHEROKEE', 'CHEYENNE', 'CLAY', 'COFFEY', 'COMANCHE', 'COWLEY', 'CRAWFORD', 'DECATUR', 'DICKINSON', 'DOUGLAS', 'ELLIS', 'ELLSWORTH', 'FINNEY', 'FORD', 'FRANKLIN', 'GEARY', 'PRATT', 'GRAHAM', 'GRANT', 'GRAY', 'GREELEY', 'GREENWOOD', 'HAMILTON', 'HARVEY', 'HASKELL', 'JACKSON', 'JEFFERSON', 'JOHNSON', 'KIOWA', 'LABETTE', 'LEAVENWORTH', 'LINN', 'LOGAN', 'LYON', 'MARION', 'MARSHALL', 'MCPHERSON', 'MEADE', 'MIAMI', 'MONTGOMERY', 'NEMAHA', 'NEOSHO', 'NORTON', 'OSAGE', 'OTTAWA', 'PAWNEE', 'POTTAWATOMIE', 'RAWLINS', 'RENO', 'RICE', 'RILEY', 'ROOKS', 'SALINE', 'SCOTT', 'SEDGWICK', 'SHAWNEE', 'SHERMAN', 'SMITH', 'STAFFORD', 'SUMNER', 'THOMAS', 'WABAUNSEE', 'WICHITA', 'WILSON', 'WOODSON', 'WYANDOTTE')\n"
     ]
    }
   ],
   "source": [
    "# Group by county\n",
    "counties = table.group_by('county')\n",
    "\n",
    "print(counties.keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then use the `aggregate` function to sum the `total_cost` column for each table in the group. The resulting values are collapsed into a new table, `totals`, which has a row for each county and a column named `total_cost_sum` containing the new total."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('county', 'total_cost_sum')\n"
     ]
    }
   ],
   "source": [
    "# Aggregate totals for all counties\n",
    "totals = counties.aggregate([\n",
    "    ('total_cost_sum', agate.Sum('total_cost'),)\n",
    "])\n",
    "\n",
    "print(totals.column_names)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we sort the counties by their total cost, limit the results to the top 10 and then print the results as a text bar chart."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "county     total_cost_sum\n",
      "SEDGWICK       977,174.45 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  \n",
      "COFFEY         691,749.03 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                       \n",
      "MONTGOMERY     447,581.20 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                        \n",
      "JOHNSON        420,628.00 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                                          \n",
      "SALINE         245,450.24 ▓░░░░░░░░░░░░░░░░░░                                                       \n",
      "FINNEY         171,862.20 ▓░░░░░░░░░░░░░                                                            \n",
      "BROWN          145,254.96 ▓░░░░░░░░░░░                                                              \n",
      "KIOWA           97,974.00 ▓░░░░░░░                                                                  \n",
      "WILSON          74,747.10 ▓░░░░░                                                                    \n",
      "FORD            70,780.00 ▓░░░░░                                                                    \n",
      "GREENWOOD       69,722.00 ▓░░░░░                                                                    \n",
      "DOUGLAS         68,069.42 ▓░░░░░                                                                    \n",
      "MIAMI           64,691.09 ▓░░░░░                                                                    \n",
      "LYON            51,236.00 ▓░░░░                                                                     \n",
      "HAMILTON        47,989.00 ▓░░░░                                                                     \n",
      "WYANDOTTE       45,259.50 ▓░░░                                                                      \n",
      "CHEYENNE        35,970.92 ▓░░░                                                                      \n",
      "PRATT           35,745.60 ▓░░░                                                                      \n",
      "FRANKLIN        34,309.00 ▓░░░                                                                      \n",
      "HARVEY          27,387.42 ▓░░                                                                       \n",
      "                          +-----------------+-----------------+------------------+-----------------+\n",
      "                          0              250,000           500,000            750,000      1,000,000\n"
     ]
    }
   ],
   "source": [
    "totals.order_by('total_cost_sum', reverse=True).limit(20).print_bars('county', 'total_cost_sum', width=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}