-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathinstructors.html
152 lines (132 loc) · 9.46 KB
/
instructors.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: Lesson Title</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<h1 class="title">Lesson Title</h1>
<h2 class="subtitle">Instructor’s Guide</h2>
<h2 id="legend">Legend</h2>
<p>Relational databases are not as widely used in science as in business, but they are still a common way to store large data sets with complex structure. Even when the data itself isn’t in a database, the metadata could be: for example, meteorological data might be stored in files on disk, but data about when and where observations were made, data ranges, and so on could be in a database to make it easier for scientists to find what they want to.</p>
<h2 id="overall">Overall</h2>
<ul>
<li><p>The first few sections (up to “Missing Data”) usually go very quickly. The pace usually slows down a bit when null values are discussed mostly because learners have a lot of details to keep straight by this point. Things <em>really</em> slow down during the discussion of joins, but this is the key idea in the whole lesson: important ideas like primary keys and referential integrity only make sense once learners have seen how they’re used in joins. It’s worth going over things a couple of times if necessary (with lots of examples).</p></li>
<li><p>The sections on creating and modifying data, and programming with databases, can be dropped if time is short. Of the two, people seem to care most about how to add data (which only takes a few minutes to demonstrate).</p></li>
<li><p>Simple calculations are actually easier to do in a spreadsheet; the advantages of using a database become clear as soon as filtering and joins are needed. Instructors may therefore want to show a spreadsheet with the information from the four database tables consolidated into a single sheet, and demonstrate what’s needed in both systems to answer questions like, “What was the average radiation reading in 1931?”</p></li>
<li>Some advanced learners may have heard that NoSQL databases (i.e., ones that don’t use the relational model) are the next big thing, and ask why we’re not teaching those. The answers are:
<ol style="list-style-type: decimal">
<li>Relational databases are far more widely used than NoSQL databases.</li>
<li>We have far more experience with relational databases than with any other kind, so we have a better idea of what to teach and how to teach it.</li>
<li>NoSQL databases are as different from each other as they are from relational databases. Until a leader emerges, it isn’t clear <em>which</em> NoSQL database we should teach.</li>
</ol></li>
</ul>
<!-- No specific notes to add. Save these headers as place-holders for now
## [Selecting Data](01-select.html)
## [Sorting and Removing Duplicates](02-sort-dup.html)
## [Filtering](03-filter.html)
## [Calculating New Values](04-calc.html)
## [Missing Data](05-null.html)
## [Aggregation](06-agg.html)
## [Combining Data](07-join.html)
## [Data Hygiene](08-hygiene.html)
## [Creating and Modifying Data](09-create.html)
## [Programming with Databases](10-prog.html)
-->
<h2 id="time-estimates">Time Estimates</h2>
<ul>
<li><span class="citation">@tomwright01</span>: 3 hours</li>
<li><span class="citation">@mckays630</span>: 3 hrs (up to Aggregation, using only shell interface)</li>
</ul>
<h2 id="resources">Resources</h2>
<ul>
<li><code>code/gen-survey-database.sql</code>: re-generate survey database used in examples.</li>
<li><code>code/sqlitemagic.py</code>: IPython Notebook plugin to handle SQLite databases.</li>
<li><code>data/*.csv</code>: CSV versions of data in sample survey database.</li>
</ul>
<h2 id="sqlite-setup">SQLite Setup</h2>
<p>In order to execute the following lessons interactively, please install SQLite as mentioned in the setup instructions for your workshop, then create a directory <code>swc_sql</code>:</p>
<pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">mkdir</span> swc_sql
$ <span class="kw">cd</span> swc_sql</code></pre>
<p>Next, download the SQL file <code>survey.sql</code> from <a href="http://files.software-carpentry.org/survey.sql" class="uri">http://files.software-carpentry.org/survey.sql</a> and the database file <code>survey.db</code> from <a href="http://files.software-carpentry.org/survey.db" class="uri">http://files.software-carpentry.org/survey.db</a> and move both files to your <code>swc_sql</code> directory. When you are done, <code>ls</code> should show that they are present:</p>
<pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">ls</span>
<span class="kw">survey.db</span> survey.sql</code></pre>
<p>To test that your version of SQLite can read the database, run this command:</p>
<pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">sqlite3</span> survey.db .schema</code></pre>
<p>(You <em>must</em> include the period at the start of <code>.schema</code>.) The output should look like this:</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="kw">CREATE</span> TABLE Person(
<span class="kw">ident</span> text,
<span class="kw">personal</span> text,
<span class="kw">family</span> text
);
<span class="kw">CREATE</span> TABLE Site(
<span class="kw">name</span> text,
<span class="kw">lat</span> real,
<span class="kw">long</span> real
);
<span class="kw">CREATE</span> TABLE Visited(
<span class="kw">ident</span> integer,
<span class="kw">site</span> text,
<span class="kw">dated</span> text
);
<span class="kw">CREATE</span> TABLE Survey(
<span class="kw">taken</span> integer,
<span class="kw">person</span> text,
<span class="kw">quant</span> text,
<span class="kw">reading</span> real
);</code></pre>
<p>If there is no output, the most likely cause is that the database <code>survey.db</code> isn’t in the directory where you’re running the command. In this case, SQLite creates a new, empty database for you (just as a text editor creates a new, empty document if you ask it to open a file that doesn’t exist yet).</p>
<p>To run commands interactively, run SQLite on <code>survey.db</code> <em>without</em> the <code>.schema</code> command:</p>
<pre><code>$ sqlite3 survey.db
SQLite version 3.8.5 2014-08-15 22:37:57
Enter ".help" for usage hints.
sqlite></code></pre>
<p>As shown above, this should give you a prompt that looks like <code>sqlite></code>. You can now start typing commands, or, as the startup line suggests, type <code>.help</code> (again, with a period at the start) to get help.</p>
<p>Finally, you can check the names and files of attached databases with the command <code>.databases</code>:</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="kw">sqlite></span> .databases
<span class="kw">seq</span> name file
<span class="kw">---</span> --------------- ----------------------------------------------------------
<span class="kw">0</span> main /Users/alan_turing/swc_sql/survey.db</code></pre>
<p>and check that the necessary tables “Person”, “Survey”, “Site” and “Visited” exist by typing:</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="kw">sqlite></span> .tables
<span class="kw">Person</span> Site Survey Visited<span class="kw"><</span>/code<span class="kw">><</span>/pre<span class="kw">></span></code></pre>
<p>To exit SQLite and return to the Bash shell, use <code>.quit</code>:</p>
<pre class="sourceCode bash"><code class="sourceCode bash"><span class="kw">sqlite></span> .quit
$</code></pre>
<p>Note: There also instructions targeted at participants in the general <a href="discussion.html">discussion page</a></p>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/sql-novice-survey">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
</body>
</html>