-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
145 lines (121 loc) · 9.72 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="POTD Analysis">
<meta name="author" content="Yoav Zimmerman">
<title>Pizza of the Day</title>
<link href='http://fonts.googleapis.com/css?family=Maven+Pro:400,700' rel='stylesheet' type='text/css'>
<link rel="stylesheet" type="text/css" href="css/style.css">
</head>
<body>
<div class="container">
<h1>Pizza of the Day</h1>
<div class="section">
<h2>Introduction</h2>
<p>Everybody loves pizza. Recently, a few friends and I have been eating more pizza than we ever have in our lives. Let me explain: <a href="http://www.800degreespizza.com/">800 Degrees Pizza</a> is Westwood's beloved local "build-your-own" pizza spot. It is regarded by UCLA students as "our best pizza", which in a school with <a href="http://dailybruin.com/2015/01/28/pieology-to-take-a-slice-of-westwoods-competitive-pizza-market/">extremely stiff pizza competition</a>, is not an opinion to be taken lightly. Of course, the superior quality of ingredients will cost you more than your average Domino's pizza, anywhere from $10-15 for a meal.</p>
<p>Now here's where it gets interesting: Since July 2014, 800 Degrees has been offering a deal they call the <i>Instagram Special</i>. Every day, their <a href="https://instagram.com/800degreespizza/">instagram account</a> posts a picture of a pizza which will be <strong>50% off</strong> for the day, which usually ends up being somewhere in the range of $4-5. It didn't take me long to become a habitual checker of instagram every day between the hours of 10 AM to 12 PM, when the <i>POTD</i> was usually posted by the 800 Degrees instagram account. Some days you'd get your average brocollini and mushrooms on margherita. Other days, you'd hit the jackpot with bacon, rosemary ham, or my personal favorite, the legendary <a href="http://en.wikipedia.org/wiki/Peppadew">peppadew pepper</a> on Bianca.</p>
<p>I quickly became interested in how the POTD was picked every day. Who decided on the daily ingredients and what was their criteria for choosing them? Was a pattern behind the ingredients? Most importantly, was there some way to predict the pizza of the day before it was posted?</p>
<p><i>Disclaimer: I am not affiliated with 800 Degree's Pizza</i></p>
</div>
<div class="section">
<h2>Getting the Data</h2>
<p>Since the entire history of POTD is documented on social networks, getting the data consists of a few calls to the instagram API. From there, one can tokenize the description of each post and search for matches to the ingredients publicized on the <a href="http://www.800degreespizza.com/menu/">800 Degrees Menu</a>. A few special considerations have to be taken into account, such as ingredients that have identical words ("red onion" and "carmelized onion") and pluralization. This technique isn't perfect, but after some tweaking I was able to come up with a reasonably accurate history model of POTD.</p>
</div>
<div class="section">
<h2>Base Frequency</h2>
<p>Now that I had a full history of the pizza of the day, I could start generating some graphs to visualize the data. Using the <a href="http://www.chartjs.org/">ChartJS</a> library, it's easier than ever to generate nice looking graphs! First, let's examine the "base"- the cheese and sauce type that forms the core of the pizza. Only one can be chosen per pizza.</p>
<div id="base-overall">
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>Not too surprising. Marinara is the least favorite base amongst most 800 Degrees fans I know due to it's alarming lack of cheese.</p>
</div>
<div id="base-by-month">
<h3>Base Frequency (by month)</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>Most months have a similar base frequency distribution to the overall base frequency distribution. Note that since the instagram special deal has been around for less than a year, some months are missing and July has scarce data.</p>
</div>
<div id="base-by-weekday">
<h3>Base Frequency (by weekday)</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>Breaking down the base frequency by weekdays is a little more interesting. Tuesday has featured a Bianca base 50% of the time, as opposed to an overall ~26% frequency for Bianca.</p>
</div>
</div>
<div class="section">
<h2>Toppings</h2>
<p>Toppings make the pizza. Historically, a <i>POTD</i> can feature anywhere from one to four toppings. On the 800 Degree menu they're grouped into three categories: Proteins, Cheeses, and Vegetables. For now, let's lump all of those together.</p>
<div id="toppings-overall">
<h3>Topping Frequency</h3>
<canvas class="graph"></canvas>
<p>The above graph rules out a completely random algorithm to select the <i>POTD</i>, as clearly some ingredients are favored over others. My guess would be that these toppings are less popular among regular customers and 800 Degree's locations usually have a surplus.</p>
<p>Another interesting metric is to look at how often two ingredients are paired together or how often an ingredient is paired with a specific base. The following two graphs take a look at topping-topping and topping-base frequencies, respectively.</p>
</div>
<div id="topping-pairings">
<h3>Topping-Topping Pair Frequency</h3>
<canvas class="graph"></canvas>
<p>Mmmmm, I dream about chicken, carmelized onion, and sundried tomatoes on Bianca.</p>
</div>
<div id="base-topping-pairings">
<h3>Topping-Base Pair Frequency</h3>
<canvas class="graph"></canvas>
</div>
</div>
<div class="section">
<h2>Prediction</h2>
<p>Theoretically, one could take all this data and plug it into a classification algorithm to try and predict the pizza given the "context" of the day. The "context" for each day could consist of features like ingredients/base of the previous several POTD's, day of the week, and month.</p>
<p>But after experimenting with a few different machine learning algorithms, I wasn't able to find any prediction models of significant accuracy myself. My gut tells me there's not enough data points and selection is too random, but I welcome any experts to experiment with the <a href="data/pizzas.json">dataset</a>! If you find any success, make sure to <a href="http://yoavz.com">contact me</a>, I'd love to know.</p>
</div>
<div class="section">
<h2>Social Data</h2>
<p>I wasn't able to come up with a prediction model, but there's a few more visualizations that could be interesting to take a look at. Since all of the data is coming directly from instagram, we easily have access the likes/comments on each <i>POTD</i> instagram post. It is important to note that likes and comments may depend on many factors; for example, the instagram account of 800 degree's is continually growing in popularity, which leads to more recent pizza's achieving a higher like/comment count on average. Still, likes and comments may be a good metric for estimating general satisfaction with of the <i>POTD</i>.
<div id="base-likes">
<h3>Average Like/Comment Count by Base</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>The public has spoken: Verde is the "most-liked" base. I consider myself a Bianca man, but hey, I can enjoy a good Verde.</p>
</div>
<div id="ingredients-likes">
<h3>Average Like/Comment Count by Ingredient</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
</div>
<div id="likes-by-weekday">
<h3>Average Likes/Comments by Weekday</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>It's interesting to look at how the day of the week affects the amount of likes and comments 800 Degree's gets on each post. The above graph implies that people are slightly more active earlier in the week, but is that because people use Instagram more during the week or because people actually are more interested in 800 Degrees pizza's during the week? Perhaps a little bit of both.
</div>
</div>
<div id="likes-by-time">
<h3>Time Posted (Minutes since 9 AM) against Likes/Comments</h3>
<canvas class="graph"></canvas>
<div class="legend">
<div class="legend-scale"></div>
</div>
<p>The final trend I was interested in was the time the post was made to the instagram amount. The above graph plots minutes since 9 AM on the X axis against Like/Comment count on the Y axis. 10 AM (60 on the X axis) seems to be go time for the 800 Degree's PR team, but sometimes they put it off until after noon. Those are the worst days.
</div>
<p><a href="http://github.com/yoavz/potd">Source</a></p>
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="js/jquery-1.11.2.min.js"></script>
<script src="js/Chart.min.js"></script>
<script src="js/Chart.Scatter.js"></script>
<script src="js/main.js"></script>
</body>
</html>