-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCodecademyNotes
579 lines (481 loc) · 18.8 KB
/
CodecademyNotes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
Data modeling as a practice is the process of developing and choosing a way to structure our data and its relationships. A data model is like a blueprint for our data.
Embedded documents
One way to establish a relationship in MongoDB is to use embedded documents. This method allows us to nest data related to a document directly inside of
it! These nested documents are called sub-documents. We already saw an example of this style when we looked at Model B of the photography database
(feel free to pause and take a look again). Here is our car and engine example, modeled with an embedded document:
// Car Document
{
car_id: 48273
model_name: "Corvette",
engine: {
engine_power: 490,
engine_type: "V8",
acceleration: "High"
}
}
In the above example, notice how the engine data is nested inside the car document. This type of data model where we find related data lumped
together into a single collection is known as a denormalized data model.
Additionally, the following scenarios are good use-cases for embedded documents:
Modeling relationships where one entity contains another, also known as a one-to-one relationship. For example, we can think of a database
storing data with a relationship between a car and its unique license plate.
Each record of a car has only one license plate. Modeling relationships that map one entity to many sub-entities, also known as an
one-to-many relationship. For example, we can think of a database storing data with a relationship between a car owner and their multiple-owned cars.
Each record of a car owner can own multiple instances of a car.
References
Links between data. Using references, we can split our data into multiple documents and maintain their relationships.
//Car Document
{
car_id: 48273
model_name: "Corvette",
engine_id: 2165
}
// Engine Document
{
id: 2165
engine_power: 490,
engine_type: "V8",
acceleration: "High"
}
This type of data model where we find related data via a link is known as a normalized data model and typically mimics how a
relational database creates relationships between data.
Additionally, the following scenario is a good use case for references:
Modeling relationships where many instances of one entity can be mapped to many instances of another entity, also
known as many-to-many relationships. For example, we can think of the relationship between car rentals and individuals
renting the cars. A car can be rented by multiple individuals, and an individual can rent multiple cars.
Choosing The Right Model
Choosing between references and embedded documents can be tricky. Let’s consider three cases where we have to choose between an
embedded or a reference-based model. For each case, try to first consider whether we would use references or embedded documents to
model the relationships between the data.
Case A: A time management application that stores one user per task. We want to store details about the task, such as the task name,
the task due date, and the user assigned to the task (and their associated details). There can only be one person assigned to each task.
In this case, since we can only have one person assigned to each task, we are modeling a one-to-one relationship. Assuming we will often
need the data for the user in addition to the task details, it might be preferable to use an embedded document since the information can be
retrieved together in one query. Here is what our document might look using an embedded document:
{
task: "Review Rough Outline",
due: "2022/09/03 09:00",
assignee: {
name: "Alex",
role: "SME",
contact: "[email protected]"
}
}
Case B: A contact information management application that can store multiple addresses per user. The application would store
important details for the person such as their name, as well as their associated addresses.
In this case, since a single user can have multiple associated addresses, we are modeling a one-to-many relationship.
There are two distinct ways we can implement the data model.
First, we could use embedded documents. Our documents might look something like this:
{
first_name: "Alex",
last_name: "Smith",
addresses: [
{ label: "home", value: { address: "123 4th Street", city: "Toronto", region: "ON", country: "Canada", postcode: "M1S 2J8" } },
{ label: "work", value: { address: "1633 Broadway 38th floor", city: "New York", region: "NY", country: "USA", postcode: "10019" } },
"
]
}
The embedded documents model above would be preferable if we are making many queries on the addresses field since we would only
need to query a single document to receive our results.
where we notice that embedded documents are starting to repeat, we may consider using references instead. By using references, our
collection could look like this instead:
// Addresses Collection
{
_id: 123456789
address: {
street: "123 4th Street",
city: "Toronto",
region: "ON",
country: "Canada",
postcode: "M1S 2J8"
}
},
{
_id: 9292944,
address: {
street: "1633 Broadway 38th floor",
city: "New York",
region: "NY",
country: "USA",
postcode: "10019"
}
}
// User Collection
{
first_name: "Alex",
last_name: "Smith"
addresses: [ 123456789, 9292944 ]
},
{
first_name: "Josh",
last_name: "Gold",
addresses: [ 123456789 ]
},
{
first_name: "Timbo",
last_name: "Gray",
addresses: [ 123456789 ]
}
Notice how the user documents include an array field named addresses that stores the unique id of the address document thus avoiding the
repetition of each individual address. While the above example shows one case where we might use references, its important to note that
we need to consider how frequently we are accessing the address information and how large it is before making a data modeling decision.
Case C: A school registration application that manages multiple students. Each student can be in multiple classes. Each class record
can easily identify which students are registered and each student record can quickly find any associated classes.
In this application, we have many students sharing a relationship with many classes. Here, we need to model a many-to-many relationship.
In this case, references might be preferred. Our collections would look similar to this:
// Students Collection
{
_id: 1,
name: "Alex",
average_grade: 3.9,
course_ids: [ 1, 2, 4 ]
},
{
_id: 2,
name: "Bob",
average_grade: 2.4,
course_ids: [ 3, 4 ]
}
// Classes Collection
{
_id: 1,
name: "Intro to MongoDB",
student_ids: [ 1 ]
},
{
_id: 2,
name: "Programming 101",
student_ids: [ 1 ]
},
{
_id: 3,
name: "Networking Concepts",
student_ids: [ 2 ]
},
{
_id: 4,
name: "Understanding Distributed Systems",
student_ids: [ 1, 2 ]
}
BROWSING AND SELECTING COLLECTIONS
command: show dbs
Will output a list of all the databases in our current instance and the disk space each takes up
command: use <db>
example: use online_plant_shop
This would place us inside of the database we have chosen. If the database we specify does not exist MongoDB will create it and place us inside
of it.
command: db
Will output the name of the database we are currently in
command: show collections
Will show you all of the collections that exist
QUERYING
Persistence refers to a database's ability to store data that is stable and enduring. Four essential functions that a persistent database must be able to
perform:
1.) create new data entries
2.) read data entries
3.) update data entries
4.) delete data entries
Querying is the process by which we request data from the database. The most common way to query data in MongoDB is to use the .find() method
.find()
SYNTAX: db.<collection>.find()
MongoDB will return up to the first set of matching documents. If we want to see the next batch of documents, we use the it keyword, short for iterate
command: it
QUERYING COLLECTIONS
If we are looking for a specific document or set of documents we can pass a query to the .find() method as its first argument inside of the
parenthesis. With the query argument, we can list selection criteria and only return documents in the collection that match those specifications
The query argument is formatted as a document with field-value pairs that we want to match
Example syntax:
db.<collection>.find(
{
<field>: <value>,
<second_field>: <value>
...
}
);
{
maker: "Honda",
country: "Japan",
models: [
{ name: "Accord" },
{ name: "Civic" },
{ name: "Pilot },
…
]
},
{
maker: "Toyota",
country: "Japan",
models: [
{ name: "4Runner" },
{ name: "Corolla" },
{ name: "Rav4" },
…
]
},
{
maker: "Ford",
country: "USA",
models: [
{ name: "F-150" },
{ name: "Bronco"},
{ name: "Escape"},
…
]
}
Imagine we wanted to query this collection to find all of the vehicles that are manufactured in "Japan".
We could use the .find() command with a query, like so:
db.auto_makers.find({ country: "Japan" });
This would output the following documents from our collection:
{
maker: "Honda",
country: "Japan",
models: [
{ name: "Accord" },
{ name: "Civic" },
{ name: "Pilot },
…
]
},
{
maker: "Toyota",
country: "Japan",
models: [
{ name: "4Runner" },
{ name: "Corolla" },
{ name: "Rav4" },
…
]
}
Note: Query fields and their associated values are case and space sensitive.
So, a query for a value "Corolla" would not be valid for a lowercase version like "corolla".
This also applies if we accidentally included spaces. So, " corolla" would also not be valid if the value was "corolla".
Under the hood, .find() is actually using an operator to find matches to our query. Operators are special syntax that specifies some logical action
we want to perform when our method executes.
.find() uses the implicit equality operator, $eq
$eq matches documents that include the specified field and value
If we wanted to explicitly include it, we could with the following field-value pair:
{
<field>: { $eq: <value> }
}
This is the equivalent of using the format seen in the first example:
{
<field>: <value>
}
Fortunately, MongoDB handles implicit equality for us, so we can simply use the shorthand syntax for basic queries.
In the upcoming exercises, we’ll learn about other operators that we can use to specify ranges and other criteria for matching documents in our queries.
db.listingsAndReviews.find({ borough: "Brooklyn", cuisine: "Caribbean" })
db.listingsAndReviews.find({ borough: "Brooklyn" })
QUERYING EMBEDDED DOCUMENTS
MongoDB lets us embed documents directly within a parent document. These nested documents are known as sub-documents and help us establish relationships
between our data
{
maker: "Honda",
country: "Japan",
models: [
{ name: "Accord" },
{ name: "Civic" },
{ name: "Pilot" },
…
]
},
models would be considered embedded
We can use the .find() method to query these by using dot notation(.) to access the embedded field.
Looking at the syntax:
db.<collection>.find(
{
"<parent_field>.<embedded_field>": <value>
}
)
Note two important parts of the syntax:
To query embedded documents, we must use a parent field (the name of the field wrapping the embedded document), followed by the dot (.) notation, and the embedded field we are looking for.
To query embedded documents, we must wrap the parent and embedded fields in quotation marks.
If we wanted to find the document with "Pilot" listed as a model, we would write the following command:
db.auto_makers.find({ "models.name" : "Pilot" })
This query would return the following document from our collection:
{
maker: "Honda",
country: "Japan",
models: [
{ name: "Accord" },
{ name: "Civic" },
{ name: "Pilot" },
…
]
}
restaurants> db.listingsAndReviews.find({ "address.zipcode" : "11231" })
When querying numbers you need to put them in double quotes too
COMPARISON OPERATORS: $gt and $lt
The greater than operator, $gt, is used in queries to match documents where the value for a particular field is greater than a specified value. Let’s have a look at the syntax for the $gt operator:
db.<collection>.find( { <field>: { $gt: <value> } } )
To see the $gt operator in action, consider the following collection of US National Parks:
{
name: "Yosemite National Park",
state: "California",
founded: 1890
},
{
name: Crater Lake National Park,
state: "Oregon",
founded: 1902
},
{
name: "Mesa Verde National Park",
state: "Colorado",
founded: 1906
},
{
name: "Olympic National Park",
state: "Washington",
founded: 1909
},
…
To find all parks that were founded after the year 1900, we could execute the following query:
db.national_parks.find({ founded: { $gt: 1900 }});
This would return documents where founded is 1901 or greater:
{
name: Crater Lake National Park,
state: "Oregon",
founded: 1902
},
{
name: "Mesa Verde National Park",
state: "Colorado",
founded: 1906
},
{
name: "Olympic National Park",
state: "Washington",
founded: 1909
}
Note: If we wanted to include the year 1900 in our query, we could use the $gte,
which will match all values that are greater than or equal to the specified value.
We can also match documents that are less than a given value, by using the less than operator, $lt.
For example, if we reference our national_parks example above, we could select all parks that were founded before 1900 with the following query:
db.national_parks.find({ founded: { $lt: 1900 }});
This would return all documents where founded is 1899 or lower.
{
name: "Yosemite National Park",
state: "California",
founded: 1890
}
Note: Similar to the $gte operator, we can use $lte if we want to match all values that are less than or equal to the specified value.
While the examples we examined in this exercise match numerical values, it’s worth noting that these comparison operators can be
used with any data type (e.g., letters).
SORTING DOCUMENTS
MongoDB allows us to sort our query before returning the results to us.
To sort documents we must append the .sort() method to our query. It takes one argument, a document specifying the fields we want to sort by, where their
respective value is the sort order.
SYNTAX
db.<collection>.find().sort(
{
<field>: <value>,
<second_field>: <value>,
...
}
)
There are two values that we can provide for the fields, 1 or -1 (negative one). Specifying a 1 sorts the field in ascending order and -1 sorts it
in descending order. For datetime and string values, a value of 1 would sort the fields, and their corresponding documents in chronological and
alphabetical order while -1 would sort those fields in reverse order.
Example:
db.records.find().sort({ "release_year": 1 });
Results:
{
_id: ObjectId(...),
artist: "The Beatles",
album: "Abbey Road",
release_year: 1969
},
{
_id: ObjectId(...),
artist: "Talking Heads",
album: "Stop Making Sense",
release_year: 1984
},
{
_id: ObjectId(...),
artist: "Prince",
album: "Purple Rain",
release_year: 1984
},
{
_id: ObjectId(...),
artist: "Tracy Chapman",
album: "Tracy Chapman",
release_year: 1988
}
…
when we sort on fields that have duplicate values, documents that have those values may be returned in any order.
Notice in our example above, that we have two documents with the release_year 1984. If we were to run this exact query multiple times,
documents would get returned in numerical order by release_year, but the two documents that have 1984 as their release_year value, might be
returned in a different order each time.
We can also specify additional fields to sort on to receive more consistent results. For example,
we can execute the following query to sort first by release_year and then by artist.
db.records.find().sort({ "release_year": 1, "artist": 1 });
This would return a list of matching documents that were sorted first by the release_year field in ascending order. Then, within each release_year value,
documents would be sorted by the artist field in ascending order. Our query result would look like this:
{
_id: ObjectId(...),
artist: "The Beatles",
album: "Abbey Road",
release_year: 1969
},
{
_id: ObjectId(...),
artist: "Prince",
album: "Purple Rain",
release_year: 1984
},
{
_id: ObjectId(...),
artist: "Talking Heads",
album: "Stop Making Sense",
release_year: 1984
},
{
_id: ObjectId(...),
artist: "Tracy Chapman",
album: "Tracy Chapman",
release_year: 1988
}
…
QUERY PROJECTIONS
MongoDB returns whole documents to us by default but allows us to use projections in our queries to specify the exact fields we want to return
from our matching documents.
To include a projection, we can pass a second argument to the .find() method, a projection document that specifies the fields we want to include, or
exclude, in our returned documents. Fields can have a value of 1 to include that field or 0 to exclude it.
SYNTAX
db.<collection>.find(
<query>,
{
<projection_field_1>: <0 or 1>,
<projection_field_2>: <0 or 1>,
...
}
)
just interested in viewing the address and name fields, we could run the following query that includes a projection:
db.listingsAndReviews.find( {}, {address: 1, name: 1} )
This would return the address and name fields for any documents that match our query. Our output for each document would look as follows:
{
_id: ObjectId("5eb3d668b31de5d588f4292a"),
address: {
building: '2780',
coord: [ -73.98241999999999, 40.579505 ],
street: 'Stillwell Avenue',
zipcode: '11224'
},
name: 'Riviera Caterer'
}
by default, the _id field is included in our projection, even if we do not specify to include it.
But what if we’re not interested in seeing the _id field? We can omit it from our results by specifying the _id field in our projection document.
Instead of setting its value to 1, we’d set it to a value of 0 to exclude it from our return documents.
db.listingsAndReviews.find( {}, {address: 1, name: 1, _id: 0} )
Rather than listing the fields we want to return, we can use a projection to define which fields we want to exclude from our
matching documents by assigning the fields a value of 0. For example, if we wanted to query our collection and see all fields
but the grades field, we could run the following command:
db.restaurants.find( {}, {grades: 0} )
In addition to the methods and operators we’ve covered in this lesson, MongoDB provides us with even more syntax that can be useful to us when performing queries:
The .count() method returns the number of documents that match a query.
The .limit() method can be chained to the .find() method, and specifies the maximum number of documents a query will output.
The $exists operator can be included in a query filter to only match documents that contain the given field.
The $ne operator helps check if a field is not equal to a specified value.
The $and and $or operators help perform AND or OR logic operators.
Lastly, if you are looking for a way to make query outputs look a bit more “pretty”, you can use the .pretty() method!