-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathQueryingArrayFields
286 lines (248 loc) · 10.7 KB
/
QueryingArrayFields
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
QUERYING FOR AN ENTIRE ARRAY
You can use the .find() method to do this
example:
db.books.find({ genres: ["young adult", "fantasy", "adventure"]})
this query would only return documents where the array field contains precisely the values included in the query in the specified order.
Matching Individual Array Elements
db.books.find({ genres: "young adult" })
This would return all documents that have a genres field that is an array that contains the value "young adult",
in addition to any other genres. Here is what our query might return:
{
_id: ObjectId(...),
title: "Children of Blood and Bone",
author: "Tomi Adeyemi",
year_published: 2018,
genres: ["fantasy", "young adult", "adventure"]
},
{
_id: ObjectId(...),
title: "The Hunger Games",
author: "Suzanne Collins",
year_published: 2008,
genres: ["young adult", "dystopian", "science fiction"]
},
{
_id: ObjectId(...),
title: "The Black Flamingo",
author: "Dean Atta",
year_published: 2019,
genres: ["young adult", "coming of age", "LGBTQ"]
},
…
Note that every document in our result has a genres field that contains "young adult" as one of its values, in no particular order.
Matching Multiple Array Elements with $all
We can use the $all operator to match documents for an array field that includes all the specified elements, without regard for
the order of the elements or additional elements in the array.
We could use the $all operator to perform this query, like so:
db.books.find({ genres: { $all: [ "science fiction", "adventure" ] } })
This query might return the following documents:
{
_id: ObjectId(...),
title: "Jurassic Park",
author: "Michael Crichton",
year_published: 1990,
genres: ["science fiction", "adventure", "fantasy", "thriller"]
},
{
_id: ObjectId(...),
title: "A Wrinkle in Time",
author: "Madeleine L'Engle",
year_published: 1962,
genres: ["young adult", "fantasy", "science fiction", "adventure"]
},
{
_id: ObjectId(...),
title: "Dune",
author: "Frank Herbert",
year_published: 1965,
genres: ["science fiction", "fantasy", "adventure"]
},
…
Notice that using the $all operator will match documents where the given array field contains
all the specified elements in any order, not necessarily the order provided in the query. Furthermore, our search returns documents
where the genres array includes other elements, in addition to the ones specified in our query.
Querying on Compound Filter Conditions
In addition to querying array fields for exact matches and individual elements, we can use comparison operators to search for documents where elements in an array field meet some condition or range of conditions.
For example, imagine we have a collection of tennis athletes, called tennis_players, with each document having a similar structure:
{
_id: ObjectId(...),
name: "Serena Williams",
country: "United States",
wimbledon_singles_wins: [2002, 2003, 2009, 2010, 2012, 2015, 2016]
}
If we wanted to search this collection for all athletes who have been Wimbledon Singles Champions from the year 2000 onward, we could run the following query:
db.tennis_players.find({ wimbledon_singles_wins: { $gte: 2000 } });
This would return all documents where the wimbledon_singles_wins array has at least one element with a value of 2000 or greater. Our query result might look something like this:
{
_id: ObjectId(...),
name: "Serena Williams",
country: "United States",
wimbledon_singles_wins: [2002, 2003, 2009, 2010, 2012, 2015, 2016]
},
{
_id: ObjectId(...),
name: "Venus Williams",
country: "United States",
wimbledon_singles_wins: [2000, 2001, 2005, 2007, 2008]
},
{
_id: ObjectId(...),
name: "Roger Federer",
country: "Switzerland",
wimbledon_singles_wins: [2003, 2004, 2005, 2006, 2007, 2009, 2012, 2017]
},
We can also query based on compound conditions. Let’s consider that we want to search our tennis_players collection to find all athletes who won a Wimbledon Singles Championship either before 1935, in the first 50 years of the championship, or after 1971, in the 50 most recent years of the tournament. We could achieve this with the following query:
db.tennis_players.find({ wimbledon_singles_wins: { $gt: 1971, $lt: 1935 } })
This might return the following set of documents:
{
_id: ObjectId(...),
name: "Suzanna Lenglen",
country: "United States",
wimbledon_singles_wins: [1919, 1920, 1921, 1922, 1923, 1925]
},
{
_id: ObjectId(...),
name: "Roger Federer",
country: "Switzerland",
wimbledon_singles_wins: [2003, 2004, 2005, 2006, 2007, 2009, 2012, 2017]
},
…
Note that this query would match documents where the array contains
elements that satisfy the query conditions in some combination. One element could satisfy the greater than 1971 condition, while another
could satisfy the less than 1935 condition. And if the ranges overlapped, a single element could satisfy both conditions. However,
using this syntax, it is not necessary that a single element satisfies all conditions.
In the next exercise we’ll learn how to filter our queries such that the matching documents have at least one array element that satisfies all the specified criteria.
Querying for all conditions with $elemMatch
More often than not, when we specify multiple query conditions for an array field, we’ll want to match at least one array element that meets all the filter criteria. We can accomplish this by using another operator, $elemMatch.
The $elemMatch operator is used in queries to specify multiple criteria on the elements of an array field, such that any returned documents have at least one array element that satisfies all the specified criteria.
Let’s reconsider our previous example about professional tennis players to see $elemMatch in action. Recall that documents from this collection have the following structure:
{
_id: ObjectId(...),
name: "Serena Williams",
country: "United States",
wimbledon_singles_wins: [2002, 2003, 2009, 2010, 2012, 2015, 2016]
}
Imagine that we want to search our collection again, this time, for any athletes who have won the Wimbledon Singles Championship in the current millennium, between the years 2000 and 2019.
Our query would look something like this:
db.tennis_players.find({
wimbledon_singles_wins: { $elemMatch: { $gte: 2000, $lt: 2020 } }
})
This would only return documents whose wimbledon_singles_wins field is an
array containing at least one element that is both greater than or equal to 2000 and less than 2020.
Our resulting cursor might contain the following documents:
{
_id: ObjectId(...),
name: "Pete Sampras",
country: "United States",
wimbledon_singles_wins: [1993, 1994, 1995, 1997, 1998, 1999, 2000]
},
{
_id: ObjectId(...),
name: "Serena Williams",
country: "United States",
wimbledon_singles_wins: [2002, 2003, 2009, 2010, 2012, 2015, 2016]
},
{
_id: ObjectId(...),
name: "Roger Federer",
country: "Switzerland",
wimbledon_singles_wins: [2003, 2004, 2005, 2006, 2007, 2009, 2012, 2017]
}
Note that while any matching documents must contain at least one value in the wimbledon_singles_wins array that is between the
range of 2000 and 2020, this array can also include values that fall outside that range.
Querying an Array of Embedded Documents
Now that we have a grasp on the basics of querying array fields, we can tackle one more common scenario - querying embedded documents. It’s common for a collection to have an array of documents rather than individual values. For instance, take our tennis_players collection again, but now with a slightly different structure:
{
_id: ObjectId(...),
name: "Miyu Kato",
country: "Japan",
wimbledon_doubles_placements:
[{
year: 2016,
place: 2
},
{
year: 2017,
place: 1
},
{
year: 2018,
place: 1
},
{
year: 2019,
place: 1
}]
}
In the above example, we have an array field named wimbledon_doubles_placements that contains documents inside of it. There are two primary
ways we can query the above collection:
a match on an entire embedded document or a match based on a single field.
First, let’s see how we can do an exact match on the entire embedded document. For example, if we wanted to query our tennis_players
collection for players who placed 2nd in 2019:
db.tennis_players.find( { "wimbledon_doubles_placements": { year: 2019, place: 2 } } )
In the above query, the field order must be exactly the order we are looking for, with the exact field values. This query would
match the below document because the order and values are exactly the same as the one in the query:
{
_id: ObjectId(...),
name: "Gabriela Dabrowski",
country: "Canada",
wimbledon_doubles_placements:
[{
year: 2019,
place: 2
}]
},
{
_id: ObjectId(...),
name: "Yifan Xu",
country: “China”,
wimbledon_doubles_placements:
[{
year: 2019,
place: 2
}]
}…
However, a query like this:
db.tennis_players.find( { "wimbledon_doubles_placements": {place: 2, year: 2019 } } )
Would not return any results since there would be no documents with that specific ordering.
We can also query based on a single field. For example, if we just wanted to query for any Wimbledon
doubles winners in the year 2016, we can do the following:
db.tennis_players.find( { "wimbledon_doubles_placements.year": 2016 )
Notice that the syntax is exactly the same as when we were querying for non-array fields. The embedded document field and parent
document field must be wrapped in quotation marks (single or double) and use the dot (.) notation. This query would return
results like the following:
{
_id: ObjectId(...),
name: "Tímea Babos",
country: "Hungary",
wimbledon_doubles_placements:
[{
year: 2015,
place: 4
},
{
year: 2016,
place: 2
}]
{
_id: ObjectId(...),
name: "Yaroslava Shvedova",
country: "Kazakhstan",
wimbledon_doubles_placements:
[{
year: 2010,
place: 1
},
{
year: 2016,
place: 2
}]
}…
Here, our query result has all the documents that have an embedded document with a year field with the value of 2016.
It’s important to note we can even combine these queries with query operators and even do multiple query conditions using $elemMatch.
For more examples, take a look at the official MongoDB documentation on querying embedded documents in arrays.
In addition to the syntax we’ve covered in this lesson, MongoDB provides us with even more operators that can be useful to us
when querying on array fields:
The $size operator is used with .find() to match any array with the specified number of elements.
The $in operator can be included in queries to match documents where the field is an array that contains at least one element in the specified array.
The $nin operator can be included in queries to match documents where the field is an array that contains no elements mentioned in the given array.