Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending the glossary #7732

Merged
merged 23 commits into from
Aug 18, 2023
Merged
Changes from 12 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
f055342
added align, broadcast,merge, concatenate, combine
harshitha1201 Apr 6, 2023
3c0c0a2
examples added
harshitha1201 Apr 11, 2023
59b9b18
Update doc/user-guide/terminology.rst
harshitha1201 Jul 25, 2023
f8da298
Update doc/user-guide/terminology.rst
harshitha1201 Jul 25, 2023
cee6dad
Update doc/user-guide/terminology.rst
harshitha1201 Jul 25, 2023
22141c8
Merge branch 'main' into add-terminology
harshitha1201 Jul 25, 2023
b11d72f
changes made
harshitha1201 Aug 10, 2023
6e5aa82
add changes
harshitha1201 Aug 10, 2023
5bd705c
.
harshitha1201 Aug 15, 2023
6f9cd9e
Merge branch 'main' into add-terminology
harshitha1201 Aug 15, 2023
4e7a475
Merge branch 'add-terminology' of https://github.com/harshitha1201/xa…
harshitha1201 Aug 15, 2023
fba6824
.
harshitha1201 Aug 16, 2023
2faa23e
Update doc/user-guide/terminology.rst
harshitha1201 Aug 17, 2023
0b3a66a
Update doc/user-guide/terminology.rst
harshitha1201 Aug 17, 2023
3cc357c
Update doc/user-guide/terminology.rst
harshitha1201 Aug 17, 2023
16f7ea8
Merge branch 'main' into add-terminology
harshitha1201 Aug 17, 2023
d344641
changes done
harshitha1201 Aug 18, 2023
c1a0bca
Merge branch 'main' into add-terminology
harshitha1201 Aug 18, 2023
613b544
Update doc/user-guide/terminology.rst
harshitha1201 Aug 18, 2023
da3bce5
Update doc/user-guide/terminology.rst
harshitha1201 Aug 18, 2023
cbf78fd
Update doc/user-guide/terminology.rst
harshitha1201 Aug 18, 2023
2ffe4c9
Update doc/user-guide/terminology.rst
harshitha1201 Aug 18, 2023
34a75f6
Update doc/user-guide/terminology.rst
harshitha1201 Aug 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions doc/user-guide/terminology.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,3 +131,129 @@ complete examples, please consult the relevant documentation.*
``__array_ufunc__`` and ``__array_function__`` protocols are also required.

__ https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html

.. ipython:: python
:suppress:

import numpy as np
import pandas as pd
import xarray as xr

harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Aligning
Aligning refers to the process of ensuring that two or more DataArrays or Datasets
have the same dimensions and coordinates, so that they can be combined or compared properly.

.. ipython:: python

x = xr.DataArray(
[[25, 35], [10, 24]],
dims=("lat", "lon"),
coords={"lat": [35.0, 40.0], "lon": [100.0, 120.0]},
)
y = xr.DataArray(
[[20, 5], [7, 13]],
dims=("lat", "lon"),
coords={"lat": [35.0, 42.0], "lon": [100.0, 120.0]},
)
x
y

harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Broadcasting
A technique that allows operations to be performed on arrays with different shapes and dimensions.
When performing operations on arrays with different shapes and dimensions, xarray will automatically broadcast the
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
arrays to a common shape before the operation is applied.

.. ipython:: python

# 'a' has shape (3,) and 'b' has shape (4,)
a = xr.DataArray(np.array([1, 2, 3]), dims=["x"])
b = xr.DataArray(np.array([4, 5, 6, 7]), dims=["y"])

# 2D array with shape (3, 4)
a + b

harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Merging
Merging is used to combine two or more Datasets or DataArrays that have different variables or coordinates along
the same dimensions. When merging, xarray aligns the variables and coordinates of the different datasets along
the specified dimensions and creates a new ``Dataset`` containing all the variables and coordinates.

.. ipython:: python

# create two 1D arrays with names
arr1 = xr.DataArray(
[1, 2, 3], dims=["x"], coords={"x": [10, 20, 30]}, name="arr1"
)
arr2 = xr.DataArray(
[4, 5, 6], dims=["x"], coords={"x": [20, 30, 40]}, name="arr2"
)

# merge the two arrays into a new dataset
merged_ds = xr.Dataset({"arr1": arr1, "arr2": arr2})
merged_ds

harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Concatenating
Concatenating is used to combine two or more Datasets or DataArrays along a dimension. When concatenating,
xarray arranges the datasets or dataarrays along a new dimension, and the resulting ``Dataset`` or ``Dataarray``
will have the same variables and coordinates along the other dimensions.

.. ipython:: python

a = xr.DataArray([[1, 2], [3, 4]], dims=("x", "y"))
b = xr.DataArray([[5, 6], [7, 8]], dims=("x", "y"))
c = xr.concat([a, b], dim="c")
c

harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Combining
Combining in xarray is a general term used to describe the process of arranging two or more DataArrays or Datasets
into a single ``DataArray`` or ``Dataset`` using some combination of merging and concatenation operations.

.. ipython:: python

ds1 = xr.Dataset(
{"data": xr.DataArray([[1, 2], [3, 4]], dims=("x", "y"))},
coords={"x": [1, 2], "y": [3, 4]},
)
ds2 = xr.Dataset(
{"data": xr.DataArray([[5, 6], [7, 8]], dims=("x", "y"))},
coords={"x": [2, 3], "y": [4, 5]},
)

# combine the datasets
combined_ds = xr.combine_by_coords([ds1, ds2])
combined_ds

lazy
When working with xarray, you often deal with big sets of data. Instead of doing
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
calculations right away, xarray lets you plan what calculations you want to do, like finding the
average temperature in a dataset.This planning is called "lazy evaluation." Later, when
you're ready to see the final result, you tell xarray, "Okay, go ahead and do those calculations now!"
That's when xarray starts working through the steps you planned and gives you the answer you wanted.This
lazy approach helps save time and memory because xarray only does the work when you actually need the
results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lazy evaluation is provided alternately by dask or by hidden xarray internals, depending on whether dask is installed. I'm wondering whether it's worth mentioning that here or not. @headtr1ck what do you think? I've added it to #7991 so we could link to that page?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leave this clarification for later.


labeled
labeled refers to the way data is named with meaningful labels or coordinates.Instead of just having
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
numerical indices to locate values, xarray allows you to attach labels to each dimension. These labels
provide context and meaning to the data, making it easier to understand and work with. If you have
temperature data for different cities over time. Using xarray, you can label the dimensions: one for
cities and another for time.

serialization
Serialization is like putting your collection of data into a format that makes it easy to save and share.
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
When you serialize data in xarray, you're taking all those temperature measurements, along with their
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
labels and other information, and turning them into a format that can be stored in a file or sent over
the internet.

indexing
Indexing is way to quickly find and grab the specific pieces of data you're interested in from your
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
dataset.
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Label-based Indexing: You can use labels to specify what you want like "Give me the temperature for New York on July 15th."
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Positional Indexing: You can use numbers to refer to positions in the data like "Give me the third temperature in the list." This is useful when you know the order of your data but don't need to remember the exact labels.
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Slicing: You can take a "slice" of your data, like you might want all temperatures from July 1st to July 10th.
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved
Boolean Indexing: You can use true/false statements to filter your data. It's like saying "Show me temperatures where it was above 80 degrees."
harshitha1201 marked this conversation as resolved.
Show resolved Hide resolved

backend
"backend" refers to the way xarray stores and manages your data behind the scenes.If you have a bunch
of temperature measurements from different cities. You want to use xarray to organize and analyze this
data. The backend is how xarray decides to store this information in memory so that you can easily
access and manipulate it.