You can run this notebook in a live session or view it on Github.
Introduction to Xoak¶
This notebook briefly shows how to use Xoak with Xarray’s advanced indexing to perform point-wise selection of irrelgularly spaced data encoded in coordinates with an arbitrary number of dimensions (1, 2, …, n-d).
[1]:
import numpy as np
import xarray as xr
import xoak
xr.set_options(display_style='text');
Let’s first create an xarray.Dataset
of latitude / longitude points located randomly on the sphere, forming a 2-dimensional (x, y) model mesh (note that Xoak supports indexing coordinates with an arbitrary number of dimensions).
[2]:
shape = (100, 100)
lat = np.random.uniform(-90, 90, size=shape)
lon = np.random.uniform(-180, 180, size=shape)
field = lat + lon
[3]:
ds_mesh = xr.Dataset(
coords={'lat': (('x', 'y'), lat), 'lon': (('x', 'y'), lon)},
data_vars={'field': (('x', 'y'), field)},
)
ds_mesh
[3]:
<xarray.Dataset> Dimensions: (x: 100, y: 100) Coordinates: lat (x, y) float64 31.3 -50.54 -10.09 19.51 ... 69.67 83.0 -57.93 64.47 lon (x, y) float64 -25.17 96.38 95.65 -47.0 ... -138.4 -140.6 -118.3 Dimensions without coordinates: x, y Data variables: field (x, y) float64 6.136 45.84 85.56 -27.49 ... -55.4 -198.6 -53.86
We first need to build an index to allow fast point-wise selection. Xoak supports several indexes that can be used depending on the data. Here we use the sklearn_geo_balltree
index, a wrapper around sklearn.BallTree using the Haversine distance metric that is suited for indexing latitude / longitude points.
With this index, it is important to specify lat
and lon
in this specific order. Both the lat
and lon
coordinates must have exactly the same dimensions in the same order, here ('x', 'y')
.
[4]:
ds_mesh.xoak.set_index(['lat', 'lon'], 'sklearn_geo_balltree')
Let’s create another xarray.Dataset
of latitude / longitude points that here correspond to a trajectory on the sphere.
[5]:
ds_trajectory = xr.Dataset({
'latitude': ('trajectory', np.linspace(-10, 40, 30)),
'longitude': ('trajectory', np.linspace(-150, 150, 30))
})
ds_trajectory
[5]:
<xarray.Dataset> Dimensions: (trajectory: 30) Dimensions without coordinates: trajectory Data variables: latitude (trajectory) float64 -10.0 -8.276 -6.552 ... 36.55 38.28 40.0 longitude (trajectory) float64 -150.0 -139.7 -129.3 ... 129.3 139.7 150.0
We can use xarray.Dataset.xoak.sel()
to select the mesh points that are the closest to the trajectory vertices. It works very much like xarray.Dataset.sel()
and returns another Dataset with the selection.
Like for xarray.Dataset.xoak.set_index()
, it is important here that all indexer coordinates (latitude
and longitude
in this example) have the exact same dimensions (here 'trajectory'
). Indexers must be given for each coordinate used to build the index above, (here latitude
for lat
and longitude
for lon
).
[6]:
ds_selection = ds_mesh.xoak.sel(
lat=ds_trajectory.latitude,
lon=ds_trajectory.longitude
)
ds_selection
[6]:
<xarray.Dataset> Dimensions: (trajectory: 30) Coordinates: lat (trajectory) float64 -9.628 -6.911 -6.392 ... 35.23 37.84 39.56 lon (trajectory) float64 -150.6 -139.5 -131.1 ... 129.4 140.2 148.9 Dimensions without coordinates: trajectory Data variables: field (trajectory) float64 -160.3 -146.5 -137.5 ... 164.6 178.0 188.5
Let’s plot the trajectory vertices (black dots) and the resulting selection (dots colored by the field
values):
[7]:
ds_trajectory.plot.scatter(x='longitude', y='latitude', c='k', alpha=0.7);
ds_selection.plot.scatter(x='lon', y='lat', hue='field', alpha=0.9);
Xoak also supports providing coordinates with an arbitrary number of dimensions as indexers, like in the example below with vertices of another mesh on the sphere.
[8]:
ds_mesh2 = xr.Dataset({
'latitude': (('x', 'y'), np.random.uniform(-90, 90, size=(10, 10))),
'longitude': (('x', 'y'), np.random.uniform(-180, 180, size=(10, 10)))
})
ds_selection = ds_mesh.xoak.sel(
lat=ds_mesh2.latitude,
lon=ds_mesh2.longitude
)
ds_selection
[8]:
<xarray.Dataset> Dimensions: (x: 10, y: 10) Coordinates: lat (x, y) float64 -15.63 -52.0 72.63 34.72 ... -42.67 -32.9 77.04 lon (x, y) float64 -93.21 -31.11 -116.0 176.0 ... -33.66 -151.9 132.1 Dimensions without coordinates: x, y Data variables: field (x, y) float64 -108.8 -83.12 -43.34 210.8 ... -76.33 -184.8 209.2
[9]:
ds_mesh2.plot.scatter(x='longitude', y='latitude', c='k', alpha=0.7);
ds_selection.plot.scatter(x='lon', y='lat', hue='field', alpha=0.9);
[ ]: