Binder

You can run this notebook in a live session or view it on Github.

Introduction to Xoak

This notebook briefly shows how to use Xoak with Xarray’s advanced indexing to perform point-wise selection of irrelgularly spaced data encoded in coordinates with an arbitrary number of dimensions (1, 2, …, n-d).

[1]:
import numpy as np
import xarray as xr
import xoak

xr.set_options(display_style='text');

Let’s first create an xarray.Dataset of latitude / longitude points located randomly on the sphere, forming a 2-dimensional (x, y) model mesh (note that Xoak supports indexing coordinates with an arbitrary number of dimensions).

[2]:
shape = (100, 100)
lat = np.random.uniform(-90, 90, size=shape)
lon = np.random.uniform(-180, 180, size=shape)

field = lat + lon
[3]:
ds_mesh = xr.Dataset(
    coords={'lat': (('x', 'y'), lat), 'lon': (('x', 'y'), lon)},
    data_vars={'field': (('x', 'y'), field)},
)

ds_mesh
[3]:
<xarray.Dataset>
Dimensions:  (x: 100, y: 100)
Coordinates:
    lat      (x, y) float64 31.3 -50.54 -10.09 19.51 ... 69.67 83.0 -57.93 64.47
    lon      (x, y) float64 -25.17 96.38 95.65 -47.0 ... -138.4 -140.6 -118.3
Dimensions without coordinates: x, y
Data variables:
    field    (x, y) float64 6.136 45.84 85.56 -27.49 ... -55.4 -198.6 -53.86

We first need to build an index to allow fast point-wise selection. Xoak supports several indexes that can be used depending on the data. Here we use the sklearn_geo_balltree index, a wrapper around sklearn.BallTree using the Haversine distance metric that is suited for indexing latitude / longitude points.

With this index, it is important to specify lat and lon in this specific order. Both the lat and lon coordinates must have exactly the same dimensions in the same order, here ('x', 'y').

[4]:
ds_mesh.xoak.set_index(['lat', 'lon'], 'sklearn_geo_balltree')

Let’s create another xarray.Dataset of latitude / longitude points that here correspond to a trajectory on the sphere.

[5]:
ds_trajectory = xr.Dataset({
    'latitude': ('trajectory', np.linspace(-10, 40, 30)),
    'longitude': ('trajectory', np.linspace(-150, 150, 30))
})

ds_trajectory
[5]:
<xarray.Dataset>
Dimensions:    (trajectory: 30)
Dimensions without coordinates: trajectory
Data variables:
    latitude   (trajectory) float64 -10.0 -8.276 -6.552 ... 36.55 38.28 40.0
    longitude  (trajectory) float64 -150.0 -139.7 -129.3 ... 129.3 139.7 150.0

We can use xarray.Dataset.xoak.sel() to select the mesh points that are the closest to the trajectory vertices. It works very much like xarray.Dataset.sel() and returns another Dataset with the selection.

Like for xarray.Dataset.xoak.set_index(), it is important here that all indexer coordinates (latitude and longitude in this example) have the exact same dimensions (here 'trajectory'). Indexers must be given for each coordinate used to build the index above, (here latitude for lat and longitude for lon).

[6]:
ds_selection = ds_mesh.xoak.sel(
    lat=ds_trajectory.latitude,
    lon=ds_trajectory.longitude
)

ds_selection
[6]:
<xarray.Dataset>
Dimensions:  (trajectory: 30)
Coordinates:
    lat      (trajectory) float64 -9.628 -6.911 -6.392 ... 35.23 37.84 39.56
    lon      (trajectory) float64 -150.6 -139.5 -131.1 ... 129.4 140.2 148.9
Dimensions without coordinates: trajectory
Data variables:
    field    (trajectory) float64 -160.3 -146.5 -137.5 ... 164.6 178.0 188.5

Let’s plot the trajectory vertices (black dots) and the resulting selection (dots colored by the field values):

[7]:
ds_trajectory.plot.scatter(x='longitude', y='latitude', c='k', alpha=0.7);
ds_selection.plot.scatter(x='lon', y='lat', hue='field', alpha=0.9);
../_images/examples_introduction_12_0.png

Xoak also supports providing coordinates with an arbitrary number of dimensions as indexers, like in the example below with vertices of another mesh on the sphere.

[8]:
ds_mesh2 = xr.Dataset({
    'latitude': (('x', 'y'), np.random.uniform(-90, 90, size=(10, 10))),
    'longitude': (('x', 'y'), np.random.uniform(-180, 180, size=(10, 10)))
})

ds_selection = ds_mesh.xoak.sel(
    lat=ds_mesh2.latitude,
    lon=ds_mesh2.longitude
)

ds_selection
[8]:
<xarray.Dataset>
Dimensions:  (x: 10, y: 10)
Coordinates:
    lat      (x, y) float64 -15.63 -52.0 72.63 34.72 ... -42.67 -32.9 77.04
    lon      (x, y) float64 -93.21 -31.11 -116.0 176.0 ... -33.66 -151.9 132.1
Dimensions without coordinates: x, y
Data variables:
    field    (x, y) float64 -108.8 -83.12 -43.34 210.8 ... -76.33 -184.8 209.2
[9]:
ds_mesh2.plot.scatter(x='longitude', y='latitude', c='k', alpha=0.7);
ds_selection.plot.scatter(x='lon', y='lat', hue='field', alpha=0.9);
../_images/examples_introduction_15_0.png
[ ]: