odc.stac.load
- odc.stac.load(items, bands=None, *, groupby='time', resampling=None, dtype=None, chunks=None, pool=None, crs=<odc.geo.types.Unset object>, resolution=None, anchor=None, geobox=None, bbox=None, lon=None, lat=None, x=None, y=None, like=None, geopolygon=None, progress=None, fail_on_error=True, stac_cfg=None, patch_url=None, preserve_original_order=False, driver=None, **kw)[source]
STAC
Item
toxarray.Dataset
.Load several STAC
Item
objects (from the same or similar collections) as anxarray.Dataset
.This method can load pixel data directly on a local machine or construct a Dask graph that can be processed on a remote cluster.
catalog = pystac.Client.open(...) query = catalog.search(...) xx = odc.stac.load( query.get_items(), bands=["red", "green", "blue"], ) xx.red.plot.imshow(col="time")
- Parameters:
Common Options
- Parameters:
groupby (
Union
[str
,Callable
[[Item
,ParsedItem
,int
],Any
],None
]) –Controls what items get placed in to the same pixel plane.
Following have special meaning:
”time” items with exactly the same timestamp are grouped together
”solar_day” items captured on the same day adjusted for solar time
”id” every item is loaded separately
Any other string is assumed to be a key in Item’s properties dictionary.
You can also supply custom key function, it should take 3 arguments
(pystac.Item, ParsedItem, index:int) -> Any
preserve_original_order (
bool
) – By default items are sorted by time, id within each group to make pixel fusing order deterministic. Setting this flag toTrue
will instead keep items within each group in the same order as supplied, so that one can implement arbitrary priority for pixel overlap cases.resampling (
Union
[str
,Dict
[str
,str
],None
]) – Controls resampling strategy, can be specified per banddtype (
Union
[dtype
[Any
],None
,type
[Any
],_SupportsDType
[dtype
[Any
]],str
,tuple
[Any
,int
],tuple
[Any
,Union
[SupportsIndex
,Sequence
[SupportsIndex
]]],list
[Any
],_DTypeDict
,tuple
[Any
,Any
],Dict
[str
,Union
[dtype
[Any
],None
,type
[Any
],_SupportsDType
[dtype
[Any
]],str
,tuple
[Any
,int
],tuple
[Any
,Union
[SupportsIndex
,Sequence
[SupportsIndex
]]],list
[Any
],_DTypeDict
,tuple
[Any
,Any
]]]]) – Force output dtype, can be specified per bandchunks (
Optional
[Dict
[str
,Union
[int
,Literal
['auto'
]]]]) – Rather than loading pixel data directly, construct Dask backed arrays.chunks={'x': 2048, 'y': 2048}
progress (
Optional
[Any
]) – Pass intqdm
progress bar or similar, only used in non-Dask load.fail_on_error (
bool
) – Set this toFalse
to skip over load failures.pool (
Union
[ThreadPoolExecutor
,int
,None
]) – Use thread pool to perform load locally, only used in non-Dask load.
Control Pixel Grid of Output
There are many ways to control footprint and resolution of returned data. The most precise way is to use
GeoBox
,geobox=GeoBox(..)
. Similarly one can uselike=xx
to match pixel grid to previously loaded data (xx = odc.stac.load(...)
).Other common way is to configure crs and resolution only
xx = odc.stac.load(... crs="EPSG:3857", resolution=10) # resolution units must match CRS # here we assume 1 degree == 111km to load at roughly # the same 10m resolution as statement above. yy = odc.stac.load(... crs="EPSG:4326", resolution=0.00009009)
By default
odc.stac.load()
loads all available pixels in the requested projection and resolution. To limit extent of loaded data you have to supply bounds via eithergeobox=
orlike=
parameters (these also select projection and resolution). Alternatively use a pair ofx, y
orlon, lat
parameters.x, y
allows you to specify bounds in the output projection, whilelon, lat
operate in degrees. You can also usebbox
which is equivalent tolon, lat
.It should be noted that returned data is likely to reach outside of the requested bounds by fraction of a pixel when using
bbox
,x, y
orlon, lat
mechanisms. This is due to pixel grid “snapping”. Pixel edges will still start atN*pixel_size
whereN is int
regardless of the requested bounding box.- Parameters:
crs (
Union
[str
,int
,CRS
,CRS
,Dict
[str
,Any
],Unset
,None
]) – Load data in a given CRS. Special name of"utm"
is also understood, in which case an appropriate UTM projection will be picked based on the output bounding box.resolution (
Union
[float
,int
,Resolution
,None
]) –Set resolution of output in units of the output CRS. This can be a single float, in which case pixels are assumed to be square with
Y
axis flipped. To specify non-square or non-flipped pixels useodc.geo.resxy_()
orodc.geo.resyx_()
.resolution=10
is equivalent toresolution=odc.geo.resxy_(10, -10)
.Resolution must be supplied in the units of the output CRS. Units are commonly meters for Projected and degrees for Geographic CRSs.
bbox (
Optional
[Tuple
[float
,float
,float
,float
]]) – Specify bounding box in Lon/Lat.[min(lon), min(lat), max(lon), max(lat)]
lon (
Optional
[Tuple
[float
,float
]]) – Define output bounds in Lon/Latlat (
Optional
[Tuple
[float
,float
]]) – Define output bounds in Lon/Latx (
Optional
[Tuple
[float
,float
]]) – Define output bounds in output projection coordinate unitsy (
Optional
[Tuple
[float
,float
]]) – Define output bounds in output projection coordinate unitsanchor (
Union
[AnchorEnum
,XY
[float
],float
,Literal
['center'
],Literal
['centre'
],Literal
['edge'
],Literal
['floating'
],Literal
['default'
],None
]) – Controls pixel snapping, default is to align pixel grid toX
/Y
axis such that pixel edges align withx=0, y=0
. Other common option is to align pixel centers to0,0
rather than edges.geobox (
Optional
[GeoBox
]) – Allows to specify exact region/resolution/projection usingGeoBox
objectlike (
Optional
[Any
]) – Match output grid to the data loaded previously.geopolygon (
Optional
[Any
]) – Limit returned result to a bounding box of a given geometry. This could be an instance ofGeometry
, GeoJSON dictionary, GeoPandas DataFrame, or any object implementing__geo_interface__
. We assumeEPSG:4326
projection for dictionary and Shapely inputs. CRS information available on GeoPandas inputs should be understood correctly.
STAC Related Options
- Parameters:
- Return type:
- Returns:
xarray.Dataset
with requested bands populated
Complete Example Code
import planetary_computer as pc from pystac_client import Client from odc import stac catalog = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1") query = catalog.search( collections=["sentinel-2-l2a"], datetime="2019-06-06", query={"s2:mgrs_tile": dict(eq="06VVN")}, ) xx = stac.load( query.get_items(), bands=["red", "green", "blue"], resolution=100, # 1/10 of the native 10m resolution patch_url=pc.sign, ) xx.red.plot.imshow(col="time", size=8, aspect=1)
Example Optional Configuration
Sometimes data source might be missing some optional STAC extensions. With
stac_cfg=
parameter one can supply that information at load time. Configuration is per collection per asset. You can provide information like pixel data type,nodata
value used,unit
attribute and band aliases you would like to use.Sample
stac_cfg={..}
parameter:sentinel-2-l2a: # < name of the collection, i.e. ``.collection_id`` assets: "*": # Band named "*" contains band info for "most" bands data_type: uint16 nodata: 0 unit: "1" SCL: # Those bands that are different than "most" data_type: uint8 nodata: 0 unit: "1" aliases: #< unique alias -> canonical map rededge: B05 rededge1: B05 rededge2: B06 rededge3: B07 some-other-collection: assets: #... "*": # Applies to all collections if not defined on a collection warnings: ignore # ignore|all (default all)