The databrowser command line interface#
This section introduces the usage of the freva-client databrowser sub command.
Please see the Installation and configuration section on how to install and
configure the command line interface.
After successful installation you can use the freva-client databrowser sub
command
freva-client databrowser --help
Searching for data locations#
Getting the locations of the data is probably the most common use case of the
databrowser application. You can search for data locations by applying the
data-search sub-command:
freva-client databrowser data-search --help
Results
Usage: freva-client databrowser data-search [OPTIONS] [SEARCH_KEYS]...
Search the databrowser for datasets.
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ search_keys [SEARCH_KEYS]... Refine your data search with this │
│ `key=value` pair search parameters. The │
│ parameters could be, depending on the │
│ DRS standard, flavour product, project │
│ model etc. │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --facet TEXT If you are not sure │
│ about the correct search │
│ key's you can use the │
│ ``--facet`` flag to │
│ search of any matching │
│ entries. For example │
│ --facet 'era5' would │
│ allow you to search for │
│ any entries containing │
│ era5, regardless of │
│ project, product etc. │
│ --uniq-key -u [file|uri] The type of search │
│ result, which can be │
│ either “file” or “uri”. │
│ This parameter │
│ determines whether the │
│ search will be based on │
│ file paths or Uniform │
│ Resource Identifiers │
│ [default: file] │
│ --flavour -f [freva|cmip6|cmip5|corde The Data Reference │
│ x|nextgems|user] Syntax (DRS) standard │
│ specifying the type of │
│ climate datasets to │
│ query. │
│ [default: freva] │
│ --time-select -ts [strict|flexible|file] Operator that specifies │
│ how the time period is │
│ selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* time period │
│ covered. The time search │
│ ``2000 to 2012`` will │
│ not select files │
│ containing data from │
│ 2010 to 2020 with the │
│ ``strict`` method. │
│ ``flexible`` will select │
│ those files as │
│ ``flexible`` returns │
│ those files that have │
│ start or end period │
│ covered. ``file`` will │
│ only return files where │
│ the entire time period │
│ is contained within *one │
│ single* file. │
│ [default: flexible] │
│ --zarr Create zarr stream │
│ files. │
│ --token-file -tf PATH Instead of │
│ authenticating via code │
│ based authentication │
│ flow you can set the │
│ path to the json file │
│ that contains a `refresh │
│ token` containing a │
│ refresh_token key. │
│ --time -t TEXT Special search facet to │
│ refine/subset search │
│ results by time. This │
│ can be a string │
│ representation of a time │
│ range or a single time │
│ step. The time steps │
│ have to follow ISO-8601. │
│ Valid strings are │
│ ``%Y-%m-%dT%H:%M`` to │
│ ``%Y-%m-%dT%H:%M`` for │
│ time ranges and │
│ ``%Y-%m-%dT%H:%M``. │
│ **Note**: You don't have │
│ to give the full string │
│ format to subset time │
│ steps ``%Y``, ``%Y-%m`` │
│ etc are also valid. │
│ --bbox -b <FLOAT FLOAT FLOAT Special search facet to │
│ FLOAT>... refine/subset search │
│ results by spatial │
│ extent. This can be a │
│ string representation of │
│ a bounding box. The │
│ bounding box has to │
│ follow the format │
│ ``min_lon max_lon │
│ min_lat max_lat``. Valid │
│ strings are ``-10 10 -10 │
│ 10`` to ``0 5 0 5``. │
│ --bbox-select -bs [strict|flexible|file] Operator that specifies │
│ how the spatial extent │
│ is selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* spatial │
│ extent covered. The bbox │
│ search ``-10 10 -10 10`` │
│ will not select files │
│ containing data from 0 5 │
│ 0 5 with the ``strict`` │
│ method. ``flexible`` │
│ will select those files │
│ as ``flexible`` returns │
│ those files that have │
│ any part of the extent │
│ covered. ``file`` will │
│ only return files where │
│ the entire spatial │
│ extent is contained │
│ within *one single* │
│ file. │
│ [default: flexible] │
│ --json -j Parse output in json │
│ format. │
│ --host TEXT Set the hostname of the │
│ databrowser, if not set │
│ (default) the hostname │
│ is read from a config │
│ file │
│ -v INTEGER Increase verbosity │
│ [default: 0] │
│ --multi-version Select all versions and │
│ not just the latest │
│ version (default). │
│ --version -V Show version an exit │
│ --help Show this message and │
│ exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
The command expects a list of key=value pairs. The order of the pairs doesn’t really matter. Most important is that you don’t need to split the search according to the type of data you are searching for. You can search for files within observations, reanalysis and model data at the same time. Also important is that all queries are case insensitive. You can also search for attributes themselves instead of file paths. For example you can search for the list of variables available that satisfies a certain constraint (e.g. sampled 6hr, from a certain model, etc).
freva-client databrowser data-search project=observations variable=pr model=cp*
There are many more options for defining a value for a given key:
Attribute syntax |
Meaning |
|---|---|
|
Search for files containing exactly that attribute |
|
Search for files containing a value for attribute that starts with the prefix val |
|
Search for files containing a value for attribute that ends with the suffix lue |
|
Search for files containing a value for attribute that has alu somewhere |
|
Search for files containing a value for attribute that matches the given regular expression (yes! you might use any regular expression to find what you want.) |
OR:
|
Search for files containing either value1 OR value2 for the given attribute (note that’s the same attribute twice!) |
|
Search for files containing value1 for attribute1 AND value2 for attribute2 |
|
Search for files NOT containing value |
|
Search for files containing neither value1 nor value2 |
Note
When using * remember that your shell might give it a different meaning (normally it will try to match files with that name) to turn that off you can use backslash (key=*) or use quotes (key=’*’).
Searching multi-versioned datasets#
In datasets with multiple versions only the latest version (i.e. highest
version number) is returned by default. Querying a specific version from a
multi versioned datasets requires the multi-version flag in combination with
the version special attribute:
freva-client databrowser data-search dataset=cmip6-fs model=access-cm2 --multi-version version=v20191108
If no particular version is requested, all versions will be returned.
Streaming files via zarr#
Instead of getting the file locations on disk or tape, you can instruct the
system to register zarr streams. Which means that instead of opening the
data directly you can open it via zarr from anywhere. To do so simply add
the --zarr flag.
Note
Before you can use the --zarr flag you will have
to create an access token and use that token to log on to the system
see also the Authentication & Authorization chapter for more details on token creation.
freva-client auth > .token.json
chmod 600 .token.json
freva-client databrowser data-search dataset=cmip6-fs --zarr --token-file .token.json
Special cases: Searching for times#
For example you want to know how many files we have between a certain time range To do so you can use the time search key to subset time steps and whole time ranges:
freva-client databrowser data-search project=observations -t '2016-09-02T22:15 to 2016-10'
The default method for selecting time periods is flexible, which means
all files are selected that cover at least start or end date. The
strict method implies that the entire search time period has to be
covered by the files. Using the strict method in the example above would
only yield on file because the first file contains time steps prior to the
start of the time period:
freva-client databrowser data-search project=observations -t '2016-09-02T22:15 to 2016-10' -ts strict
Giving single time steps is also possible:
freva-client databrowser data-search project=observations -t 2016-09-02T22:10
Note
The time format has to follow the
ISO-8601 standard. Time ranges
are indicated by the to keyword such as 2000 to 2100 or
2000-01 to 2100-12 and alike. Single time steps are given without the
to keyword.
Creating intake-esm catalouges#
The intake-catalogue sub command allows you to create an
intake-esm catalogue <https://intake-esm.readthedocs.io/en/stable/>_ from
the current search. This can be useful to share the catalogue with others
or merge datasets.
freva-client databrowser intake-catalogue --help
Results
Usage: freva-client databrowser intake-catalogue [OPTIONS] [SEARCH_KEYS]...
Create an intake catalogue from the search.
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ search_keys [SEARCH_KEYS]... Refine your data search with this │
│ `key=value` pair search parameters. The │
│ parameters could be, depending on the │
│ DRS standard, flavour product, project │
│ model etc. │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --facet TEXT If you are not sure │
│ about the correct search │
│ key's you can use the │
│ ``--facet`` flag to │
│ search of any matching │
│ entries. For example │
│ --facet 'era5' would │
│ allow you to search for │
│ any entries containing │
│ era5, regardless of │
│ project, product etc. │
│ --uniq-key -u [file|uri] The type of search │
│ result, which can be │
│ either “file” or “uri”. │
│ This parameter │
│ determines whether the │
│ search will be based on │
│ file paths or Uniform │
│ Resource Identifiers │
│ [default: file] │
│ --flavour -f [freva|cmip6|cmip5|corde The Data Reference │
│ x|nextgems|user] Syntax (DRS) standard │
│ specifying the type of │
│ climate datasets to │
│ query. │
│ [default: freva] │
│ --time-select -ts [strict|flexible|file] Operator that specifies │
│ how the time period is │
│ selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* time period │
│ covered. The time search │
│ ``2000 to 2012`` will │
│ not select files │
│ containing data from │
│ 2010 to 2020 with the │
│ ``strict`` method. │
│ ``flexible`` will select │
│ those files as │
│ ``flexible`` returns │
│ those files that have │
│ start or end period │
│ covered. ``file`` will │
│ only return files where │
│ the entire time period │
│ is contained within *one │
│ single* file. │
│ [default: flexible] │
│ --time -t TEXT Special search facet to │
│ refine/subset search │
│ results by time. This │
│ can be a string │
│ representation of a time │
│ range or a single time │
│ step. The time steps │
│ have to follow ISO-8601. │
│ Valid strings are │
│ ``%Y-%m-%dT%H:%M`` to │
│ ``%Y-%m-%dT%H:%M`` for │
│ time ranges and │
│ ``%Y-%m-%dT%H:%M``. │
│ **Note**: You don't have │
│ to give the full string │
│ format to subset time │
│ steps ``%Y``, ``%Y-%m`` │
│ etc are also valid. │
│ --bbox -b <FLOAT FLOAT FLOAT Special search facet to │
│ FLOAT>... refine/subset search │
│ results by spatial │
│ extent. This can be a │
│ string representation of │
│ a bounding box. The │
│ bounding box has to │
│ follow the format │
│ ``min_lon max_lon │
│ min_lat max_lat``. Valid │
│ strings are ``-10 10 -10 │
│ 10`` to ``0 5 0 5``. │
│ --bbox-select -bs [strict|flexible|file] Operator that specifies │
│ how the spatial extent │
│ is selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* spatial │
│ extent covered. The bbox │
│ search ``-10 10 -10 10`` │
│ will not select files │
│ containing data from 0 5 │
│ 0 5 with the ``strict`` │
│ method. ``flexible`` │
│ will select those files │
│ as ``flexible`` returns │
│ those files that have │
│ any part of the extent │
│ covered. ``file`` will │
│ only return files where │
│ the entire spatial │
│ extent is contained │
│ within *one single* │
│ file. │
│ [default: flexible] │
│ --zarr Create zarr stream │
│ files, as catalogue │
│ targets. │
│ --token-file -tf PATH Instead of │
│ authenticating via code │
│ based authentication │
│ flow you can set the │
│ path to the json file │
│ that contains a `refresh │
│ token` containing a │
│ refresh_token key. │
│ --filename -f PATH Path to the file where │
│ the catalogue, should be │
│ written to. if None │
│ given (default) the │
│ catalogue is parsed to │
│ stdout. │
│ --host TEXT Set the hostname of the │
│ databrowser, if not set │
│ (default) the hostname │
│ is read from a config │
│ file │
│ -v INTEGER Increase verbosity │
│ [default: 0] │
│ --multi-version Select all versions and │
│ not just the latest │
│ version (default). │
│ --version -V Show version an exit │
│ --help Show this message and │
│ exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
You can either set the --filename flag to save the catalogue to a .json
file or pipe the catalogue to stdout (default). Just like for the data-search
sub command you can instruct the system to create zarr file streams to access
the data via zarr.
Creating STAC Catalogue#
The stac-catalogue sub command allows you to create a static
SpatioTemporal Asset Catalog (STAC) <https://stacspec.org/en/about/stac-spec/>_
from the current search. This can be useful for creating, sharing and using
standardized geospatial data catalogs and enabling interoperability between
different data systems.
freva-client databrowser stac-catalogue --help
Results
Usage: freva-client databrowser stac-catalogue [OPTIONS] [SEARCH_KEYS]...
Create a static STAC catalogue from the search.
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ search_keys [SEARCH_KEYS]... Refine your data search with this │
│ `key=value` pair search parameters. The │
│ parameters could be, depending on the │
│ DRS standard, flavour product, project │
│ model etc. │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --facet TEXT If you are not sure │
│ about the correct search │
│ key's you can use the │
│ ``--facet`` flag to │
│ search of any matching │
│ entries. For example │
│ --facet 'era5' would │
│ allow you to search for │
│ any entries containing │
│ era5, regardless of │
│ project, product etc. │
│ --uniq-key -u [file|uri] The type of search │
│ result, which can be │
│ either “file” or “uri”. │
│ This parameter │
│ determines whether the │
│ search will be based on │
│ file paths or Uniform │
│ Resource Identifiers │
│ [default: file] │
│ --flavour -f [freva|cmip6|cmip5|corde The Data Reference │
│ x|nextgems|user] Syntax (DRS) standard │
│ specifying the type of │
│ climate datasets to │
│ query. │
│ [default: freva] │
│ --time-select -ts [strict|flexible|file] Operator that specifies │
│ how the time period is │
│ selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* time period │
│ covered. The time search │
│ ``2000 to 2012`` will │
│ not select files │
│ containing data from │
│ 2010 to 2020 with the │
│ ``strict`` method. │
│ ``flexible`` will select │
│ those files as │
│ ``flexible`` returns │
│ those files that have │
│ start or end period │
│ covered. ``file`` will │
│ only return files where │
│ the entire time period │
│ is contained within *one │
│ single* file. │
│ [default: flexible] │
│ --time -t TEXT Special search facet to │
│ refine/subset search │
│ results by time. This │
│ can be a string │
│ representation of a time │
│ range or a single time │
│ step. The time steps │
│ have to follow ISO-8601. │
│ Valid strings are │
│ ``%Y-%m-%dT%H:%M`` to │
│ ``%Y-%m-%dT%H:%M`` for │
│ time ranges and │
│ ``%Y-%m-%dT%H:%M``. │
│ **Note**: You don't have │
│ to give the full string │
│ format to subset time │
│ steps ``%Y``, ``%Y-%m`` │
│ etc are also valid. │
│ --bbox -b <FLOAT FLOAT FLOAT Special search facet to │
│ FLOAT>... refine/subset search │
│ results by spatial │
│ extent. This can be a │
│ string representation of │
│ a bounding box. The │
│ bounding box has to │
│ follow the format │
│ ``min_lon max_lon │
│ min_lat max_lat``. Valid │
│ strings are ``-10 10 -10 │
│ 10`` to ``0 5 0 5``. │
│ --bbox-select -bs [strict|flexible|file] Operator that specifies │
│ how the spatial extent │
│ is selected. Choose from │
│ flexible (default), │
│ strict or file. │
│ ``strict`` returns only │
│ those files that have │
│ the *entire* spatial │
│ extent covered. The bbox │
│ search ``-10 10 -10 10`` │
│ will not select files │
│ containing data from 0 5 │
│ 0 5 with the ``strict`` │
│ method. ``flexible`` │
│ will select those files │
│ as ``flexible`` returns │
│ those files that have │
│ any part of the extent │
│ covered. ``file`` will │
│ only return files where │
│ the entire spatial │
│ extent is contained │
│ within *one single* │
│ file. │
│ [default: flexible] │
│ --host TEXT Set the hostname of the │
│ databrowser, if not set │
│ (default) the hostname │
│ is read from a config │
│ file │
│ -v INTEGER Increase verbosity │
│ [default: 0] │
│ --multi-version Select all versions and │
│ not just the latest │
│ version (default). │
│ --version -V Show version an exit │
│ --filename -o PATH Path to the file where │
│ the static STAC │
│ catalogue, should be │
│ written to. If you don't │
│ specify or the path does │
│ not exist, the file will │
│ be created in the │
│ current working │
│ directory. │
│ --help Show this message and │
│ exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
To get an static STAC catalogue you can use the following command:
freva-client databrowser stac-catalogue --filename /path/to/output
and if the specified filename directory doesn’t specify or not existed or not provided, the STAC catalogue will be saved in the current directory. It can be only a directory or a fully qualified filename.
The STAC Catalogue provides multiple ways to access and interact with the data:
Access your climate data through the intake-esm data catalog specification
Access search results as Zarr files, available as STAC Assets at both collection and item levels
Browse and explore your search results directly through the Freva DataBrowser web interface
Each of these access methods is encoded as STAC Assets, making them easily discoverable and accessible through any STAC-compatible tool.
Query the number of occurrences#
In some cases it might be useful to know how many files are found in the
databrowser for certain search constraints. In such cases you can use the
data-count sub command to count the number of found files instead of getting
the files themselves.
freva-client databrowser data-count --help
By default the data-count sub command will display the total number of items
matching your search query. For example:
freva-client databrowser data-count project=observations
If you want to group the number of occurrences by search categories (facets)
use the -d or --detail flag:
freva-client databrowser data-count -d project=observations
Retrieving the available metadata#
Sometime it might be useful to know about possible values that search attributes.
For this you use the metadata-search sub command:
freva-client databrowser metadata-search --help
Just like with any other databrowser command you can apply different search constraints when acquiring metadata
freva-client databrowser metadata-search project=observations
By default the command will display only the most commonly used metadata
keys. To retrieve all metadata you can use the -e or --extended-search
flag.
freva-client databrowser metadata-search -e project=observations
Sometimes you don’t exactly know the exact names of the search keys and
want retrieve all file objects that match a certain category. For example
for getting all ocean reanalysis datasets you can apply the --facet flag:
freva-client databrowser metadata-search -e realm=ocean --facet 'rean*'
Expert tip: Getting metadata for certain files#
In some cases it might be useful to retrieve metadata for certain file or object store locations. For example if we wanted to retrieve the metadata of those files on tape:
freva-client databrowser metadata-search -e file="/arch/*"
Parsing the command output#
You might have already noticed that each command has a --json flag.
Enabling this flag lets you parse the output of each command to JSON.
Following on from the example above we can parse the output of the reverse search to the command line json processor jq:
freva-client databrowser metadata-search -e file="/arch/*" --json
By using the pipe operator | the JSON output of the freva-client
commands can be piped and processed by jq:
freva-client databrowser metadata-search -e file="/arch/*" --json | jq -r .ensemble[0]
The above example will select only the first entry of the ensembles that are associated with files on the tape archive.
Adding and Deleting User Data#
You can manage your personal datasets within the databrowser by adding or deleting user-specific data. This functionality allows you to include your own data files into the databrowser, making them accessible for analysis and retrieval alongside other datasets.
Before using the user-data commands, you need to create an access token and authenticate with the system. Please refer to the Authentication & Authorization chapter for more details on token creation and authentication.
Adding User Data#
To add your data to the databrowser, use the user-data add command. You’ll need to provide the paths to your data files, and any metadata you’d like to associate with your data.
token=$(freva-client auth -u janedoe | jq -r .access_token)
freva-client databrowser user-data add \
--path freva-rest/src/freva_rest/databrowser_api/mock/ \
--facet project=cordex \
--facet experiment=rcp85 \
--facet model=mpi-m-mpi-esm-lr-clmcom-cclm4-8-17-v1 \
--facet variable=tas \
--access-token $token
This command adds the specified data files to the databrowser and tags them with the provided metadata. These search filters help in indexing and searching your data within the system.
Deleting User Data#
If you need to remove your data from the databrowser, use the user-data delete command. Provide your the search keys (facets) that identify the user data you wish to delete.
token=$(freva-client auth -u janedoe | jq -r .access_token)
freva-client databrowser user-data delete \
--search-key project=cordex \
--search-key experiment=rcp85 \
--access-token $token
This command deletes all data entries that match the specified search keys from the databrowser.