Open-EO · m-mohr · Jan 30, 2023 · Feb 24, 2022 · Feb 25, 2022 · Feb 25, 2022
diff --git a/documentation/1.0/datacubes.md b/documentation/1.0/datacubes.md
@@ -2,53 +2,79 @@
 
 ## What are Datacubes?
 
-Datacubes are multidimensional arrays with one or more spatial or temporal dimension(s). They are the way in which data is represented in OpenEO. They provide a nice and tidy interface for spatiotemporal data as well as the operations you may want to execute on it. As they are arrays, it might be easiest to look at raster data as an example, even though datacubes can hold vector data as well. Our example data however consists of a 6x7 raster with 4 bands [`blue`, `green`, `red`, `near-infrared`] and 3 timesteps [`2020-10-01`, `2020-10-13`, `2020-10-25`], displayed here in an orderly, timeseries-like manner:
+Data is represented as datacubes in openEO, which are multi-dimensional arrays with additional information about their dimensionality. Datacubes can provide a nice and tidy interface for spatiotemporal data as well as for the operations you may want to execute on them. As they are arrays, it might be easiest to look at raster data as an example, even though datacubes can hold vector data as well. Our example data however consists of a 6x7 raster with 4 bands [`blue`, `green`, `red`, `near-infrared`] and 3 timesteps [`2020-10-01`, `2020-10-13`, `2020-10-25`], displayed here in an orderly, timeseries-like manner:
 
 <figure>
-    <img src="./datacubes/dc_timeseries.png" alt="Datacube timeseries: 12 imagery tiles are depicted, grouped by 3 dates along a timeline (time dimension). Each date has a blue, green, red and near-infrared band (bands dimension). Each single tile has the dimensions x and y (spatial dimensions).">
-    <figcaption>An exemplary datacube with 4 dimensions: x, y, bands and time.</figcaption>
+    <img src="./datacubes/dc_timeseries.png" alt="Raster datacube timeseries: 12 imagery tiles are depicted, grouped by 3 dates along a timeline (time dimension). Each date has a blue, green, red and near-infrared band (bands dimension). Each single tile has the dimensions x and y (spatial dimensions).">
+    <figcaption>An examplary raster datacube with 4 dimensions: x, y, bands and time.</figcaption>
 </figure>
 
 It is important to understand that datacubes are designed to make things easier for us, and are not literally a cube, meaning that the above plot is just as good a representation as any other. That is why we can switch the dimensions around and display them in whatever way we want, including the view below:
 
 <figure>
-    <img src="./datacubes/dc_flat.png" alt="Datacube flat representation: The 12 imagery tiles are now laid out flat as a 4 by 3 grid (bands by timesteps). All dimension labels are depicted (The timestamps, the band names and the x, y coordinates).">
+    <img src="./datacubes/dc_flat.png" alt="Raster datacube flat representation: The 12 imagery tiles are now laid out flat as a 4 by 3 grid (bands by timesteps). All dimension labels are depicted (The timestamps, the band names and the x, y coordinates).">
     <figcaption>This is the 'raw' data collection that is our example datacube. The grayscale images are colored for understandability, and dimension labels are displayed.</figcaption>
 </figure>
 
+A vector cube on the other hand could look like this:
+
+<figure>
+    <img src="./datacubes/vector.png" alt="Vector datacube: 2 geometries are depicted for the vector dimension, along with 3 timesteps along the time dimension and 4 bands.">
+    <figcaption>An examplary vector datacube with 3 dimensions: 2 geometries are given for the vector dimension, along with 3 timesteps for the time dimension and 4 bands.</figcaption>
+</figure>
+
+[Vector data cubes](https://r-spatial.org/r/2022/09/12/vdc.html) and raster data cubes are common cases of data cubes in the EO domain.
+A raster data cube has at least two spatial dimensions (e.g. `x` and `y`) and a vector data cube has at least a vector dimension (e.g. `geometry`).
+These distinctions are just made so that it is easier to describe "special" cases of data cubes, but you can also define other types such as a temporal data cube that has at least a temporal dimension (e.g. `t`).
+
 ## Dimensions
+
 A dimension refers to a certain axis of a datacube. This includes all variables (e.g. bands), which are represented as dimensions. Our exemplary raster datacube has the spatial dimensions `x` and `y`, and the temporal dimension `t`. Furthermore, it has a `bands` dimension, extending into the realm of _what kind of information_ is contained in the cube.
 
 The following properties are usually available for dimensions:
 
 * name
+* type (`spatial`, `temporal`, `bands`, `vector` or `other`)
 * axis / number
-* type (spatial/temporal/bands/other)
-* extents _or_ nominal dimension labels
-* reference system / projections
-* resolution
+* labels (usually exposed in metadata as nominal values _or_ extents)
+* reference system / projection
+* resolution / step size
+* unit (either explicitly specified or implicitly given by the reference system)
 
-Here is an overview of the dimensions contained in our example datacube above:
+Here is an overview of the dimensions contained in our example raster datacube above:
 
-| # | dimension name | dimension labels | resolution |
-|---|----------------|------------------| ---------- |
-| 1 | `x`              | `466380`, `466580`, `466780`, `466980`, `467180`, `467380` | 10m |
-| 2 | `y`             | `7167130`, `7166930`, `7166730`, `7166530`, `7166330`, `7166130`, `7165930` | 10m |
-| 3 | `bands`          | `blue`, `green`, `red`, `nir` | 4 bands |
-| 4 | `t`              | `2020-10-01`, `2020-10-13`, `2020-10-25` | 12 days |
+| # | name    | type     | labels                                                                      | resolution | reference system                    |
+| - | ------- | -------- | --------------------------------------------------------------------------- | ---------- | ----------------------------------- |
+| 1 | `x`     | spatial  | `466380`, `466580`, `466780`, `466980`, `467180`, `467380`                  | 200m       | [EPSG:32627](https://epsg.io/32627) |
+| 2 | `y`     | spatial  | `7167130`, `7166930`, `7166730`, `7166530`, `7166330`, `7166130`, `7165930` | 200m       | [EPSG:32627](https://epsg.io/32627) |
+| 3 | `bands` | bands    | `blue`, `green`, `red`, `nir`                                               | 4 bands    | -                                   |
+| 4 | `t`     | temporal | `2020-10-01`, `2020-10-13`, `2020-10-25`                                    | 12 days    | Gregorian calendar / UTC            |
 
-Dimension labels are either numerical or text (also known as "strings"), which also includes textual representations of timestamps for example. Dimensions with a natural/inherent order are always sorted. These are usually all spatial and temporal dimensions. Dimensions without inherent order, `bands` in openEO for example, retain the order in which they have been defined in metadata or processes (e.g. through [`filter_bands`](https://processes.openeo.org/#filter_bands)), with new labels simply being appended to the existing labels.
+Dimension labels are usually either numerical or text (also known as "strings"), which also includes textual representations of timestamps or vectors for example.
+Usually, vector labels (geometries) are encoded as [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) and temporal labels are encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
-Usually, vector labels (geometries) are encoded as [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) and temporal labels are encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
+For example, geometries (i.e. the labels of a geometry dimension) can be encoded in [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) or GeoJSON like temporal labels are usually encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
-Usually, vector labels (geometries) are encoded as [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) and temporal labels are encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
+For example, geometries (i.e. the labels of a geometry dimension) can be encoded in [Well-known Text (WKT)](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) or GeoJSON like temporal labels are usually encoded as [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) compatible dates and/or times.
 
-OpenEO datacubes contain scalar values (e.g. strings, numbers or boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the CRS or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, `(x, y)` refers to a unique location, that changes to `(x, y, CRS)` when `(x, y)` values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).
+Dimensions with a natural/inherent order are always sorted. These are usually all spatial and temporal dimensions. Dimensions without inherent order, in openEO `bands` for example, retain the order in which they have been defined in metadata or processes (e.g. through [`filter_bands`](https://processes.openeo.org/#filter_bands)), with new labels simply being appended to the existing labels.
+
+A vector dimension is not included in the example raster datacube above and it is not used in the following examples, but to show how a vector dimension with two polygons could look like:
+
+| name       | type   | labels | reference system |
+| ---------- | ------ | ------ | ---------------- | 
+| `geometry` | vector | `POLYGON((-122.4 37.6,-122.35 37.6,-122.35 37.64,-122.4 37.64,-122.4 37.6))`, `POLYGON((-122.51 37.5,-122.48 37.5,-122.48 37.52,-122.51 37.52,-122.51 37.5))` | [EPSG:4326](https://epsg.io/4326) |
+
+Vector dimensions can consist of points, linestrings, polygons, multi points, multi linestrings and multi polygons or a mixture of those. Empty geometries (includes GeoJSON `null` geometries) are not allowed.
+
+openEO datacubes contain scalar values (e.g. strings, numbers or boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the CRS or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, `(x, y)` refers to a unique location, that changes to `(x, y, CRS)` when `(x, y)` values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).
 
 ::: tip Be Careful with Data Types
 As stated above, datacubes only contain scalar values. However, implementations may differ in their ability to handle or convert them. Implementations may also not allow mixing data types in a datacube. For example, returning a boolean value for a reducer on a numerical datacube may result in an error on some back-ends. The recommendation is to not change the data type of values in a datacube unless the back-end supports it explicitly.
 :::
 
 ### Applying Processes on Dimensions
+
 Some processes are typically applied "along a dimension". You can imagine said dimension as an arrow and whatever is happening as a parallel process to that arrow. It simply means: "we focus on _this_ dimension right now".
 
 ### Resolution
+
 The resolution of a dimension gives information about what interval lies between observations. This is most obvious with the temporal resolution, where the intervals depict how often observations were made. Spatial resolution gives information about the pixel spacing, meaning how many 'real world meters' are contained in a pixel. The number of bands and their wavelength intervals give information about the spectral resolution.
 
 ### Coordinate Reference System as a Dimension

diff --git a/documentation/1.0/datacubes/.scripts/datacube_plots.R b/documentation/1.0/datacubes/.scripts/datacube_plots.R
@@ -491,8 +491,8 @@ pl(b, 46.5, -3.5, m = vecM, pal = alpha("white", 0.9), border = 0)
 print_vector_content(52.5, -1.5)
 pl(b, 45, -2, m = vecM, pal = alpha("white", 0.9), border = 0)
 print_vector_content(51, 0)
-text(51.5, 15, "Line_1")
-text(63, 15, "Polygon_1")
+text(51.5, 15, "LINESTRING(...)") # e.g. LINESTRING(24.6 19, 24.6 17.4, 25.8 16.4, 27.9 16.1)
+text(63, 15, "POLYGON(...)") # e.g. POLYGON((30 18.2, 32.3 17.6, 32.6 19.2, 31.9 19.7, 30 18.2))
 text(57, 17.5, "Geometries", cex = 1.1)
 text(42, 12, "blue")
 text(42,  8, "green")

diff --git a/documentation/1.0/datacubes/dc_aggregate_space.png b/documentation/1.0/datacubes/dc_aggregate_space.png
diff --git a/documentation/1.0/datacubes/vector.png b/documentation/1.0/datacubes/vector.png
diff --git a/documentation/1.0/glossary.md b/documentation/1.0/glossary.md
@@ -33,7 +33,29 @@ In openEO, a back-end offers a set of collections to be processed. All collectio
 
 ## Spatial datacubes
 
-A spatiotemporal datacube is a multidimensional array with one or more spatial or temporal dimensions. In the EO domain, it is common to be implicit about the temporal dimension and just refer to them as spatial datacubes in short. Special cases are raster and vector datacubes. Learn more about datacubes in the [datacube documentation](https://openeo.org/documentation/1.0/datacubes.html).
+A spatiotemporal datacube is a multidimensional array with one or more spatial or temporal dimensions.
+In the EO domain, it is common to be implicit about the temporal dimension and just refer to them as spatial datacubes in short.
+Special cases are raster and [vector datacubes](https://r-spatial.org/r/2022/09/12/vdc.html).
+Learn more about datacubes in the [datacube documentation](https://openeo.org/documentation/1.0/datacubes.html).
+
+## Vector data
+
+In general, **vector data** represent specific things (also called "features") in a space, e.g. on the surface of the Earth.
+
+A **coordinate** represents a specific point in space.
+
+A **feature** is a thing that has a geometry (e.g. the outline of an agricultural field, a forest or an urban area) and it may have additional properties assigned (e.g. a name, a color or a population).
+
+**Geometries** consist of one or more coordinates that may be connected and then form a specific type of geometry, e.g. two points can be connected to a straight line and four straight lines can be connected to rectangle.
+
+Commonly used types of geometries are:
+- Point
+- LineString (connected straight line pieces)
+- Polygon (connected straight line pieces forming a closed ring, possibly with holes - for example a triangle or rectangle)
+
+Multiple geometries of the same type can be combined to a group of geometries, e.g. a Multi Point or a Multi Polygon.
+
+Features and geometries are specified by the OGC in the [Simple Feature Access specification](https://www.ogc.org/standards/sfa) (and ISO 19125). See the specification for more details.
 
 ## User-defined function (UDF)