Skip to content

Commit

Permalink
HOWTO: describe dependency management (elastic#388)
Browse files Browse the repository at this point in the history
* Update build command

* HOWTO: dependency management
  • Loading branch information
mtojek authored Jun 28, 2021
1 parent eb14c86 commit 97d86d4
Show file tree
Hide file tree
Showing 3 changed files with 91 additions and 1 deletion.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ Built packages are served up by the Elastic Package Registry running locally (se

Built packages can also be published to the global package registry service.

For details on how to enable dependency management, see the [HOWTO guide](https://github.com/elastic/elastic-package/blob/master/docs/howto/dependency_management.md).

### `elastic-package check`

_Context: package_
Expand Down
4 changes: 3 additions & 1 deletion cmd/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ Built packages are stored in the "build/" folder located at the root folder of t
Built packages are served up by the Elastic Package Registry running locally (see "elastic-package stack"). If you want a local package to be served up by the local Elastic Package Registry, make sure to build that package first using "elastic-package build".
Built packages can also be published to the global package registry service.`
Built packages can also be published to the global package registry service.
For details on how to enable dependency management, see the [HOWTO guide](https://github.com/elastic/elastic-package/blob/master/docs/howto/dependency_management.md).`

func setupBuildCommand() *cobraext.Command {
cmd := &cobra.Command{
Expand Down
86 changes: 86 additions & 0 deletions docs/howto/dependency_management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# HOWTO: Enable dependency management

## Motivation

As the package universe keeps growing, there are more occurrences of fields reusing by different integrations, especially
ones basing on the [Elastic Common Schema](https://github.com/elastic/ecs) (ECS). Without dependency management in place
developers tended to copy over same field definitions (mostly ECS related) from one integration to another, leading to
an increase of repository size and accidentally introducing inconsistencies. As there was no single source of truth defining
which field definition was correct, maintenance and typo correction process was expensive.

The described situation brought us to a point in time when a simple dependency management was a requirement to maintain
all used fields, especially ones imported from external sources.

## Principles of operation

Currently Elastic Packages support build-time dependencies that can be used as external field sources. They use a flat
dependency model represented with an additional build manifest, stored in an optional YAML file - `_dev/build/build.yml`:

```yaml
dependencies:
ecs:
reference: git@<commit SHA or Git tag>
```
When the elastic-package builds the package, it uses the build manifest to construct a dependencies map with references.
## External fields
While the builder processes fields files and encounters references to external sources, for example:
```yaml
- name: event.category
external: ecs
- name: event.created
external: ecs
- name: user_agent.os.full
external: ecs
```
... it will try to resolve them using the prepared dependencies map and replace with actual definitions (importing).
The tool will try to download and cache locally referenced schemas (e.g. `git@0b8b7d6121340e99a1eb463c91fd1bc7c9eb2e41` or `[email protected]`).
Cached files are stored in a dedicated directory - `~/.elastic-package/cache/fields/`. It's assumed that schema (versioned) files
do not change.

To verify if building process went well, you can open `build` directory and compare fields (e.g. `./build/integrations/nginx/1.2.3/access/fields/ecs.yml`):

```yaml
- description: |-
This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy.
`event.category` represents the "big buckets" of ECS categories. For example, filtering on `event.category:process` yields all events relating to process activity. This field is closely related to `event.type`, which is used as a subcategory.
This field is an array. This will allow proper categorization of some events that fall in multiple categories.
name: event.category
type: keyword
- description: |-
event.created contains the date/time when the event was first read by an agent, or by your pipeline.
This field is distinct from @timestamp in that @timestamp typically contain the time extracted from the original event.
In most situations, these two timestamps will be slightly different. The difference can be used to calculate the delay between your source generating an event, and the time when your agent first processed it. This can be used to monitor your agent's or pipeline's ability to keep up with your event source.
In case the two timestamps are identical, @timestamp should be used.
name: event.created
type: date
- description: Operating system name, including the version or code name.
name: user_agent.os.full
type: keyword
```
Fields in output fields files are stored sorted in alphabetical order.
### ECS repository
This dependency type refers to the ECS repository and allows for importing fields (name, type, description) from the common schema.
The schema is imported from the generated artifact (`generated/beats/fields.ecs.yml`) and it depends on a Git tag or a commit SHA.

To import fields from ECS v1.9, prepare the following `build.yml` file:

```yaml
dependencies:
ecs:
reference: [email protected]
```

and use a following field definition:

```yaml
- name: event.category
external: ecs
```

0 comments on commit 97d86d4

Please sign in to comment.