diff --git a/Data-Aggregation-API/design/index.html b/Data-Aggregation-API/design/index.html index bbdbc44..8e5b372 100644 --- a/Data-Aggregation-API/design/index.html +++ b/Data-Aggregation-API/design/index.html @@ -1479,6 +1479,10 @@
/api/build/trigger
/metrics
At Criteo, we have decided to fully open source our network automation framework.
We have called it AFK, which stands for \"Automation Framework for networK\" (yes we are cheating a bit with the \"k\", but why not ). It fits with the idea of being Away From Keyboard while the network configuration is being deployed or updated automatically.
It is based on NetBox, OpenConfig, SaltStack, and supports Juniper JunOS, Arista EOS and SONiC.
Note
If you are using an ad-blocker, this documentation might not work properly as \"Criteo\" is in some links.
"},{"location":"#repositories","title":"Repositories","text":"Repository Description Latest commit Network CMDB Network CMDB plugin for Netbox Data aggregation API Aggregate data from CMDB and convert to OpenConfig SONiC Salt Deployer Tool to deploy and configure salt-minion on SONiC devices SONiC SaltStack States/execution modules for SONiC SONiC utilities SONiC scripts used by some SONiC SaltStack modules"},{"location":"#global-design","title":"Global design","text":"Note
Our approach to automation is opinionated. There are tons of ways of doing network configuration, and choices must be made.
This diagram shows the components of our framework:
flowchart TD\n CMDB[Network CMDB]\n DAAPI[Data Aggregation API]\n DEV[Network_Devices]\n DATASOURCE[Other data source*]\n\n CMDB -->|raw data| DAAPI\n DATASOURCE -->|raw data| DAAPI\n DAAPI -->|openconfig| SaltStack\n SaltStack -->|configuration| DEV[Network_Devices]
* The Data Aggregation API will be able to get and merge data from other data sources once a plugin system will be in place.
"},{"location":"#network-cmdb","title":"Network CMDB","text":"The Network CMDB contains data relative to the business and is completely agnostic to the network OS.
The models are designed to describe the objects themselves rather than the configuration from device perspective. The idea is also to avoid any data duplication which could lead to configuration mismatches.
For instance, we represent the BGP session itself with two joined tables describing peers: DeviceBGPSession
<==> BGPSession
<==> DeviceBGPSession
DeviceBGPSession
contains the local-as
but not the peer-as
, avoiding data duplication. The peer-as
being the local-as
of the other neighbor.BGPSession
contains all information peers have in common, like state (in production
, maintenance
etc...) or MD5 password
.This API aggregates data from their sources of truth: the Network CMDB or possibly any other data source you may have.
Then, it computes this data to provide OpenConfig JSON for each device as an output.
ygot is used to validate the output against the OpenConfig YANG models.
"},{"location":"#saltstack-modules","title":"SaltStack modules","text":"Our AFK Salt modules takes OpenConfig data and converts it as Network configuration. We are using templates to do that.
The end goal is to simply forward this OpenConfig data to the Network OS to apply the configuration. Currently, OpenConfig is, at best, partially implemented in Network Operating Systems.
"},{"location":"FAQ/","title":"FAQ","text":""},{"location":"FAQ/#how-to-decommission-an-entire-device","title":"How to decommission an entire device?","text":"For now, the mantra is: create your own Salt State to decommission.
We do have an optional removal feature, but with some safeguards.
As of today, we think decommissioning should be handled in different Salt States and modules than provisioning.
"},{"location":"FAQ/#what-part-is-device-agnostic-and-what-is-not","title":"What part is device-agnostic, and what is not?","text":"The only part of the stack which is not agnostic is SaltStack.
The job of Salt is to convert agnostic data coming from Data Aggregation API to a configuration the devices understand. So, the data in the NetBox CMDB and the Data Aggregation API must be completely agnostic, with nothing specific to a Network OS.
If we did not do that, Salt could try to push a configuration which would not be appropriate for the device.
Note
In the future, the Data Aggregation API might no longer be device-agnostic because of interface naming. More info to come.
"},{"location":"FAQ/#what-if-i-have-a-use-case-which-is-not-covered-by-openconfig","title":"What if I have a use case which is not covered by OpenConfig?","text":"You should follow these general rules:
See also Data-Aggregation-API.
"},{"location":"FAQ/#some-network-os-requires-extra-configuration-for-a-service-where-do-i-put-that","title":"Some Network OS requires extra configuration for a service, where do I put that?","text":"You are perfectly allowed to duplicate/convert an object if needed.
For instance, depending on the Network OS, PBR sometimes requires an ACL and sometimes a route-map. If it is doable, do that conversion in the Salt module and/or in the template.
"},{"location":"FAQs/","title":"FAQs","text":"The FAQs are divided per section to make it more digest.
For now, AFK provides a partial configuration removal feature. It is not enabled by default.
We do not advise enabling it, as it has not been battle-tested yet. If you still want to use it, please be careful and run tests in your lab first!
"},{"location":"configuration-cleaning/#target","title":"Target","text":"The target is to manage the configuration like any Configuration Management Software, e.g. Chef.
The system maintains the desired state but will not remove extra configuration. For instance, when you maintain the configuration of a server with Chef, it will never remove an extra package installed by a user. This is the same here. If you add manually a protocol which is not supported by AFK, like OSPF, it will not remove it. Meanwhile, BGP sessions are maintained by AFK, so extra BGP session can be removed.
We do not want to give too much power to the CMDB, which hosts the source of truth of what should be on the device. We do not want to remove all BGP sessions if someone removes the wrong device in NetBox.
This is why we have already implemented safeguards and will continue to do so.
Some examples:
Info
SSH support will probably not be added because we consider it to be part of a minimal configuration which must be configured on the device during bootstrap (via ZTP or manual).
"},{"location":"installation/","title":"Installation","text":"Component How to install NetBox https://docs.netbox.dev/en/stable/installation/ Network CMDB Network CMDB plugin for Netbox Data Aggregation API see our Installation guide SaltStack https://docs.saltproject.io/salt/install-guide/en/latest/ SaltStack for JunOS/EOS proxy-minion with napalm driver SaltStack for SONiC see our SONiC Salt Deployer and our AFK modules for SaltStackNote
If you are not familiar with NetBox and SaltStack, you should look at their awesome documentation prior to trying AFK:
Info
We plan to provide a script to deploy a development environment. You will be able to inspire yourself from this to deploy to production.
"},{"location":"monitoring/","title":"Monitoring","text":"Warning
This page is under construction.
"},{"location":"provisioning/","title":"Provisioning","text":""},{"location":"provisioning/#bootstrap-your-devices","title":"Bootstrap your devices","text":"For now, AFK does not provide any way to generate bootstrap configuration.
The advised workflow is the following:
The minimal configuration cannot be generated by Salt as it requires a salt-minion for the device.
You could use Python scripts for instance to generate them, host the minimal configuration files on a server and serve them via your ZTP process.
One alternative would be to rely on the default configuration of your devices. But this depends on the Network OS used.
Note
We think this minimal configuration should not be maintained by AFK to avoid losing the access in case of misconfiguration in the CMDB.
"},{"location":"provisioning/#use-provisioners","title":"Use Provisioners","text":"The provisioner is where your business logic is located.
It can take many forms. At Criteo we have two kinds:
one shot provisioner
provisions the CMDB to have a fully operational datacenter.service provisioner
provisions the CMDB on demand. It allows your internal clients to dynamically benefit from network services, like BGP as a service.Here you have two options:
not so AFK
: just run the SaltStack State manually via the salt <device> state.apply afk
commandreal AFK
: create a schedule to automatically apply the configurationNetBox is a great tool to manage your DCIM and IPAM.
Our CMDB models directly use the DCIM and IPAM models, like Device
and IP prefixes
.
NetBox also includes several models which could be considered as CMDB such as VLAN
, VRF
and L2VPN
. We do not want to use these models to avoid confusion and design mismatches with our CMDB.
OpenConfig models are designed from a device perspective.
Our CMDB models are designed from a service/asset perspective. It allows us to apply common parameters without worrying of applying them at both ends.
Example: we can directly set a maintenance status on a BGP session.
It also allows us to limit redefinition of values that can lead to misconfiguration. For instance, in OpenConfig nothing prevents us from having a mismatch of ASN in a BGP session. In our CMDB, it cannot happen as the BGP session is the central object which links two devices, hence the neighbor A remote-as is the neighbor B local-as.
"},{"location":"CMDB/FAQ/#is-it-perfect-and-completely-generic","title":"Is it perfect and completely generic?","text":"No
It will not cover all use cases, but we will bring more features.
The CMDB models come from opinionated choices aiming to simplify the implementation.
Also, we do not aim to have a factorized configuration.
"},{"location":"CMDB/FAQ/#why-is-max-prefixes-configured-at-the-top-level-of-the-neighbor-and-not-in-a-safi","title":"Why is max-prefixes configured at the top level of the neighbor and not in a SAFI?","text":"This is due to an EOS limitation. It cannot set the max prefix at SAFI level:
device.lab(config)#router bgp 65000\n\ndevice.lab(config-router-bgp)#neighbor 100.64.1.0 maximum-routes ?\n <0-4294967294> Maximum number of routes (0 means unlimited)\n\ndevice.lab(config-router-bgp)#address-family ipv4\n\ndevice.lab(config-router-bgp-af)#neighbor 100.64.1.0 ?\n activate Activate neighbor in the address family\n additional-paths BGP additional-paths commands\n default-originate Advertise a default route to this peer\n graceful-restart Enable graceful restart mode\n next-hop Next-hop address-family configuration\n next-hop-unchanged Preserve original nexthop while advertising routes to eBGP peers\n prefix-list Prefix list reference\n route-map Name a route map\n weight Assign weight for routes learnt from this peer\n
"},{"location":"CMDB/FAQ/#why-is-there-no-permitdeny-field-in-community-lists-and-prefix-lists","title":"Why is there no permit/deny field in community-lists and prefix-lists?","text":"While some devices provides a permit/deny
field to a community/prefix-list, other Network OS like JunOS does not. More importantly, OpenConfig does not have it either.
If you are currently using permit/deny
, we suggest you to adapt your route-map instead.
Example
You should migrate from:
ip prefix-list FANCY_PREFIX_LIST 10 deny 10.0.0.0/24\nip prefix-list FANCY_PREFIX_LIST 20 permit 10.0.0.0/8 le 32\n\nroute-map RM-TEST-IN 5 permit\n match ip prefix-list FANCY_PREFIX_LIST\n
to:
ip prefix-list FANCY_PREFIX_LIST 10 permit 10.0.0.0/24\nip prefix-list ANOTHER_FANCY_PREFIX_LIST 20 permit 10.0.0.0/8 le 32\n\nroute-map RM-TEST-IN 5 deny\n match ip prefix-list FANCY_PREFIX_LIST\n\nroute-map RM-TEST-IN 10 permit\n match ip prefix-list ANOTHER_FANCY_PREFIX_LIST\n
"},{"location":"CMDB/FAQ/#why-cannot-a-route-map-filter-on-safi-while-openconfig-permits-it","title":"Why cannot a route-map filter on SAFI while OpenConfig permits it?","text":"The issue here is coming from implementation differences between JunOS and EOS/FRR:
In JunOS, this should be done using afi-safi-in
in routing-policy (applying a route-map in a specific SAFI in BGP configuration is not possible): route-map->term->from->safi
In EOS/FRR, this should be done in BGP config (afi-safi-in
in routing-policy is not supported): bgp->neighbor->safi->route-map
Authorizing both methods could lead to conflicts.
Example
If we set the following configuration in the CMDB:
route-map RM-TEST\n term 10\n from afi-safi-in ipv4_unicast\n\nrouter bgp\n neighbor toto\n address-family ipv6 unicast\n route-map out RM-TEST\n
We would end up with a route-map with different meanings depending on the network OS.
We decided to not support afi-safi-in
in routing-policy to simplify and reduce risk.
Yes.
JunOS example
# we create duplicates per SAFI, adding \"from family <safi>\" in each term:\nset policy-options policy-statement AUTOGENERATED::RM-TEST:IPV4_UNICAST term 10 from family inet\n...\n\nset policy-options policy-statement AUTOGENERATED::RM-TEST:IPV6_UNICAST term 10 from family inet6\n...\n\n# then we apply the one matching the SAFI used in the CMDB neighbor configuration (bgp->neighbor->address-family->route-map)\nset routing-instance prod protocols bgp group TEST neighbor 192.0.2.1 export [AUTOGENERATED::RM-TEST:IPV4_UNICAST]\n
"},{"location":"CMDB/FAQ/#why-is-send-community-not-implemented","title":"Why is send-community not implemented?","text":"For two reasons:
Adding it would make the Salt templates more complex because sometimes it would be directly in the BGP configuration, sometimes it would imply the creation of a route-map.
Additionally, it would make route-policies and BGP states more tightly coupled.
"},{"location":"CMDB/FAQ/#why-are-peer-groups-deprecated-not-supported","title":"Why are peer-groups deprecated / not supported?","text":"Because peer-groups bring a lot of complexity and risks, especially because of FRR.
In FRR, for sessions being in a peer-group:
some of the issues we found in FRR
When the neighbor is in a peer-group and the remote-as is set at neighbor level only:
error: \u201cCannot change the peer-group. Deconfigure first\u201d
When the neighbor is in a peer-group and the remote-as is set at the peer-group level only:
There are other issues, and we did not even talk about the difference with the other Network OS...
"},{"location":"CMDB/FAQ/#questions-about-deprecated-features","title":"Questions about deprecated features","text":"expand..."},{"location":"CMDB/FAQ/#why-do-peer-groups-not-have-maximum-prefixes","title":"Why do peer-groups not have maximum-prefixes?","text":"maximum-prefixes
can only be set at the SAFI levelBecause of EOS, the maximum-prefixes
field in the CMDB is set at the peer/peer-group level too. Then, the Data Aggregation API duplicates the value for each maximum-prefixes
set.
We could force the maximum-prefixes
for each SAFI whether they are enabled or not. But in JunOS it would mean enabling all the SAFIs for the peer-group.
So it is easier to enable this option only at the peer level, as the peer must be in a SAFI to work.
"},{"location":"CMDB/FAQ/#why-dont-we-support-safi-in-peer-groups","title":"Why don't we support SAFI in peer-groups?","text":"Because of FRR:
For retro-compatibility purposes, we do not manage the statement no neighbor <peer-group> activate
. Meaning, if it is added manually, it will not be removed.
Warning
To ease migration, our Salt template for SONiC supports peer-groups without remote-as.
All these restrictions come from FRR. In FRR, we cannot setup the remote-as at both peer-group and neighbor level.
It means that the peer-group remote-as MUST match with all its neighbors remote-as. Otherwise, the CMDB would not reflect what would really be applied to the device.
TL;DR
Note: for local-as the behavior is different:
router bgp 65000\n neighbor PG-L3_RA local-as 65000\n% Cannot have local-as same as BGP AS number\n\n neighbor 192.0.2.1 local-as 65001\n neighbor 192.0.2.1 local-as 65000\n% Cannot have local-as same as BGP AS number\n neighbor 192.0.2.1 local-as 65002\n
"},{"location":"CMDB/endpoints/","title":"Endpoints","text":""},{"location":"CMDB/endpoints/#asns","title":"ASNs","text":"Endpoint: /api/plugins/cmdb/asns/
Provides all available ASN.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"created\": \"2022-09-28T12:28:52.112949Z\",\n \"last_updated\": \"2022-09-28T12:28:52.112978Z\",\n \"organization_name\": \"Paris\",\n \"number\": 65000\n }\n ]\n}\n
"},{"location":"CMDB/endpoints/#bgp-global","title":"BGP Global","text":"Endpoint: /api/plugins/cmdb/bgp-global/
Provides the BGP global configuration for each device.
ExampleTODO\n
"},{"location":"CMDB/endpoints/#bgp-sessions","title":"BGP sessions","text":"Endpoint: /api/plugins/cmdb/bgp-sessions/
Provides all BGP sessions.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"peer_a\": {\n \"id\": 1,\n \"local_address\": {\n \"id\": 1234,\n \"url\": \"https://netbox.local/api/ipam/ip-addresses/1234/\",\n \"display\": \"192.0.2.16/31\",\n \"family\": 4,\n \"address\": \"192.0.2.16/31\"\n },\n \"device\": {\n \"id\": 123,\n \"name\": \"tor1\"\n },\n \"local_asn\": {\n \"id\": 1,\n \"number\": 65000,\n \"organization_name\": \"Criteo-65000\"\n },\n \"afi_safis\": [\n {\n \"id\": 217,\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"afi_safi_name\": \"ipv4-unicast\"\n }\n ],\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"created\": \"2022-11-07T16:57:33.848779Z\",\n \"last_updated\": \"2022-11-07T16:57:33.848795Z\",\n \"description\": \"to-spine1\",\n \"enforce_first_as\": true,\n \"maximum_prefixes\": 10000\n },\n \"peer_b\": {\n \"id\": 218,\n \"local_address\": {\n \"id\": 22,\n \"url\": \"https://netbox.local/api/ipam/ip-addresses/22/\",\n \"display\": \"192.0.2.17/31\",\n \"family\": 4,\n \"address\": \"192.0.2.17/31\"\n },\n \"device\": {\n \"id\": 345,\n \"name\": \"spine1\"\n },\n \"local_asn\": {\n \"id\": 10,\n \"number\": 65001,\n \"organization_name\": \"Criteo-65001\"\n },\n \"afi_safis\": [\n {\n \"id\": 2,\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"afi_safi_name\": \"ipv4-unicast\"\n }\n ],\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"created\": \"2022-11-07T16:57:33.853487Z\",\n \"last_updated\": \"2022-11-07T16:57:33.853500Z\",\n \"description\": \"to-tor1\",\n \"enforce_first_as\": true,\n \"maximum_prefixes\": 10000\n },\n \"tenant\": null,\n \"created\": \"2022-11-07T16:57:33.856145Z\",\n \"last_updated\": \"2022-11-07T16:57:33.856156Z\",\n \"status\": \"active\",\n \"password\": \"\",\n \"circuit\": null\n },\n ]\n}\n
"},{"location":"CMDB/endpoints/#prefix-lists","title":"Prefix lists","text":"Endpoint: /api/plugins/cmdb/prefix-lists/
Provides the prefix lists for each device.
Example{\n \"results\": [\n {\n \"id\": 129,\n \"name\": \"PF-ANY_IPV6\",\n \"device\": {\n \"id\": 1,\n \"name\": \"tor1\"\n },\n \"ip_version\": \"ipv6\",\n \"terms\": [\n {\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"prefix\": \"::/0\",\n \"le\": 128,\n \"ge\": null\n }\n ]\n },\n ]\n}\n
"},{"location":"CMDB/endpoints/#bgp-community-lists","title":"BGP community lists","text":"Endpoint: /api/plugins/cmdb/bgp-community-lists/
Provides the BGP community lists for each device.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"device\": {\n \"id\": 1,\n \"name\": \"tor1\"\n },\n \"terms\": [\n {\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"community\": \"65000:10000\"\n }\n ],\n \"created\": \"2022-07-28T13:37:05.699938Z\",\n \"last_updated\": \"2022-07-28T13:37:05.699968Z\",\n \"name\": \"CL-SERVER\"\n }\n ]\n}\n
"},{"location":"CMDB/endpoints/#route-policies","title":"Route policies","text":"Endpoint: /api/plugins/cmdb/route-policies/
Provides the route maps for each device.
Example{\n \"results\": [\n {\n \"id\": 129,\n \"name\": \"RM-UPLINK-IN\",\n \"device\": {\n \"id\": 123,\n \"name\": \"tor1\"\n },\n \"description\": \"tor1:uplink-in\",\n \"terms\": [\n {\n \"description\": \"\",\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": null,\n \"from_prefix_list\": {\n \"id\": 129,\n \"device\": 123,\n \"name\": \"PF-ANY_IPV6\"\n },\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n },\n {\n \"description\": \"\",\n \"sequence\": 20,\n \"decision\": \"permit\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": {\n \"id\": 1,\n \"device\": 123,\n \"name\": \"CL-SERVER\"\n },\n \"from_prefix_list\": null,\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n },\n {\n \"description\": \"\",\n \"sequence\": 30,\n \"decision\": \"deny\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": null,\n \"from_prefix_list\": null,\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n }\n ]\n }\n ]\n}\n
"},{"location":"CMDB/installation/","title":"Installation","text":""},{"location":"CMDB/installation/#existing-netbox-instance","title":"Existing NetBox instance","text":"This is simple:
network_cmdb
plugin from the criteo/netbox-network-cmdb repository.netbox/configuration.py
file and add netbox_cmdb
in the PLUGINS
list.You can have a look at the docker-compose.yml
and the other scripts in the develop directory.
make start
from the Network CMDB repository.For now, there are no CMDB components in NetBox UI. It will be added later.
In the meantime you can access the CMDB in the Django Admin UI: http://127.0.0.1:8000/admin/
"},{"location":"CMDB/models/BGP/","title":"BGP","text":""},{"location":"CMDB/models/BGP/#source-code","title":"Source code","text":"Models location: netbox_cmdb/models/bgp.py
"},{"location":"CMDB/models/BGP/#table-relations","title":"Table relations","text":"Info
To simplify the diagram, not all relations are displayed (example: DeviceBGPSession
=> IPAM.IPAddress
)
erDiagram\n\n dcim_Device 1--1 BGPGlobal: \"\"\n dcim_Device 1--0+ BGPSession: \"\"\n BGPSession 1--1+ DeviceBGPSession: \"\"\n DeviceBGPSession 1--0+ AfiSafi: \"\"\n DeviceBGPSession 1--0+ RoutePolicy: \"\"\n AfiSafi 1--0+ RoutePolicy: \"\"
"},{"location":"CMDB/models/BGP/#bgp-global-configuration","title":"BGP global configuration","text":"BGPGlobal:\n device: dcim.Device\n local_asn: cmdb.ASN\n router_id: string\n ebgp_administrative_distance: integer\n ibgp_administrative_distance: integer\n graceful_restart: boolean\n graceful_restart_time: integer\n ecmp: boolean\n ecmp_maximum_paths: integer\n
ASN:\n organization_name: string\n number: integer\n
"},{"location":"CMDB/models/BGP/#bgp-sessions","title":"BGP sessions","text":"The implementation for BGP sessions is more complex.
The main idea is to have a BGPSession
linked to two DeviceBGPSession
to avoid data duplication such as local-asn
vs remote-asn
.
Info
As we have deprecated the usage of peer-groups, we do not document the model here.
BGPSession:\n state: string choice\n monitoring_state: string choice\n peer_a: cmdb.DeviceBGPSession\n peer_b: cmdb.DeviceBGPSession\n password: string\n circuit: cmdb.Circuit\n tenant: dcim.Tenant\n
DeviceBGPSession:\n device: dcim.Device\n enabled: boolean\n description: string\n local_address: netbox.ipam.IPAddress\n remote_asn: cmdb.ASN\n peer_group: cmdb.PeerGroup (deprecated)\n maximum_prefixes: integer\n route_policy_in: cmdb.RoutePolicy\n route_policy_out: cmdb.RoutePolicy\n enforce_first_as: boolean (not used yet)\n
AfiSafi:\n device: dcim.Device\n route_policy_in: cmdb.RoutePolicy\n route_policy_out: cmdb.RoutePolicy\n device_bgp_session: cmdb.DeviceBGPSession\n
"},{"location":"CMDB/models/Routing-Policies/","title":"Routing Policies","text":""},{"location":"CMDB/models/Routing-Policies/#source-code","title":"Source code","text":"Models locations:
erDiagram\n\n dcim_Device 1--0+ RoutePolicy: \"\"\n RoutePolicy 1--0+ RoutePolicyTerm: \"\"\n RoutePolicyTerm 0+--o| BgpCommunityList: \"\"\n RoutePolicyTerm 0+--o| PrefixList: \"\"\n BgpCommunityList 1--0+ BgpCommunityListTerm: \"\"\n PrefixList 1--0+ PrefixListTerm: \"\"\n
"},{"location":"CMDB/models/Routing-Policies/#prefix-lists","title":"Prefix lists","text":"PrefixList:\n name: string\n device: dcim.Device\n ip_version: string choice\n
PrefixListTerm:\n prefix_list: cmdb.PrefixList\n sequence: integer\n decision: string choice\n prefix: IPNetwork\n le: integer\n ge: integer\n
"},{"location":"CMDB/models/Routing-Policies/#community-lists","title":"Community lists","text":""},{"location":"CMDB/models/Routing-Policies/#_1","title":"Routing Policies","text":"BGPCommunityList:\n name: string\n device: dcim.Device\n
BGPCommunityListTerm:\n bgp_community_list: cmdb.BGPCommunityList\n sequence: integer\n decision: string choice\n community: string\n
"},{"location":"CMDB/models/Routing-Policies/#route-policies","title":"Route policies","text":"RoutePolicy:\n name: string\n device: dcim.Device\n description: string\n
RoutePolicyTerm:\n route_policy: cmdb.RoutePolicy\n description: string\n sequence: integer\n decision: string choice\n\n # match\n from_bgp_community: string\n from_bgp_community_list: cmdb.BgpCommunityList\n from_prefix_list: cmdb.PrefixList\n from_source_protocol: string\n from_route_type: string\n from_local_pref: integer\n\n # set\n set_local_pref: integer\n set_community: string\n set_origin: string\n set_metric: integer\n set_large_community: string\n set_as_path_prepend: string\n set_next_hop: IPaddress\n
"},{"location":"Data-Aggregation-API/configuration/","title":"Configuration","text":""},{"location":"Data-Aggregation-API/configuration/#configuration-file","title":"Configuration file","text":"The Data Aggregation API can be configured with flags, environments variables and configuration file.
The precedence order for the different methods is:
Info
Datacenter: \"europe\"\n\nLog:\n Level: \"error\"\n Pretty: true\n\nAPI:\n ListenAddress: \"127.0.0.1\"\n ListenPort: 1234\n\nAuthentication:\n LDAP:\n URL: \"ldaps://URL.local\"\n BindDN: \"cn=<user>,OU=<ou>,DC=<local>\"\n BaseDN: \"DC=<local>\"\n Password: \"<some_password>\"\n WorkersCount: 10\n Timeout: 5s\n MaxConnectionLifetime: 1m\n InsecureSkipVerify: false\n\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n DatacenterFilterKey: \"site\"\n LimitPerPage: 500\n\nBuild:\n Interval: \"30m\"\n AllDevicesMustBuild: false\n
"},{"location":"Data-Aggregation-API/configuration/#global-settings","title":"Global settings","text":"Parameter Default Description Datacenter `` Value used to filter devices. The key is defined by DatacenterFilterKey
."},{"location":"Data-Aggregation-API/configuration/#log-settings","title":"Log settings","text":"All parameters below are in the Log
section of the configuration. This section is optional.
info
Log level. Pretty false
If enabled: human readable logs (with colors). If disabled: structured logs."},{"location":"Data-Aggregation-API/configuration/#api-settings","title":"API settings","text":"All parameters below are in the API
section of the configuration. This section is optional.
0.0.0.0
Listening address of the web API. ListenPort 8080
Listening port of the web API."},{"location":"Data-Aggregation-API/configuration/#authentication-settings","title":"Authentication settings","text":"All parameters below are in the Authentication
->LDAP
section of the configuration.
LDAP Authentication is only to authenticate users when they try to query the Web API, i.e. when they want to retrieve the built config.
This section is optional.
Parameter Default Description InsecureSkipVerifyfalse
Ignore LDAP TLS warnings. URL URL of the LDAP server. BindDN Bind used to query the LDAP server. Password Password to query the LDAP server. BaseDN Only users matching the BaseDN are authorized to query the web API. WorkersCount 10
Number of workers to authenticate users concurrently. Timeout 10s
Time to wait before considering a LDAP request timed out. MaxConnectionLifetime 1m
Lifetime of worker connection to LDAP. Useful to re-use existing LDAP connection."},{"location":"Data-Aggregation-API/configuration/#netbox-settings","title":"NetBox settings","text":"All parameters below are in the NetBox
section of the configuration.
site
Key used to filter devices, using Datacenter
as a value. LimitPerPage 100
Number of elements per page when getting data from NetBox API."},{"location":"Data-Aggregation-API/configuration/#build-settings","title":"Build settings","text":"All parameters below are in the NetBox
section of the configuration. This section is optional.
1m
Build interval, e.g.: 10m
, 1h
AllDevicesMustBuild false
The build fails if one device has not been built."},{"location":"Data-Aggregation-API/configuration/#alternative-methods","title":"Alternative methods","text":""},{"location":"Data-Aggregation-API/configuration/#environment-variables","title":"Environment variables","text":"All settings available in the configuration file can be set as environment variables, but:
DAAPI_
_
is the level separatorFor example, the equivalent of this config file:
Datacenter: \"europe\"\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n
is:
DAAPI_DATACENTER=\"europe\"\nDAAPI_NETBOX_URL=\"https://netbox.local\"\nDAAPI_NETBOX_APIKEY=\"<some_key>\"\n
"},{"location":"Data-Aggregation-API/design/","title":"Design","text":""},{"location":"Data-Aggregation-API/design/#ingestors","title":"Ingestors","text":"An ingestor is a component responsible for fetching data from a single source of truth. Most of the time it is querying a single endpoint.
Examples of ingestor with their associated endpoints:
/api/plugins/cmdb/bgp-sessions/
/api/plugins/cmdb/route-policies/
The precompute step is responsible for extracting raw data per device. The goal is to be able to have all needed data to compute one device configuration.
For instance, a BGPSession
has two peers: peer_a
and peer_b
. The devices matching the peers will all have this BGPSession
in their respective raw_data
.
Here is a simplified output of the bgp-sessions
endpoint.
{\n \"results\": [\n {\n \"peer_a\": {\n \"local_address\": {\n \"address\": \"192.0.2.16/31\"\n },\n \"device\": {\n \"name\": \"tor1\"\n },\n \"local_asn\": {\n \"number\": 65000,\n \"organization_name\": \"Criteo-65000\"\n },\n \"description\": \"to-spine1\",\n },\n \"peer_b\": {\n \"local_address\": {\n \"address\": \"192.0.2.17/31\"\n },\n \"device\": {\n \"name\": \"spine1\"\n },\n \"local_asn\": {\n \"number\": 65001,\n },\n \"description\": \"to-tor1\",\n },\n \"status\": \"active\",\n \"password\": \"thisisanincredibleandcomplexpassword:)\",\n }\n ]\n }\n
The precompute will \"copy\" this structure for both devices:
tor1.raw_data[\"bgp-session\"] = results[0]
spine1.raw_data[\"bgp-session\"] = results[0]
Thanks to this simple precompute part, tor1
OpenConfig can be generated independently:
tor1.neighbor[0].local_as = tor1.raw_data[\"bgp-session\"].peer_a.local_asn\ntor1.neighbor[0].remote_as = tor1.raw_data[\"bgp-session\"].peer_b.local_asn\n...\n
this is pseudo-code just to explain the idea.
"},{"location":"Data-Aggregation-API/design/#convertors","title":"Convertors","text":"This is where the magic happens. From the precomputed data, the Data Aggregation API generates OpenConfig configuration for each device.
Thanks to ygot and OpenConfig YANG models, the Data Aggregation API has all the OpenConfig Go structures.
/v1/report/last
Last or ongoing build report /v1/report/last/complete
Report of the last complete build (whether it failed or not) /v1/report/last/successful
Report of the last successful build The reports contain: * The status of the build. * When it started and finished. * Statistics like performance. * Logs.
"},{"location":"Data-Aggregation-API/design/#openconfig","title":"OpenConfig","text":"Endpoint Description/v1/devices/[hostname]/openconfig
Get OpenConfig data for a specific device /v1/devices/*/openconfig
Get OpenConfig data for all devices /v1/devices/[hostname]/afk_enabled
Indicates if a device has the tag AFK enabled in NetBox /v1/devices/*/afk_enabled
Same as above, but for all devices Note: afk_enabled
can be used to enable or disable AFK schedule in Salt via afk-enabled
tag in NetBox DCIM.
/metrics
Prometheus metrics /api/version
Details about the running version /api/health
Dummy endpoint for basic healthcheck of the app"},{"location":"Data-Aggregation-API/installation/","title":"Installation","text":""},{"location":"Data-Aggregation-API/installation/#quickstart","title":"Quickstart","text":"Example of basic configuration (settings.yml
):
Datacenter: \"europe\"\nLog:\n Level: \"error\"\n Pretty: true\n\nAPI:\n ListenAddress: \"127.0.0.1\"\n ListenPort: 1234\n\nLog:\n Level: \"info\"\n Pretty: true\n\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n DatacenterFilterKey: \"site\"\n\nBuild:\n Interval: \"30m\"\n
See Configuration for more details.
"},{"location":"Data-Aggregation-API/installation/#dependencies","title":"Dependencies","text":"The only dependency is our Network CMDB NetBox plugin.
The installation guide of this plugin is available in the CMDB section here.
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/","title":"Missing features in OpenConfig","text":""},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#context-and-approaches","title":"Context and approaches","text":"OpenConfig models do not provide all existing network features.
There are three approaches to this:
Warning
We want to stick with OpenConfig standard models.
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#missing-features","title":"Missing features","text":""},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#bgp","title":"BGP","text":"Feature Decisionenforce first-as
Migrate to a route-map like match as-path <asn>
being the AS of the neighbor. network
To be defined, maybe via route-maps
soft reconfiguration inbound
To be defined
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#routing-policies","title":"Routing policies","text":"Feature Decision large communities
Do a PR in OpenConfig repo decision in prefix-list: permit
/ deny
All prefixes in prefix-lists are hardcoded as permit
. The deny
must be done at in the route-map. prefix-list sequence Hardcoded in the jinja loop"},{"location":"SONiC-support/Criteo-SONiC-utilities/","title":"Criteo SONiC utilities","text":""},{"location":"SONiC-support/Criteo-SONiC-utilities/#installation","title":"Installation","text":"Our SONiC modules require some custom script to be installed:
/opt/salt/scripts/criteo_fdbshow
/opt/salt/scripts/criteo_intf_information
These scripts are available in the Criteo SONiC utilities repository.
You can use the provided Salt state to deploy them automatically. This state assumes some grains are properly set for each SONiC device:
hwsku: some-hardware\nnos: sonic\nsonic_asic_type: some-asic\nsonic_build_date: some-date\nsonic_build_version: 202205\nsonic_built_by: someone\nsonic_commit_id: some-commit-id\n
Tip
The needed grains are automatically set by our SONiC Salt Deployer.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/","title":"SONiC Salt deployer","text":"This tool deploys and configures Salt-minions on SONiC devices.
It includes:
You can run it regularly. There will be no impact on already deployed devices. Only the needed changes will be made.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#prepare-your-environment","title":"Prepare your environment","text":"python3 -m venv .venv\nsource .venv/bin/activate\npip install -r requirements/base.txt\n
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#how-to-use-it","title":"How to use it","text":""},{"location":"SONiC-support/SONiC-Salt-Deployer/#settings","title":"Settings","text":"See settings.env.
There is also an example provided here.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#usage","title":"Usage","text":"pip install -r requirements/base.txt\npython ./start.py\n
Or build the PEX via tox -e bundle
and run the executable.
You can use this systemd service and its timer.
"},{"location":"SONiC-support/SONiC-modules/","title":"SONiC modules","text":""},{"location":"SONiC-support/SONiC-modules/#how-to-install","title":"How to install","text":"The installation process is similar to SaltStack modules installation.
file_roots
section in the Salt-master configuration.Example of salt-master configuration:
file_roots:\n base:\n # ... other paths ...\n # SONiC codebase:\n - /srv/salt/base/sonic/\n
Important
Make sure to synchronize the modules with your minions:
salt <device> saltutil.sync_all\n
"},{"location":"SONiC-support/overview/","title":"Overview","text":""},{"location":"SONiC-support/overview/#supported-versions","title":"Supported versions","text":"SONiC version Support 201911* 202205 Legend
salt-minion
on your SONiC devices SONiC Salt Deployer section 2 Deploy Criteo SONiC utilities
Criteo SONiC utilities section 3 Deploy our SONiC modules SONiC modules section 4 Deploy our Saltstack modules SaltStack-modules section Important
To benefit from all AFK features, you need to change FRR integration in SONiC.
By default, the files in /etc/frr
of the BGP container are generated from an embedded template combined with metadata from the config_db
.
AFK requires to directly mount /etc/sonic/frr
from the host to /etc/frr
on the BGP container.
The change has been upstream starting 202205. To configure SONiC this way:
The idea is to have most of the logic in Salt modules.
We want to avoid making Jinja templates more complex because:
As another rule, we only authorize two levels of indentation in the template. If you need more, split your templates.
"},{"location":"SaltStack-modules/FAQ/#state-modules-are-supposed-to-be-agnostic-why-not-use-execution-modules-to-handle-network-os-specificities","title":"State modules are supposed to be agnostic. Why not use Execution modules to handle Network OS specificities?","text":"For now, it brings more complexity to create Execution modules to handle configuration differences.
However, we do use Execution modules to get information from the device and to apply the configuration.
"},{"location":"SaltStack-modules/FAQ/#eossonic-why-negate-all-configuration-statements-instead-of-pushing-diffs-or-override-everything","title":"EOS/SONiC: why negate all configuration statements instead of pushing diffs or override everything?","text":"Because not all Network OS provide a way to obtain a diff, e.g. SONiC/FRR.
We cannot override the entire configuration because a section of the configuration might be managed by multiple Salt States. For instance one State manages the global BGP configuration, another manages the BGP sessions and a third one manages EVPN.
Sometimes we do negate the entire section. We remove a route-map to recreate it entirely because it is safer and easier to do that: we ensure the route-map terms do not contain extra configuration we do not support.
"},{"location":"SaltStack-modules/FAQ/#what-is-the-behavior-of-sonic-bgp-configuration-if-something-bad-happens","title":"What is the behavior of SONiC BGP configuration if something bad happens?","text":"If a command is incorrect, it is ignored by FRR and the rest of the configuration is applied. It returns errors in the output.
Example
line 11: Failure to communicate[13] to zebra, line: ip prefix-list PF-LOOPBACK seq 10 permit 10.252.200.0/22 ge 22\n% Invalid prefix range for 10.252.200.0/22, make sure: len < ge-value <= le-value\n
If we attempt to remove a statement which is not present, it returns an error. But it still applies the rest of the configuration.
Example
% Could not find route-map RM-CLOS-IN\nline 13: Failure to communicate[13] to zebra, line: no route-map RM-CLOS-IN\nThese are not caught by the dry-run feature of vtysh, which only checks if the config is semantica:ly correct.\n
"},{"location":"SaltStack-modules/FAQ/#frr-is-not-offering-a-commit-based-configuration-yet-how-do-we-handle-this","title":"FRR is not offering a commit-based configuration yet, how do we handle this?","text":"This part is complex. The future of FRR is clear: the Northbound API. We plan to use the future incremental feature via CLI or gRPC.
In the meantime, AFK applies all changes in real time:
bgp route-map delay-timer 0
.Everything happens in a really short time as FRR pushes all lines in a simple for loop
in C.
Be careful
file_roots
section in the Salt-master configuration.Example of salt-master configuration:
file_roots:\n base:\n # your own codebase:\n - /srv/salt/base/your-code-base/\n # OpenConfig codebase:\n - /srv/salt/base/openconfig/\n
Important
Make sure to synchronize the modules with your minions:
salt <device> saltutil.sync_all\n
"},{"location":"SaltStack-modules/installation/#dependencies","title":"Dependencies","text":"Depending on the Network OS you want to support, you will need:
To apply the configuration, you need the input data in OpenConfig format (JSON RFC7951).
If you are using our Data Aggregation API, you can create the following pillar in {SALT_PILLAR_PATH}/data-aggregation-api-openconfig.sls
:
#!py\nimport logging\nimport requests\n\nDATACENTER = \"paris\"\nENVIRONMENT = \"production\"\nUSER = \"salt-master\"\nPASSWORD = \"awesomepassword\"\n\nDATA_API = f\"https://data-aggregation-api.{DATACENTER}.{ENVIRONMENT}.local\"\n\n\ndef run():\n \"\"\"Get OpenConfig data for all devices.\"\"\"\n device = __grains__[\"id\"]\n openconfig_endpoint = f\"{DATA_API}/devices/{device}/openconfig\"\n\n try:\n result = requests.get(openconfig_endpoint, auth=(USER, PASSWORD)).json()\n return {\"openconfig\": result}\n except Exception as error:\n logging.error(\"data-aggregation-api: failed to query '%s' because %s\", openconfig_endpoint, error)\n return {}\n
"},{"location":"SaltStack-modules/usage/#apply-the-configuration-manually","title":"Apply the configuration manually","text":"salt <device> state.apply afk test=True
salt <device> state.apply afk
Info
These are examples. Make sure to adapt them to your infrastructure.
Simple schedule in {SALT_PILLAR_PATH}/schedule_simple_afk.sls
schedule:\n afk:\n function: state.sls\n args:\n - afk\n minutes: 30\n range:\n start: 8am\n end: 7pm\n
Schedule only for devices having a afk-enabled
tag in NetBox in {SALT_PILLAR_PATH}/schedule_smart_afk.sls
#!py\nimport logging\nimport requests\n\nDATACENTER = \"paris\"\nENVIRONMENT = \"production\"\nUSER = \"salt-master\"\nPASSWORD = \"awesomepassword\"\n\nDATA_API = f\"https://data-aggregation-api.{DATACENTER}.{ENVIRONMENT}.local\"\n\n\ndef run():\n \"\"\"Get AFK afk-enabled data for all devices.\"\"\"\n device = __grains__[\"id\"]\n endpoint = f\"{DATA_API}/devices/{device}/salt_enabled\"\n\n try:\n result = requests.get(endpoint, auth=(USER, PASSWORD)).json()\n if result.get(\"salt_enabled\") is True:\n return {\n \"schedule\": {\n \"afk\": {\n \"function\": \"state.sls\",\n \"args\": [\"afk\"],\n \"minutes\": 30,\n \"range\": {\"start\": \"8am\", \"end\": \"7pm\"},\n }\n }\n }\n except Exception as error:\n logging.error(\"data-aggregation-api: failed to query '%s' because %s\", endpoint, error)\n\n return {}\n
Attention
If you are putting secrets directly in the pillar file, make sure to apply the appropriate permissions to the file. Something like chmod 600
.
At Criteo, we have decided to fully open source our network automation framework.
We have called it AFK, which stands for \"Automation Framework for networK\" (yes we are cheating a bit with the \"k\", but why not ). It fits with the idea of being Away From Keyboard while the network configuration is being deployed or updated automatically.
It is based on NetBox, OpenConfig, SaltStack, and supports Juniper JunOS, Arista EOS and SONiC.
Note
If you are using an ad-blocker, this documentation might not work properly as \"Criteo\" is in some links.
"},{"location":"#repositories","title":"Repositories","text":"Repository Description Latest commit Network CMDB Network CMDB plugin for Netbox Data aggregation API Aggregate data from CMDB and convert to OpenConfig SONiC Salt Deployer Tool to deploy and configure salt-minion on SONiC devices SONiC SaltStack States/execution modules for SONiC SONiC utilities SONiC scripts used by some SONiC SaltStack modules"},{"location":"#global-design","title":"Global design","text":"Note
Our approach to automation is opinionated. There are tons of ways of doing network configuration, and choices must be made.
This diagram shows the components of our framework:
flowchart TD\n CMDB[Network CMDB]\n DAAPI[Data Aggregation API]\n DEV[Network_Devices]\n DATASOURCE[Other data source*]\n\n CMDB -->|raw data| DAAPI\n DATASOURCE -->|raw data| DAAPI\n DAAPI -->|openconfig| SaltStack\n SaltStack -->|configuration| DEV[Network_Devices]
* The Data Aggregation API will be able to get and merge data from other data sources once a plugin system will be in place.
"},{"location":"#network-cmdb","title":"Network CMDB","text":"The Network CMDB contains data relative to the business and is completely agnostic to the network OS.
The models are designed to describe the objects themselves rather than the configuration from device perspective. The idea is also to avoid any data duplication which could lead to configuration mismatches.
For instance, we represent the BGP session itself with two joined tables describing peers: DeviceBGPSession
<==> BGPSession
<==> DeviceBGPSession
DeviceBGPSession
contains the local-as
but not the peer-as
, avoiding data duplication. The peer-as
being the local-as
of the other neighbor.BGPSession
contains all information peers have in common, like state (in production
, maintenance
etc...) or MD5 password
.This API aggregates data from their sources of truth: the Network CMDB or possibly any other data source you may have.
Then, it computes this data to provide OpenConfig JSON for each device as an output.
ygot is used to validate the output against the OpenConfig YANG models.
"},{"location":"#saltstack-modules","title":"SaltStack modules","text":"Our AFK Salt modules takes OpenConfig data and converts it as Network configuration. We are using templates to do that.
The end goal is to simply forward this OpenConfig data to the Network OS to apply the configuration. Currently, OpenConfig is, at best, partially implemented in Network Operating Systems.
"},{"location":"FAQ/","title":"FAQ","text":""},{"location":"FAQ/#how-to-decommission-an-entire-device","title":"How to decommission an entire device?","text":"For now, the mantra is: create your own Salt State to decommission.
We do have an optional removal feature, but with some safeguards.
As of today, we think decommissioning should be handled in different Salt States and modules than provisioning.
"},{"location":"FAQ/#what-part-is-device-agnostic-and-what-is-not","title":"What part is device-agnostic, and what is not?","text":"The only part of the stack which is not agnostic is SaltStack.
The job of Salt is to convert agnostic data coming from Data Aggregation API to a configuration the devices understand. So, the data in the NetBox CMDB and the Data Aggregation API must be completely agnostic, with nothing specific to a Network OS.
If we did not do that, Salt could try to push a configuration which would not be appropriate for the device.
Note
In the future, the Data Aggregation API might no longer be device-agnostic because of interface naming. More info to come.
"},{"location":"FAQ/#what-if-i-have-a-use-case-which-is-not-covered-by-openconfig","title":"What if I have a use case which is not covered by OpenConfig?","text":"You should follow these general rules:
See also Data-Aggregation-API.
"},{"location":"FAQ/#some-network-os-requires-extra-configuration-for-a-service-where-do-i-put-that","title":"Some Network OS requires extra configuration for a service, where do I put that?","text":"You are perfectly allowed to duplicate/convert an object if needed.
For instance, depending on the Network OS, PBR sometimes requires an ACL and sometimes a route-map. If it is doable, do that conversion in the Salt module and/or in the template.
"},{"location":"FAQs/","title":"FAQs","text":"The FAQs are divided per section to make it more digest.
For now, AFK provides a partial configuration removal feature. It is not enabled by default.
We do not advise enabling it, as it has not been battle-tested yet. If you still want to use it, please be careful and run tests in your lab first!
"},{"location":"configuration-cleaning/#target","title":"Target","text":"The target is to manage the configuration like any Configuration Management Software, e.g. Chef.
The system maintains the desired state but will not remove extra configuration. For instance, when you maintain the configuration of a server with Chef, it will never remove an extra package installed by a user. This is the same here. If you add manually a protocol which is not supported by AFK, like OSPF, it will not remove it. Meanwhile, BGP sessions are maintained by AFK, so extra BGP session can be removed.
We do not want to give too much power to the CMDB, which hosts the source of truth of what should be on the device. We do not want to remove all BGP sessions if someone removes the wrong device in NetBox.
This is why we have already implemented safeguards and will continue to do so.
Some examples:
Info
SSH support will probably not be added because we consider it to be part of a minimal configuration which must be configured on the device during bootstrap (via ZTP or manual).
"},{"location":"installation/","title":"Installation","text":"Component How to install NetBox https://docs.netbox.dev/en/stable/installation/ Network CMDB Network CMDB plugin for Netbox Data Aggregation API see our Installation guide SaltStack https://docs.saltproject.io/salt/install-guide/en/latest/ SaltStack for JunOS/EOS proxy-minion with napalm driver SaltStack for SONiC see our SONiC Salt Deployer and our AFK modules for SaltStackNote
If you are not familiar with NetBox and SaltStack, you should look at their awesome documentation prior to trying AFK:
Info
We plan to provide a script to deploy a development environment. You will be able to inspire yourself from this to deploy to production.
"},{"location":"monitoring/","title":"Monitoring","text":"Warning
This page is under construction.
"},{"location":"provisioning/","title":"Provisioning","text":""},{"location":"provisioning/#bootstrap-your-devices","title":"Bootstrap your devices","text":"For now, AFK does not provide any way to generate bootstrap configuration.
The advised workflow is the following:
The minimal configuration cannot be generated by Salt as it requires a salt-minion for the device.
You could use Python scripts for instance to generate them, host the minimal configuration files on a server and serve them via your ZTP process.
One alternative would be to rely on the default configuration of your devices. But this depends on the Network OS used.
Note
We think this minimal configuration should not be maintained by AFK to avoid losing the access in case of misconfiguration in the CMDB.
"},{"location":"provisioning/#use-provisioners","title":"Use Provisioners","text":"The provisioner is where your business logic is located.
It can take many forms. At Criteo we have two kinds:
one shot provisioner
provisions the CMDB to have a fully operational datacenter.service provisioner
provisions the CMDB on demand. It allows your internal clients to dynamically benefit from network services, like BGP as a service.Here you have two options:
not so AFK
: just run the SaltStack State manually via the salt <device> state.apply afk
commandreal AFK
: create a schedule to automatically apply the configurationNetBox is a great tool to manage your DCIM and IPAM.
Our CMDB models directly use the DCIM and IPAM models, like Device
and IP prefixes
.
NetBox also includes several models which could be considered as CMDB such as VLAN
, VRF
and L2VPN
. We do not want to use these models to avoid confusion and design mismatches with our CMDB.
OpenConfig models are designed from a device perspective.
Our CMDB models are designed from a service/asset perspective. It allows us to apply common parameters without worrying of applying them at both ends.
Example: we can directly set a maintenance status on a BGP session.
It also allows us to limit redefinition of values that can lead to misconfiguration. For instance, in OpenConfig nothing prevents us from having a mismatch of ASN in a BGP session. In our CMDB, it cannot happen as the BGP session is the central object which links two devices, hence the neighbor A remote-as is the neighbor B local-as.
"},{"location":"CMDB/FAQ/#is-it-perfect-and-completely-generic","title":"Is it perfect and completely generic?","text":"No
It will not cover all use cases, but we will bring more features.
The CMDB models come from opinionated choices aiming to simplify the implementation.
Also, we do not aim to have a factorized configuration.
"},{"location":"CMDB/FAQ/#why-is-max-prefixes-configured-at-the-top-level-of-the-neighbor-and-not-in-a-safi","title":"Why is max-prefixes configured at the top level of the neighbor and not in a SAFI?","text":"This is due to an EOS limitation. It cannot set the max prefix at SAFI level:
device.lab(config)#router bgp 65000\n\ndevice.lab(config-router-bgp)#neighbor 100.64.1.0 maximum-routes ?\n <0-4294967294> Maximum number of routes (0 means unlimited)\n\ndevice.lab(config-router-bgp)#address-family ipv4\n\ndevice.lab(config-router-bgp-af)#neighbor 100.64.1.0 ?\n activate Activate neighbor in the address family\n additional-paths BGP additional-paths commands\n default-originate Advertise a default route to this peer\n graceful-restart Enable graceful restart mode\n next-hop Next-hop address-family configuration\n next-hop-unchanged Preserve original nexthop while advertising routes to eBGP peers\n prefix-list Prefix list reference\n route-map Name a route map\n weight Assign weight for routes learnt from this peer\n
"},{"location":"CMDB/FAQ/#why-is-there-no-permitdeny-field-in-community-lists-and-prefix-lists","title":"Why is there no permit/deny field in community-lists and prefix-lists?","text":"While some devices provides a permit/deny
field to a community/prefix-list, other Network OS like JunOS does not. More importantly, OpenConfig does not have it either.
If you are currently using permit/deny
, we suggest you to adapt your route-map instead.
Example
You should migrate from:
ip prefix-list FANCY_PREFIX_LIST 10 deny 10.0.0.0/24\nip prefix-list FANCY_PREFIX_LIST 20 permit 10.0.0.0/8 le 32\n\nroute-map RM-TEST-IN 5 permit\n match ip prefix-list FANCY_PREFIX_LIST\n
to:
ip prefix-list FANCY_PREFIX_LIST 10 permit 10.0.0.0/24\nip prefix-list ANOTHER_FANCY_PREFIX_LIST 20 permit 10.0.0.0/8 le 32\n\nroute-map RM-TEST-IN 5 deny\n match ip prefix-list FANCY_PREFIX_LIST\n\nroute-map RM-TEST-IN 10 permit\n match ip prefix-list ANOTHER_FANCY_PREFIX_LIST\n
"},{"location":"CMDB/FAQ/#why-cannot-a-route-map-filter-on-safi-while-openconfig-permits-it","title":"Why cannot a route-map filter on SAFI while OpenConfig permits it?","text":"The issue here is coming from implementation differences between JunOS and EOS/FRR:
In JunOS, this should be done using afi-safi-in
in routing-policy (applying a route-map in a specific SAFI in BGP configuration is not possible): route-map->term->from->safi
In EOS/FRR, this should be done in BGP config (afi-safi-in
in routing-policy is not supported): bgp->neighbor->safi->route-map
Authorizing both methods could lead to conflicts.
Example
If we set the following configuration in the CMDB:
route-map RM-TEST\n term 10\n from afi-safi-in ipv4_unicast\n\nrouter bgp\n neighbor toto\n address-family ipv6 unicast\n route-map out RM-TEST\n
We would end up with a route-map with different meanings depending on the network OS.
We decided to not support afi-safi-in
in routing-policy to simplify and reduce risk.
Yes.
JunOS example
# we create duplicates per SAFI, adding \"from family <safi>\" in each term:\nset policy-options policy-statement AUTOGENERATED::RM-TEST:IPV4_UNICAST term 10 from family inet\n...\n\nset policy-options policy-statement AUTOGENERATED::RM-TEST:IPV6_UNICAST term 10 from family inet6\n...\n\n# then we apply the one matching the SAFI used in the CMDB neighbor configuration (bgp->neighbor->address-family->route-map)\nset routing-instance prod protocols bgp group TEST neighbor 192.0.2.1 export [AUTOGENERATED::RM-TEST:IPV4_UNICAST]\n
"},{"location":"CMDB/FAQ/#why-is-send-community-not-implemented","title":"Why is send-community not implemented?","text":"For two reasons:
Adding it would make the Salt templates more complex because sometimes it would be directly in the BGP configuration, sometimes it would imply the creation of a route-map.
Additionally, it would make route-policies and BGP states more tightly coupled.
"},{"location":"CMDB/FAQ/#why-are-peer-groups-deprecated-not-supported","title":"Why are peer-groups deprecated / not supported?","text":"Because peer-groups bring a lot of complexity and risks, especially because of FRR.
In FRR, for sessions being in a peer-group:
some of the issues we found in FRR
When the neighbor is in a peer-group and the remote-as is set at neighbor level only:
error: \u201cCannot change the peer-group. Deconfigure first\u201d
When the neighbor is in a peer-group and the remote-as is set at the peer-group level only:
There are other issues, and we did not even talk about the difference with the other Network OS...
"},{"location":"CMDB/FAQ/#questions-about-deprecated-features","title":"Questions about deprecated features","text":"expand..."},{"location":"CMDB/FAQ/#why-do-peer-groups-not-have-maximum-prefixes","title":"Why do peer-groups not have maximum-prefixes?","text":"maximum-prefixes
can only be set at the SAFI levelBecause of EOS, the maximum-prefixes
field in the CMDB is set at the peer/peer-group level too. Then, the Data Aggregation API duplicates the value for each maximum-prefixes
set.
We could force the maximum-prefixes
for each SAFI whether they are enabled or not. But in JunOS it would mean enabling all the SAFIs for the peer-group.
So it is easier to enable this option only at the peer level, as the peer must be in a SAFI to work.
"},{"location":"CMDB/FAQ/#why-dont-we-support-safi-in-peer-groups","title":"Why don't we support SAFI in peer-groups?","text":"Because of FRR:
For retro-compatibility purposes, we do not manage the statement no neighbor <peer-group> activate
. Meaning, if it is added manually, it will not be removed.
Warning
To ease migration, our Salt template for SONiC supports peer-groups without remote-as.
All these restrictions come from FRR. In FRR, we cannot setup the remote-as at both peer-group and neighbor level.
It means that the peer-group remote-as MUST match with all its neighbors remote-as. Otherwise, the CMDB would not reflect what would really be applied to the device.
TL;DR
Note: for local-as the behavior is different:
router bgp 65000\n neighbor PG-L3_RA local-as 65000\n% Cannot have local-as same as BGP AS number\n\n neighbor 192.0.2.1 local-as 65001\n neighbor 192.0.2.1 local-as 65000\n% Cannot have local-as same as BGP AS number\n neighbor 192.0.2.1 local-as 65002\n
"},{"location":"CMDB/endpoints/","title":"Endpoints","text":""},{"location":"CMDB/endpoints/#asns","title":"ASNs","text":"Endpoint: /api/plugins/cmdb/asns/
Provides all available ASN.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"created\": \"2022-09-28T12:28:52.112949Z\",\n \"last_updated\": \"2022-09-28T12:28:52.112978Z\",\n \"organization_name\": \"Paris\",\n \"number\": 65000\n }\n ]\n}\n
"},{"location":"CMDB/endpoints/#bgp-global","title":"BGP Global","text":"Endpoint: /api/plugins/cmdb/bgp-global/
Provides the BGP global configuration for each device.
ExampleTODO\n
"},{"location":"CMDB/endpoints/#bgp-sessions","title":"BGP sessions","text":"Endpoint: /api/plugins/cmdb/bgp-sessions/
Provides all BGP sessions.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"peer_a\": {\n \"id\": 1,\n \"local_address\": {\n \"id\": 1234,\n \"url\": \"https://netbox.local/api/ipam/ip-addresses/1234/\",\n \"display\": \"192.0.2.16/31\",\n \"family\": 4,\n \"address\": \"192.0.2.16/31\"\n },\n \"device\": {\n \"id\": 123,\n \"name\": \"tor1\"\n },\n \"local_asn\": {\n \"id\": 1,\n \"number\": 65000,\n \"organization_name\": \"Criteo-65000\"\n },\n \"afi_safis\": [\n {\n \"id\": 217,\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"afi_safi_name\": \"ipv4-unicast\"\n }\n ],\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"created\": \"2022-11-07T16:57:33.848779Z\",\n \"last_updated\": \"2022-11-07T16:57:33.848795Z\",\n \"description\": \"to-spine1\",\n \"enforce_first_as\": true,\n \"maximum_prefixes\": 10000\n },\n \"peer_b\": {\n \"id\": 218,\n \"local_address\": {\n \"id\": 22,\n \"url\": \"https://netbox.local/api/ipam/ip-addresses/22/\",\n \"display\": \"192.0.2.17/31\",\n \"family\": 4,\n \"address\": \"192.0.2.17/31\"\n },\n \"device\": {\n \"id\": 345,\n \"name\": \"spine1\"\n },\n \"local_asn\": {\n \"id\": 10,\n \"number\": 65001,\n \"organization_name\": \"Criteo-65001\"\n },\n \"afi_safis\": [\n {\n \"id\": 2,\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"afi_safi_name\": \"ipv4-unicast\"\n }\n ],\n \"route_policy_in\": null,\n \"route_policy_out\": null,\n \"created\": \"2022-11-07T16:57:33.853487Z\",\n \"last_updated\": \"2022-11-07T16:57:33.853500Z\",\n \"description\": \"to-tor1\",\n \"enforce_first_as\": true,\n \"maximum_prefixes\": 10000\n },\n \"tenant\": null,\n \"created\": \"2022-11-07T16:57:33.856145Z\",\n \"last_updated\": \"2022-11-07T16:57:33.856156Z\",\n \"status\": \"active\",\n \"password\": \"\",\n \"circuit\": null\n },\n ]\n}\n
"},{"location":"CMDB/endpoints/#prefix-lists","title":"Prefix lists","text":"Endpoint: /api/plugins/cmdb/prefix-lists/
Provides the prefix lists for each device.
Example{\n \"results\": [\n {\n \"id\": 129,\n \"name\": \"PF-ANY_IPV6\",\n \"device\": {\n \"id\": 1,\n \"name\": \"tor1\"\n },\n \"ip_version\": \"ipv6\",\n \"terms\": [\n {\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"prefix\": \"::/0\",\n \"le\": 128,\n \"ge\": null\n }\n ]\n },\n ]\n}\n
"},{"location":"CMDB/endpoints/#bgp-community-lists","title":"BGP community lists","text":"Endpoint: /api/plugins/cmdb/bgp-community-lists/
Provides the BGP community lists for each device.
Example{\n \"results\": [\n {\n \"id\": 1,\n \"device\": {\n \"id\": 1,\n \"name\": \"tor1\"\n },\n \"terms\": [\n {\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"community\": \"65000:10000\"\n }\n ],\n \"created\": \"2022-07-28T13:37:05.699938Z\",\n \"last_updated\": \"2022-07-28T13:37:05.699968Z\",\n \"name\": \"CL-SERVER\"\n }\n ]\n}\n
"},{"location":"CMDB/endpoints/#route-policies","title":"Route policies","text":"Endpoint: /api/plugins/cmdb/route-policies/
Provides the route maps for each device.
Example{\n \"results\": [\n {\n \"id\": 129,\n \"name\": \"RM-UPLINK-IN\",\n \"device\": {\n \"id\": 123,\n \"name\": \"tor1\"\n },\n \"description\": \"tor1:uplink-in\",\n \"terms\": [\n {\n \"description\": \"\",\n \"sequence\": 10,\n \"decision\": \"permit\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": null,\n \"from_prefix_list\": {\n \"id\": 129,\n \"device\": 123,\n \"name\": \"PF-ANY_IPV6\"\n },\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n },\n {\n \"description\": \"\",\n \"sequence\": 20,\n \"decision\": \"permit\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": {\n \"id\": 1,\n \"device\": 123,\n \"name\": \"CL-SERVER\"\n },\n \"from_prefix_list\": null,\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n },\n {\n \"description\": \"\",\n \"sequence\": 30,\n \"decision\": \"deny\",\n \"from_bgp_community\": \"\",\n \"from_bgp_community_list\": null,\n \"from_prefix_list\": null,\n \"from_source_protocol\": \"\",\n \"from_route_type\": \"\",\n \"from_local_pref\": null,\n \"set_local_pref\": null,\n \"set_community\": \"\",\n \"set_origin\": \"\",\n \"set_metric\": null,\n \"set_large_community\": \"\",\n \"set_as_path_prepend\": \"\",\n \"set_next_hop\": null\n }\n ]\n }\n ]\n}\n
"},{"location":"CMDB/installation/","title":"Installation","text":""},{"location":"CMDB/installation/#existing-netbox-instance","title":"Existing NetBox instance","text":"This is simple:
network_cmdb
plugin from the criteo/netbox-network-cmdb repository.netbox/configuration.py
file and add netbox_cmdb
in the PLUGINS
list.You can have a look at the docker-compose.yml
and the other scripts in the develop directory.
make start
from the Network CMDB repository.For now, there are no CMDB components in NetBox UI. It will be added later.
In the meantime you can access the CMDB in the Django Admin UI: http://127.0.0.1:8000/admin/
"},{"location":"CMDB/models/BGP/","title":"BGP","text":""},{"location":"CMDB/models/BGP/#source-code","title":"Source code","text":"Models location: netbox_cmdb/models/bgp.py
"},{"location":"CMDB/models/BGP/#table-relations","title":"Table relations","text":"Info
To simplify the diagram, not all relations are displayed (example: DeviceBGPSession
=> IPAM.IPAddress
)
erDiagram\n\n dcim_Device 1--1 BGPGlobal: \"\"\n dcim_Device 1--0+ BGPSession: \"\"\n BGPSession 1--1+ DeviceBGPSession: \"\"\n DeviceBGPSession 1--0+ AfiSafi: \"\"\n DeviceBGPSession 1--0+ RoutePolicy: \"\"\n AfiSafi 1--0+ RoutePolicy: \"\"
"},{"location":"CMDB/models/BGP/#bgp-global-configuration","title":"BGP global configuration","text":"BGPGlobal:\n device: dcim.Device\n local_asn: cmdb.ASN\n router_id: string\n ebgp_administrative_distance: integer\n ibgp_administrative_distance: integer\n graceful_restart: boolean\n graceful_restart_time: integer\n ecmp: boolean\n ecmp_maximum_paths: integer\n
ASN:\n organization_name: string\n number: integer\n
"},{"location":"CMDB/models/BGP/#bgp-sessions","title":"BGP sessions","text":"The implementation for BGP sessions is more complex.
The main idea is to have a BGPSession
linked to two DeviceBGPSession
to avoid data duplication such as local-asn
vs remote-asn
.
Info
As we have deprecated the usage of peer-groups, we do not document the model here.
BGPSession:\n state: string choice\n monitoring_state: string choice\n peer_a: cmdb.DeviceBGPSession\n peer_b: cmdb.DeviceBGPSession\n password: string\n circuit: cmdb.Circuit\n tenant: dcim.Tenant\n
DeviceBGPSession:\n device: dcim.Device\n enabled: boolean\n description: string\n local_address: netbox.ipam.IPAddress\n remote_asn: cmdb.ASN\n peer_group: cmdb.PeerGroup (deprecated)\n maximum_prefixes: integer\n route_policy_in: cmdb.RoutePolicy\n route_policy_out: cmdb.RoutePolicy\n enforce_first_as: boolean (not used yet)\n
AfiSafi:\n device: dcim.Device\n route_policy_in: cmdb.RoutePolicy\n route_policy_out: cmdb.RoutePolicy\n device_bgp_session: cmdb.DeviceBGPSession\n
"},{"location":"CMDB/models/Routing-Policies/","title":"Routing Policies","text":""},{"location":"CMDB/models/Routing-Policies/#source-code","title":"Source code","text":"Models locations:
erDiagram\n\n dcim_Device 1--0+ RoutePolicy: \"\"\n RoutePolicy 1--0+ RoutePolicyTerm: \"\"\n RoutePolicyTerm 0+--o| BgpCommunityList: \"\"\n RoutePolicyTerm 0+--o| PrefixList: \"\"\n BgpCommunityList 1--0+ BgpCommunityListTerm: \"\"\n PrefixList 1--0+ PrefixListTerm: \"\"\n
"},{"location":"CMDB/models/Routing-Policies/#prefix-lists","title":"Prefix lists","text":"PrefixList:\n name: string\n device: dcim.Device\n ip_version: string choice\n
PrefixListTerm:\n prefix_list: cmdb.PrefixList\n sequence: integer\n decision: string choice\n prefix: IPNetwork\n le: integer\n ge: integer\n
"},{"location":"CMDB/models/Routing-Policies/#community-lists","title":"Community lists","text":""},{"location":"CMDB/models/Routing-Policies/#_1","title":"Routing Policies","text":"BGPCommunityList:\n name: string\n device: dcim.Device\n
BGPCommunityListTerm:\n bgp_community_list: cmdb.BGPCommunityList\n sequence: integer\n decision: string choice\n community: string\n
"},{"location":"CMDB/models/Routing-Policies/#route-policies","title":"Route policies","text":"RoutePolicy:\n name: string\n device: dcim.Device\n description: string\n
RoutePolicyTerm:\n route_policy: cmdb.RoutePolicy\n description: string\n sequence: integer\n decision: string choice\n\n # match\n from_bgp_community: string\n from_bgp_community_list: cmdb.BgpCommunityList\n from_prefix_list: cmdb.PrefixList\n from_source_protocol: string\n from_route_type: string\n from_local_pref: integer\n\n # set\n set_local_pref: integer\n set_community: string\n set_origin: string\n set_metric: integer\n set_large_community: string\n set_as_path_prepend: string\n set_next_hop: IPaddress\n
"},{"location":"Data-Aggregation-API/configuration/","title":"Configuration","text":""},{"location":"Data-Aggregation-API/configuration/#configuration-file","title":"Configuration file","text":"The Data Aggregation API can be configured with flags, environments variables and configuration file.
The precedence order for the different methods is:
Info
Datacenter: \"europe\"\n\nLog:\n Level: \"error\"\n Pretty: true\n\nAPI:\n ListenAddress: \"127.0.0.1\"\n ListenPort: 1234\n\nAuthentication:\n LDAP:\n URL: \"ldaps://URL.local\"\n BindDN: \"cn=<user>,OU=<ou>,DC=<local>\"\n BaseDN: \"DC=<local>\"\n Password: \"<some_password>\"\n WorkersCount: 10\n Timeout: 5s\n MaxConnectionLifetime: 1m\n InsecureSkipVerify: false\n\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n DatacenterFilterKey: \"site\"\n LimitPerPage: 500\n\nBuild:\n Interval: \"30m\"\n AllDevicesMustBuild: false\n
"},{"location":"Data-Aggregation-API/configuration/#global-settings","title":"Global settings","text":"Parameter Default Description Datacenter `` Value used to filter devices. The key is defined by DatacenterFilterKey
."},{"location":"Data-Aggregation-API/configuration/#log-settings","title":"Log settings","text":"All parameters below are in the Log
section of the configuration. This section is optional.
info
Log level. Pretty false
If enabled: human readable logs (with colors). If disabled: structured logs."},{"location":"Data-Aggregation-API/configuration/#api-settings","title":"API settings","text":"All parameters below are in the API
section of the configuration. This section is optional.
0.0.0.0
Listening address of the web API. ListenPort 8080
Listening port of the web API."},{"location":"Data-Aggregation-API/configuration/#authentication-settings","title":"Authentication settings","text":"All parameters below are in the Authentication
->LDAP
section of the configuration.
LDAP Authentication is only to authenticate users when they try to query the Web API, i.e. when they want to retrieve the built config.
This section is optional.
Parameter Default Description InsecureSkipVerifyfalse
Ignore LDAP TLS warnings. URL URL of the LDAP server. BindDN Bind used to query the LDAP server. Password Password to query the LDAP server. BaseDN Only users matching the BaseDN are authorized to query the web API. WorkersCount 10
Number of workers to authenticate users concurrently. Timeout 10s
Time to wait before considering a LDAP request timed out. MaxConnectionLifetime 1m
Lifetime of worker connection to LDAP. Useful to re-use existing LDAP connection."},{"location":"Data-Aggregation-API/configuration/#netbox-settings","title":"NetBox settings","text":"All parameters below are in the NetBox
section of the configuration.
site
Key used to filter devices, using Datacenter
as a value. LimitPerPage 100
Number of elements per page when getting data from NetBox API."},{"location":"Data-Aggregation-API/configuration/#build-settings","title":"Build settings","text":"All parameters below are in the NetBox
section of the configuration. This section is optional.
1m
Build interval, e.g.: 10m
, 1h
AllDevicesMustBuild false
The build fails if one device has not been built."},{"location":"Data-Aggregation-API/configuration/#alternative-methods","title":"Alternative methods","text":""},{"location":"Data-Aggregation-API/configuration/#environment-variables","title":"Environment variables","text":"All settings available in the configuration file can be set as environment variables, but:
DAAPI_
_
is the level separatorFor example, the equivalent of this config file:
Datacenter: \"europe\"\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n
is:
DAAPI_DATACENTER=\"europe\"\nDAAPI_NETBOX_URL=\"https://netbox.local\"\nDAAPI_NETBOX_APIKEY=\"<some_key>\"\n
"},{"location":"Data-Aggregation-API/design/","title":"Design","text":""},{"location":"Data-Aggregation-API/design/#ingestors","title":"Ingestors","text":"An ingestor is a component responsible for fetching data from a single source of truth. Most of the time it is querying a single endpoint.
Examples of ingestor with their associated endpoints:
/api/plugins/cmdb/bgp-sessions/
/api/plugins/cmdb/route-policies/
The precompute step is responsible for extracting raw data per device. The goal is to be able to have all needed data to compute one device configuration.
For instance, a BGPSession
has two peers: peer_a
and peer_b
. The devices matching the peers will all have this BGPSession
in their respective raw_data
.
Here is a simplified output of the bgp-sessions
endpoint.
{\n \"results\": [\n {\n \"peer_a\": {\n \"local_address\": {\n \"address\": \"192.0.2.16/31\"\n },\n \"device\": {\n \"name\": \"tor1\"\n },\n \"local_asn\": {\n \"number\": 65000,\n \"organization_name\": \"Criteo-65000\"\n },\n \"description\": \"to-spine1\",\n },\n \"peer_b\": {\n \"local_address\": {\n \"address\": \"192.0.2.17/31\"\n },\n \"device\": {\n \"name\": \"spine1\"\n },\n \"local_asn\": {\n \"number\": 65001,\n },\n \"description\": \"to-tor1\",\n },\n \"status\": \"active\",\n \"password\": \"thisisanincredibleandcomplexpassword:)\",\n }\n ]\n }\n
The precompute will \"copy\" this structure for both devices:
tor1.raw_data[\"bgp-session\"] = results[0]
spine1.raw_data[\"bgp-session\"] = results[0]
Thanks to this simple precompute part, tor1
OpenConfig can be generated independently:
tor1.neighbor[0].local_as = tor1.raw_data[\"bgp-session\"].peer_a.local_asn\ntor1.neighbor[0].remote_as = tor1.raw_data[\"bgp-session\"].peer_b.local_asn\n...\n
this is pseudo-code just to explain the idea.
"},{"location":"Data-Aggregation-API/design/#convertors","title":"Convertors","text":"This is where the magic happens. From the precomputed data, the Data Aggregation API generates OpenConfig configuration for each device.
Thanks to ygot and OpenConfig YANG models, the Data Aggregation API has all the OpenConfig Go structures.
/v1/report/last
Last or ongoing build report /v1/report/last/complete
Report of the last complete build (whether it failed or not) /v1/report/last/successful
Report of the last successful build The reports contain: * The status of the build. * When it started and finished. * Statistics like performance. * Logs.
"},{"location":"Data-Aggregation-API/design/#openconfig","title":"OpenConfig","text":"Endpoint Description/v1/devices/[hostname]/openconfig
Get OpenConfig data for a specific device /v1/devices/*/openconfig
Get OpenConfig data for all devices /v1/devices/[hostname]/afk_enabled
Indicates if a device has the tag AFK enabled in NetBox /v1/devices/*/afk_enabled
Same as above, but for all devices Note: afk_enabled
can be used to enable or disable AFK schedule in Salt via afk-enabled
tag in NetBox DCIM.
/api/build/trigger
Trigger a new build, only one at a time /metrics
Prometheus metrics /api/version
Details about the running version /api/health
Dummy endpoint for basic healthcheck of the app"},{"location":"Data-Aggregation-API/installation/","title":"Installation","text":""},{"location":"Data-Aggregation-API/installation/#quickstart","title":"Quickstart","text":"Example of basic configuration (settings.yml
):
Datacenter: \"europe\"\nLog:\n Level: \"error\"\n Pretty: true\n\nAPI:\n ListenAddress: \"127.0.0.1\"\n ListenPort: 1234\n\nLog:\n Level: \"info\"\n Pretty: true\n\nNetBox:\n URL: \"https://netbox.local\"\n APIKey: \"<some_key>\"\n DatacenterFilterKey: \"site\"\n\nBuild:\n Interval: \"30m\"\n
See Configuration for more details.
"},{"location":"Data-Aggregation-API/installation/#dependencies","title":"Dependencies","text":"The only dependency is our Network CMDB NetBox plugin.
The installation guide of this plugin is available in the CMDB section here.
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/","title":"Missing features in OpenConfig","text":""},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#context-and-approaches","title":"Context and approaches","text":"OpenConfig models do not provide all existing network features.
There are three approaches to this:
Warning
We want to stick with OpenConfig standard models.
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#missing-features","title":"Missing features","text":""},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#bgp","title":"BGP","text":"Feature Decisionenforce first-as
Migrate to a route-map like match as-path <asn>
being the AS of the neighbor. network
To be defined, maybe via route-maps
soft reconfiguration inbound
To be defined
"},{"location":"Data-Aggregation-API/missing-features-in-openconfig/#routing-policies","title":"Routing policies","text":"Feature Decision large communities
Do a PR in OpenConfig repo decision in prefix-list: permit
/ deny
All prefixes in prefix-lists are hardcoded as permit
. The deny
must be done at in the route-map. prefix-list sequence Hardcoded in the jinja loop"},{"location":"SONiC-support/Criteo-SONiC-utilities/","title":"Criteo SONiC utilities","text":""},{"location":"SONiC-support/Criteo-SONiC-utilities/#installation","title":"Installation","text":"Our SONiC modules require some custom script to be installed:
/opt/salt/scripts/criteo_fdbshow
/opt/salt/scripts/criteo_intf_information
These scripts are available in the Criteo SONiC utilities repository.
You can use the provided Salt state to deploy them automatically. This state assumes some grains are properly set for each SONiC device:
hwsku: some-hardware\nnos: sonic\nsonic_asic_type: some-asic\nsonic_build_date: some-date\nsonic_build_version: 202205\nsonic_built_by: someone\nsonic_commit_id: some-commit-id\n
Tip
The needed grains are automatically set by our SONiC Salt Deployer.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/","title":"SONiC Salt deployer","text":"This tool deploys and configures Salt-minions on SONiC devices.
It includes:
You can run it regularly. There will be no impact on already deployed devices. Only the needed changes will be made.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#prepare-your-environment","title":"Prepare your environment","text":"python3 -m venv .venv\nsource .venv/bin/activate\npip install -r requirements/base.txt\n
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#how-to-use-it","title":"How to use it","text":""},{"location":"SONiC-support/SONiC-Salt-Deployer/#settings","title":"Settings","text":"See settings.env.
There is also an example provided here.
"},{"location":"SONiC-support/SONiC-Salt-Deployer/#usage","title":"Usage","text":"pip install -r requirements/base.txt\npython ./start.py\n
Or build the PEX via tox -e bundle
and run the executable.
You can use this systemd service and its timer.
"},{"location":"SONiC-support/SONiC-modules/","title":"SONiC modules","text":""},{"location":"SONiC-support/SONiC-modules/#how-to-install","title":"How to install","text":"The installation process is similar to SaltStack modules installation.
file_roots
section in the Salt-master configuration.Example of salt-master configuration:
file_roots:\n base:\n # ... other paths ...\n # SONiC codebase:\n - /srv/salt/base/sonic/\n
Important
Make sure to synchronize the modules with your minions:
salt <device> saltutil.sync_all\n
"},{"location":"SONiC-support/overview/","title":"Overview","text":""},{"location":"SONiC-support/overview/#supported-versions","title":"Supported versions","text":"SONiC version Support 201911* 202205 Legend
salt-minion
on your SONiC devices SONiC Salt Deployer section 2 Deploy Criteo SONiC utilities
Criteo SONiC utilities section 3 Deploy our SONiC modules SONiC modules section 4 Deploy our Saltstack modules SaltStack-modules section Important
To benefit from all AFK features, you need to change FRR integration in SONiC.
By default, the files in /etc/frr
of the BGP container are generated from an embedded template combined with metadata from the config_db
.
AFK requires to directly mount /etc/sonic/frr
from the host to /etc/frr
on the BGP container.
The change has been upstream starting 202205. To configure SONiC this way:
The idea is to have most of the logic in Salt modules.
We want to avoid making Jinja templates more complex because:
As another rule, we only authorize two levels of indentation in the template. If you need more, split your templates.
"},{"location":"SaltStack-modules/FAQ/#state-modules-are-supposed-to-be-agnostic-why-not-use-execution-modules-to-handle-network-os-specificities","title":"State modules are supposed to be agnostic. Why not use Execution modules to handle Network OS specificities?","text":"For now, it brings more complexity to create Execution modules to handle configuration differences.
However, we do use Execution modules to get information from the device and to apply the configuration.
"},{"location":"SaltStack-modules/FAQ/#eossonic-why-negate-all-configuration-statements-instead-of-pushing-diffs-or-override-everything","title":"EOS/SONiC: why negate all configuration statements instead of pushing diffs or override everything?","text":"Because not all Network OS provide a way to obtain a diff, e.g. SONiC/FRR.
We cannot override the entire configuration because a section of the configuration might be managed by multiple Salt States. For instance one State manages the global BGP configuration, another manages the BGP sessions and a third one manages EVPN.
Sometimes we do negate the entire section. We remove a route-map to recreate it entirely because it is safer and easier to do that: we ensure the route-map terms do not contain extra configuration we do not support.
"},{"location":"SaltStack-modules/FAQ/#what-is-the-behavior-of-sonic-bgp-configuration-if-something-bad-happens","title":"What is the behavior of SONiC BGP configuration if something bad happens?","text":"If a command is incorrect, it is ignored by FRR and the rest of the configuration is applied. It returns errors in the output.
Example
line 11: Failure to communicate[13] to zebra, line: ip prefix-list PF-LOOPBACK seq 10 permit 10.252.200.0/22 ge 22\n% Invalid prefix range for 10.252.200.0/22, make sure: len < ge-value <= le-value\n
If we attempt to remove a statement which is not present, it returns an error. But it still applies the rest of the configuration.
Example
% Could not find route-map RM-CLOS-IN\nline 13: Failure to communicate[13] to zebra, line: no route-map RM-CLOS-IN\nThese are not caught by the dry-run feature of vtysh, which only checks if the config is semantica:ly correct.\n
"},{"location":"SaltStack-modules/FAQ/#frr-is-not-offering-a-commit-based-configuration-yet-how-do-we-handle-this","title":"FRR is not offering a commit-based configuration yet, how do we handle this?","text":"This part is complex. The future of FRR is clear: the Northbound API. We plan to use the future incremental feature via CLI or gRPC.
In the meantime, AFK applies all changes in real time:
bgp route-map delay-timer 0
.Everything happens in a really short time as FRR pushes all lines in a simple for loop
in C.
Be careful
file_roots
section in the Salt-master configuration.Example of salt-master configuration:
file_roots:\n base:\n # your own codebase:\n - /srv/salt/base/your-code-base/\n # OpenConfig codebase:\n - /srv/salt/base/openconfig/\n
Important
Make sure to synchronize the modules with your minions:
salt <device> saltutil.sync_all\n
"},{"location":"SaltStack-modules/installation/#dependencies","title":"Dependencies","text":"Depending on the Network OS you want to support, you will need:
To apply the configuration, you need the input data in OpenConfig format (JSON RFC7951).
If you are using our Data Aggregation API, you can create the following pillar in {SALT_PILLAR_PATH}/data-aggregation-api-openconfig.sls
:
#!py\nimport logging\nimport requests\n\nDATACENTER = \"paris\"\nENVIRONMENT = \"production\"\nUSER = \"salt-master\"\nPASSWORD = \"awesomepassword\"\n\nDATA_API = f\"https://data-aggregation-api.{DATACENTER}.{ENVIRONMENT}.local\"\n\n\ndef run():\n \"\"\"Get OpenConfig data for all devices.\"\"\"\n device = __grains__[\"id\"]\n openconfig_endpoint = f\"{DATA_API}/devices/{device}/openconfig\"\n\n try:\n result = requests.get(openconfig_endpoint, auth=(USER, PASSWORD)).json()\n return {\"openconfig\": result}\n except Exception as error:\n logging.error(\"data-aggregation-api: failed to query '%s' because %s\", openconfig_endpoint, error)\n return {}\n
"},{"location":"SaltStack-modules/usage/#apply-the-configuration-manually","title":"Apply the configuration manually","text":"salt <device> state.apply afk test=True
salt <device> state.apply afk
Info
These are examples. Make sure to adapt them to your infrastructure.
Simple schedule in {SALT_PILLAR_PATH}/schedule_simple_afk.sls
schedule:\n afk:\n function: state.sls\n args:\n - afk\n minutes: 30\n range:\n start: 8am\n end: 7pm\n
Schedule only for devices having a afk-enabled
tag in NetBox in {SALT_PILLAR_PATH}/schedule_smart_afk.sls
#!py\nimport logging\nimport requests\n\nDATACENTER = \"paris\"\nENVIRONMENT = \"production\"\nUSER = \"salt-master\"\nPASSWORD = \"awesomepassword\"\n\nDATA_API = f\"https://data-aggregation-api.{DATACENTER}.{ENVIRONMENT}.local\"\n\n\ndef run():\n \"\"\"Get AFK afk-enabled data for all devices.\"\"\"\n device = __grains__[\"id\"]\n endpoint = f\"{DATA_API}/devices/{device}/salt_enabled\"\n\n try:\n result = requests.get(endpoint, auth=(USER, PASSWORD)).json()\n if result.get(\"salt_enabled\") is True:\n return {\n \"schedule\": {\n \"afk\": {\n \"function\": \"state.sls\",\n \"args\": [\"afk\"],\n \"minutes\": 30,\n \"range\": {\"start\": \"8am\", \"end\": \"7pm\"},\n }\n }\n }\n except Exception as error:\n logging.error(\"data-aggregation-api: failed to query '%s' because %s\", endpoint, error)\n\n return {}\n
Attention
If you are putting secrets directly in the pillar file, make sure to apply the appropriate permissions to the file. Something like chmod 600
.