Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy backports 1.31 #1134

Merged
merged 34 commits into from
Jan 21, 2025
Merged

Policy backports 1.31 #1134

merged 34 commits into from
Jan 21, 2025

Conversation

jrajahalme and others added 17 commits January 21, 2025 12:15
[ upstream commit c2c560c ]

This allows debugging with local toolchain without specifying a release
build.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit f4006ab ]

Decouple also the first policy update from listener init manager. This
allows listener to start listening even if the first policy update takes
a long time, e.g., due to a missing SDS secret.

TODO: Change the SDS wait time from the initial 15 seconds to maybe
80ms. It is better to have TLS policies fail than have the policy updates
stall for the whole node.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 7eac600 ]

Add new helper functions:

- updateInitManager(): Update the init manager before parsing policy
- removeInitManager(): Remove the update-specific init manager after parsing
- executeUpdate(): Execute successfully parsed policy in each thread

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 670d2fb ]

Avoid stalling network policy updates for extended periods due to missing
SDS secrets by reducing the initial fetch timeout to 50 milliseconds.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 7dff3c6 ]

This commit combines the nested conditions in the method `getPolicy`
that are handling the special case for N/S L7 Loadbalancing.
(Allow all traffic if no explicit Ingress policy is set)

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit d72c6fe ]

This commit simplifies the variable naming for the N/S L7 LB
case in getMetadata.

* Rename `ip` to `ingress_ip`
* use `ingress_ip` instead of `pod_ip` to avoid confusion

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 3bb8ccd ]

This commit removes an outdated comment about Ingress IP handling.

Currently, the existence of an Ingress source IP with the right IP
family is validated in the config.

See

```
    const Network::Address::Ip* ingress_ip = selectIPVersion(sip->version(), source_addresses);

    if (!ingress_ip) {
      // IP family of the connection has no configured local ingress source address
      ENVOY_LOG(
          warn,
          "cilium.bpf_metadata (north/south L7 LB): No local Ingress IP source address configured "
          "for the family of {}",
          sip->addressAsString());
      return nullptr;
    }
```

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 974678b ]

Currently, the C++ files are either including unused files or
heavily rely on transtive includes and doesn't define the actual
required includes.

Therefore, this commit fixes all includes so that all files are
defining their includes according to their needs - following the
model "import what you use".

See https://clangd.llvm.org/guides/include-cleaner

Advantages:

* Get rid of warnings when using clangd as LSP server
* Remove unused imports
* Define all direct imports helps understanding the code and the
  dependencies

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 9a0e95d ]

To prevent upstream Envoy lint issues, this commit refactors the includes
to use `source/common/protobuf/protobuf.h` instead of the `google/protobuf/...`
includes.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 9cd0142 ]

In preparation to split up the existing socket options into multiple
socket options (including reuse of upstream Envoy socket options where
possible), this commit introduces a new intermediate struct
`SocketMetadata` that is used by the function `getMetadata`.

The socket metadata is responsible to provide builder functions that
help building the fine-grain socket options.

In the same step, the function `getMetadata` is renamed to
`extractSocketMetadata`.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit a8b1b7d ]

This commit uses the upstream socket option factory option
`buildTcpKeepaliveOption` to build the TCP keep alive socket i
options (`SO_KEEPALIVE`, `TCP_KEEPINTVL`, `TCP_KEEPIDLE`).

Note: Applying these keep-alive related options is no longer
best-effort - failing to apply results in an error.

Signed-off-by: Marco Hofstetter <[email protected]>
[ ustream commit c120a54 ]

This commit makes use of the upstream Envoy socket option factory
function `buildReusePortOptions` to register a `SocketOption` that
sets the socket option `SO_REUSEPORT`.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 8ccbdb3 ]

This commit extracts setting the socket option `SO_REUSEADDR` from
`socket_option.h` and sets the option in the `onAccept` of the bpf
metadata listener filter directly. This, by using the Envoy upstream
`SocketOptionImpl` - unfortunately there's no builder function yet.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 9dc4681 ]

Currently, setting the socket option `IP_TRANSPARENT` is part
of the `socket_option.h`. This commit extracts this logic into
its own file and class.

Note: It's not possible to reuse upstream Envoy socket options
as setting `IP_TRANSPARENT` requires the use of the Cilium
Privileged Service to be executd with the required capabilities.

Note: This commit also adds the possibility to apply privileged
socket options to the BPF metadata config. See function
`addPrivilegedSocketOptions`. The Test config can set this to `false`.
This way, the sockektoption itself no longer has to be responsible
for this.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 0711ea5 ]

This commit extracts the logic that sets the `SO_MARK` socket option
with the Cilium mark into its own file and class.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit af8ab52 ]

This commit extracts the source address handling into its
own file and socket option.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 126ab64 ]

After the extraction of all "technical" socket option aspects, the
only thing that is left in the `SocketOption` are the information that
are relevant for policy enforcement.

Therefore, this commit renames the socket option into
`CiliumPolicySocketOption` and only applies the this option
for the connection socktets - no longer the listener socket.

In a follow up PR, this socket option can be refactored to be
a Envoy filter state object.

Signed-off-by: Marco Hofstetter <[email protected]>
@jrajahalme jrajahalme requested a review from a team as a code owner January 21, 2025 13:45
@jrajahalme jrajahalme requested a review from sayboras January 21, 2025 13:45
jrajahalme and others added 10 commits January 21, 2025 14:52
[ upstream commit b6801bc ]

Currently, policy related information are passed as Envoy `SocketOption`
to the upstream filters that do Cilium Policy enforcement. But no actual
network socket option is set, the logic just uses Envoy' socket option
functionality to pass the information.

In the meantime filter state has been introduced in upstream Envoy.

Therefore, this commit converts the socket option into a filter state
object.

The BPF metadata listener filter puts the filter state object into the
streaminfo of the downstream connection when accepting a new connection.
It does this by setting the option `SharedWithUpstreamConnection` that
makes the filter state information to upstream functionality too (e.g.
Cilium TLS wrapper).

Note: Naming of the types etc will be cleaned up in following commits.

Signed-off-by: Marco Hofstetter <[email protected]>
…tSocketCallbacks

[ upstream commit 7e6ac76 ]

Currently, the Cilium TLS wrapper handles the dynamic aspects of configuring the
underlying transport socket (SSL or RAW) in the function `setTransportSocketCallbacks`.

At this point, the streamInfo of the connection is not yet available and crashes on the
attempt of accessing it.

Therefore, this commit is refactoring the function to just store the callback as
field and moves the actual logic into the function `onConnected` - at that point the
streamInfo is available.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 4c4a224 ]

The upstream Envoy `ConnectionImpl` calls the function `setSSLConnection`
on the connection right after calling `setTransportSocketCallbacks` on the
transport socket.

By moving the Cilium TLS wrapper logic from `setTransportSocketCallbacks`
to `onConnected`, the `setSSLConnection` function is called without knowing
whether SSL needs to be setup or not.

Therefore, this commit calls the function `setSSLConnection` on the callback,
AFTER it has configured the transport socket with the information from the
policy. This way, SSL is working as expected.

Signed-off-by: Marco Hofstetter <[email protected]>
…State

[ upstream commit a621f20 ]

This commit is renaming the struct `CiliumPolicySocketOption` into
`CiliumPolicyFilterState`.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit e40699d ]

This commit is renaming the file `socket_option_cilium_policy.h` to
`filter_state_cilium_policy.h`.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 95cb9a7 ]

This commit is removing the helper function `GetCiliumPolicyFilterState`
by refactoring the callers to directly retrieve the filter state
from the connection.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit e47a5be ]

This commit is introducing a separate bazel CC library for the
filter state and separates it from the one for the socket options.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit f30861a ]

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit c3ab457 ]

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 075f2ff ]

Signed-off-by: Marco Hofstetter <[email protected]>
jrajahalme and others added 7 commits January 21, 2025 14:52
[ upstream commit 4192a30 ]

IPCache needs to be reopened on the first message even if policy parsing
fails for any reason.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit fe62be6 ]

Move init manager related functions higher up in the file so that the
diffs for the next commit become more readable.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 5aeb98c ]

Currently we must execute in all worker threads to be able to ACK the
update back to Cilium agent. This has become a problem as Envoy logs show
worker threads stalling for 10 seconds doing nothing, pausing the policy
update progress long enough to cause problems for FQDN policies.

Change to create a new policy map for each update, sharing it as a const
map for all worker threads using an atomic swap. Deletion of the old map
is coordinated with the worker threads in RCU fashion, waiting until each
worker thread has processed a new posted function, proving that they no
longer use the policy lookop results they may have obtained. This can be
done after ACK is already sent back to Cilium Agent.

There is a potentially significant functional change, we no longer wait
at all for SDS secrets to be fetched. Network policy update is a node
wide function, so it was questionable to stall it for a single secret
used by some endpoint’s policy to begin with.

Unless Envoy main thread is too busy, this change should eliminate
toFQDNs policy proxy wait timeouts sometimes seen in Cilium Agent.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 92d6a5c ]

Discourage storing shared pointer references to policies by returning
them as references instead. Since we do not have copy constructors it is
not possible hold those references once the function doing the policy
lookup returns.

Signed-off-by: Jarno Rajahalme <[email protected]>
[ upstream commit 23103fb ]

Currently, the function `extractSocketMetadata` does not only extract relevant
information from the socket and enriches it with data retrieved from BPF maps.
It also has some side-effects: namely setting the proxylib protocol as application
protocol on the network socket.

Therefore, this commit splits this logic from the function `extractSocketMetadata`
and moves the part that is setting the proxylib l7 protocol as requested app protocol
on the network socket into a dedicated function on the SocketMetadata struct that
then is called from the BPF metadata listener filter in the onAccept function.

The part that is evaluating the ProxyLib L7 proto is still kept in the function
`extractSocketMetadata`.

This way, we can keep the function extractSocketMetadata free of side-effects and
bundle the logic that changes the socket in the BPF metadata listener filter (onAccept).

Note: socket.connectionInfoProvider().restoreLocalAddress(dst_address) - the last
modifying action in extractSocketMetadata) will be cleaned up in a follow up commit.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 5734920 ]

The refactoring of PR #1129 completely missed calling the newly
introduced function `configureProxyLibApplicationProtocol` in
the BPF metadata listener filter.

This commit fixes this.

Signed-off-by: Marco Hofstetter <[email protected]>
[ upstream commit 585ba14 ]

This commit adds an unit test that verifies that the proxylib
l7 protocol has been correctly set to the request application
protocols of the network socket.

Signed-off-by: Marco Hofstetter <[email protected]>
@jrajahalme jrajahalme force-pushed the policy-backports-1.31 branch from 690b822 to b1a8597 Compare January 21, 2025 13:52
@jrajahalme
Copy link
Member Author

fixed format

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for a massive effort to backport all recent changes.

@jrajahalme jrajahalme merged commit 9905ed1 into v1.31 Jan 21, 2025
5 checks passed
@jrajahalme jrajahalme deleted the policy-backports-1.31 branch January 21, 2025 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants