-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow reuse of rego snippets #210
Comments
+1 to supporting better code reuse. We have the beginnings of that in the There is definitely a lot of tooling work to be done around automatically bundling common libraries into constraint templates as part of a build process. |
Only usage of
... so ideally |
Found some example setup here https://github.com/plexsystems/konstraint/tree/main/examples ... so I think the idea is that:
|
@grosser, yep that's the idea currently. Anything slated for potential re-use would go into But as @maxsmythe points of, still going back and forth on the best way to more systematically import what you need, rather than import the world every time. |
FYI I ended up making small libs and then generating the policy by scanning for which imports are needed libs = rego.scan(/^import data\.lib\.([^.\s]+)/).flatten(1).map do |lib|
File.read("policies/lib/#{lib}.rego")
end |
@grosser Konstraint does that as well. For every rego, it'll look at the import statements and add the imports to the lib section. |
I'd love to see this potentially be extended into either loading common libs from ConfigMaps as is kinda-sorta the case for vanilla OPA, or else see a CRD created for libs. If I define a lib in one ConstraintTemplate, can it be referenced by name from another ConstraintTemplate? |
Not currently. Gatekeeper currently requires each template be self-contained. The benefit of that is that you avoid dependency conflicts. Imagine one template relied on a function:
And another template relied on a newer version of that function that allowed fuzzy matching:
The example is a bit contrived, but there are plenty of examples of dependency conflicts in the wild
There is also a potential security issue where one template would be able to maliciously extend another template's library if they had access to a shared namespace. |
This issue/PR has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
this is similar to #206 |
It seems like this issue should be on the main Gatekeeper project, no? Regardless, adding libs to existing templates using kustomize or scripting is one thing, but doesn't seem like close to an ideal solution as it involves grossly expanding the size of the objects with lots of duplicate code. It seems like some kind of new CRD As for conflicts, it seems like each template should have its own scope (otherwise shouldn't those conflicts already exist? Say two templates declaring a rule with the same name). So if a template tries to extend the library, it doesn't influence any other template The main issue I see (knowing nothing about the internals, really) is integration with |
I think this is about the ability to re-use Rego in the writing process, not necessarily avoiding duplicating compiled code.
There is definitely a tradeoff where we are gaining a lack of need to worry about dependency conflicts (and security/speed as I'll get into in a second) for space. So far I haven't heard any complaints about memory footprint due to copied Rego, so definitely curious if that's a problem. By far the largest consumer of RAM is usually the informer cache for synced resources (at least on large clusters). WRT the speed gain: each constraint template is compiled into a separate Rego environment, so they are unable to share code. The security/speed benefits of this relative to compiling everything to the same Rego environment (which we used to do) are:
See this doc for more info about the performance improvements.
Aside from raising the need to resolve dependency conflicts (maybe solvable if you, say, require the users to reference their dependency via semver), there would also be a usability concern here. How would these common libraries be packaged and redistributed?
Currently each template has its own scope for its local code, but that doesn't change the problem of dependency conflicts, since we are talking about shared libraries, which implies each template would be referencing the same rego code. This is the same sort of problem as static vs. dynamic linking of libraries. The "maliciously extend" mentioned in an above comment was less about whether a template has its own scope and more about if they are also importing a common scope (e.g.
I think, at that point, the libraries could live next to, and be loaded with, the constraint templates (using the same methods a user is leveraging to load the templates into gator). Finally, there is the elephant-in-the-room that is the CEL KEP. When dealing with language-specific features, we should be careful to remain generic so that we don't accidentally bake something into the system that makes it hard to adapt to the changing landscape. The use of common libraries, if adopted, would be one of those things that should be generic. There are more pedestrian concerns such as code complexity (e.g. dealing with the edge cases of a multi-resource controller), that are surmountable, but definitely not trivial. In summary, I don't think it's impossible to have shared libraries, but I'd definitely want to see that there is a real need, that the benefits outweigh the costs and get a sense of how the CEL KEP reshapes the environment before treading too deep into a design. |
Just for more background, we use a COTS product called Styra as a replacement for Gatekeeper. It's the main feature missing in Gatekeeper that we use heavily in Styra (in our case, we have ~1000 lines of Rego code which includes all the library code plus the business logic, so you can see the benefit of abstracting rules into a library). Now granted, Styra doesn't store policy as CRDs in the cluster. So there's a slightly different architecture and different concerns. But that's where I'm coming from |
if duplicating them in CRDs is not getting too large you should be able to do that |
I'm curious if all policies use all library code, or if selective importing would limit the duplication? How many of those 1000 lines are library code? Also, in terms of RAM usage, I'm curious how many constraint templates you'd be looking to use? Assuming an average of 100 characters/line, 1000 lines is around 98 kB (though unsure what the compiled size would be, worth looking into), which would be require ~100 templates for 10 MB of RAM usage for the string representation. Let's assume 50 MB of RAM to account for duplicate caching and size of compiled code (it could be much better or worse than that, I haven't benchmarked shared Rego as TBH the template library has only a few examples that are much smaller than this). These are all obviously very rough rules-of-thumb, but I'm interested in figuring out when the management complexity vs. infrastructure spend cost/benefit analysis starts to tip for most people. At a macro level, some amount of code duplication is implicit in most containerized/sandboxed applications. Docker containers share a kernel, but have differing binaries/libraries/runtimes, for example. IMO it's the same issue here: what are the tradeoffs between operational complexity, operational fragility, and infrastructure cost? I'm also curious... it seems like you're looking to use multiple templates, which implies you see value in sharding your existing code across templates vs. creating one mega-template. I can make guesses, but don't what to assume: what benefits are you looking for in switching to constraints/templates? |
That's what we'll do for now, perhaps using something like Helm templates to generate everything. We have a little bit of work to do to migrate everything though so I haven't been able to test it yet to see if we hit any limitations
That's a question I had as well. We have lots of use cases, not only for native Kubernetes objects, but lots of CRDs for different commercial and other products we use. Essentially, we use OPA to implement more secure multitenancy in our clusters where the controllers are lacking (typical examples are preventing cross-namespace references, validating schema where the CRDs are lacking, enforcing certain settings within object specs, deconflicting global settings like ingress hostnames, etc.)
They almost all use some kind of library code. But I don't know enough yet to say how much duplication we'll need to introduce. For example, a common complexity is objects that have similar shapes. Think all objects that include Pod somewhere in their data -- Job, Deployment, StatefulSet, CronJob. We use some products that have similar characteristics where we need multiple rules to select the different varieties -- object A has the data at spec.x.y.z, object B has it at spec.a[].b[].c, etc. -- in some cases the nesting goes very deep and there are many different varieties. Some of these products also nest things like Pod or other native objects we want to govern.
To be honest, mainly we would like to stop paying for Styra. We really only use it for Kubernetes policy and not any of its other features |
For the general model of "resources that wrap other resources", especially if those resources are created in the cluster, expansion templates may be helpful, letting you only worry about writing policies against pods. They also play a bit nicer with match criteria, which may improve the enforcement accuracy. Definitely curious how things go. Keep us updated! |
@maxsmythe Thanks, I'll look into that. Not sure we're at a Gatekeeper version that has that alpha feature, but I suppose we can upgrade to try it out |
This issue/PR has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
we have this and some other chunks copied in 5+ policies and idea how to clean this up/make reuse work (except by using a new layer of templating)
some shared libraries or defining and then calling out to go libraries would be great
The text was updated successfully, but these errors were encountered: