Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp schemagen logic #143

Merged
merged 11 commits into from
Sep 13, 2024
Merged

Revamp schemagen logic #143

merged 11 commits into from
Sep 13, 2024

Conversation

rquitales
Copy link
Member

@rquitales rquitales commented Sep 12, 2024

Proposed Changes:

This PR significantly enhances the underlying code generation logic by leveraging the Pulumi Kubernetes provider’s advanced schema generation capabilities. The Kubernetes provider’s schema generation has been thoroughly tested over multiple Kubernetes version releases, and used in production environments, making it a robust solution. By adopting this approach, we can efficiently resolve several codegen related bugs in crd2pulumi without needing numerous fragile targeted fixes within the current codebase.

Technical Changes

  • Removed manual parsing of CRD manifests.
  • Integrated Pulumi Kubernetes provider's schema generation logic:
    • CRD manifests are now converted to OpenAPI specs before being schematized.
    • Inline nested objects within CRDs are now treated as global objects in the OpenAPI layer, simplifying handling and reducing code complexity.
  • Updated mocked CRDs to be valid for testing, as valid CRDs are required for OpenAPI conversion.
  • Expanded e2e tests by adding additional CRDs to be code generated, and adding compilation steps for Go, Node, and Dotnet languages to ensure validity.

User Facing Changes

This PR maintains backward compatibility as much as possible, though some behavior changes are expected:

  1. CRDs provided to crd2pulumi must be valid and acceptable by a Kubernetes API server. Previous versions made best-effort conversions regardless of validity.
  2. Generated Go code now adheres to the --goPath flag if provided, ensuring code is generated at the specified path, aligning Go with other languages in terms of folder structure.
  3. While we’ve made efforts to maintain consistency, field names and resource names in generated Pulumi code may change slightly due to cross-language sanitization of field and resource names.

Related Issues

Resolves #142, resolves #141, resolves #115, resolves #113, resolves #110, resolves #104, resolves #103, resolves #100, resolves #97, resolves #89, resolves #70, resolves #49, resolves #34, resolves #30, resolves #26, resolves #21, resolves #111, resolves #50

@rquitales rquitales force-pushed the rquitales/switch-codegen branch 7 times, most recently from bad0c99 to 06dfe8c Compare September 12, 2024 19:59
Copy link

codecov bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 75.35211% with 35 lines in your changes missing coverage. Please review.

Project coverage is 78.10%. Comparing base (151a8b6) to head (ff0ef1b).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pkg/codegen/customresourcegenerator.go 74.73% 12 Missing and 12 partials ⚠️
pkg/codegen/schema.go 70.58% 5 Missing and 5 partials ⚠️
internal/unstruct/unstruct.go 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
- Coverage   79.45%   78.10%   -1.36%     
==========================================
  Files          17       17              
  Lines         740      708      -32     
==========================================
- Hits          588      553      -35     
+ Misses         88       87       -1     
- Partials       64       68       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rquitales rquitales force-pushed the rquitales/switch-codegen branch 3 times, most recently from 11fff64 to 614c8a6 Compare September 12, 2024 21:47
@rquitales rquitales changed the title [WIP]: Revamp schemagen logic Revamp schemagen logic Sep 12, 2024
@rquitales rquitales requested review from blampe, EronWright and a team September 12, 2024 22:11
@rquitales rquitales marked this pull request as ready for review September 12, 2024 22:14
@@ -7,6 +7,7 @@ import (
"io"
"strings"

extensionv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file is related to switch from storing an unstructured object to a typed CRD object.

@@ -27,27 +26,15 @@ const nodejsMetaPath = "meta/v1.ts"
const nodejsMetaFile = `import * as k8s from "@pulumi/kubernetes";

export type ObjectMeta = k8s.types.input.meta.v1.ObjectMeta;
export type ObjectMetaPatch = k8s.types.input.meta.v1.ObjectMetaPatch;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now generate Patch variants so need to update our imports to address this.

@@ -16,18 +16,26 @@ package codegen

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the schemagen changes occur here. This is where we flatten OpenAPI specs.

pkg.Name = oldName
delete(pkg.Language, langName)

buffers = map[string]*bytes.Buffer{}
for path, code := range files {
newPath, _ := filepath.Rel(name, path)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing Go code generated in the wrong folder. See #89
Also needed to ensure tests can find the right folder to compile generated Go code.

@@ -15,21 +15,30 @@
package codegen
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrating with p-k schemagen to generate the package spec.

@rquitales rquitales force-pushed the rquitales/switch-codegen branch from 614c8a6 to 2c843f7 Compare September 12, 2024 22:22
t.Run(lang, func(t *testing.T) {
t.Parallel()
execCrd2Pulumi(t, lang, tt.url, nil)
if lang == "dotnet" && (tt.name == "CertManager" || tt.name == "GKEManagedCerts") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GKEManagedCerts is working for dotnet locally and in a separate docker container I have. I have also manually validated it, with a Pulumi program that can create these CRs. I am skipping for now to unblock this PR while I hunt down what's happening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - how many files does it generate? I wonder if it's got some nested types that are functionally identical but are now getting treated as unique top-level things.

Copy link
Contributor

@blampe blampe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beautiful!

CHANGELOG.md Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
// Schemas represents a mapping from each version in the `spec.versions`
// list to its corresponding `openAPIV3Schema` field in the CRD YAML
Schemas map[string]map[string]any
Schemas map[string]spec.Swagger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does k8s just happen to use older type names for back-compact? I guess it probably pre-dates the Swagger→OpenAPI change so it would make sense. I just want to double check we're using whatever the most recent/relevant logic is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newer k8 versions provide both v2 and v3 specs, so it might be nice to upgrade to v3 specs in the future for p-k schemagen. Also should note that p-k merges schemas from a range of Kubernetes versions, so we may not want to do this in the short term unless we drop support for these older versions that do not serve a v3 spec.

Comment on lines +57 to +58
// flattenOpenAPI recursively finds all nested objects in the OpenAPI spec and flattens them into a single object as definitions.
func flattenOpenAPI(sw *spec.Swagger) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great -- the surface area for potential issues is signifcantly smaller. If we have bugs they'll almost certainly either live in this flattening logic or the upstream p-p or p-k codegen.

return currSpec, nil
}

func sanitizeReferenceName(fieldName string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this basically responsible for ensuring the ref name is a valid token in all of our target languages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Previously, in crd2pulumi, we need to do this in a number of locations which resulted in harder to maintain code, and the bugs we saw. For package name sanitization, that resides in the p-k schemagen logic.

pkg/codegen/customresourcegenerator.go Outdated Show resolved Hide resolved
pkg/codegen/customresourcegenerator.go Outdated Show resolved Hide resolved
tests/crds_test.go Outdated Show resolved Hide resolved
t.Run(lang, func(t *testing.T) {
t.Parallel()
execCrd2Pulumi(t, lang, tt.url, nil)
if lang == "dotnet" && (tt.name == "CertManager" || tt.name == "GKEManagedCerts") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - how many files does it generate? I wonder if it's got some nested types that are functionally identical but are now getting treated as unique top-level things.

Pulumi Kubernetes provider has robust schemagen logic that is immune to the bugs
identified in crd2pulumi. To enable interoperability, we need to handle the conversion
of CRD manifests to their OpenAPI v2 equivalents.

remove extra generated kubernetes types

An OpenAPI spec generated from a CRD manifest will contain additional Kubernetes meta types (statuses etc). This commit removes those extra generated files/resources to debloat the final output.
Our mocked CRDs for testing purposes were very minimal and not completely valid since previous codegen logic only needed a subset of fields. As we have updated our implementation to do codegen on an OpenAPI spec, we need fully formed CRD manifests for this conversion to occur.
@rquitales rquitales force-pushed the rquitales/switch-codegen branch from e13c547 to c80d232 Compare September 13, 2024 22:03
@rquitales rquitales merged commit 74efa11 into master Sep 13, 2024
5 checks passed
@rquitales rquitales deleted the rquitales/switch-codegen branch September 13, 2024 23:08
@rquitales rquitales linked an issue Sep 13, 2024 that may be closed by this pull request
@pulumi-bot
Copy link

This PR has been shipped in release v1.5.0.

@blampe blampe mentioned this pull request Nov 12, 2024
blampe added a commit that referenced this pull request Nov 13, 2024
All of our languages except NodeJS use
`SchemaPackageWithObjectMetaType`. As a result, Node exposes only input
types for object metadata which makes it awkward to consume downstream.

I'm assuming #143 didn't change
this for backwards-compatibility reasons, or maybe it was overlooked. I
_think_ it is safe to continue generating the `ObjectMeta` type alias
(in case users are importing it) but use the input/output types
(matching the upstream k8s types) on resources.

Edit: 

> // schemaPackageWithObjectMetaType is the Pulumi schema package used to
> // generate code for languages that need an ObjectMeta type (Python, Go, and .NET)

So this seems intentional. For whatever reason, using this with Node has
the effect of generating input/output types matching upstream's types,
which seems desirable.

Fixes #158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment