diff --git a/.vscode/settings.json b/.vscode/settings.json index 280400c..1873250 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -126,6 +126,7 @@ "/workspaces/devcontainer", "/workspaces/kargo" ], + "python.analysis.typeCheckingMode": "strict", "go.testTags": "all", "go.vetOnSave": "off", "go.buildTags": "all", diff --git a/README.md b/README.md index 0d4d0e6..94da109 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ## Overview -This repository serves as a comprehensive template for starting new DevOps projects from scratch. It is designed to be cloned as a GitHub template repository, providing a fully-configured environment for deploying and managing cloud-native infrastructure using VSCode with Kubernetes and Pulumi boilerplate as a starting line. +This repository serves as a comprehensive template for starting new DevOps projects from scratch. It is designed to be cloned as a GitHub template repository, providing a fully-configured environment for deploying and managing cloud-native infrastructure using VSCode with Kubernetes and Pulumi boilerplate as a starting point. Whether you're building for on-premises, cloud, or local environments, this template streamlines the setup and deployment processes, enabling you to focus on building and innovating. @@ -12,9 +12,11 @@ Join the community in the [ContainerCraft Community Discord](https://discord.gg/ - **AWS LandingZone**: Automated Kubernetes cluster setup using Talos. - **Kubernetes Deployment**: Automated Kubernetes cluster setup using Talos. -- **Pulumi IaC Integration**: Infrastructure as Code management with Pulumi. +- **Pulumi IaC Integration**: Infrastructure as Code management with Pulumi, using Python and best practices. - **Runme Integration**: Execute documented tasks directly from the README.md. - **GitHub Actions Support**: CI/CD pipelines configured for automated testing and deployment. +- **Dependency Management with Poetry**: Manage Python dependencies and environments using Poetry. +- **Strict Type Checking with Pyright**: Enforce code quality through strict type checking. ## Using This Template to Start a New Project @@ -36,58 +38,143 @@ This repository is designed as a template, allowing you to quickly bootstrap new 1. **Clone the Repository to Your Local Machine:** -Once your new repository is created, clone it to your local machine using Git. + Once your new repository is created, clone it to your local machine using Git. -```bash -git clone https://github.com/YourUsername/YourNewRepoName.git -cd YourNewRepoName -``` + ```bash + git clone https://github.com/YourUsername/YourNewRepoName.git + cd YourNewRepoName + ``` 2. **Initialize the Development Environment:** - If you're using [GitHub Codespaces](https://github.com/features/codespaces) or a local development environment with Docker, you can launch directly into the pre-configured environment. + We use [Poetry](https://python-poetry.org/) for dependency management and packaging. Poetry ensures that our development environment is consistent, dependencies are properly managed, and collaboration is streamlined. + + **Install Poetry:** + + Ensure that `poetry` is added to your system's PATH. Refer to the [official installation guide](https://python-poetry.org/docs/#installation) for detailed instructions. + + **Install Dependencies and Create Virtual Environment:** + + ```bash + poetry install + ``` + + **Activate the Virtual Environment:** + + ```bash + poetry shell + ``` + + Alternatively, you can prefix commands with `poetry run`. + +3. **Configure Pulumi to Use Poetry:** + + Ensure that `Pulumi.yaml` specifies Poetry as the toolchain: + + ```yaml + name: your-pulumi-project + runtime: + name: python + options: + toolchain: poetry + ``` - - **GitHub Codespaces:** Click the `Code` button and select `Open with Codespaces`, or follow the instructions in the Quickstart section of this README. - - **Local Development:** Follow the instructions in the `Getting Started` section to set up your local environment. +4. **Install Pulumi Dependencies:** + + ```bash + pulumi install + ``` + + This command ensures that Pulumi recognizes and utilizes the Poetry-managed environment. ### Step 3: Customize the Configuration 1. **Update Configuration Files:** - Customize the `.env` file with your project-specific environment variables. - - Modify the `Taskfile.yaml` to include tasks specific to your project. - Adjust the Pulumi configuration files under `.pulumi` to match your cloud and infrastructure setup. 2. **Set Up Your Pulumi Stack:** - Configure your Pulumi stack settings to match your project environment by following the steps in the `Getting Started` section. + Configure your Pulumi stack settings to match your project environment by running: + + ```bash + pulumi stack init dev + ``` + + Replace `dev` with your desired stack name. + +3. **Install Pyright for Type Checking:** + + We enforce strict type checking using [Pyright](https://github.com/microsoft/pyright). + + **Add Pyright to the Development Dependencies:** + + ```bash + poetry add --dev pyright + ``` + + **Configure Pyright:** + + Create a `pyrightconfig.json` in the project root to define Pyright settings: + + ```json + { + "include": ["**/*.py"], + "exclude": ["**/__pycache__/**"], + "reportMissingImports": true, + "pythonVersion": "3.8", + "typeCheckingMode": "strict" + } + ``` + + **Verify Type Checking:** + + ```bash + poetry run pyright + ``` + +4. **Editor Integration (Optional):** + + For real-time type checking and enhanced development experience, integrate Pyright with your editor. If you use Visual Studio Code, install the [Pylance extension](https://marketplace.visualstudio.com/items?itemName=ms-python.vscode-pylance) and set `"python.analysis.typeCheckingMode": "strict"` in your settings. ### Step 4: Start Developing 1. **Deploy the Infrastructure:** -Use the pre-configured tasks to deploy your infrastructure, as detailed in the Quickstart section. + Before deploying, ensure that your code passes type checking to maintain code quality. -```bash -task kubernetes -task deploy -``` + **Run Type Checking:** + + ```bash + poetry run pyright + ``` -2. **Build and Iterate:** + **Deploy with Pulumi:** - With your infrastructure deployed, you can now focus on developing your application, iterating on your DevOps processes, and refining your setup. + ```bash + pulumi up + ``` + + If type errors are detected, the deployment will halt, and errors will be displayed. + +2. **Implement Best Practices:** + + - Use `TypedDict` to define resource inputs with type hints, enhancing code readability and type safety. + - Follow the code standards and guidelines outlined in [PULUMI_PYTHON.md](docs/PULUMI_PYTHON.md), including naming conventions, type annotations, and error handling. + - Organize your code into logical modules and packages for better maintainability. ### Step 5: Push Your Changes 1. **Commit and Push:** -After making changes, commit them to your repository. + After making changes, commit them to your repository. -```bash -git add . -git commit -m "Initial setup and configuration" -git push origin main -``` + ```bash + git add . + git commit -m "Initial setup and configuration" + git push origin main + ``` 2. **Collaborate and Contribute:** @@ -95,9 +182,10 @@ git push origin main ### Tips for Success -- **Keep your dependencies up to date:** Regularly update the tools and libraries used in your project. -- **Document your changes:** Update the README and other documentation as your project evolves. -- **Engage with the community:** Join the [ContainerCraft Community Discord](https://discord.gg/Jb5jgDCksX) to get support and share your experiences. +- **Keep Your Dependencies Up to Date:** Regularly update the tools and libraries used in your project. +- **Enforce Type Checking:** Use Pyright to catch type errors early in the development process. +- **Document Your Changes:** Update the README and other documentation as your project evolves. +- **Engage with the Community:** Join the [ContainerCraft Community Discord](https://discord.gg/Jb5jgDCksX) to get support and share your experiences. ## How-To (Boilerplate Instructions) @@ -110,6 +198,7 @@ Ensure you have the following tools and accounts: 1. [GitHub](https://github.com) 2. [Pulumi Cloud](https://app.pulumi.com/signup) 3. [Microsoft Edge](https://www.microsoft.com/en-us/edge) or [Google Chrome](https://www.google.com/chrome) +4. [Poetry](https://python-poetry.org/docs/#installation) ### Quickstart @@ -119,60 +208,76 @@ Follow these steps to get your environment up and running: Clone this repository to your GitHub account using the "Use this template" button. -2. **Launch in GitHub Codespaces:** +2. **Launch in GitHub Codespaces or Local Development Environment:** + + - **GitHub Codespaces:** Start a new Codespace with the following options: - Start a new GitHub Codespace with the following options: + - **Branch:** `main` + - **Dev Container Configuration:** `konductor` + - **Region:** Your choice + - **Machine Type:** 4 cores, 16 GB RAM, or better - - **Branch:** `main` - - **Dev Container Configuration:** `konductor` - - **Region:** Your choice - - **Machine Type:** 4 cores, 16 GB RAM, or better + - **Local Development:** Set up your local environment using Docker or your preferred method. 3. **Open the Integrated Terminal:** Use `Ctrl + `` to open the VSCode integrated terminal. -4. **Authenticate Credentials:** +4. **Initialize the Development Environment:** -Login to Pulumi Cloud and other required services. + **Install Dependencies:** -```bash {"name":"login"} -task login -``` + ```bash + poetry install + ``` -5. **Configure the Pulumi Stack:** + **Activate the Virtual Environment:** -Set up Pulumi stack parameters. + ```bash + poetry shell + ``` -```bash {"name":"configure"} -export ORGANIZATION="${GITHUB_USER:-${GITHUB_REPOSITORY_OWNER:-}}" -export DEPLOYMENT="${RepositoryName:-}" -task configure -``` +5. **Authenticate Credentials:** -6. **Deploy Kubernetes:** + Login to Pulumi Cloud and other required services. -Deploy Kubernetes using Talos. + ```bash + pulumi login + ``` -```bash {"excludeFromRunAll":"true","name":"kubernetes"} -task kubernetes -``` +6. **Configure the Pulumi Stack:** -7. **Deploy the Platform:** + Set up Pulumi stack parameters. -Deploy the KubeVirt PaaS infrastructure. + ```bash + pulumi stack init dev + ``` -```bash {"excludeFromRunAll":"true","name":"deploy"} -task deploy -``` + Replace `dev` with your desired stack name. -8. **Cleanup:** +7. **Run Type Checking:** -Clean up all Kubernetes and Pulumi resources when you're done. + Ensure your code passes type checking before deployment. -```bash {"excludeFromRunAll":"true","name":"clean"} -task clean-all -``` + ```bash + poetry run pyright + ``` + +8. **Deploy the Infrastructure:** + + **Deploy with Pulumi:** + + ```bash + pulumi up + ``` + +9. **Cleanup:** + + Clean up all Kubernetes and Pulumi resources when you're done. + + ```bash + pulumi destroy + ``` ## Contributing @@ -182,8 +287,8 @@ Contributions are welcome! This template is intended to evolve with the needs of Use the `act` tool to test GitHub Actions locally before pushing your changes. -```bash {"excludeFromRunAll":"true"} -task act +```bash +act ``` ## Community and Support diff --git a/ROADMAP.md b/ROADMAP.md deleted file mode 100644 index 89cc8bf..0000000 --- a/ROADMAP.md +++ /dev/null @@ -1,269 +0,0 @@ -# Next-Generation Platform Engineering Roadmap - -## Table of Contents - -1. [Introduction](#introduction) -2. [Objectives](#objectives) -3. [Architecture Overview](#architecture-overview) -4. [Key Components](#key-components) - - [Account Structure](#account-structure) - - [Identity and Access Management (IAM)](#identity-and-access-management-iam) - - [Infrastructure as Code (IaC)](#infrastructure-as-code-iac) - - [Compliance and Governance](#compliance-and-governance) - - [Logging, Monitoring, and Alerting](#logging-monitoring-and-alerting) - - [Networking](#networking) - - [Cost Management](#cost-management) - -5. [Design Principles](#design-principles) -6. [Implementation Roadmap](#implementation-roadmap) - - [Phase 1: Foundations](#phase-1-foundations) - - [Phase 2: Core Infrastructure](#phase-2-core-infrastructure) - - [Phase 3: Compliance and Governance Integration](#phase-3-compliance-and-governance-integration) - - [Phase 4: Application Onboarding](#phase-4-application-onboarding) - - [Phase 5: Multi-Cloud Expansion](#phase-5-multi-cloud-expansion) - - [Phase 6: Optimization and Scaling](#phase-6-optimization-and-scaling) - -7. [Roles and Responsibilities](#roles-and-responsibilities) -8. [Risks and Mitigation Strategies](#risks-and-mitigation-strategies) -9. [Conclusion](#conclusion) -10. [Appendices](#appendices) - - [A. Glossary](#a-glossary) - - [B. References](#b-references) - ---- - -## Introduction - -This roadmap outlines the development of a next-generation, cloud-agnostic platform engineering environment. -The goal is to establish a robust, scalable, and secure multi-cloud infrastructure that automates provisioning, enforces compliance, and centralizes operational data. -This environment will empower application teams to operate safely within their own isolated accounts, supported by a streamlined, code-driven landing zone setup. - -The architecture consists of a hierarchical organization with multiple organizational units (OUs) and accounts across AWS, Azure, and GCP. -Infrastructure provisioning and configuration are fully automated using Infrastructure as Code (IaC) practices. -Compliance controls are embedded within the configuration code, ensuring consistent policy enforcement. -Centralized governance is achieved through policy propagation and centralized services for logging, monitoring, and cost management. - ---- - -## Objectives - -- **Cloud-Agnostic Design**: Support AWS, Azure, and Google Cloud Platform (GCP) to prevent vendor lock-in. -- **Infrastructure as Code (IaC)**: Utilize code to automate infrastructure provisioning, configuration, and management. -- **Compliance Integration**: Embed compliance controls (e.g., FISMA, NIST) within configurations to ensure consistent enforcement. -- **Automation and GitOps**: Implement full automation with continuous integration and continuous deployment (CI/CD) pipelines, adopting GitOps practices. -- **Centralized Governance**: Maintain centralized policies, secrets, and configurations for consistent management across all environments. -- **Scalability and Modularity**: Design for horizontal scalability and modularity to accommodate growth and technological changes. - -## Key Components - -### Account Structure - -#### Organizational Hierarchy - -- **Root Organization**: The top-level entity for each cloud provider. - - **Security OU**: - - **Log Archive Account**: Central repository for logs. - - **Security Tools Account**: Hosts security tools and services. - - - **Infrastructure OU**: - - **Networking Account**: Manages shared networking resources. - - **Shared Services Account**: Houses services shared across the organization. - - - **Applications OU**: - - **Development Accounts**: Environments for development teams. - - **Testing Accounts**: Isolated testing environments. - - **Production Accounts**: Live environments for production workloads. - -#### Account Provisioning - -- **Automated Provisioning**: Use IaC to programmatically create and manage organizations, OUs, and accounts across multiple cloud providers. -- **Standardized Configuration**: Ensure all accounts adhere to baseline security and configuration standards. - -### Identity and Access Management (IAM) - -- **Centralized IAM**: Implement a unified IAM strategy across all cloud providers. -- **Roles and Policies**: - - Define IAM roles with the principle of least privilege. - - Manage IAM policies and role assignments programmatically. - -- **User and Group Management**: - - Integrate with centralized identity providers (e.g., Azure AD, Okta). - - Group users by function and assign appropriate permissions. - -### Infrastructure as Code (IaC) - -- **Tooling**: Utilize a programming language (e.g., Python) with an IaC framework that supports multi-cloud provisioning. -- **Repository Structure**: - - **Modular Design**: Create reusable modules for common infrastructure components. - - **Environment Separation**: Maintain separate configurations for development, testing, and production environments. - -- **CI/CD Integration**: - - Automate deployment pipelines with tools like Jenkins, GitHub Actions, or GitLab CI. - - Implement GitOps practices to ensure that Git is the single source of truth. - -### Compliance and Governance - -- **Policy as Code**: - - Define compliance controls within the IaC configurations. - - Embed policies for standards like FISMA and NIST directly into code. - -- **Automated Enforcement**: - - Use tagging and labeling to propagate compliance metadata to all resources. - - Implement automated checks during deployment to enforce compliance. - -- **Auditability**: - - Maintain detailed logs of infrastructure changes. - - Utilize version control history for audit trails. - -### Logging, Monitoring, and Alerting - -- **Centralized Logging**: - - Aggregate logs from all resources into centralized logging services. - - Ensure logs are stored securely and comply with data retention policies. - -- **Monitoring Tools**: - - Deploy monitoring solutions (e.g., Prometheus, Grafana) to collect metrics. - -- **Alerting Mechanisms**: - - Configure alerts for performance issues, security incidents, and compliance violations. - - Integrate with incident management systems for timely response. - -### Networking - -- **Standardized Network Topology**: - - Define network architectures using IaC for consistency. - - Include components like virtual networks, subnets, and routing configurations. - -- **Security Controls**: - - Manage security groups, network access control lists (ACLs), and firewall rules programmatically. - -- **Cross-Cloud Connectivity**: - - Implement VPNs or cloud interconnects for secure communication between different cloud environments. - -### Cost Management - -- **Cost Monitoring**: - - Implement tools to aggregate and analyze cost data across all cloud providers. - -- **Tagging for Cost Allocation**: - - Enforce tagging standards to facilitate cost tracking by project, environment, and department. - -- **Budgeting and Alerts**: - - Set up cost thresholds and receive alerts to prevent budget overruns. - ---- - -## Design Principles - -- **Modularity**: Design reusable and interchangeable components to simplify maintenance and scaling. -- **Scalability**: Ensure the architecture can accommodate growth in users, data, and services without compromising performance. -- **Security by Design**: Incorporate security measures at every layer, following best practices and compliance requirements. -- **Automation**: Automate repetitive tasks to reduce errors and increase efficiency. -- **Observability**: Build systems that are easy to monitor and debug, with comprehensive logging and metrics. -- **Immutability**: Treat infrastructure components as immutable; changes result in new deployments rather than modifications. - ---- - -## Implementation Roadmap - -### Phase 1: Foundations - -- **Milestone 1**: Set up the IaC framework and configure state and secrets management. -- **Milestone 2**: Establish the Git repository with the initial directory structure and enforce code quality standards. -- **Milestone 3**: Define and deploy the root organizations and OUs for each cloud provider. -- **Milestone 4**: Configure CI/CD pipelines integrating the IaC tool for automated deployments and validations. - -### Phase 2: Core Infrastructure - -- **Milestone 5**: Develop reusable modules for networking, compute, and storage resources. -- **Milestone 6**: Implement centralized IAM roles, policies, and integrate with identity providers. -- **Milestone 7**: Set up centralized logging and monitoring solutions, ensuring they comply with security standards. - -### Phase 3: Compliance and Governance Integration - -- **Milestone 8**: Define compliance controls within IaC configurations for standards like FISMA and NIST. -- **Milestone 9**: Automate tagging and labeling of resources with compliance and cost metadata. -- **Milestone 10**: Establish audit logging, reporting mechanisms, and integrate with governance tools. - -### Phase 4: Application Onboarding - -- **Milestone 11**: Provision isolated tenant accounts for application teams using automated processes. -- **Milestone 12**: Deploy sample applications to validate the infrastructure, security, and compliance setups. -- **Milestone 13**: Develop documentation and provide training sessions for application teams. - -### Phase 5: Multi-Cloud Expansion - -- **Milestone 14**: Extend infrastructure provisioning and compliance enforcement to Azure and GCP. -- **Milestone 15**: Implement cross-cloud networking solutions and identity federation mechanisms. -- **Milestone 16**: Consolidate cost management and reporting across all cloud providers. - -### Phase 6: Optimization and Scaling - -- **Milestone 17**: Review and optimize infrastructure for performance improvements and cost efficiency. -- **Milestone 18**: Scale the infrastructure to support additional workloads, users, and application teams. -- **Milestone 19**: Enhance security with advanced threat detection, response capabilities, and regular compliance audits. - ---- - -## Appendices - -### A. Glossary - -- **IaC (Infrastructure as Code)**: The process of managing and provisioning computing infrastructure through machine-readable definition files. -- **CI/CD (Continuous Integration/Continuous Deployment)**: A method to frequently deliver apps to customers by introducing automation into the stages of app development. -- **GitOps**: A practice that uses Git pull requests to manage infrastructure provisioning and deployment. -- **FISMA (Federal Information Security Management Act)**: A United States federal law that defines a comprehensive framework to protect government information, operations, and assets. -- **NIST (National Institute of Standards and Technology)**: A physical sciences laboratory and non-regulatory agency of the United States Department of Commerce. - -### B. References - -- **FISMA Compliance**: [https://www.cisa.gov/fisma](https://www.cisa.gov/fisma) -- **NIST Compliance**: [https://www.nist.gov/cyberframework](https://www.nist.gov/cyberframework) -- **GitOps Principles**: [https://www.gitops.tech/](https://www.gitops.tech/) -- **Open Policy Agent (OPA)**: [https://www.openpolicyagent.org/](https://www.openpolicyagent.org/) -- **Pulumi Documentation**: [https://www.pulumi.com/docs/](https://www.pulumi.com/docs/) -- **Pulumi Python SDK**: [https://www.pulumi.com/docs/intro/languages/python/](https://www.pulumi.com/docs/intro/languages/python/) -- **Pulumi Cloud Features**: [https://www.pulumi.com/product/cloud/](https://www.pulumi.com/product/cloud/) - -## C. What is Pulumi? - -**Pulumi** is an open-source IaC platform that enables developers and infrastructure teams to define, deploy, and manage cloud infrastructure using familiar programming languages such as Python, TypeScript, Go, and C#. Unlike traditional IaC tools that use domain-specific languages (DSLs), Pulumi allows the use of real programming languages, offering greater flexibility and power. - -**Key Features:** - -- **Multi-Cloud Support**: Manage resources across AWS, Azure, GCP, Kubernetes, and other providers. -- **Programming Language Flexibility**: Use existing programming skills to define infrastructure. -- **State Management**: Choose between self-managed state or use Pulumi Cloud for managed state storage. -- **Policy as Code**: Embed compliance and governance policies directly into your codebase. -- **Automation API**: Integrate Pulumi into CI/CD pipelines and other automation workflows. - -### 1.1 Infrastructure Provisioning - -- **Resource Management**: Define and manage cloud resources using code. -- **Complex Logic Handling**: Utilize programming constructs for loops, conditionals, and abstractions. -- **Reusable Components**: Create modules and packages for shared infrastructure code. - -### 1.2 Multi-Cloud Capabilities - -- **Unified Interface**: Manage different cloud providers using the same codebase. -- **Cross-Cloud Abstractions**: Develop higher-level components that abstract away provider specifics. - -### 1.3 State Management - -- **State Persistence**: Track infrastructure state for accurate deployments. -- **Backend Options**: Use local files, cloud storage, or Pulumi Cloud for state management. - -### 1.4 Policy and Compliance - -- **Policy as Code**: Define and enforce policies within your infrastructure code. -- **Compliance Integration**: Include compliance controls (e.g., FISMA, NIST) in configurations. - -### 1.5 CI/CD Integration - -- **Automation Support**: Seamlessly integrate with CI/CD pipelines for automated deployments. -- **GitOps Workflow**: Adopt GitOps practices with Pulumi for infrastructure changes. - -### 1.6 Collaboration and Secrets Management - -- **Team Collaboration**: Use Pulumi Cloud for role-based access control and collaboration. -- **Secure Secrets Management**: Handle secrets securely with Pulumi's Federated OIDC, and Secrets Federation suppport. diff --git a/docs/DOCUMENTATION.md b/docs/DOCUMENTATION.md new file mode 100644 index 0000000..fbaf8c6 --- /dev/null +++ b/docs/DOCUMENTATION.md @@ -0,0 +1,163 @@ +# Documentation Guidelines for Module Maintainers and Documentation Developers + +## Purpose + +This document provides comprehensive guidelines for module maintainers and documentation developers contributing to the **Konductor Infrastructure as Code (IaC) project**. The goal is to produce high-quality, informative, and accessible documentation that serves a diverse audience—from novice homelab enthusiasts to senior DevOps professionals and principal platform engineers. + +By adhering to these guidelines, you can ensure that all documentation is consistent, comprehensive, and aligned with industry best practices, enhancing its value to developers, users, and practitioners. + +--- + +## General Principles + +### 1. **Alignment with Project Standards** + +- **Consistency**: Ensure all documentation aligns with the project's established standards and guidelines. +- **Best Practices**: Incorporate documentation best practices from leading cloud-native projects like Kubernetes, Docker, and others. + +### 2. **Audience Awareness** + +- **Inclusivity**: Write documentation that is accessible and informative for a wide range of readers, from beginners to experts. +- **Clarity**: Use clear, concise language and avoid unnecessary jargon. When technical terms are necessary, provide definitions or explanations. +- **Depth and Detail**: Provide sufficient detail to help advanced users while ensuring that beginners can understand and follow along. + +### 3. **Content Quality** + +- **Accuracy**: Ensure all information is accurate and up-to-date with the latest project developments. +- **Completeness**: Cover all necessary topics, including setup instructions, usage examples, troubleshooting, and best practices. +- **Relevance**: Focus on content that adds value to the user experience and aids in the understanding of the module or feature. + +### 4. **Structural Organization** + +- **Logical Flow**: Organize content in a logical sequence, starting from general concepts and progressing to specific details. +- **Predictable Layout**: Use a consistent structure across all documents to help users know where to find information. +- **Modularity**: Break down documentation into manageable sections or documents, each focusing on a specific aspect or feature. + +### 5. **Stylistic Consistency** + +- **Formatting Standards**: Adhere to consistent formatting throughout all documents, including headings, code blocks, lists, and emphasis. +- **Tone and Voice**: Maintain a professional and approachable tone. Encourage engagement and learning by being supportive and inclusive. +- **Terminology**: Use consistent terminology across all documents to prevent confusion. + +--- + +## Guidelines for Authoring and Revising Documentation + +### A. **Document Structure and Sections** + +- **Introduction**: Begin with an introduction that outlines the purpose, scope, and audience of the document. +- **Table of Contents**: Provide a table of contents for easy navigation, especially for longer documents. +- **Sections and Subsections**: Organize content into clear sections with descriptive headings. +- **Conclusion and Next Steps**: End with a summary of key points and suggestions for further reading or actions. + +### B. **Content Development** + +- **Use Case Examples**: Include practical examples and real-world use cases to illustrate concepts. +- **Code Samples**: Provide code snippets where applicable, ensuring they are tested and functional. +- **Visual Aids**: Use diagrams, charts, or tables to explain complex ideas or workflows. +- **FAQs and Troubleshooting**: Anticipate common questions or issues and address them proactively. + +### C. **Style and Formatting** + +- **Markdown Standards**: Use proper Markdown syntax for formatting documents, including appropriate heading levels, bold and italics, code blocks, and lists. +- **Code Blocks**: Use fenced code blocks with language identifiers for syntax highlighting (e.g., ```python). +- **Inline Code and Commands**: Use backticks for inline code references or commands (e.g., `kubectl get pods`). +- **Links and References**: Include hyperlinks to relevant sections, documents, or external resources. + +### D. **Clarity and Accessibility** + +- **Plain Language**: Write in plain language to make content accessible to non-native English speakers and those less familiar with the subject matter. +- **Definitions and Glossaries**: Provide definitions for specialized terms and consider including a glossary for complex topics. +- **Accessibility Standards**: Follow accessibility guidelines, such as using alt text for images and ensuring sufficient color contrast. + +### E. **Review and Quality Assurance** + +- **Proofreading**: Carefully proofread documents to correct grammatical errors, typos, and inconsistencies. +- **Peer Review**: Engage other team members to review documentation for accuracy, clarity, and completeness. +- **Continuous Updates**: Regularly update documentation to reflect changes in the project, deprecations, or new features. + +--- + +## Module Maintainers Specific Guidelines + +### A. **Module Documentation Structure** + +- **README.md**: Provide an overview of the module, including its purpose, features, and how it fits into the larger project. +- **Installation Guide**: Include step-by-step instructions on how to install and configure the module. +- **Usage Instructions**: Offer detailed usage examples, including common use cases and advanced configurations. +- **API References**: If applicable, provide API documentation with explanations of available functions, classes, or methods. +- **Changelog**: Maintain a changelog that documents significant changes, enhancements, and bug fixes. + +### B. **Consistency Across Modules** + +- **Standardized Templates**: Use standardized templates for module documentation to ensure consistency. +- **Naming Conventions**: Follow consistent naming conventions for files, directories, and headings. +- **Cross-Module References**: Reference related modules where appropriate and explain how they interact. + +### C. **Versioning and Compatibility** + +- **Version Information**: Clearly indicate the module version and compatible versions of dependencies. +- **Deprecation Notices**: Provide advance notice of deprecated features and guidance on migration paths. + +### D. **Contribution Guidelines** + +- **How to Contribute**: Include clear instructions for contributing to the module, such as coding standards, testing requirements, and submission processes. +- **Issue Reporting**: Explain how users can report bugs or request features, including any templates or guidelines to follow. + +--- + +## Documentation Developers Specific Guidelines + +### A. **Documentation Contribution Process** + +- **Style Guide Compliance**: Familiarize yourself with and adhere to the project's documentation style guide. +- **Documentation Planning**: Collaborate with developers and maintainers to plan documentation updates alongside code changes. +- **Use of Tools**: Utilize documentation tools and generators where appropriate (e.g., Sphinx, MkDocs). + +### B. **Content Maintenance** + +- **Content Audits**: Regularly review existing documentation to identify outdated information or gaps. +- **User Feedback**: Incorporate feedback from users to improve documentation clarity and usefulness. +- **Localization**: If applicable, support localization efforts by preparing documentation for translation. + +### C. **Collaboration with Development Teams** + +- **Integration with Development Workflows**: Align documentation updates with development cycles and release schedules. +- **Documentation in Pull Requests**: Encourage the inclusion of documentation changes in code pull requests when features are added or modified. + +--- + +## Drawing Inspiration from Leading Projects + +To meet or exceed the standards of mainstream cloud-native projects like Kubernetes, consider the following practices: + +- **Adopt a Documentation Style Guide**: Use established style guides like the [Kubernetes Documentation Style Guide](https://kubernetes.io/docs/contribute/style/style-guide/) as a reference. +- **Use Structured Formats**: Implement documentation structures similar to those found in Kubernetes or Docker, which often include concepts, tasks, tutorials, and reference sections. +- **Implement Documentation Tests**: Utilize tools that can test the validity of links, code snippets, and formatting within documentation. +- **Community Engagement**: Foster a community around documentation by recognizing contributors and encouraging participation through documentation sprints or hackathons. + +--- + +## Additional Considerations + +- **Accessibility Compliance**: Ensure documentation meets accessibility standards, such as WCAG 2.1, to make content usable by everyone. +- **Searchability**: Optimize documentation for search engines and include metadata to improve discoverability. +- **Documentation Metrics**: Track metrics like page views, time on page, and user feedback to assess the effectiveness of documentation and identify areas for improvement. +- **Disaster Recovery and Archiving**: Implement processes for backing up documentation and maintaining version history. + +--- + +## Conclusion + +By following these guidelines, module maintainers and documentation developers can create high-quality documentation that enhances the Konductor IaC project's usability and adoption. Consistent, clear, and comprehensive documentation is essential for empowering users and fostering a collaborative community. + +Your efforts are crucial to the project's success, and your contributions are highly valued. Together, we can build an exceptional knowledge base that serves the needs of all users and contributors. + +--- + +## Getting Help + +If you have questions or need assistance with documentation, please reach out through the following channels: + +- **Discord Channel**: Join our project's Discord channel for real-time discussions. +- **Issue Tracker**: Open an issue in the project's repository with the label `documentation`. diff --git a/docs/DOCUMENTATION_PLAN.md b/docs/DOCUMENTATION_PLAN.md new file mode 100644 index 0000000..7458f60 --- /dev/null +++ b/docs/DOCUMENTATION_PLAN.md @@ -0,0 +1,319 @@ +# Documentation Analysis and Reorganization Plan + +## Introduction + +This analysis examines the current state of the documentation after recent refactoring efforts. Our goal is to identify issues across various dimensions—functional, informational, architectural, stylistic, compliance, context, cohesion, and omissions. Based on this analysis, we propose a reorganized documentation structure that enhances reliability, intuitiveness, and accessibility for maintainers, developers, and end users. + +By aligning with the project's documentation guidelines, we aim to produce high-quality, informative, and accessible documentation that serves a diverse audience. + +--- + +## Current Documentation Analysis + +### Overview of Existing Documents + +1. **PULUMI_PYTHON.md** + - **Purpose**: Outlines development practices, techniques, and requirements for working on the Pulumi project. + - **Content**: Project setup, dependency management with Poetry, enforcing type checking with Pyright, using `TypedDict`, best practices, and code standards. + +2. **Konductor User Guide** + - **Purpose**: Provides an in-depth overview of the design principles, code structure, and best practices for module development within the Konductor IaC codebase. + - **Content**: Introduction, design principles, code structure, configuration management, module development guide, example module (Cert Manager), conclusion, and next steps. + +3. **Konductor Developer Guide** + - **Purpose**: Intended for developers contributing to the Konductor IaC codebase. + - **Content**: Code structure, development best practices, contribution workflow, adding enhancements, testing and validation, documentation standards, support, and resources. + +4. **AWS Module Implementation Roadmap** + - **Purpose**: Provides a comprehensive guide for implementing a scalable and modular AWS infrastructure using Pulumi and Python. + - **Content**: Introduction, goals and objectives, code structure, configuration management, AWS organization setup, IAM management, deploying workloads, secrets management, main execution flow, testing and validation, documentation best practices, conclusion, and additional resources. + +5. **Module-Specific Documents** + - **eks_donor_template.md** + - **Purpose**: Detailed guide and code walkthrough for setting up an Amazon EKS cluster with supporting AWS infrastructure. + - **eks_opentelemetry_docs.md** + - **Purpose**: Documentation on integrating AWS Distro for OpenTelemetry (ADOT) with Amazon EKS. + +6. **Other Documents** + - **CALL_TO_ACTION.md**: Purpose unclear without content. + - **COMPLIANCE.md**: Likely covers compliance requirements and standards. + - **TypeDict.md**: Possibly explains the usage of `TypedDict` in the codebase. + - **ROADMAP.md** and **ROADMAP_Addendum.md**: Provide project roadmaps and additional planning details. + - **Multiple README.md and DEVELOPER.md Files**: Present in various directories, potentially leading to confusion. + +### Identified Issues + +#### Functional + +- **Redundancy and Overlap**: Multiple `README.md` and `DEVELOPER.md` files scattered across directories can cause confusion. +- **Scattered Information**: Important information is spread across different locations, making it difficult to find. +- **Clarity of Purpose**: Some documents lack clear descriptions of their intended audience or purpose. + +#### Informational + +- **Duplication**: Similar content may be repeated in different documents. +- **Gaps**: Some advanced topics, FAQs, or troubleshooting guides are missing. + +#### Architectural + +- **Inconsistent Organization**: Documentation is not centralized, leading to a fragmented experience. +- **Predictability**: Users may not know where to look for specific information due to inconsistent placement of documents. + +#### Stylistic + +- **Inconsistency**: Variations in writing style, formatting, and terminology across documents. +- **Formatting Issues**: Lack of standardized formatting guidelines may affect readability. + +#### Compliance + +- **Adherence to Standards**: Some documents may not fully align with the standards outlined in `PULUMI_PYTHON.md`. + +#### Context + +- **Audience Targeting**: Unclear distinctions between documents intended for end users, developers, or maintainers. +- **Assumed Knowledge**: Some documents may assume prior knowledge not shared by all readers. + +#### Cohesion + +- **Disconnected Sections**: Lack of cross-referencing between related documents. +- **Flow**: The progression from introductory to advanced topics may not be logical. + +#### Omissions + +- **Central Index or Overview**: Absence of a central document that maps out the entire documentation structure. +- **Contribution Templates**: Missing templates or examples for contributions, issues, or pull requests. +- **FAQs and Troubleshooting**: Lack of dedicated sections for common issues and their resolutions. +- **Accessibility Compliance**: No mention of accessibility standards. + +--- + +## Proposed Documentation Reorganization + +To address the identified issues and align with the documentation guidelines, we'll reorganize the documentation into a centralized and predictable structure. This new layout enhances accessibility, reduces redundancy, and improves the overall user experience. + +### New Documentation Structure + +``` +docs/ +├── README.md +├── getting_started.md +├── user_guide/ +│ ├── README.md +│ ├── konductor_user_guide.md +│ └── faq_and_troubleshooting.md +├── developer_guide/ +│ ├── README.md +│ ├── konductor_developer_guide.md +│ ├── contribution_guidelines.md +│ └── modules/ +│ ├── aws/ +│ │ ├── README.md +│ │ ├── implementation_roadmap.md +│ │ ├── developer_guide.md +│ │ ├── eks_donor_template.md +│ │ ├── eks_opentelemetry_docs.md +│ │ └── changelog.md +│ └── cert_manager/ +│ ├── README.md +│ ├── developer_guide.md +│ └── changelog.md +├── modules/ +│ ├── aws/ +│ │ ├── README.md +│ │ ├── usage_guide.md +│ │ ├── installation_guide.md +│ │ └── faq_and_troubleshooting.md +│ └── cert_manager/ +│ ├── README.md +│ ├── usage_guide.md +│ ├── installation_guide.md +│ └── faq_and_troubleshooting.md +├── reference/ +│ ├── PULUMI_PYTHON.md +│ ├── TypedDict.md +│ └── style_guide.md +├── compliance/ +│ └── COMPLIANCE.md +├── roadmaps/ +│ ├── ROADMAP.md +│ └── ROADMAP_Addendum.md +├── contribution_templates/ +│ ├── issue_template.md +│ ├── pull_request_template.md +│ └── feature_request_template.md +├── call_to_action.md +``` + +### Explanation of the New Structure + +#### 1. **docs/** (Root Documentation Directory) + +- **Purpose**: Centralizes all documentation, making it the go-to place for any information related to the project. +- **Contents**: + - `README.md`: Provides an overview of the documentation structure, accessibility considerations, and guides users to specific sections. + - `getting_started.md`: A quick-start guide for new users to set up and begin using the project. + +#### 2. **user_guide/** + +- **Purpose**: Contains all user-focused documentation, helping end users understand how to use the project. +- **Contents**: + - `README.md`: Introduces the user guides and provides a table of contents. + - `konductor_user_guide.md`: The main user guide for the Konductor IaC platform. + - `faq_and_troubleshooting.md`: Addresses common questions and issues. + +#### 3. **developer_guide/** + +- **Purpose**: Houses developer-focused documentation, including guidelines for contributing and extending the project. +- **Contents**: + - `README.md`: Overview of the developer guides. + - `konductor_developer_guide.md`: Detailed guide for developers working on the Konductor codebase. + - `contribution_guidelines.md`: Centralized document outlining the contribution workflow, including templates. + - `modules/`: Subdirectory containing module-specific developer guides. + +#### 4. **developer_guide/modules/** + +- **Purpose**: Organizes developer documentation for individual modules. +- **Contents**: + - **aws/**: + - `README.md`: Introduction to the AWS module. + - `implementation_roadmap.md`: The AWS Module Implementation Roadmap. + - `developer_guide.md`: Developer guide specific to the AWS module. + - `eks_donor_template.md`: Documentation for the EKS donor template. + - `eks_opentelemetry_docs.md`: Guide on integrating ADOT with EKS. + - `changelog.md`: Documents significant changes, enhancements, and bug fixes. + - **cert_manager/**: + - `README.md`: Introduction to the Cert Manager module. + - `developer_guide.md`: Developer guide specific to the Cert Manager module. + - `changelog.md`: Documents significant changes, enhancements, and bug fixes. + +#### 5. **modules/** + +- **Purpose**: Contains user-facing documentation for individual modules. +- **Contents**: + - **aws/**: + - `README.md`: Overview and basic usage of the AWS module. + - `installation_guide.md`: Step-by-step instructions on how to install and configure the module. + - `usage_guide.md`: Detailed instructions on using the AWS module. + - `faq_and_troubleshooting.md`: Addresses common questions and issues. + - **cert_manager/**: + - `README.md`: Overview and basic usage of the Cert Manager module. + - `installation_guide.md`: Step-by-step instructions on how to install and configure the module. + - `usage_guide.md`: Detailed instructions on using the Cert Manager module. + - `faq_and_troubleshooting.md`: Addresses common questions and issues. + +#### 6. **reference/** + +- **Purpose**: Stores reference materials and standards applicable across the project. +- **Contents**: + - `PULUMI_PYTHON.md`: Pulumi Python development standards. + - `TypedDict.md`: Explanation and usage guidelines for `TypedDict`. + - `style_guide.md`: Documentation style guide outlining formatting, tone, and terminology standards. + +#### 7. **compliance/** + +- **Purpose**: Contains compliance requirements and related documentation. +- **Contents**: + - `COMPLIANCE.md`: Details on compliance standards and how the project adheres to them. + +#### 8. **roadmaps/** + +- **Purpose**: Provides planning documents and future development roadmaps. +- **Contents**: + - `ROADMAP.md`: The overall project roadmap. + - `ROADMAP_Addendum.md`: Additional details or updates to the roadmap. + +#### 9. **contribution_templates/** + +- **Purpose**: Contains templates for issues, pull requests, and feature requests to standardize contributions. +- **Contents**: + - `issue_template.md`: Template for reporting issues. + - `pull_request_template.md`: Template for submitting pull requests. + - `feature_request_template.md`: Template for proposing new features. + +#### 10. **call_to_action.md** + +- **Purpose**: A document highlighting ways for the community to contribute or engage with the project. + +### Benefits of the New Structure + +- **Centralization**: All documentation is located under the `docs/` directory, making it easy to find. +- **Predictability**: Users can intuitively navigate to the appropriate section based on their needs. +- **Audience Clarity**: Clear separation between user guides and developer guides ensures that each audience can find relevant information quickly. +- **Reduced Redundancy**: Consolidates duplicate documents and organizes content logically. +- **Consistency**: Enables uniform formatting, style, and terminology across all documents by including a `style_guide.md`. +- **Improved Navigation**: README files in each directory provide overviews and link to sub-documents. +- **Enhanced Cohesion**: Related documents are grouped together, facilitating a logical flow of information. +- **Ease of Maintenance**: A structured layout simplifies updates and the addition of new documentation. +- **Accessibility Compliance**: By including accessibility considerations in the `docs/README.md` and following guidelines in the `style_guide.md`, we ensure documentation is accessible to all users. + +--- + +## Implementation Plan + +To transition to the new documentation structure without losing any content or value, we propose the following steps: + +1. **Create the `docs/` Directory**: + - Move all existing documentation files into the `docs/` directory. + - Update any relative links within documents to reflect the new paths. + +2. **Organize Documentation into Subdirectories**: + - Categorize documents based on their audience and purpose. + - For example, move `Konductor User Guide` into `docs/user_guide/konductor_user_guide.md`. + +3. **Consolidate Duplicate Documents**: + - Merge multiple `README.md` and `DEVELOPER.md` files where appropriate. + - Ensure that module-specific guides are placed in their respective directories under `developer_guide/modules/`. + +4. **Standardize Document Formatting**: + - Apply consistent styling, headings, and formatting across all documents according to the `style_guide.md`. + - Use a markdown linter or formatter to enforce style guidelines. + +5. **Update the Root `README.md`**: + - Provide an introduction to the project. + - Include links to the main sections of the documentation under `docs/`. + - Highlight accessibility features and compliance. + +6. **Create Indexes and Tables of Contents**: + - In `docs/README.md`, include a high-level table of contents for the entire documentation. + - In each subdirectory's `README.md`, provide an overview of the contents and purpose. + +7. **Add Missing Sections**: + - **FAQs and Troubleshooting**: Create dedicated sections to address common issues. + - **Accessibility Compliance**: Ensure documents meet accessibility standards, as outlined in `DOCUMENTATION.md`. + - **Glossary**: Consider adding a glossary for complex terms. + +8. **Review for Omissions and Gaps**: + - Ensure that all critical topics are covered. + - Incorporate user feedback to identify areas needing improvement. + +9. **Ensure Compliance with Standards**: + - Cross-reference documents with `PULUMI_PYTHON.md` and `style_guide.md` to ensure adherence to coding and documentation standards. + +10. **Update Contribution Guidelines**: + - In `docs/developer_guide/contribution_guidelines.md`, outline the new documentation structure. + - Provide instructions for adding or updating documentation within the new layout. + - Include templates from `contribution_templates/` for consistency. + +11. **Implement Accessibility and Searchability Enhancements**: + - Add metadata to documents to improve search engine optimization (SEO). + - Use headings and alt text appropriately for accessibility. + +12. **Communicate the Changes**: + - Inform the team and community about the reorganization through the `call_to_action.md`. + - Update any external links or references to documentation. + +13. **Set Up Documentation Metrics**: + - Implement tools to track documentation usage and effectiveness. + - Use metrics to guide future improvements. + +14. **Backup and Version Control**: + - Ensure all documentation changes are committed to version control. + - Implement a backup strategy for documentation. + +--- + +## Conclusion + +By reorganizing the documentation into a centralized and predictable structure, we enhance the usability and accessibility of information for all stakeholders. This new layout addresses the issues identified in the analysis by providing clear pathways to information, reducing redundancy, and ensuring consistency across all documents. + +Maintainers, developers, and end users will benefit from the improved organization, making it easier to find the information they need and contribute effectively to the project. This reorganization sets a strong foundation for future growth and scalability of the documentation as the project evolves. diff --git a/docs/HOW_TO.md b/docs/HOW_TO.md new file mode 100644 index 0000000..5bbe120 --- /dev/null +++ b/docs/HOW_TO.md @@ -0,0 +1,260 @@ +# Konductor DevOps Template + +## Introduction + +Welcome to the Konductor DevOps Template. + +This repository includes baseline dependencies and boilerplate artifacts for operating and developing cloud infrastructure automation. + +Follow the steps below to configure AWS credentials, set up your development environment, and verify access to AWS and EKS resources. + +## Prerequisites + +Before you begin, ensure you have the following installed: + +- **Python**: Version 3.8 or higher. +- **Poetry**: For dependency management and packaging. [Install Poetry](https://python-poetry.org/docs/#installation). +- **AWS CLI**: Version 2.x or higher. +- **Pulumi CLI**: For managing infrastructure as code. +- **Git**: For cloning repositories. +- **Kubectl**: For interacting with Kubernetes clusters. +- **sudo**: For executing administrative commands. + +> **Note:** All dependencies are automatically supplied in the [ghcr.io/containercraft/devcontainer](https://github.com/containercraft/devcontainer) image powering the VSCode Dev Container included in this repository by the [.devcontainer/devcontainer.json](.devcontainer/devcontainer.json) and [.devcontainer/Dockerfile](.devcontainer/Dockerfile). + +## Steps to Recreate + +Follow the steps below to set up your environment: + +### 1. Initialize the Development Environment + +#### a. Clone the Repository + +Clone this repository to your local machine: + +```bash +git clone https://github.com/containercraft/konductor.git +cd konductor +``` + +#### b. Install Dependencies with Poetry + +We use [Poetry](https://python-poetry.org/) for dependency management and packaging. Poetry ensures that our development environment is consistent, dependencies are properly managed, and collaboration is streamlined. + +Install the project dependencies: + +```bash +poetry install +``` + +This command will create a virtual environment and install all dependencies specified in `pyproject.toml`. + +#### c. Activate the Virtual Environment + +Activate the virtual environment: + +```bash +poetry shell +``` + +Alternatively, you can prefix commands with `poetry run`. + +#### d. Configure Pulumi to Use Poetry + +Ensure that `Pulumi.yaml` specifies Poetry as the toolchain: + +```yaml +name: your-pulumi-project +runtime: + name: python + options: + toolchain: poetry +``` + +#### e. Install Pulumi Dependencies + +Install Pulumi dependencies: + +```bash +pulumi install +``` + +This command ensures that Pulumi recognizes and utilizes the Poetry-managed environment. + +### 2. Enforce Type Checking with Pyright + +Type checking enhances code reliability and maintainability. We enforce strict type checking using [Pyright](https://github.com/microsoft/pyright). + +#### a. Verify Type Checking + +Run Pyright to check for type errors: + +```bash +poetry run pyright +``` + +Ensure that there are no type errors before proceeding. If type errors are detected, fix them according to the standards outlined in [PULUMI_PYTHON.md](../docs/PULUMI_PYTHON.md). + +### 3. Authenticate with Pulumi + +Login to Pulumi Cloud: + +```bash +pulumi login +``` + +### 4. Load Environment Variables and AWS Credentials + +Use Pulumi to load environment variables, configuration files, and credentials. + +*Note:* Replace ``, ``, and `` with your Pulumi organization, project name, and stack name. + +```bash +export ENVIRONMENT="//" +eval $(pulumi env open --format=shell $ENVIRONMENT | tee .tmpenv; direnv allow) +echo "Loaded environment $ENVIRONMENT" + +alias aws='aws --profile smdc-cba' +``` + +### 5. Validate AWS CLI Access + +Get Caller Identity to verify your AWS identity: + +```bash +aws --profile smdc-cba sts get-caller-identity +``` + +### 6. Deploy Infrastructure as Code (IaC) + +Before deploying, ensure that your code passes type checking to maintain code quality. + +#### a. Run Type Checking + +```bash +poetry run pyright +``` + +If type errors are detected, the deployment will halt, and errors will be displayed. + +#### b. Deploy with Pulumi + +Deploy the infrastructure using Pulumi: + +```bash +pulumi up --yes --stack // --skip-preview --refresh +``` + +Replace `//` with your specific Pulumi stack information. + +### 7. Install the SMCE CLI Tool + +Clone the SMCE CLI repository and set up the `smce` CLI: + +```bash +cd ~ +rm -rf ~/smce-cli ~/.local/bin/smce +ln -sf $GIT_CONFIG ~/.gitconfig + +git clone https://git.smce.nasa.gov/smce-administration/smce-cli.git ~/smce-cli +cd ~/smce-cli + +mkdir -p ~/.local/bin +cp -f ~/smce-cli/smce ~/.local/bin/smce +chmod +x ~/.local/bin/smce + +smce --help || true +``` + +### 8. Configure AWS and Kubernetes Using SMCE CLI + +Set up AWS Multi-Factor Authentication (MFA): + +> **Note:** Enhance `smce-cli` to auto-export MFA environment variables. + +```bash +smce awsconfig mfa +``` + +### 9. Test AWS CLI Access + +List S3 buckets to confirm AWS access and verify your AWS identity: + +```bash +aws s3 ls +aws sts get-caller-identity +``` + +### 10. Update Kubernetes Configuration for EKS Cluster + +Update your kubeconfig file to interact with your EKS cluster: + +```bash +aws eks update-kubeconfig --profile main --region us-east-1 --name smce-gitops +``` + +Generate a new Kubernetes configuration: + +```bash +smce kubeconfig generate +``` + +Generate an authentication token for the EKS cluster: + +```bash +aws eks get-token --region us-east-1 --cluster-name smce-gitops --output json +``` + +Replace `` with your MFA device's ARN and provide your MFA token code in place of `$MFA_TOKEN`. + +### 11. Configure Kubectl Alias and Verify Kubernetes Access + +List available Kubernetes contexts: + +```bash +kubectl --kubeconfig ~/.kube/smce config get-contexts +``` + +Retrieve the list of nodes in your Kubernetes cluster: + +```bash +kubectl --kubeconfig ~/.kube/smce get nodes +``` + +Check the Kubernetes client and server versions with verbose output: + +```bash +kubectl version -v=8 +``` + +## Conclusion + +By following these steps, you've set up your environment to interact with AWS services and your EKS cluster. This setup is essential for deploying and managing applications using the Konductor DevOps Template. + +## Resources + +- [AWS CLI Documentation](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) +- [Pulumi Documentation](https://www.pulumi.com/docs/) +- [SMCE CLI Repository](https://git.smce.nasa.gov/smce-administration/smce-cli) +- [Kubernetes Documentation](https://kubernetes.io/docs/home/) +- [Poetry Documentation](https://python-poetry.org/docs/) +- [Pyright Documentation](https://github.com/microsoft/pyright) + +## Troubleshooting + +**Note:** If you encounter authentication issues due to MFA requirements, test temporary session credentials using the following command: + +```bash +aws sts get-session-token \ + --duration-seconds 129600 \ + --profile default \ + --serial-number \ + --token-code $MFA_TOKEN +``` + +## Bonus: Launch Kubernetes in Docker + +```bash +cd .. +task kubernetes +``` diff --git a/docs/README.md b/docs/README.md index 900c587..3cf708e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,190 +1,90 @@ -# Konductor DevOps Template +# Konductor Documentation -## Introduction - -Welcome to the Konductor DevOps Template. - -This repository includes baseline dependencies and boilerplate artifacts for operating and developing cloud infrastructure automation. - -Follow the steps below to configure AWS credentials, Kubernetes `kubectl` configuration, and verify access to AWS and EKS resources. - -## Prerequisites - -Before you begin, ensure you have the following installed: - -- **AWS CLI**: Version 2.x or higher. -- **Pulumi CLI**: For managing infrastructure as code. -- **Git**: For cloning repositories. -- **Kubectl**: For interacting with Kubernetes clusters. -- **sudo**: For executing administrative commands. - -> NOTE: All dependencies are automatically supplied in the [ghcr.io/containercraft/devcontainer](https://github.com/containercraft/devcontainer) image powering the VSCode Dev Container included in this repository by the [.devcontainer/devcontainer.json](.devcontainer/devcontainer.json) and [.devcontainer/Dockerfile](.devcontainer/Dockerfile). - -## Steps to Recreate - -Follow the steps below to set up your environment: - -### 1. Cloud Account Logins - -Authenticate with your Pulumi account: - -```bash {"name":"login","tag":"setup"} -pulumi login && pulumi install - -``` - -### 2. Load Environment Variables and AWS Credentials - -Use Pulumi to load environment variables, configuration files, and credentials: - -* NOTE: Replace ``, ``, and `` with your Pulumi organization, project name, and stack name. - -```bash {"name":"load-environments-and-secrets","tag":"setup"} -export ENVIRONMENT="containercraft/NavtecaAwsCredentialsConfigSmce/navteca-aws-credentials-config-smce" -eval $(pulumi env open --format=shell $ENVIRONMENT | tee .tmpenv; direnv allow) -echo "Loaded environment $ENVIRONMENT" - -alias aws='aws --profile smdc-cba' - -``` - -### 3. Validate AWS CLI Access - -Get Caller Identity to verify your AWS identity: - -```bash {"excludeFromRunAll":"true","name":"validate-aws-identity","tag":"validate-aws"} -aws --profile smdc-cba sts get-caller-identity - -``` - -### 3. Deploy IaC +Welcome to the Konductor documentation! This comprehensive guide covers everything you need to know about using, developing for, and contributing to the Konductor Infrastructure as Code (IaC) platform. -Deploy the infrastructure as code (IaC) using Pulumi: - -```bash {"name":"deploy-iac","tag":"setup"} -git remote add origin https://github.com/containercraft/konductor || true -git config remote.origin.url https://github.com/containercraft/konductor || true -pulumi up --yes --stack containercraft/scip-ops-prod --skip-preview=true --refresh=true - -``` - -### 3. Install the SMCE CLI Tool - -Clone the SMCE CLI repository && Symlink `smce` cli: - -```bash {"name":"install-smce-cli","tag":"setup"} -cd ~ -rm -rf ~/smce-cli ~/.local/bin/smce -ln -sf $GIT_CONFIG ~/.gitconfig - -git config remote.origin.url https://git.smce.nasa.gov/smce-administration/smce-cli.git -git clone https://git.smce.nasa.gov/smce-administration/smce-cli.git ~/smce-cli && cd ~/smce-cli && ls - -mkdir -p ~/.local/bin -cp -f ~/smce-cli/smce ~/.local/bin/smce -chmod +x ~/.local/bin/smce - -smce --help; true - -``` - -### 4. Configure AWS and Kubernetes Using SMCE CLI - -Set up AWS Multi-Factor Authentication (MFA): - -> TODO: enhance smce-cli to auto-export mfa env vars - -```bash {"excludeFromRunAll":"true","name":"smce-aws-mfa","tag":"aws"} -smce awsconfig mfa - -``` - -### 7. Test AWS CLI Access - -List S3 buckets to confirm AWS access && Verify your AWS identity: - -```bash {"excludeFromRunAll":"true","name":"validate-aws-s3-ls","tag":"validate-aws"} -aws s3 ls -aws sts get-caller-identity - -``` - -### 8. Update Kubernetes Configuration for EKS Cluster - -Update your kubeconfig file to interact with your EKS cluster: - -```bash {"excludeFromRunAll":"true","name":"aws-get-ops-kubeconfig","tag":"kubeconfig"} -aws eks update-kubeconfig --profile main --region us-east-1 --name smce-gitops - -``` - -Generate a new Kubernetes configuration: - -```bash {"excludeFromRunAll":"true","name":"generate-smce-kubeconfig","tag":"kubeconfig"} -smce kubeconfig generate +## Introduction -``` +Konductor is a powerful Infrastructure as Code (IaC) platform built on Pulumi and Python, designed to streamline DevOps workflows and Platform Engineering practices. Whether you're a platform user deploying infrastructure, a developer contributing modules, or a maintainer managing the project, you'll find the information you need in these docs. -Generate an authentication token for the EKS cluster: +## Documentation Structure -```bash {"excludeFromRunAll":"true","name":"generate-eks-auth-token","tag":"kubeconfig"} -aws eks get-token --region us-east-1 --cluster-name smce-gitops --output json +Our documentation is organized into several main sections, each targeting specific user needs: -``` +### 🚀 [Getting Started](./getting_started.md) +Quick-start guide to help you begin using Konductor, including installation, basic setup, and your first deployment. -Replace `` with your MFA device's ARN and provide your MFA token code in place of `$MFA_TOKEN`. +### 📚 User Documentation +- [User Guide](./user_guide/README.md): Complete guide for platform users +- [Module Documentation](./modules/README.md): Detailed guides for individual modules +- [FAQ & Troubleshooting](./user_guide/faq_and_troubleshooting.md): Common issues and solutions -### 9. Configure Kubectl Alias and Verify Kubernetes Access +### 💻 Developer Documentation +- [Developer Guide](./developer_guide/README.md): Guide for contributing to Konductor +- [Module Development](./developer_guide/modules/README.md): Creating and maintaining modules +- [Contribution Guidelines](./developer_guide/contribution_guidelines.md): How to contribute -List available Kubernetes contexts: +### 📖 Reference Documentation +- [Pulumi Python Standards](./reference/PULUMI_PYTHON.md): Development standards and practices +- [TypedDict Guide](./reference/TypedDict.md): Using TypedDict for configurations +- [Style Guide](./reference/style_guide.md): Documentation and code style standards -```bash {"excludeFromRunAll":"true","name":"validate-kubeconfig-context-list","tag":"kubeconfig"} -kubectl --kubeconfig ~/.kube/smce config get-contexts +### ⚖️ Compliance & Planning +- [Compliance Guide](./compliance/COMPLIANCE.md): Compliance standards and implementation +- [Project Roadmap](./roadmaps/ROADMAP.md): Future development plans +- [Roadmap Addendum](./roadmaps/ROADMAP_Addendum.md): Additional planning details -``` +## Available Modules -Retrieve the list of nodes in your Kubernetes cluster: +Konductor includes several core modules, each with comprehensive documentation: -```bash {"excludeFromRunAll":"true","name":"validate-kube-get-nodes","tag":"kubeconfig"} -kubectl --kubeconfig ~/.kube/smce get nodes +### AWS Module +- [User Guide](./modules/aws/README.md) +- [Developer Guide](./developer_guide/modules/aws/README.md) +- [Implementation Roadmap](./developer_guide/modules/aws/implementation_roadmap.md) +- [EKS Setup Guide](./developer_guide/modules/aws/eks_donor_template.md) +- [OpenTelemetry Integration](./developer_guide/modules/aws/eks_opentelemetry_docs.md) -``` +### Cert Manager Module +- [User Guide](./modules/cert_manager/README.md) +- [Developer Guide](./developer_guide/modules/cert_manager/README.md) +- [Installation Guide](./modules/cert_manager/installation_guide.md) -Check the Kubernetes client and server versions with verbose output: +## Contributing -```bash {"excludeFromRunAll":"true","name":"validate-kube-get-version","tag":"kubeconfig"} -kubectl version -v=8 +We welcome contributions from the community! To get started: -``` +1. Read our [Contribution Guidelines](./developer_guide/contribution_guidelines.md) +2. Check our [Issue Template](./contribution_templates/issue_template.md) +3. Review our [Pull Request Template](./contribution_templates/pull_request_template.md) +4. See our [Feature Request Template](./contribution_templates/feature_request_template.md) -## Conclusion +## Accessibility -By following these steps, you've set up your environment to interact with AWS services and your EKS cluster. This setup is essential for deploying and managing applications using the Konductor DevOps Template. +This documentation adheres to web accessibility guidelines to ensure it's usable by everyone: -## Resources +- Clear heading hierarchy for easy navigation +- Alt text for all images and diagrams +- High contrast text and proper font sizing +- Keyboard-navigable interface +- Screen reader compatibility -- [AWS CLI Documentation](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) -- [Pulumi Documentation](https://www.pulumi.com/docs/) -- [SMCE CLI Repository](https://git.smce.nasa.gov/smce-administration/smce-cli) -- [Kubernetes Documentation](https://kubernetes.io/docs/home/) +## Getting Help -## Troubleshooting +If you need assistance: -**Note:** If you encounter authentication issues due to MFA requirements, test temporary session credentials using the following command: +1. Check the [FAQ & Troubleshooting](./user_guide/faq_and_troubleshooting.md) guide +2. Search existing [GitHub Issues](https://github.com/containercraft/konductor/issues) +3. Join our [Community Discord](https://discord.gg/Jb5jgDCksX) +4. Create a new issue using our [Issue Template](./contribution_templates/issue_template.md) -```bash {"excludeFromRunAll":"true","name":"aws-sts-get-session-token","tag":"dbg"} -aws sts get-session-token \ - --duration-seconds 129600 \ - --profile default \ - --serial-number \ - --token-code $MFA_TOKEN +## Documentation Updates -``` +This documentation is continuously improved based on user feedback and project evolution. To suggest improvements: -## Bonus: Launch Kubernetes in Docker +1. Create an issue using our documentation issue template +2. Submit a pull request with your proposed changes +3. Join the documentation discussions in our community channels -```bash {"excludeFromRunAll":"true","name":"task-run-kubernetes","tag":"tind"} -cd .. -task kubernetes +## License -``` +This documentation and the Konductor project are licensed under [LICENSE]. See the [LICENSE](../LICENSE) file for details. diff --git a/docs/call_to_action.md b/docs/call_to_action.md new file mode 100644 index 0000000..da74cb1 --- /dev/null +++ b/docs/call_to_action.md @@ -0,0 +1,248 @@ +# Call to Action: Enhancing the Konductor Platform + +## Introduction + +This document outlines our vision for enhancing and maintaining the Konductor Infrastructure as Code (IaC) platform. It serves as both a guide for current maintainers and an invitation to potential contributors, emphasizing our commitment to quality, maintainability, and community-driven development. + +## Prime Directive + +> "Features are nice. Quality is paramount." + +Quality extends beyond code to encompass the entire developer and user experience. At Konductor, we believe that the success of open-source projects depends on the satisfaction and engagement of both community developers and users. + +## Core Principles + +### 1. Developer Experience (DX) + +- **Code Quality**: Maintain high standards through: + - Static type checking with Pyright + - Comprehensive documentation + - Automated testing + - Clear error messages + +- **Development Workflow**: + ```python + # Example of type-safe, well-documented code + from typing import TypedDict, Optional + + class ModuleConfig(TypedDict): + """Configuration for module deployment. + + Attributes: + enabled: Whether the module is enabled + version: Optional version string + namespace: Kubernetes namespace + """ + enabled: bool + version: Optional[str] + namespace: str + ``` + +### 2. User Experience (UX) + +- **Clear Documentation**: Maintain comprehensive, accessible documentation +- **Intuitive Interfaces**: Design APIs and configurations for clarity +- **Error Handling**: Provide actionable error messages +- **Progressive Disclosure**: Layer complexity appropriately + +### 3. Code Maintainability + +- **Modular Design**: + - Separate concerns clearly + - Create reusable components + - Implement consistent interfaces + +- **Type Safety**: + - Use TypedDict for configurations + - Implement strict type checking + - Provide clear type annotations + +### 4. Community Focus + +- **Open Communication**: + - Active Discord community + - Responsive issue management + - Regular updates and roadmap sharing + +- **Inclusive Development**: + - Welcome contributions of all sizes + - Provide mentorship opportunities + - Maintain helpful documentation + +## Areas for Enhancement + +### 1. Modular Design Improvements + +Current: + +```python +# Before: Mixed responsibilities +class AwsResources: + def create_vpc(self): pass + def create_database(self): pass +``` + +Target: + +```python +# After: Single responsibility +class AwsNetworking: + """Manages AWS networking resources.""" + def create_vpc(self): pass + def create_database(self): pass + +class AwsDatabase: + """Manages AWS database resources.""" + def create_database(self): pass +``` + + +### 2. Configuration Management + +- **Standardize Configurations**: + - Use TypedDict consistently + - Implement validation + - Provide clear defaults + +### 3. Documentation Improvements + +- **Structure**: Follow the new documentation organization +- **Accessibility**: Ensure documentation is accessible to all +- **Examples**: Provide clear, runnable examples + +### 4. Testing Enhancements + +- **Unit Tests**: Improve coverage +- **Integration Tests**: Add end-to-end scenarios +- **Type Checking**: Enforce strict mode + +## How to Contribute + +### 1. Code Contributions + +- Follow type safety practices +- Include comprehensive tests +- Update documentation +- Add examples where appropriate + +### 2. Documentation Contributions + +- Follow the style guide +- Include code examples +- Consider accessibility +- Update related docs + +### 3. Review Process + +- Engage in constructive review +- Test thoroughly +- Verify documentation +- Check type safety + +## Development Standards + +### Code Organization + +```python +# Example of well-organized module structure +modules/ +├── aws/ +│ ├── init.py +│ ├── types.py # TypedDict definitions +│ ├── deploy.py # Deployment logic +│ └── README.md # Module documentation +``` + +### Type Safety Requirements + +- Use TypedDict for configurations +- Enable strict type checking +- Implement proper error handling + +### Documentation Requirements + +- Clear docstrings +- Type annotations +- Usage examples +- Architecture diagrams + +## Future Vision + +### Short-term Goals + +1. **Enhanced Type Safety** + - Complete TypedDict migration + - Implement strict checking + - Add validation layers + +2. **Improved Testing** + - Increase test coverage + - Add integration tests + - Implement property testing + +### Long-term Goals + +1. **Platform Evolution** + - Multi-cloud support + - Advanced compliance + - Enhanced automation + +2. **Community Growth** + - Expand contributor base + - Improve documentation + - Regular workshops + +## Call to Action + +We invite you to join us in improving the Konductor platform: + +1. **For Developers**: + - Review our [Developer Guide](./developer_guide/README.md) + - Check our [Good First Issues](https://github.com/containercraft/konductor/issues?q=is:issue+is:open+label:"good+first+issue") + - Join our [Discord](https://discord.gg/Jb5jgDCksX) + +2. **For Users**: + - Share your use cases + - Report issues + - Suggest improvements + +3. **For Documentation**: + - Help improve clarity + - Add examples + - Fix errors + +## Getting Started + +1. **Read the Documentation**: + - [Getting Started Guide](./getting_started.md) + - [Developer Guide](./developer_guide/README.md) + - [User Guide](./user_guide/README.md) + +2. **Set Up Your Environment**: + ```bash + git clone https://github.com/containercraft/konductor.git + cd konductor + poetry install + poetry shell + ``` + +3. **Start Contributing**: + - Pick an issue + - Fork the repository + - Submit a pull request + +## Community Support + +- **Discord**: Join our [Community Discord](https://discord.gg/Jb5jgDCksX) +- **GitHub**: Open issues and discussions +- **Documentation**: Contribute to our docs + +## Conclusion + +The Konductor platform thrives on community involvement and maintains high standards for code quality, documentation, and user experience. We welcome contributions that align with our vision of creating a robust, maintainable, and user-friendly Infrastructure as Code platform. + +Remember: "Features are nice. Quality is paramount." + +--- + +**Note**: This document is actively maintained. For updates and changes, refer to our [changelog](./changelog.md). diff --git a/docs/compliance/COMPLIANCE.md b/docs/compliance/COMPLIANCE.md new file mode 100644 index 0000000..be7ba5f --- /dev/null +++ b/docs/compliance/COMPLIANCE.md @@ -0,0 +1,498 @@ +# Compliance Standards and Implementation Guide + +## Introduction + +This document outlines the comprehensive compliance strategy implemented in the Konductor Infrastructure as Code (IaC) platform. It details how the codebase is designed to reduce the time necessary to achieve production-ready compliance and authority to operate, while also minimizing the overhead associated with compliance maintenance and renewal audits. + +## Table of Contents + +1. [Overview](#overview) + - [Objectives](#objectives) + - [Scope](#scope) +2. [Compliance Framework](#compliance-framework) + - [Supported Standards](#supported-standards) + - [Implementation Strategy](#implementation-strategy) +3. [Development Standards](#development-standards) +4. [Implementation Guidelines](#implementation-guidelines) + - [Resource Tagging](#resource-tagging) + - [Access Control](#access-control) +5. [Validation and Testing](#validation-and-testing) +6. [Documentation Requirements](#documentation-requirements) +7. [Auditing and Reporting](#auditing-and-reporting) +8. [Security Controls](#security-controls) +9. [Compliance Automation](#compliance-automation) +10. [Maintenance and Updates](#maintenance-and-updates) +11. [Conclusion](#conclusion) + +## Overview + +### Objectives + +- Automate compliance controls within IaC workflows +- Ensure consistent policy enforcement across all environments +- Reduce manual compliance tasks and human error +- Provide clear audit trails and documentation +- Support multiple compliance frameworks (NIST, FISMA, ISO 27001) + +### Scope + +This document covers: + +- Development standards and practices +- Security controls and implementation +- Documentation requirements +- Testing and validation procedures +- Audit preparation and reporting + +## Compliance Framework + +### Supported Standards + +#### NIST Framework + +- NIST SP 800-53 +- NIST Cybersecurity Framework +- NIST Cloud Computing Standards + +#### FISMA Compliance + +- Federal Information Security Management Act requirements +- Authority to Operate (ATO) prerequisites +- Continuous monitoring requirements + +#### ISO Standards + +- ISO 27001 Information Security Management +- ISO 27017 Cloud Security +- ISO 27018 Cloud Privacy + +### Implementation Strategy + +#### Configuration Schema + +```python +from typing import TypedDict, List, Dict + +class ComplianceConfig(TypedDict): + """Compliance configuration structure. + + Attributes: + nist_controls: List of NIST control identifiers + fisma_level: FISMA impact level + iso_controls: List of ISO control identifiers + audit_logging: Audit logging configuration + encryption: Encryption requirements + """ + nist_controls: List[str] + fisma_level: str + iso_controls: List[str] + audit_logging: Dict[str, bool] + encryption: Dict[str, str] + +# Default compliance configuration +default_compliance: ComplianceConfig = { + "nist_controls": ["AC-2", "AC-3", "AU-2"], + "fisma_level": "moderate", + "iso_controls": ["A.9.2.3", "A.10.1.1"], + "audit_logging": {"enabled": True, "encrypted": True}, + "encryption": {"algorithm": "AES-256", "key_rotation": "90days"} +} +``` + +## Development Standards + +### Code Quality Requirements + +#### Type Safety + +- Use type hints for all functions and variables +- Implement TypedDict for configurations +- Enable strict type checking with Pyright + +#### Documentation + +- Comprehensive docstrings for all public APIs +- Clear inline comments for complex logic +- Up-to-date README files and user guides + +#### Testing + +- Unit tests for all functionality +- Integration tests for compliance controls +- Security testing and validation + +### Python Standards + +```python +from typing import Optional, Dict, Any +import pulumi + +def create_compliant_resource( + name: str, + config: Dict[str, Any], + compliance_tags: Optional[Dict[str, str]] = None +) -> pulumi.Resource: + """Create a resource with compliance controls. + + Args: + name: Resource name + config: Resource configuration + compliance_tags: Compliance-related tags + + Returns: + pulumi.Resource: Created resource with compliance controls + """ + if not compliance_tags: + compliance_tags = {} + + # Add required compliance tags + compliance_tags.update({ + "compliance:framework": "nist", + "compliance:control": "AC-2", + "compliance:validated": "true" + }) + + # Resource creation with compliance controls + resource = pulumi.Resource( + name, + props={**config, "tags": compliance_tags}, + opts=pulumi.ResourceOptions(protect=True) + ) + + return resource +``` + +## Implementation Guidelines + +### Resource Tagging + +#### Required Tags + +- `compliance:framework` +- `compliance:control` +- `compliance:validated` +- `compliance:owner` +- `compliance:expiration` + +#### Tag Implementation + +```python +def apply_compliance_tags( + resource: pulumi.Resource, + tags: Dict[str, str] +) -> None: + """Apply compliance tags to a resource. + + Args: + resource: The resource to tag + tags: Compliance tags to apply + """ + required_tags = { + "compliance:framework": "nist", + "compliance:control": "AC-2", + "compliance:validated": "true", + "compliance:owner": "platform-team", + "compliance:expiration": "2024-12-31" + } + + # Merge required tags with provided tags + final_tags = {**required_tags, **tags} + + # Apply tags to resource + resource.tags.apply(lambda x: {**x, **final_tags}) +``` + +### Access Control + +#### IAM Configuration + +- Implement least privilege access +- Regular access review procedures +- Role-based access control (RBAC) + +#### Authentication Requirements + +- Multi-factor authentication (MFA) +- Strong password policies +- Regular credential rotation + +## Validation and Testing + +### Compliance Testing + +```python +import pytest +from typing import Dict, Any + +def test_resource_compliance( + resource_config: Dict[str, Any], + compliance_requirements: Dict[str, Any] +) -> None: + """Test resource compliance with requirements. + + Args: + resource_config: Resource configuration to test + compliance_requirements: Compliance requirements to validate + """ + # Verify required tags + assert "compliance:framework" in resource_config["tags"] + assert "compliance:control" in resource_config["tags"] + + # Verify encryption settings + assert resource_config["encryption"]["enabled"] is True + assert resource_config["encryption"]["algorithm"] == "AES-256" + + # Verify access controls + assert resource_config["access"]["mfa_enabled"] is True + assert resource_config["access"]["minimum_permissions"] is True +``` + +### Automated Validation + +#### Pre-deployment Checks + +- Configuration validation +- Policy compliance verification +- Security control validation + +#### Continuous Monitoring + +- Real-time compliance monitoring +- Automated remediation +- Compliance reporting + +## Documentation Requirements + +### Required Documentation + +#### System Documentation + +- Architecture diagrams +- Data flow documentation +- Security controls documentation + +#### Operational Procedures + +- Incident response plans +- Change management procedures +- Backup and recovery procedures + +#### Compliance Evidence + +- Control implementation evidence +- Test results and validations +- Audit logs and reports + +#### Documentation Format + +```python +# Component Documentation Template + +## Overview +[Component description and purpose] + +## Compliance Controls +- NIST Controls: [List applicable controls] +- FISMA Requirements: [List FISMA requirements] +- ISO Controls: [List ISO controls] + +## Implementation Details +[Technical implementation details] + +## Security Controls +[Security measures and controls] + +## Testing and Validation +[Testing procedures and results] + +## Maintenance Procedures +[Routine maintenance requirements] +``` + +## Auditing and Reporting + +### Audit Logging + +```python +from typing import Dict, Any +import logging + +def setup_compliance_logging( + config: Dict[str, Any] +) -> None: + """Configure compliance audit logging. + + Args: + config: Logging configuration + """ + logging.basicConfig( + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', + level=logging.INFO + ) + + # Add compliance-specific handlers + handler = logging.FileHandler('compliance_audit.log') + handler.setFormatter(logging.Formatter( + '%(asctime)s - %(name)s - %(levelname)s - %(message)s' + )) + logging.getLogger('compliance').addHandler(handler) +``` + +### Reporting Requirements + +#### Regular Reports + +- Monthly compliance status +- Quarterly security assessments +- Annual compliance reviews + +#### Incident Reporting + +- Security incident reports +- Compliance violation reports +- Remediation action reports + +## Security Controls + +### Encryption Requirements + +#### Data at Rest + +- AES-256 encryption +- Key management procedures +- Regular key rotation + +#### Data in Transit + +- TLS 1.2 or higher +- Certificate management +- Secure key exchange + +### Access Controls + +```python +from typing import Dict, List + +def validate_access_controls( + resource: Dict[str, Any], + required_controls: List[str] +) -> bool: + """Validate resource access controls. + + Args: + resource: Resource configuration + required_controls: Required access controls + + Returns: + bool: True if all required controls are implemented + """ + implemented_controls = resource.get("access_controls", []) + return all(control in implemented_controls for control in required_controls) +``` + +## Compliance Automation + +### Automated Controls + +#### Resource Provisioning + +- Compliant resource templates +- Automated configuration +- Validation checks + +#### Monitoring and Alerts + +- Continuous compliance monitoring +- Automated alerts +- Remediation + + workflows + +### Implementation Example + +```python +from typing import Dict, Any +import pulumi + +class ComplianceAutomation: + """Automated compliance control implementation.""" + + def __init__(self, config: Dict[str, Any]): + self.config = config + self.setup_monitoring() + self.setup_alerts() + + def setup_monitoring(self) -> None: + """Configure compliance monitoring.""" + # Implementation details + + def setup_alerts(self) -> None: + """Configure compliance alerts.""" + # Implementation details + + def validate_resource(self, resource: pulumi.Resource) -> bool: + """Validate resource compliance. + + Args: + resource: Resource to validate + + Returns: + bool: True if resource is compliant + """ + # Validation implementation + return True +``` + +## Maintenance and Updates + +### Regular Reviews + +#### Monthly Reviews + +- Control effectiveness +- Policy compliance +- Security posture + +#### Quarterly Assessments + +- Comprehensive audits +- Control updates +- Documentation reviews + +### Update Procedures + +```python +from typing import Dict, Any +import datetime + +def update_compliance_controls( + current_controls: Dict[str, Any], + new_requirements: Dict[str, Any] +) -> Dict[str, Any]: + """Update compliance controls with new requirements. + + Args: + current_controls: Current compliance controls + new_requirements: New compliance requirements + + Returns: + Dict[str, Any]: Updated compliance controls + """ + updated_controls = current_controls.copy() + + # Update controls + for control, requirement in new_requirements.items(): + updated_controls[control] = requirement + + # Add update metadata + updated_controls["last_updated"] = datetime.datetime.now().isoformat() + updated_controls["update_version"] = str( + int(current_controls.get("update_version", "0")) + 1 + ) + + return updated_controls +``` + +## Conclusion + +This compliance framework provides a comprehensive approach to maintaining security and regulatory compliance within the Konductor IaC platform. By following these guidelines and implementing the provided controls, organizations can achieve and maintain compliance while minimizing operational overhead. Regular reviews and updates of this document ensure that it remains current with evolving compliance requirements and best practices. For questions or clarification, please contact the compliance team or refer to the project documentation. diff --git a/docs/developer_guide/README.md b/docs/developer_guide/README.md new file mode 100644 index 0000000..81c0578 --- /dev/null +++ b/docs/developer_guide/README.md @@ -0,0 +1,247 @@ +# Konductor Developer Guide + +Welcome to the Konductor Developer Guide! This comprehensive guide is designed for developers who want to contribute to or extend the Konductor Infrastructure as Code (IaC) platform. Whether you're fixing bugs, adding features, or creating new modules, this guide will help you understand our development practices and standards. + +## Table of Contents + +1. [Introduction](#introduction) + - [What is Konductor?](#what-is-konductor) + - [Development Philosophy](#development-philosophy) + - [How to Use This Guide](#how-to-use-this-guide) + +2. [Getting Started](#getting-started) + - [Development Environment Setup](#development-environment-setup) + - [Repository Structure](#repository-structure) + - [Core Technologies](#core-technologies) + +3. [Development Standards](#development-standards) + - [Code Quality Requirements](#code-quality-requirements) + - [Type Safety with TypedDict](#type-safety-with-typeddict) + - [Documentation Requirements](#documentation-requirements) + - [Testing Standards](#testing-standards) + +4. [Module Development](#module-development) + - [Module Architecture](#module-architecture) + - [Creating New Modules](#creating-new-modules) + - [Module Testing](#module-testing) + - [Module Documentation](#module-documentation) + +5. [Contributing](#contributing) + - [Contribution Workflow](#contribution-workflow) + - [Pull Request Guidelines](#pull-request-guidelines) + - [Code Review Process](#code-review-process) + +6. [Available Modules](#available-modules) + - [AWS Module](#aws-module) + - [Cert Manager Module](#cert-manager-module) + - [Other Modules](#other-modules) + +7. [Additional Resources](#additional-resources) + - [Reference Documentation](#reference-documentation) + - [Community Resources](#community-resources) + - [Getting Help](#getting-help) + +## Introduction + +### What is Konductor? + +Konductor is a modern Infrastructure as Code (IaC) platform built on Pulumi and Python, designed to streamline DevOps workflows and Platform Engineering practices. It provides a robust framework for managing cloud infrastructure with emphasis on type safety, modularity, and maintainability. + +### Development Philosophy + +Our development approach is guided by several key principles: + +- **Type Safety**: We use Python's type system and TypedDict to catch errors early. +- **Modularity**: Code is organized into reusable, self-contained modules. +- **Documentation**: Comprehensive documentation is treated as a first-class citizen. +- **Testing**: Thorough testing ensures reliability and maintainability. +- **Accessibility**: Code and documentation should be accessible to developers of all skill levels. + +### How to Use This Guide + +This guide is organized to support different development activities: + +- **New Contributors**: Start with [Getting Started](#getting-started) and [Contributing](#contributing). +- **Module Developers**: Focus on [Module Development](#module-development). +- **Core Contributors**: Review all sections, particularly [Development Standards](#development-standards). + +## Getting Started + +### Development Environment Setup + +1. **Prerequisites**: + - Python 3.8+ + - Poetry for dependency management + - Pulumi CLI + - Git + - VS Code (recommended) + +2. **Initial Setup**: + ```bash + # Clone the repository + git clone https://github.com/containercraft/konductor.git + cd konductor + + # Install dependencies + poetry install + + # Activate virtual environment + poetry shell + + # Initialize Pulumi + pulumi login + ``` + +3. **VS Code Configuration**: + - Install recommended extensions + - Configure Pylance for type checking + - Set up the Dev Container (optional but recommended) + +### Repository Structure + +```bash +konductor/ +├── pulumi/ +│ ├── main.py # Main entry point +│ ├── core/ # Core functionality +│ │ ├── config.py +│ │ ├── deployment.py +│ │ └── utils.py +│ └── modules/ # Individual modules +│ ├── aws/ +│ └── cert_manager/ +├── docs/ # Documentation +└── tests/ # Test suite +``` + + +### Core Technologies + +- **Pulumi**: Infrastructure as Code framework +- **Poetry**: Dependency management +- **TypedDict**: Type-safe configuration management +- **Pyright**: Static type checking + +## Development Standards + +For detailed standards, refer to our [Python Development Standards](../reference/PULUMI_PYTHON.md). + +### Code Quality Requirements + +- Static type checking with Pyright +- PEP 8 compliance +- Documentation for all public APIs +- Unit tests for new functionality + +### Type Safety with TypedDict + +We use TypedDict for configuration management. See our [TypedDict Guide](../reference/TypedDict.md) for details. + +### Documentation Requirements + +All code contributions must include: + +- Docstrings for modules, classes, and functions +- Updated README files +- Changelog entries +- Type annotations + +### Testing Standards + +- Unit tests for new functionality +- Integration tests for modules +- Type checking passes without errors +- Test coverage requirements met + +## Module Development + +### Module Architecture + +Modules follow a standard structure: + +```bash +modules// +├── init.py +├── types.py # TypedDict definitions +├── deploy.py # Deployment logic +└── README.md # Module documentation +``` + + +### Creating New Modules + +See our detailed [Module Development Guide](./modules/README.md). + +### Module Testing + +Refer to our [Testing Guide](./contribution_guidelines.md#testing) for module testing requirements. + +### Module Documentation + +Each module must include: + +- README with usage instructions +- Configuration documentation +- Example configurations +- Troubleshooting guide + +## Contributing + +### Contribution Workflow + +1. Fork the repository +2. Create a feature branch +3. Make changes +4. Run tests and type checking +5. Submit a pull request + +### Pull Request Guidelines + +See our [Pull Request Template](../contribution_templates/pull_request_template.md). + +### Code Review Process + +All contributions undergo review for: + +- Code quality +- Documentation completeness +- Test coverage +- Type safety + +## Available Modules + +### AWS Module + +- [Developer Guide](./modules/aws/developer_guide.md) +- [Implementation Roadmap](./modules/aws/implementation_roadmap.md) + +### Cert Manager Module + +- [Developer Guide](./modules/cert_manager/developer_guide.md) + +### Other Modules + +See our [Modules Directory](./modules/README.md) for a complete list. + +## Additional Resources + +### Reference Documentation + +- [Pulumi Python Standards](../reference/PULUMI_PYTHON.md) +- [TypedDict Guide](../reference/TypedDict.md) +- [Style Guide](../reference/style_guide.md) + +### Community Resources + +- [Discord Community](https://discord.gg/Jb5jgDCksX) +- [GitHub Discussions](https://github.com/containercraft/konductor/discussions) + +### Getting Help + +- Join our [Discord](https://discord.gg/Jb5jgDCksX) +- Open an [issue](https://github.com/containercraft/konductor/issues) +- Check our [FAQ](../user_guide/faq_and_troubleshooting.md) + +--- + +**Next Steps**: Review our [Contribution Guidelines](./contribution_guidelines.md) to start contributing to Konductor. diff --git a/docs/developer_guide/contribution_guidelines.md b/docs/developer_guide/contribution_guidelines.md new file mode 100644 index 0000000..a935b0d --- /dev/null +++ b/docs/developer_guide/contribution_guidelines.md @@ -0,0 +1,378 @@ +# Contribution Guidelines + +## Introduction + +Welcome to the Konductor contribution guidelines! This document provides detailed instructions for contributing to the Konductor project, whether you're fixing bugs, adding features, improving documentation, or creating new modules. We value all contributions and want to make the process as transparent and straightforward as possible. + +## Table of Contents + +1. [Code of Conduct](#code-of-conduct) +2. [Getting Started](#getting-started) +3. [Development Environment](#development-environment) +4. [Contribution Workflow](#contribution-workflow) +5. [Documentation Guidelines](#documentation-guidelines) +6. [Testing Requirements](#testing-requirements) +7. [Code Style and Standards](#code-style-and-standards) +8. [Pull Request Process](#pull-request-process) +9. [Issue Guidelines](#issue-guidelines) +10. [Community Engagement](#community-engagement) + +## Code of Conduct + +Our project adheres to a Code of Conduct that establishes expected behavior for all contributors and community members. Please read and follow our [Code of Conduct](../CODE_OF_CONDUCT.md). + +## Getting Started + +### Prerequisites + +Before contributing, ensure you have: + +- Python 3.8 or higher +- Poetry for dependency management +- Pulumi CLI +- Git +- A code editor (VS Code recommended) +- AWS CLI (for AWS module development) +- kubectl (for Kubernetes development) + +### Initial Setup + +1. **Fork the Repository** + ```bash + # Fork via GitHub UI, then clone your fork + git clone https://github.com/YOUR_USERNAME/konductor.git + cd konductor + ``` + +2. **Set Up Development Environment** + ```bash + # Install dependencies + poetry install + + # Activate virtual environment + poetry shell + + # Install pre-commit hooks + pre-commit install + ``` + +## Development Environment + +### Required Tools + +- **VS Code Extensions**: + - Pylance for Python language support + - Python extension for debugging + - YAML extension for configuration files + - Docker extension for container management + +### Configuration Files + +1. **Pyright Configuration** (`pyrightconfig.json`): + ```json + { + "include": ["**/*.py"], + "exclude": ["**/__pycache__/**"], + "reportMissingImports": true, + "pythonVersion": "3.8", + "typeCheckingMode": "strict" + } + ``` + +2. **Poetry Configuration** (`pyproject.toml`): + ```toml + [tool.poetry] + name = "konductor" + version = "0.1.0" + description = "Infrastructure as Code platform" + + [tool.poetry.dependencies] + python = "^3.8" + pulumi = "^3.0.0" + ``` + +## Contribution Workflow + +### 1. Create an Issue + +Before starting work: +- Check existing issues and discussions +- Create a new issue using the appropriate template: + - [Bug Report Template](../contribution_templates/issue_template.md) + - [Feature Request Template](../contribution_templates/feature_request_template.md) + +### 2. Branch Creation + +```bash +#Create a feature branch +git checkout -b feature/issue-number-brief-description + +# For bug fixes +git checkout -b fix/issue-number-brief-description +``` + + +### 3. Development Process + +1. **Write Code** + - Follow [Python Development Standards](../reference/PULUMI_PYTHON.md) + - Use type hints and TypedDict (see [TypedDict Guide](../reference/TypedDict.md)) + - Add tests for new functionality + +2. **Local Testing** + ```bash + # Run type checking + poetry run pyright + + # Run tests + poetry run pytest + + # Run linting + poetry run black . + poetry run isort . + poetry run flake8 . + ``` + +3. **Commit Changes** + ```bash + # Stage changes + git add . + + # Commit with conventional commit message + git commit -m "type(scope): description" + ``` + +### 4. Documentation Updates + +All contributions must include appropriate documentation updates: + +1. **Code Documentation** + - Docstrings for all public functions/classes + - Type annotations + - Inline comments for complex logic + +2. **User Documentation** + - Update relevant user guides + - Add examples if applicable + - Update FAQs if needed + +3. **Developer Documentation** + - Update technical documentation + - Add architecture diagrams if needed + - Update module documentation + +## Documentation Guidelines + +### File Organization + +Follow the documentation structure: + +```bash +docs/ +├── user_guide/ # End-user documentation +├── developer_guide/ # Developer documentation +├── modules/ # Module-specific guides +├── reference/ # Technical references +└── contribution_templates/ # Contribution templates +``` + +### Documentation Standards + +1. **Markdown Formatting** + - Use ATX-style headers (`#` for headers) + - Include table of contents for long documents + - Use code blocks with language identifiers + - Include alt text for images + +2. **Content Guidelines** + - Write in clear, concise language + - Include examples and use cases + - Link to related documentation + - Keep technical accuracy + +3. **Accessibility** + - Use proper heading hierarchy + - Provide alt text for images + - Ensure sufficient color contrast + - Use descriptive link text + +### Example Documentation + +```python +from typing import Dict, Optional + +def update_resource_tags( + resource_id: str, + tags: Dict[str, str], + region: Optional[str] = None +) -> Dict[str, str]: + """Update tags for an AWS resource. + + Args: + resource_id: The ID of the resource to update + tags: Dictionary of tags to apply + region: Optional AWS region (defaults to current) + + Returns: + Dictionary of applied tags + + Raises: + ResourceNotFoundError: If resource doesn't exist + InvalidTagError: If tags are invalid + + Example: + >>> tags = update_resource_tags("vpc-123", {"Environment": "prod"}) + >>> assert tags["Environment"] == "prod" + """ +``` + + +## Testing Requirements + +### Required Tests + +1. **Unit Tests** + - Test individual components + - Use pytest fixtures + - Mock external dependencies + +2. **Integration Tests** + - Test module interactions + - Verify resource creation + - Test configuration handling + +3. **Type Checking** + - Use strict type checking + - Verify all type annotations + +### Example Test Structure + +```python +import pytest +from pulumi import automation as auto + +def test_vpc_creation(): + """Test VPC creation with default configuration.""" + stack = auto.create_stack(...) + + # Deploy resources + result = stack.up() + + # Verify outputs + assert "vpc_id" in result.outputs + assert result.outputs["vpc_id"].value != "" +``` + + +## Code Style and Standards + +### Python Standards + +1. **Type Safety** + - Use type hints for all functions + - Implement TypedDict for configurations + - Enable strict type checking + +2. **Code Organization** + - Follow single responsibility principle + - Use meaningful names + - Keep functions focused and small + +3. **Error Handling** + - Use custom exceptions + - Provide meaningful error messages + - Handle edge cases + +### Example Code Style + +```python +from typing import TypedDict, List + +class NetworkConfig(TypedDict): + vpc_cidr: str + subnet_cidrs: List[str] + +class NetworkManager: + """Manages AWS networking resources.""" + def __init__(self, config: NetworkConfig): + self.config = config + self.validate_config() + + def validate_config(self) -> None: + """Validate network configuration.""" + if not self.is_valid_cidr(self.config["vpc_cidr"]): + raise ValueError(f"Invalid VPC CIDR: {self.config['vpc_cidr']}") +``` + + +## Pull Request Process + +### 1. Prepare Your PR + +- Update your branch with main +- Ensure all tests pass +- Update documentation +- Add changelog entry + +### 2. Submit PR + +Use the [Pull Request Template](../contribution_templates/pull_request_template.md): + +- Link related issues +- Describe changes +- List testing performed +- Note documentation updates + +### 3. Review Process + +- Address reviewer feedback +- Keep PR focused and small +- Maintain clear communication + +### 4. Merge Requirements + +- Passing CI/CD checks +- Approved reviews +- Updated documentation +- Changelog entry + +## Issue Guidelines + +### Creating Issues + +Use appropriate templates: +- [Bug Report Template](../contribution_templates/issue_template.md) +- [Feature Request Template](../contribution_templates/feature_request_template.md) + +### Issue Labels + +- `bug`: Bug reports +- `enhancement`: Feature requests +- `documentation`: Documentation updates +- `good first issue`: Beginner-friendly +- `help wanted`: Community input needed + +## Community Engagement + +### Communication Channels + +- GitHub Issues and Discussions +- Discord Community + +### Getting Help + +1. Check documentation +2. Search existing issues +3. Ask in Discord +4. Create a new issue + +## Conclusion + +Thank you for contributing to Konductor! Your efforts help make the project better for everyone. Remember to: + +- Follow the guidelines +- Write clear documentation +- Test thoroughly +- Engage with the community + +For updates and new features, watch our [GitHub repository](https://github.com/containercraft/konductor). diff --git a/docs/developer_guide/konductor_developer_guide.md b/docs/developer_guide/konductor_developer_guide.md new file mode 100644 index 0000000..3076af7 --- /dev/null +++ b/docs/developer_guide/konductor_developer_guide.md @@ -0,0 +1,739 @@ +# Konductor Developer Guide + +## Introduction + +Welcome to the comprehensive developer guide for Konductor! This guide provides detailed information for developers who want to contribute to or extend the Konductor Infrastructure as Code (IaC) platform. Whether you're fixing bugs, adding features, or creating new modules, you'll find everything you need to understand our development practices and standards. + +## Table of Contents + +1. [Development Environment](#development-environment) + - [Prerequisites](#prerequisites) + - [Environment Setup](#environment-setup) + - [Development Tools](#development-tools) + +2. [Core Technologies](#core-technologies) + - [Pulumi Overview](#pulumi-overview) + - [Poetry for Dependency Management](#poetry-for-dependency-management) + - [Type Safety with TypedDict](#type-safety-with-typeddict) + - [Static Type Checking](#static-type-checking) + +3. [Project Structure](#project-structure) + - [Directory Layout](#directory-layout) + - [Module Organization](#module-organization) + - [Configuration Management](#configuration-management) + +4. [Development Standards](#development-standards) + - [Code Style Guidelines](#code-style-guidelines) + - [Type Annotations](#type-annotations) + - [Documentation Requirements](#documentation-requirements) + - [Testing Requirements](#testing-requirements) + +5. [Module Development](#module-development) + - [Module Architecture](#module-architecture) + - [Creating New Modules](#creating-new-modules) + - [Module Testing](#module-testing) + - [Module Documentation](#module-documentation) + +6. [Testing and Validation](#testing-and-validation) + - [Unit Testing](#unit-testing) + - [Integration Testing](#integration-testing) + - [Type Checking](#type-checking) + - [Continuous Integration](#continuous-integration) + +7. [Documentation Guidelines](#documentation-guidelines) + - [Code Documentation](#code-documentation) + - [Module Documentation](#module-documentation-1) + - [API Documentation](#api-documentation) + - [Example Documentation](#example-documentation) + +8. [Best Practices](#best-practices) + - [Code Organization](#code-organization) + - [Error Handling](#error-handling) + - [Configuration Management](#configuration-management-1) + - [Resource Management](#resource-management) + +9. [Troubleshooting and Support](#troubleshooting-and-support) + - [Common Issues](#common-issues) + - [Getting Help](#getting-help) + - [Community Resources](#community-resources) + +## Development Environment + +### Prerequisites + +Before starting development, ensure you have: + +- Python 3.8 or higher +- Poetry for dependency management +- Pulumi CLI +- Git +- A code editor (VS Code recommended) +- AWS CLI (for AWS module development) +- kubectl (for Kubernetes development) + +### Environment Setup + +1. **Clone the Repository**: + ```bash + git clone https://github.com/containercraft/konductor.git + cd konductor + ``` + +2. **Install Dependencies**: + ```bash + poetry install + ``` + +3. **Activate Virtual Environment**: + ```bash + poetry shell + ``` + +4. **Configure Development Tools**: + ```bash + # Configure Pulumi + pulumi login + + # Set up pre-commit hooks + pre-commit install + ``` + +### Development Tools + +- **VS Code Extensions**: + - Pylance for Python language support + - Python extension for debugging + - YAML extension for configuration files + - Docker extension for container management + +- **Configuration Files**: + ```json:pyrightconfig.json + { + "include": ["**/*.py"], + "exclude": ["**/__pycache__/**"], + "reportMissingImports": true, + "pythonVersion": "3.8", + "typeCheckingMode": "strict" + } + ``` + +## Core Technologies + +### Pulumi Overview + +Pulumi is our primary IaC framework, chosen for its: +- Native Python support +- Strong type system integration +- Multi-cloud capabilities +- State management features + +### Poetry for Dependency Management + +We use Poetry to: +- Manage project dependencies +- Create reproducible builds +- Handle virtual environments +- Package distribution + +Example `pyproject.toml`: +```toml +[tool.poetry] +name = "konductor" +version = "0.1.0" +description = "Infrastructure as Code platform" + +[tool.poetry.dependencies] +python = "^3.8" +pulumi = "^3.0.0" +pulumi-aws = "^5.0.0" +``` + +### Type Safety with TypedDict + +TypedDict is central to our configuration management: + +```python +from typing import TypedDict, List + +class ContainerPort(TypedDict): + containerPort: int + protocol: str + +class Container(TypedDict): + name: str + image: str + ports: List[ContainerPort] +``` + +### Static Type Checking + +We enforce strict type checking using Pyright: + +```bash +# Run type checking +poetry run pyright + +# Configure VS Code for real-time type checking +{ + "python.analysis.typeCheckingMode": "strict" +} +``` + +## Project Structure + +### Directory Layout + +``` +konductor/ +├── pulumi/ +│ ├── main.py +│ ├── core/ +│ │ ├── config.py +│ │ ├── deployment.py +│ │ └── utils.py +│ └── modules/ +│ ├── aws/ +│ └── cert_manager/ +├── docs/ +└── tests/ +``` + +### Module Organization + +Each module follows a standard structure: + +``` +modules// +├── __init__.py +├── types.py +├── deploy.py +├── config.py +└── README.md +``` + +### Configuration Management + +Configuration hierarchy: +1. Default module configurations +2. User-provided configurations +3. Environment-specific overrides + +## Development Standards + +### Code Style Guidelines + +We follow PEP 8 with additional requirements: + +```python +# Good +def create_deployment( + name: str, + replicas: int, + container: Container +) -> Deployment: + """Create a Kubernetes deployment. + + Args: + name: The deployment name + replicas: Number of replicas + container: Container configuration + + Returns: + The created deployment + """ + return Deployment(...) + +# Bad - Missing type hints and docstring +def create_deployment(name, replicas, container): + return Deployment(...) +``` + +### Type Annotations + +All code must use type hints: + +```python +from typing import Dict, List, Optional + +def get_resource_tags( + environment: str, + additional_tags: Optional[Dict[str, str]] = None +) -> Dict[str, str]: + tags = {"Environment": environment} + if additional_tags: + tags.update(additional_tags) + return tags +``` + +### Documentation Requirements + +Required documentation elements: +- Module overview +- Function/class docstrings +- Type annotations +- Usage examples +- Configuration options + +### Testing Requirements + +All code must include: +- Unit tests +- Integration tests (where applicable) +- Type checking validation +- Documentation tests + +## Module Development + +### Module Architecture + +Modules should follow the Single Responsibility Principle: + +```python +# Good - Single responsibility +class AwsNetworking: + """Manages AWS networking resources.""" + + def create_vpc(self) -> pulumi.Output[str]: + """Create a VPC.""" + pass + + def create_subnet(self) -> pulumi.Output[str]: + """Create a subnet.""" + pass + +# Bad - Mixed responsibilities +class AwsResources: + """Manages various AWS resources.""" + + def create_vpc(self) -> str: + pass + + def create_database(self) -> str: + pass +``` + +### Creating New Modules + +Follow these steps to create a new module: + +1. Create module directory structure +2. Define TypedDict configurations +3. Implement core functionality +4. Add documentation +5. Write tests +6. Submit for review + +### Module Testing + +Example test structure: + +```python +import pytest +from pulumi import automation as auto + +def test_vpc_creation(): + """Test VPC creation with default configuration.""" + stack = auto.create_stack(...) + + # Deploy resources + result = stack.up() + + # Verify outputs + assert "vpc_id" in result.outputs + assert result.outputs["vpc_id"].value != "" +``` + +### Module Documentation + +Required module documentation: + +1. Overview and purpose +2. Installation instructions +3. Configuration options +4. Usage examples +5. API reference +6. Troubleshooting guide + +## Testing and Validation + +### Unit Testing + +We use pytest for unit testing: + +```python +import pytest +from konductor.core.utils import merge_configurations + +def test_merge_configurations(): + """Test configuration merging logic.""" + base = {"name": "test", "replicas": 1} + override = {"replicas": 2} + + result = merge_configurations(base, override) + + assert result["name"] == "test" + assert result["replicas"] == 2 +``` + +### Integration Testing + +Integration tests verify module interactions: + +```python +from pulumi import automation as auto +from typing import Generator +import pytest + +@pytest.fixture +def pulumi_stack() -> Generator[auto.Stack, None, None]: + """Create a test stack for integration testing.""" + stack = auto.create_stack( + stack_name="integration-test", + project_name="konductor-test" + ) + yield stack + stack.destroy() + stack.workspace.remove_stack("integration-test") + +def test_aws_module_integration(pulumi_stack: auto.Stack): + """Test AWS module integration with core components.""" + # Deploy test infrastructure + result = pulumi_stack.up() + + # Verify resource creation + assert "vpc_id" in result.outputs + assert "subnet_ids" in result.outputs +``` + +### Type Checking + +Run type checking as part of validation: + +```bash +# Run type checking with detailed output +poetry run pyright --verbose + +# Check specific module +poetry run pyright pulumi/modules/aws +``` + +### Continuous Integration + +Our CI pipeline includes: + +1. **Code Quality Checks**: + ```yaml + steps: + - name: Code Quality + run: | + poetry run black --check . + poetry run isort --check-only . + poetry run flake8 . + ``` + +2. **Type Checking**: + ```yaml + steps: + - name: Type Check + run: poetry run pyright + ``` + +3. **Tests**: + ```yaml + steps: + - name: Run Tests + run: poetry run pytest --cov + ``` + +## Documentation Guidelines + +### Code Documentation + +Follow these docstring conventions: + +```python +from typing import Dict, Optional + +def update_resource_tags( + resource_id: str, + tags: Dict[str, str], + region: Optional[str] = None +) -> Dict[str, str]: + """Update tags for an AWS resource. + + Args: + resource_id: The ID of the resource to update + tags: Dictionary of tags to apply + region: Optional AWS region (defaults to current) + + Returns: + Dictionary of applied tags + + Raises: + ResourceNotFoundError: If resource doesn't exist + InvalidTagError: If tags are invalid + + Example: + >>> tags = update_resource_tags("vpc-123", {"Environment": "prod"}) + >>> assert tags["Environment"] == "prod" + """ + # Implementation +``` + +### Module Documentation + +Each module must include: + +1. **README.md**: + ```markdown + # AWS Module + + ## Overview + Manages AWS infrastructure resources. + + ## Installation + ```bash + poetry add konductor-aws + ``` + + ## Usage + ```python + from konductor.modules.aws import AwsNetworking + + networking = AwsNetworking(...) + vpc = networking.create_vpc(...) + ``` + + ## Configuration + | Parameter | Type | Description | Default | + |-----------|------|-------------|---------| + | region | str | AWS region | us-west-2 | + ``` + +2. **API Documentation**: + ```python + class AwsNetworking: + """AWS networking resource management. + + Provides functionality for creating and managing AWS networking + resources including VPCs, subnets, and security groups. + + Attributes: + region: The AWS region for resource creation + tags: Default tags for all resources + """ + ``` + +### Example Documentation + +Provide clear, runnable examples: + +```python +from konductor.modules.aws import AwsNetworking +from konductor.core.config import NetworkConfig + +# Configuration +network_config = NetworkConfig( + vpc_cidr="10.0.0.0/16", + subnet_cidrs=["10.0.1.0/24", "10.0.2.0/24"], + availability_zones=["us-west-2a", "us-west-2b"] +) + +# Create networking resources +networking = AwsNetworking(network_config) +vpc = networking.create_vpc() +subnets = networking.create_subnets() +``` + +## Best Practices + +### Code Organization + +1. **Separation of Concerns**: + ```python + # Good - Separate configuration and implementation + class NetworkConfig(TypedDict): + vpc_cidr: str + subnet_cidrs: List[str] + + class NetworkManager: + def __init__(self, config: NetworkConfig): + self.config = config + + # Bad - Mixed configuration and implementation + class Network: + def __init__(self, vpc_cidr: str, subnet_cidrs: List[str]): + self.create_vpc(vpc_cidr) + self.create_subnets(subnet_cidrs) + ``` + +2. **Resource Organization**: + ```python + # Good - Logical resource grouping + class AwsNetworking: + def __init__(self, config: NetworkConfig): + self.vpc = self._create_vpc() + self.subnets = self._create_subnets() + self.security_groups = self._create_security_groups() + + # Bad - No clear organization + class AwsResources: + def create_stuff(self): + self.vpc = create_vpc() + self.database = create_database() + self.subnets = create_subnets() + ``` + +### Error Handling + +1. **Custom Exceptions**: + ```python + class ResourceError(Exception): + """Base exception for resource operations.""" + pass + + class ResourceNotFoundError(ResourceError): + """Raised when a resource cannot be found.""" + pass + + def get_resource(resource_id: str) -> Resource: + try: + return fetch_resource(resource_id) + except ApiError as e: + raise ResourceNotFoundError(f"Resource {resource_id} not found") from e + ``` + +2. **Graceful Degradation**: + ```python + from typing import Optional + + def get_resource_tags( + resource_id: str, + default: Optional[Dict[str, str]] = None + ) -> Dict[str, str]: + """Get resource tags with fallback to defaults.""" + try: + return fetch_resource_tags(resource_id) + except ResourceError: + return default or {} + ``` + +### Configuration Management + +1. **Configuration Validation**: + ```python + from pydantic import BaseModel, Field + + class VpcConfig(BaseModel): + cidr_block: str = Field(..., regex=r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/\d{1,2}$") + enable_dns: bool = True + tags: Dict[str, str] = Field(default_factory=dict) + ``` + +2. **Environment-Specific Configuration**: + ```python + def load_config(environment: str) -> Dict[str, Any]: + """Load environment-specific configuration.""" + base_config = load_yaml("config/base.yaml") + env_config = load_yaml(f"config/{environment}.yaml") + return deep_merge(base_config, env_config) + ``` + +### Resource Management + +1. **Resource Cleanup**: + ```python + class ResourceManager: + def __init__(self): + self.resources: List[Resource] = [] + + def add_resource(self, resource: Resource): + self.resources.append(resource) + + def cleanup(self): + """Clean up all managed resources.""" + for resource in reversed(self.resources): + try: + resource.destroy() + except ResourceError: + logger.exception(f"Failed to cleanup {resource.id}") + ``` + +2. **Resource Dependencies**: + ```python + class NetworkStack: + def __init__(self): + self.vpc = self._create_vpc() + # Subnets depend on VPC + self.subnets = self._create_subnets(self.vpc.id) + # Route tables depend on VPC and subnets + self.route_tables = self._create_route_tables( + self.vpc.id, + self.subnets + ) + ``` + +## Troubleshooting and Support + +### Common Issues + +1. **Type Checking Errors**: + ```python + # Error: Parameter "config" missing required TypedDict key "region" + def deploy(config: AwsConfig): # Missing region in config + pass + + # Fix: Provide all required keys + config: AwsConfig = { + "region": "us-west-2", + "tags": {"Environment": "prod"} + } + ``` + +2. **Resource Creation Failures**: + ```python + try: + vpc = create_vpc(config) + except pulumi.ResourceError as e: + # Check for common issues + if "InvalidCidrBlock" in str(e): + logger.error("Invalid CIDR block specified") + elif "QuotaExceeded" in str(e): + logger.error("AWS account quota exceeded") + raise + ``` + +### Getting Help + +1. **Community Support**: + - GitHub Issues: [Konductor Issues](https://github.com/containercraft/konductor/issues) + - Discord Community: [Join Discord](https://discord.gg/Jb5jgDCksX) + - Stack Overflow: Tag `konductor` + +2. **Documentation Resources**: + - [User Guide](../user_guide/README.md) + - [API Reference](../reference/README.md) + - [Module Documentation](../modules/README.md) + +### Community Resources + +1. **Contributing**: + - [Contribution Guidelines](../contribution_guidelines.md) + - [Code of Conduct](../CODE_OF_CONDUCT.md) + - [Development Setup](../getting_started.md) + +2. **Learning Resources**: + - [Tutorial Series](../tutorials/README.md) + - [Example Projects](../examples/README.md) + - [Best Practices Guide](../reference/best_practices.md) + +## Conclusion + +This developer guide provides a comprehensive overview of developing with Konductor. Remember to: + +- Follow type safety practices +- Write comprehensive tests +- Document your code +- Contribute to the community + +For updates and new features, watch our [GitHub repository](https://github.com/containercraft/konductor) diff --git a/docs/developer_guide/modules/aws/README.md b/docs/developer_guide/modules/aws/README.md new file mode 100644 index 0000000..7ea9ec2 --- /dev/null +++ b/docs/developer_guide/modules/aws/README.md @@ -0,0 +1,295 @@ +# AWS Module Developer Guide + +## Table of Contents + +1. [Introduction](#introduction) +2. [Module Overview](#module-overview) +3. [Architecture](#architecture) +4. [Implementation Guide](#implementation-guide) +5. [Configuration Management](#configuration-management) +6. [Development Standards](#development-standards) +7. [Testing and Validation](#testing-and-validation) +8. [Security and Compliance](#security-and-compliance) +9. [Deployment Workflows](#deployment-workflows) +10. [Roadmap and Future Plans](#roadmap-and-future-plans) +11. [Troubleshooting](#troubleshooting) +12. [References](#references) + +## Introduction + +The AWS module for Konductor provides a comprehensive framework for implementing scalable, secure, and compliant AWS infrastructure using Pulumi and Python. This guide is intended for developers contributing to or extending the AWS module functionality. + +### Prerequisites + +- Python 3.8+ +- Poetry for dependency management +- Pulumi CLI +- AWS CLI configured with appropriate credentials +- Understanding of TypedDict and static type checking + +## Module Overview + +### Purpose + +The AWS module enables: +- Automated AWS Organizations and Control Tower setup +- Standardized landing zone implementation +- Secure IAM management +- EKS cluster deployment with best practices +- Integration with AWS services (OpenTelemetry, etc.) + +### Key Features + +- Multi-account strategy support +- Compliance-ready infrastructure +- Automated security controls +- Scalable resource management +- Type-safe configuration handling + +## Architecture + +### Core Components + +```python +from typing import TypedDict, Optional, List + +class LandingZone(TypedDict): + name: str + email: str + ou_path: str + tags: Dict[str, str] + +class AWSOrganizationConfig(TypedDict): + enabled: bool + feature_set: str + accounts: List[LandingZone] + default_tags: Dict[str, str] + region: str +``` + +### Directory Structure + +``` +aws/ +├── __init__.py +├── types.py # TypedDict definitions +├── deploy.py # Deployment logic +├── config.py # Configuration management +├── iam/ # IAM management +├── organizations/ # AWS Organizations +├── eks/ # EKS implementation +└── security/ # Security controls +``` + +## Implementation Guide + +### Setting Up AWS Organizations + +```python +def create_organization(config: AWSOrganizationConfig) -> aws.organizations.Organization: + """Creates an AWS Organization with all features enabled.""" + organization = aws.organizations.Organization( + "aws_organization", + feature_set=config.get("feature_set", "ALL"), + opts=pulumi.ResourceOptions(protect=True) + ) + return organization +``` + +### Creating Organizational Units + +```python +def create_organizational_units( + organization: aws.organizations.Organization, + ou_names: List[str] +) -> Dict[str, aws.organizations.OrganizationalUnit]: + """Creates Organizational Units (OUs) under the AWS Organization.""" + organizational_units = {} + for ou_name in ou_names: + ou = aws.organizations.OrganizationalUnit( + f"ou_{ou_name.lower()}", + name=ou_name, + parent_id=organization.roots[0].id, + opts=pulumi.ResourceOptions(parent=organization) + ) + organizational_units[ou_name] = ou + return organizational_units +``` + +## Configuration Management + +### TypedDict Configuration + +```python +# Default configuration values +aws_organization_defaults: AWSOrganizationConfig = { + "enabled": True, + "feature_set": "ALL", + "accounts": [], + "default_tags": {}, + "region": "us-west-2" +} +``` + +### Configuration Validation + +```python +def validate_config(config: AWSOrganizationConfig) -> None: + """Validates AWS configuration.""" + if not config["region"]: + raise ValueError("AWS region must be specified") + + for account in config["accounts"]: + if not account["email"]: + raise ValueError("Account email is required") +``` + +## Development Standards + +### Code Quality Requirements + +- Static type checking with Pyright +- Documentation for all public APIs +- Unit tests for new functionality +- Compliance with `PULUMI_PYTHON.md` standards + +### Example Implementation + +```python +from typing import Optional, Dict, Any +import pulumi + +def create_compliant_resource( + name: str, + config: Dict[str, Any], + compliance_tags: Optional[Dict[str, str]] = None +) -> pulumi.Resource: + """Create a resource with compliance controls.""" + if not compliance_tags: + compliance_tags = {} + + compliance_tags.update({ + "compliance:framework": "nist", + "compliance:control": "AC-2", + "compliance:validated": "true" + }) + + return pulumi.Resource( + name, + props={**config, "tags": compliance_tags}, + opts=pulumi.ResourceOptions(protect=True) + ) +``` + +## Testing and Validation + +### Unit Testing + +```python +import pytest +from pulumi import automation as auto + +def test_vpc_creation(): + """Test VPC creation with default configuration.""" + stack = auto.create_stack(...) + result = stack.up() + assert "vpc_id" in result.outputs +``` + +### Integration Testing + +```python +def test_aws_module_integration(pulumi_stack: auto.Stack): + """Test AWS module integration with core components.""" + result = pulumi_stack.up() + assert "vpc_id" in result.outputs + assert "subnet_ids" in result.outputs +``` + +## Security and Compliance + +### NIST Controls Implementation + +- AC-2: Account Management +- AC-3: Access Enforcement +- AU-2: Audit Events +- CM-6: Configuration Settings + +### Security Best Practices + +- Enable AWS Organizations SCP +- Implement least privilege access +- Enable CloudTrail logging +- Configure AWS Config rules + +## Deployment Workflows + +### CI/CD Integration + +```yaml +name: AWS Module Deployment +on: [push, pull_request] + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v2 + - name: Setup Python + uses: actions/setup-python@v2 + - name: Deploy Infrastructure + run: pulumi up --yes +``` + +## Roadmap and Future Plans + +- Complete AWS Organizations integration +- Implement Control Tower automation +- Deploy baseline security controls +- Enhance EKS integration +- Implement cross-account access patterns +- Add advanced monitoring capabilities +- Implement advanced compliance controls +- Add support for AWS Landing Zone +- Enhance security posture +- Multi-region support +- Disaster recovery automation +- Advanced cost optimization + +## Troubleshooting + +### Common Issues + +1. **Organizations Access Denied** + - Ensure proper IAM permissions + - Verify Organization access + +2. **EKS Deployment Failures** + - Check VPC configuration + - Verify IAM roles + - Review security group settings + +### Best Practices + +1. **Resource Management** + - Use consistent naming conventions + - Implement proper tagging + - Enable detailed monitoring + +2. **Security** + - Regular security assessments + - Automated compliance checks + - Continuous monitoring + +## References + +- [AWS Organizations Documentation](https://docs.aws.amazon.com/organizations/) +- [AWS Control Tower](https://docs.aws.amazon.com/controltower/) +- [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/) +- [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) +- [NIST Compliance](https://aws.amazon.com/compliance/nist/) +- [Pulumi AWS Provider](https://www.pulumi.com/registry/packages/aws/) + +--- + +**Note**: This module is under active development. For the latest updates, refer to the [changelog](./changelog.md) and [implementation roadmap](./implementation_roadmap.md). diff --git a/docs/developer_guide/modules/aws/developer_guide.md b/docs/developer_guide/modules/aws/developer_guide.md new file mode 100644 index 0000000..c0abd62 --- /dev/null +++ b/docs/developer_guide/modules/aws/developer_guide.md @@ -0,0 +1,446 @@ +# AWS Module Developer Guide + +## Table of Contents + +1. [Introduction](#introduction) +2. [Module Architecture](#module-architecture) +3. [Development Standards](#development-standards) +4. [Implementation Guide](#implementation-guide) +5. [Configuration Management](#configuration-management) +6. [Security and Compliance](#security-and-compliance) +7. [Testing and Validation](#testing-and-validation) +8. [Deployment Workflows](#deployment-workflows) +9. [Advanced Topics](#advanced-topics) +10. [Troubleshooting](#troubleshooting) +11. [Future Development](#future-development) +12. [References](#references) + +## Introduction + +The AWS module for Konductor provides a comprehensive framework for implementing scalable, secure, and compliant AWS infrastructure using Pulumi and Python. This guide details the development standards, implementation patterns, and best practices for contributing to or extending the AWS module. + +### Prerequisites + +- Python 3.8+ +- Poetry for dependency management +- Pulumi CLI +- AWS CLI configured with appropriate credentials +- Understanding of TypedDict and static type checking + +### Module Scope + +The AWS module handles: +- AWS Organizations and Control Tower setup +- Landing zone implementation +- IAM management and security controls +- EKS cluster deployment +- Integration with AWS services (OpenTelemetry, etc.) + +## Module Architecture + +### Directory Structure + +``` +aws/ +├── __init__.py +├── types.py # TypedDict definitions +├── deploy.py # Deployment logic +├── config.py # Configuration management +├── iam/ # IAM management +├── organizations/ # AWS Organizations +├── eks/ # EKS implementation +└── security/ # Security controls +``` + +### Core Components + +```python +from typing import TypedDict, Optional, List, Dict + +class LandingZone(TypedDict): + """Landing zone configuration. + + Attributes: + name: Landing zone name + email: Account email + ou_path: Organizational unit path + tags: Resource tags + """ + name: str + email: str + ou_path: str + tags: Dict[str, str] + +class AWSOrganizationConfig(TypedDict): + """AWS Organization configuration. + + Attributes: + enabled: Whether the organization is enabled + feature_set: Organization feature set + accounts: List of landing zones + default_tags: Default resource tags + region: AWS region + """ + enabled: bool + feature_set: str + accounts: List[LandingZone] + default_tags: Dict[str, str] + region: str +``` + +## Development Standards + +### Code Quality Requirements + +1. **Type Safety** + - Use TypedDict for configurations + - Enable strict type checking with Pyright + - Implement proper type annotations + +2. **Documentation** + - Comprehensive docstrings + - Clear inline comments + - Up-to-date README files + +3. **Testing** + - Unit tests for all functionality + - Integration tests for workflows + - Compliance validation tests + +### Example Implementation + +```python +from typing import Optional, Dict, Any +import pulumi +import pulumi_aws as aws + +def create_compliant_resource( + name: str, + config: Dict[str, Any], + compliance_tags: Optional[Dict[str, str]] = None +) -> pulumi.Resource: + """Create a resource with compliance controls. + + Args: + name: Resource name + config: Resource configuration + compliance_tags: Compliance-related tags + + Returns: + pulumi.Resource: Created resource with compliance controls + """ + if not compliance_tags: + compliance_tags = {} + + # Add required compliance tags + compliance_tags.update({ + "compliance:framework": "nist", + "compliance:control": "AC-2", + "compliance:validated": "true" + }) + + # Create resource with compliance controls + resource = pulumi.Resource( + name, + props={**config, "tags": compliance_tags}, + opts=pulumi.ResourceOptions(protect=True) + ) + + return resource +``` + +## Implementation Guide + +### AWS Organizations Setup + +```python +def create_organization( + config: AWSOrganizationConfig +) -> aws.organizations.Organization: + """Creates an AWS Organization with all features enabled. + + Args: + config: Organization configuration + + Returns: + aws.organizations.Organization: Created organization + """ + organization = aws.organizations.Organization( + "aws_organization", + feature_set=config.get("feature_set", "ALL"), + opts=pulumi.ResourceOptions(protect=True) + ) + return organization +``` + +### Landing Zone Implementation + +```python +def create_landing_zone( + config: LandingZone, + org_id: str +) -> aws.organizations.Account: + """Creates a landing zone account. + + Args: + config: Landing zone configuration + org_id: Organization ID + + Returns: + aws.organizations.Account: Created account + """ + account = aws.organizations.Account( + f"account-{config['name']}", + email=config["email"], + parent_id=org_id, + tags=config["tags"], + opts=pulumi.ResourceOptions(protect=True) + ) + return account +``` + +## Configuration Management + +### TypedDict Configuration + +```python +# Default configuration values +aws_organization_defaults: AWSOrganizationConfig = { + "enabled": True, + "feature_set": "ALL", + "accounts": [], + "default_tags": {}, + "region": "us-west-2" +} +``` + +### Configuration Validation + +```python +def validate_config(config: AWSOrganizationConfig) -> None: + """Validates AWS configuration. + + Args: + config: Configuration to validate + + Raises: + ValueError: If configuration is invalid + """ + if not config["region"]: + raise ValueError("AWS region must be specified") + + for account in config["accounts"]: + if not account["email"]: + raise ValueError("Account email is required") +``` + +## Security and Compliance + +### NIST Controls Implementation + +The AWS module implements the following NIST controls: + +- AC-2: Account Management +- AC-3: Access Enforcement +- AU-2: Audit Events +- CM-6: Configuration Settings + +### Security Best Practices + +1. **Organizations Security** + - Enable AWS Organizations SCP + - Implement least privilege access + - Enable CloudTrail logging + +2. **Resource Protection** + - Enable encryption at rest + - Implement backup policies + - Configure AWS Config rules + +### Example Security Implementation + +```python +def configure_security_controls( + account: aws.organizations.Account +) -> None: + """Configure security controls for an account. + + Args: + account: AWS account to configure + """ + # Enable CloudTrail + trail = aws.cloudtrail.Trail( + "audit-trail", + is_multi_region_trail=True, + include_global_service_events=True, + enable_logging=True, + opts=pulumi.ResourceOptions(parent=account) + ) + + # Configure AWS Config + aws_config = aws.config.Configuration( + "aws-config", + recording_group={ + "all_supported": True, + "include_global_resources": True + }, + opts=pulumi.ResourceOptions(parent=account) + ) +``` + +## Testing and Validation + +### Unit Testing + +```python +import pytest +from pulumi import automation as auto + +def test_organization_creation(): + """Test AWS Organization creation.""" + stack = auto.create_stack(...) + result = stack.up() + + assert "organization_id" in result.outputs + assert result.outputs["organization_id"].value != "" +``` + +### Integration Testing + +```python +def test_landing_zone_deployment( + pulumi_stack: auto.Stack +): + """Test landing zone deployment workflow.""" + result = pulumi_stack.up() + + assert "account_id" in result.outputs + assert "ou_id" in result.outputs +``` + +## Deployment Workflows + +### CI/CD Integration + +```yaml +name: AWS Module Deployment +on: [push, pull_request] + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v2 + - name: Setup Python + uses: actions/setup-python@v2 + - name: Install Dependencies + run: | + pip install poetry + poetry install + - name: Deploy Infrastructure + run: pulumi up --yes +``` + +## Advanced Topics + +### Cross-Account Access Patterns + +```python +def setup_cross_account_access( + source_account: str, + target_account: str, + role_name: str +) -> aws.iam.Role: + """Set up cross-account access role. + + Args: + source_account: Source AWS account ID + target_account: Target AWS account ID + role_name: Name of the role to create + + Returns: + aws.iam.Role: Created IAM role + """ + assume_role_policy = { + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Principal": { + "AWS": f"arn:aws:iam::{source_account}:root" + }, + "Action": "sts:AssumeRole" + }] + } + + return aws.iam.Role( + role_name, + assume_role_policy=assume_role_policy, + opts=pulumi.ResourceOptions( + provider=aws.Provider(f"provider-{target_account}") + ) + ) +``` + +## Troubleshooting + +### Common Issues + +1. **Organizations Access Denied** + - Ensure proper IAM permissions + - Verify Organization access + - Check SCP configurations + +2. **Landing Zone Deployment Failures** + - Validate email uniqueness + - Check OU path existence + - Verify quota limits + +### Best Practices + +1. **Resource Management** + - Use consistent naming conventions + - Implement proper tagging + - Enable detailed monitoring + +2. **Error Handling** + - Implement proper error handling + - Provide meaningful error messages + - Log deployment failures + +## Future Development + +### Planned Enhancements + +- Complete AWS Organizations integration +- Implement Control Tower automation +- Deploy baseline security controls +- Enhance EKS integration +- Implement cross-account access patterns +- Add advanced monitoring capabilities +- Implement advanced compliance controls +- Add support for AWS Landing Zone + +### Long-term Roadmap + +1. **Infrastructure Optimization** + - Multi-region support + - Disaster recovery automation + - Advanced cost optimization + +2. **Security Enhancements** + - Zero-trust architecture + - Advanced threat detection + - Automated incident response + +## References + +- [AWS Organizations Documentation](https://docs.aws.amazon.com/organizations/) +- [AWS Control Tower](https://docs.aws.amazon.com/controltower/) +- [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/) +- [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) +- [NIST Compliance](https://aws.amazon.com/compliance/nist/) +- [Pulumi AWS Provider](https://www.pulumi.com/registry/packages/aws/) + +--- + +**Note**: This guide is actively maintained and updated. For the latest changes, refer to the [changelog](./changelog.md) and [implementation roadmap](./implementation_roadmap.md). diff --git a/docs/developer_guide/modules/aws/eks_donor_template.md b/docs/developer_guide/modules/aws/eks_donor_template.md new file mode 100644 index 0000000..ae806e9 --- /dev/null +++ b/docs/developer_guide/modules/aws/eks_donor_template.md @@ -0,0 +1,391 @@ +# Amazon EKS Cluster Template Guide + +## Table of Contents + +1. [Introduction](#introduction) +2. [Architecture Overview](#architecture-overview) +3. [Prerequisites](#prerequisites) +4. [Implementation Guide](#implementation-guide) +5. [Configuration Management](#configuration-management) +6. [Security Controls](#security-controls) +7. [Observability Integration](#observability-integration) +8. [Deployment Workflow](#deployment-workflow) +9. [Testing and Validation](#testing-and-validation) +10. [Best Practices](#best-practices) +11. [Future Enhancements](#future-enhancements) +12. [Troubleshooting](#troubleshooting) +13. [References](#references) + +## Introduction + +This guide provides a comprehensive template for deploying production-ready Amazon EKS clusters using Pulumi and Python. It implements AWS best practices, security controls, and observability patterns while maintaining compliance with organizational standards. + +### Purpose + +- Provide a standardized EKS deployment template +- Implement security best practices +- Enable comprehensive observability +- Ensure compliance with standards +- Support multi-tenant workloads + +### Scope + +- EKS cluster deployment +- VPC and networking setup +- IAM and security configuration +- Logging and monitoring +- Add-on integration (ADOT, Fluent Bit) + +## Architecture Overview + +### Core Components + +```python +from typing import TypedDict, Optional, List + +class EksClusterConfig(TypedDict): + """EKS cluster configuration structure. + + Attributes: + name: Cluster name + version: Kubernetes version + vpc_config: VPC configuration + node_groups: Node group configurations + addons: Cluster add-ons + """ + name: str + version: str + vpc_config: Dict[str, Any] + node_groups: List[Dict[str, Any]] + addons: Optional[Dict[str, Any]] +``` + +### Network Architecture + +- VPC with public and private subnets +- NAT Gateways for outbound traffic +- VPC Flow Logs for network monitoring +- Security group configuration + +### Security Architecture + +- Private API endpoint +- KMS encryption for secrets +- IAM roles and policies +- Network policies + +## Prerequisites + +- Python 3.8+ +- Pulumi CLI +- AWS CLI configured +- Required permissions: + - EKS cluster creation + - VPC management + - IAM role creation + - KMS key management + +## Implementation Guide + +### VPC Setup + +```python +def create_vpc( + stack_name: str, + cidr_block: str, + azs: List[str] +) -> aws_native.ec2.VPC: + """Create VPC with required components.""" + vpc = aws_native.ec2.VPC( + f"vpc-{stack_name}", + cidr_block=cidr_block, + enable_dns_support=True, + enable_dns_hostnames=True, + tags=[aws_native.ec2.TagArgs( + key="Name", + value=f"vpc-{stack_name}" + )] + ) + return vpc +``` + +### Security Group Configuration + +```python +def create_security_groups( + stack_name: str, + vpc_id: pulumi.Output[str] +) -> Dict[str, aws_native.ec2.SecurityGroup]: + """Create required security groups.""" + cluster_sg = aws_native.ec2.SecurityGroup( + f"cluster-sg-{stack_name}", + vpc_id=vpc_id, + description="EKS cluster security group", + tags=[aws_native.ec2.TagArgs( + key="Name", + value=f"cluster-sg-{stack_name}" + )] + ) + return {"cluster": cluster_sg} +``` + +### EKS Cluster Deployment + +```python +def create_eks_cluster( + config: EksClusterConfig, + vpc_id: pulumi.Output[str], + subnet_ids: List[pulumi.Output[str]], + security_groups: Dict[str, aws_native.ec2.SecurityGroup] +) -> aws_native.eks.Cluster: + """Create EKS cluster with configuration.""" + cluster = aws_native.eks.Cluster( + config["name"], + role_arn=config["role_arn"], + version=config["version"], + vpc_config=aws_native.eks.ClusterVpcConfigArgs( + subnet_ids=subnet_ids, + security_group_ids=[security_groups["cluster"].id], + endpoint_private_access=True, + endpoint_public_access=False + ), + encryption_config=[ + aws_native.eks.ClusterEncryptionConfigArgs( + provider=aws_native.eks.ProviderArgs( + key_arn=config["kms_key_arn"] + ), + resources=["secrets"] + ) + ] + ) + return cluster +``` + +## Configuration Management + +### TypedDict Configurations + +```python +class NodeGroupConfig(TypedDict): + name: str + instance_types: List[str] + desired_size: int + min_size: int + max_size: int + disk_size: int + labels: Dict[str, str] + taints: Optional[List[Dict[str, str]]] + +class VpcConfig(TypedDict): + cidr_block: str + public_subnet_cidrs: List[str] + private_subnet_cidrs: List[str] + availability_zones: List[str] +``` + +### Default Values + +```python +eks_defaults: EksClusterConfig = { + "version": "1.26", + "vpc_config": { + "cidr_block": "10.0.0.0/16", + "public_subnet_cidrs": [ + "10.0.0.0/24", + "10.0.1.0/24" + ], + "private_subnet_cidrs": [ + "10.0.2.0/24", + "10.0.3.0/24" + ] + }, + "node_groups": [{ + "name": "default", + "instance_types": ["t3.medium"], + "desired_size": 2, + "min_size": 1, + "max_size": 4 + }] +} +``` + +## Security Controls + +### IAM Configuration + +```python +def create_cluster_role( + stack_name: str +) -> aws_native.iam.Role: + """Create IAM role for EKS cluster.""" + return aws_native.iam.Role( + f"eks-cluster-role-{stack_name}", + assume_role_policy_document=json.dumps({ + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Principal": { + "Service": "eks.amazonaws.com" + }, + "Action": "sts:AssumeRole" + }] + }) + ) +``` + +### KMS Configuration + +```python +def create_kms_key( + stack_name: str +) -> aws_native.kms.Key: + """Create KMS key for EKS secrets encryption.""" + return aws_native.kms.Key( + f"eks-kms-key-{stack_name}", + description="KMS key for EKS secrets encryption", + enable_key_rotation=True + ) +``` + +## Observability Integration + +### Fluent Bit Setup + +```python +def deploy_fluent_bit( + provider: k8s.Provider, + namespace: str = "logging" +) -> None: + """Deploy Fluent Bit for log collection.""" + # Implementation details in the full code example +``` + +### ADOT Integration + +```python +def deploy_adot( + provider: k8s.Provider, + namespace: str = "adot-system" +) -> None: + """Deploy AWS Distro for OpenTelemetry.""" + # Implementation details in the full code example +``` + +## Deployment Workflow + +1. Create VPC and networking components +2. Set up security groups and IAM roles +3. Deploy EKS cluster +4. Configure node groups +5. Install cluster add-ons +6. Deploy observability components + +## Testing and Validation + +### Health Checks + +```python +def validate_cluster_health( + cluster_name: str +) -> bool: + """Validate EKS cluster health.""" + try: + cluster = aws.eks.get_cluster(name=cluster_name) + return cluster.status == "ACTIVE" + except Exception as e: + log.error(f"Cluster validation failed: {str(e)}") + return False +``` + +### Integration Tests + +```python +def test_cluster_deployment( + stack: auto.Stack +) -> None: + """Test full cluster deployment.""" + # Implementation details +``` + +## Best Practices + +1. **Security** + - Enable private endpoints + - Implement least privilege + - Use KMS encryption + - Enable audit logging + +2. **Networking** + - Use private subnets for nodes + - Implement proper CIDR planning + - Enable VPC Flow Logs + +3. **Observability** + - Deploy ADOT collector + - Configure Fluent Bit + - Enable CloudWatch Container Insights + +## Future Enhancements + +### Short-term + +1. **Advanced Networking** + - VPC CNI customization + - Network policy implementation + - Service mesh integration + +2. **Security Enhancements** + - Pod security policies + - Runtime security + - Image scanning + +### Long-term + +1. **Multi-cluster Management** + - Cluster federation + - Cross-cluster networking + - Centralized operations + +2. **Advanced Observability** + - Custom metrics pipeline + - Automated alerting + - Performance optimization + +## Troubleshooting + +### Common Issues + +1. **Cluster Creation Failures** + - Check IAM permissions + - Verify VPC configuration + - Review security group rules + +2. **Node Group Issues** + - Validate instance types + - Check capacity constraints + - Review launch template + +### Logging and Debugging + +```python +def configure_debug_logging( + level: str = "DEBUG" +) -> None: + """Configure debug logging for troubleshooting.""" + logging.basicConfig( + level=level, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' + ) +``` + +## References + +- [EKS Best Practices Guide](https://aws.github.io/aws-eks-best-practices/) +- [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) +- [Pulumi AWS Native Provider](https://www.pulumi.com/registry/packages/aws-native/) +- [ADOT Documentation](https://aws-otel.github.io/) +- [Fluent Bit Documentation](https://docs.fluentbit.io/) + +--- + +**Note**: This template is actively maintained and updated. For the latest changes, refer to the [changelog](./changelog.md). diff --git a/docs/developer_guide/modules/aws/eks_opentelemetry_docs.md b/docs/developer_guide/modules/aws/eks_opentelemetry_docs.md new file mode 100644 index 0000000..cdc0915 --- /dev/null +++ b/docs/developer_guide/modules/aws/eks_opentelemetry_docs.md @@ -0,0 +1,395 @@ +# AWS Distro for OpenTelemetry (ADOT) Integration Guide + +## Table of Contents + +1. [Introduction](#introduction) +2. [Prerequisites](#prerequisites) +3. [Architecture Overview](#architecture-overview) +4. [Implementation Guide](#implementation-guide) +5. [Configuration Management](#configuration-management) +6. [Deployment Steps](#deployment-steps) +7. [Validation and Testing](#validation-and-testing) +8. [Troubleshooting](#troubleshooting) +9. [Best Practices](#best-practices) +10. [Future Enhancements](#future-enhancements) +11. [References](#references) + +## Introduction + +This guide provides comprehensive documentation for integrating AWS Distro for OpenTelemetry (ADOT) with Amazon EKS clusters in the Konductor platform. ADOT provides a secure, production-ready distribution of the OpenTelemetry project, enabling the collection and export of telemetry data from applications running on Amazon EKS. + +### Purpose + +- Enable comprehensive observability for EKS workloads +- Implement standardized telemetry collection +- Support multiple monitoring backends +- Ensure compliance with observability requirements + +### Scope + +This document covers: +- ADOT Operator installation and configuration +- Collector setup and customization +- Application instrumentation +- Integration with AWS services +- Monitoring and troubleshooting + +## Prerequisites + +- Existing EKS cluster (1.21+) +- Helm 3.x +- kubectl configured for cluster access +- AWS CLI with appropriate permissions +- cert-manager installed (for TLS certificate management) + +### Required Permissions + +```python +class AdotIamConfig(TypedDict): + """IAM configuration for ADOT deployment. + + Attributes: + role_name: Name of the IAM role + namespace: Kubernetes namespace + service_account: Service account name + """ + role_name: str + namespace: str + service_account: str + +adot_iam_defaults: AdotIamConfig = { + "role_name": "adot-collector", + "namespace": "adot-system", + "service_account": "adot-collector" +} +``` + +## Architecture Overview + +### Components + +1. **ADOT Operator** + - Manages collector lifecycle + - Handles configuration updates + - Ensures high availability + +2. **Collector** + - Receives telemetry data + - Processes and transforms data + - Exports to destinations + +3. **Instrumentation** + - Auto-instrumentation injection + - Manual instrumentation support + - Custom instrumentation options + +### Integration Points + +```python +class AdotCollectorConfig(TypedDict): + """ADOT Collector configuration structure. + + Attributes: + mode: Deployment mode (sidecar, daemon, deployment) + replicas: Number of collector replicas + resources: Resource requirements + config: Collector configuration + """ + mode: str + replicas: int + resources: Dict[str, Any] + config: Dict[str, Any] +``` + +## Implementation Guide + +### ADOT Operator Installation + +1. **Create Namespace** + +```python +def create_adot_namespace( + name: str = "adot-system", + labels: Optional[Dict[str, str]] = None +) -> k8s.core.v1.Namespace: + """Create namespace for ADOT components.""" + if not labels: + labels = {"name": name} + + return k8s.core.v1.Namespace( + name, + metadata={ + "name": name, + "labels": labels + } + ) +``` + +2. **Deploy Operator** + +```python +def deploy_adot_operator( + config: AdotOperatorConfig, + namespace: str +) -> k8s.helm.v3.Release: + """Deploy ADOT Operator using Helm.""" + return k8s.helm.v3.Release( + "adot-operator", + chart="adot-operator", + namespace=namespace, + repository="https://aws.github.io/eks-charts", + version=config["version"], + values=config["values"] + ) +``` + +### Collector Configuration + +Example collector configuration: + +```yaml +apiVersion: opentelemetry.io/v1alpha1 +kind: OpenTelemetryCollector +metadata: + name: adot-collector +spec: + mode: deployment + serviceAccount: adot-collector + config: | + receivers: + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + http: + endpoint: 0.0.0.0:4318 + + processors: + batch: + timeout: 1s + send_batch_size: 1024 + + exporters: + awsxray: + region: ${AWS_REGION} + awsemf: + region: ${AWS_REGION} + log_group_name: "/aws/containerinsights/${CLUSTER_NAME}/performance" + log_stream_name: "${HOST_NAME}" + + service: + pipelines: + traces: + receivers: [otlp] + processors: [batch] + exporters: [awsxray] + metrics: + receivers: [otlp] + processors: [batch] + exporters: [awsemf] +``` + +### Implementation in Pulumi + +```python +def deploy_adot_collector( + config: AdotCollectorConfig, + namespace: str, + depends_on: List[pulumi.Resource] = None +) -> pulumi.CustomResource: + """Deploy ADOT Collector with configuration.""" + return k8s.apiextensions.CustomResource( + "adot-collector", + api_version="opentelemetry.io/v1alpha1", + kind="OpenTelemetryCollector", + metadata={ + "name": "adot-collector", + "namespace": namespace + }, + spec={ + "mode": config["mode"], + "serviceAccount": config["service_account"], + "config": config["collector_config"] + }, + opts=pulumi.ResourceOptions(depends_on=depends_on) + ) +``` + +## Configuration Management + +### TypedDict Configurations + +```python +class AdotConfig(TypedDict): + """Main ADOT configuration structure.""" + operator: AdotOperatorConfig + collector: AdotCollectorConfig + iam: AdotIamConfig + +# Default configuration +adot_defaults: AdotConfig = { + "operator": { + "version": "v0.24.0", + "values": { + "serviceAccount": { + "create": True, + "annotations": {} + } + } + }, + "collector": { + "mode": "deployment", + "replicas": 2, + "resources": { + "limits": { + "cpu": "1", + "memory": "2Gi" + }, + "requests": { + "cpu": "200m", + "memory": "400Mi" + } + } + }, + "iam": adot_iam_defaults +} +``` + +## Deployment Steps + +1. **Prepare Environment** + - Create namespace + - Configure IAM roles + - Set up service accounts + +2. **Deploy Components** + - Install ADOT Operator + - Deploy Collector + - Configure auto-instrumentation + +3. **Validate Installation** + - Check component status + - Verify telemetry flow + - Test instrumentation + +## Validation and Testing + +### Health Checks + +```python +def validate_adot_deployment( + namespace: str +) -> bool: + """Validate ADOT deployment status.""" + try: + # Check operator status + operator_status = k8s.core.v1.list_namespaced_pod( + namespace, + label_selector="app=adot-operator" + ) + + # Check collector status + collector_status = k8s.core.v1.list_namespaced_pod( + namespace, + label_selector="app=adot-collector" + ) + + return all([ + operator_status.items[0].status.phase == "Running", + collector_status.items[0].status.phase == "Running" + ]) + except Exception as e: + log.error(f"Validation failed: {str(e)}") + return False +``` + +## Troubleshooting + +### Common Issues + +1. **Collector Not Starting** + - Check IAM roles and permissions + - Verify resource requirements + - Review collector configuration + +2. **Data Not Flowing** + - Validate endpoint configuration + - Check network policies + - Review exporter settings + +### Logging + +```python +def configure_adot_logging( + namespace: str, + log_level: str = "info" +) -> None: + """Configure ADOT component logging.""" + k8s.core.v1.ConfigMap( + "adot-logging", + metadata={"namespace": namespace}, + data={ + "collector.yaml": f""" + logging: + level: {log_level} + development: false + encoding: json + """ + } + ) +``` + +## Best Practices + +1. **Resource Management** + - Size collectors appropriately + - Use horizontal scaling + - Implement resource limits + +2. **Security** + - Enable TLS encryption + - Use service accounts + - Implement network policies + +3. **Monitoring** + - Monitor collector health + - Track telemetry pipeline + - Set up alerts + +## Future Enhancements + +### Planned Features + +1. **Advanced Configuration** + - Custom processors + - Additional exporters + - Enhanced filtering + +2. **Integration Improvements** + - Additional AWS services + - Third-party systems + - Custom instrumentation + +### Roadmap Items + +1. **Short-term** + - Performance optimization + - Enhanced auto-instrumentation + - Additional metric support + +2. **Long-term** + - Multi-cluster support + - Advanced sampling + - Custom processors + +## References + +- [AWS Distro for OpenTelemetry Documentation](https://aws-otel.github.io/) +- [OpenTelemetry Documentation](https://opentelemetry.io/docs/) +- [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/) +- [AWS X-Ray Documentation](https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html) +- [CloudWatch Container Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights.html) + +--- + +**Note**: This document is actively maintained. For updates and changes, refer to the [changelog](./changelog.md). diff --git a/docs/developer_guide/modules/aws/implementation_roadmap.md b/docs/developer_guide/modules/aws/implementation_roadmap.md new file mode 100644 index 0000000..22dc8f8 --- /dev/null +++ b/docs/developer_guide/modules/aws/implementation_roadmap.md @@ -0,0 +1,280 @@ +# AWS Module Implementation Roadmap + +## Introduction + +This comprehensive roadmap outlines the development strategy for implementing a scalable, secure, and compliant AWS infrastructure using the Konductor Infrastructure as Code (IaC) platform. It serves as a technical blueprint for developers working on the AWS module, providing detailed guidance on implementation patterns, compliance integration, and future development plans. + +## Table of Contents + +1. [Overview](#overview) +2. [Technical Architecture](#technical-architecture) +3. [Implementation Phases](#implementation-phases) +4. [Development Standards](#development-standards) +5. [Core Components](#core-components) +6. [Security and Compliance](#security-and-compliance) +7. [Testing Strategy](#testing-strategy) +8. [Future Enhancements](#future-enhancements) +9. [References](#references) + +## Overview + +### Objectives + +- Implement cloud-agnostic infrastructure patterns +- Automate AWS Organizations and Control Tower setup +- Enable automated compliance controls +- Provide secure, scalable EKS deployments +- Integrate advanced observability with ADOT + +### Key Features + +- Multi-account strategy +- Landing zone automation +- Compliance-ready infrastructure +- EKS cluster management +- Automated security controls + +## Technical Architecture + +### Core Components + +```python +from typing import TypedDict, Optional, List + +class LandingZone(TypedDict): + name: str + email: str + ou_path: str + tags: Dict[str, str] + +class AWSOrganizationConfig(TypedDict): + enabled: bool + feature_set: str + accounts: List[LandingZone] + default_tags: Dict[str, str] + region: str +``` + +### Directory Structure + +``` +aws/ +├── __init__.py +├── types.py # TypedDict definitions +├── deploy.py # Deployment logic +├── config.py # Configuration management +├── iam/ # IAM management +├── organizations/ # AWS Organizations +├── eks/ # EKS implementation +└── security/ # Security controls +``` + +## Implementation Phases + +### Phase 1: Foundation (Months 1-3) + +#### 1.1 AWS Organizations Setup +- Implement organization creation +- Configure organizational units +- Set up service control policies +- Enable AWS Control Tower (where applicable) + +#### 1.2 IAM Framework +- Implement role management +- Configure permission boundaries +- Set up cross-account access +- Enable identity federation + +#### 1.3 Base Infrastructure +- VPC architecture +- Network segmentation +- Security groups +- Route tables + +### Phase 2: Security and Compliance (Months 4-6) + +#### 2.1 Security Controls +- Enable CloudTrail +- Configure AWS Config +- Implement GuardDuty +- Set up Security Hub + +#### 2.2 Compliance Framework +- Implement NIST controls +- Configure FISMA requirements +- Enable compliance reporting +- Automate security assessments + +### Phase 3: EKS Implementation (Months 7-9) + +#### 3.1 Cluster Setup +- Base cluster configuration +- Node group management +- Add-on integration +- Network policies + +#### 3.2 Observability +- ADOT integration +- Prometheus setup +- Grafana deployment +- Logging pipeline + +### Phase 4: Advanced Features (Months 10-12) + +#### 4.1 Cost Optimization +- Budget controls +- Resource tagging +- Usage monitoring +- Cost allocation + +#### 4.2 Disaster Recovery +- Backup strategies +- Cross-region replication +- Recovery procedures +- Failover automation + +## Development Standards + +### Code Organization + +```python +# Example module structure +def create_organization( + config: AWSOrganizationConfig +) -> aws.organizations.Organization: + """Creates an AWS Organization with all features enabled.""" + organization = aws.organizations.Organization( + "aws_organization", + feature_set=config.get("feature_set", "ALL"), + opts=pulumi.ResourceOptions(protect=True) + ) + return organization +``` + +### Type Safety + +```python +from typing import TypedDict, Optional + +class SecurityConfig(TypedDict): + """Security configuration structure.""" + encryption_key_rotation: int + backup_retention_days: int + log_retention_days: int +``` + +## Core Components + +### AWS Organizations + +- Multi-account strategy +- Organizational units +- Service control policies +- Account management + +### Landing Zone + +- Account baseline +- Network architecture +- Security controls +- Compliance framework + +### EKS Platform + +- Cluster management +- Node groups +- Add-ons +- ADOT integration + +## Security and Compliance + +### NIST Controls + +- AC-2: Account Management +- AC-3: Access Enforcement +- AU-2: Audit Events +- CM-6: Configuration Settings + +### Implementation Example + +```python +def configure_security_controls( + account: aws.organizations.Account +) -> None: + """Configure security controls for an account.""" + # Enable CloudTrail + trail = aws.cloudtrail.Trail( + "audit-trail", + is_multi_region_trail=True, + include_global_service_events=True, + enable_logging=True, + opts=pulumi.ResourceOptions(parent=account) + ) + + # Configure AWS Config + aws_config = aws.config.Configuration( + "aws-config", + recording_group={ + "all_supported": True, + "include_global_resources": True + }, + opts=pulumi.ResourceOptions(parent=account) + ) +``` + +## Testing Strategy + +### Unit Testing + +```python +def test_organization_creation(): + """Test AWS Organization creation.""" + stack = auto.create_stack(...) + result = stack.up() + assert "organization_id" in result.outputs +``` + +### Integration Testing + +```python +def test_landing_zone_deployment( + pulumi_stack: auto.Stack +): + """Test landing zone deployment workflow.""" + result = pulumi_stack.up() + assert "account_id" in result.outputs +``` + +## Future Enhancements + +### Short-term + +1. **Advanced Security Features** + - Zero-trust architecture + - Advanced threat detection + - Automated incident response + +2. **Cost Optimization** + - Resource scheduling + - Spot instance integration + - Reserved capacity management + +### Long-term + +1. **Multi-Region Support** + - Global load balancing + - Cross-region replication + - Disaster recovery automation + +2. **Advanced Compliance** + - FedRAMP integration + - SOC 2 compliance + - PCI DSS controls + +## References + +- [AWS Organizations Documentation](https://docs.aws.amazon.com/organizations/) +- [AWS Control Tower](https://docs.aws.amazon.com/controltower/) +- [EKS Best Practices](https://aws.github.io/aws-eks-best-practices/) +- [NIST Compliance](https://aws.amazon.com/compliance/nist/) +- [Pulumi AWS Provider](https://www.pulumi.com/registry/packages/aws/) diff --git a/docs/getting_started.md b/docs/getting_started.md new file mode 100644 index 0000000..6bd1067 --- /dev/null +++ b/docs/getting_started.md @@ -0,0 +1,165 @@ +# Getting Started with Konductor + +Welcome to Konductor! This guide will help you quickly set up and begin using the Konductor Infrastructure as Code (IaC) platform. Whether you're new to DevOps or an experienced platform engineer, this guide provides everything you need to get started. + +## Table of Contents + +1. [Prerequisites](#prerequisites) +2. [Quick Setup](#quick-setup) +3. [Initial Configuration](#initial-configuration) +4. [First Deployment](#first-deployment) +5. [Next Steps](#next-steps) +6. [Troubleshooting](#troubleshooting) +7. [Getting Help](#getting-help) + +## Prerequisites + +Before you begin, ensure you have the following installed: + +- **Python**: Version 3.8 or higher +- **Poetry**: For dependency management ([Install Poetry](https://python-poetry.org/docs/#installation)) +- **Pulumi CLI**: For infrastructure management ([Install Pulumi](https://www.pulumi.com/docs/get-started/install/)) +- **Git**: For version control +- **AWS CLI** (optional): If working with AWS resources +- **kubectl** (optional): If working with Kubernetes resources + +> **Note**: All dependencies are automatically supplied if you use the provided Dev Container with VS Code. + +## Quick Setup + +### 1. Create Your Project + +Choose one of these methods to create your project: + +#### Option A: Use GitHub Template (Recommended) + +1. Visit the [Konductor Template Repository](https://github.com/containercraft/konductor). +2. Click "Use this template." +3. Fill in your repository details. +4. Clone your new repository: + ```bash + git clone https://github.com/your-username/your-project-name.git + cd your-project-name + ``` + +#### Option B: Clone Directly + +```bash +git clone https://github.com/containercraft/konductor.git +cd konductor +``` + +### 2. Set Up Development Environment + +#### Using Dev Container (Recommended) + +1. Install [VS Code](https://code.visualstudio.com/) and the [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers). +2. Open the project in VS Code. +3. When prompted, click "Reopen in Container." + +#### Manual Setup + +1. Install dependencies with Poetry: + ```bash + poetry install + ``` + +2. Activate the virtual environment: + ```bash + poetry shell + ``` + +## Initial Configuration + +### 1. Configure Pulumi + +1. Log in to Pulumi: + ```bash + pulumi login + ``` + +2. Create a new stack: + ```bash + pulumi stack init dev + ``` + +3. Configure your cloud provider (example for AWS): + ```bash + pulumi config set aws:region us-west-2 + ``` + +### 2. Set Up Cloud Provider Credentials + +#### For AWS: + +```bash +aws configure +``` + +Follow the prompts to enter your AWS credentials. + +> **Security Note**: Always follow your organization's security policies when handling credentials. + +## First Deployment + +1. **Preview Changes**: + ```bash + pulumi preview + ``` + +2. **Deploy Infrastructure**: + ```bash + pulumi up + ``` + +3. **Verify Deployment**: + ```bash + pulumi stack output + ``` + +## Next Steps + +After completing your initial setup, explore these resources: + +1. **[User Guide](./user_guide/README.md)** + - Learn about Konductor's features and capabilities. + - Understand best practices for using the platform. + +2. **[Module Documentation](./modules/README.md)** + - Explore available modules. + - Learn how to use specific modules like AWS or cert-manager. + +3. **[Developer Guide](./developer_guide/README.md)** + - Contribute to Konductor. + - Create custom modules. + +## Troubleshooting + +Common issues and their solutions: + +### Poetry Installation Issues + +**Problem**: Poetry installation fails. +**Solution**: Try installing with pip: + + ```bash + pip install --user poetry + ``` + +### Pulumi Login Issues + +**Problem**: Cannot log in to Pulumi. +**Solution**: Ensure you have created an account at [app.pulumi.com](https://app.pulumi.com) and create a new Personal Access Token (PAT). + +For more troubleshooting help, see our [FAQ and Troubleshooting Guide](./user_guide/faq_and_troubleshooting.md). + +## Getting Help + +- **Documentation**: Browse our [comprehensive documentation](./README.md). +- **Community**: Join our [Discord community](https://discord.gg/Jb5jgDCksX). +- **Issues**: Report problems on our [GitHub Issues](https://github.com/containercraft/konductor/issues). +- **Discussions**: Start a discussion in our [GitHub Discussions](https://github.com/containercraft/konductor/discussions). + +--- + +**Next**: Explore the [User Guide](./user_guide/README.md) to learn more about using Konductor effectively. diff --git a/docs/reference/TypeDict.md b/docs/reference/TypeDict.md new file mode 100644 index 0000000..f859c8a --- /dev/null +++ b/docs/reference/TypeDict.md @@ -0,0 +1,368 @@ +# TypedDict Reference Guide for Konductor + +## Introduction + +This reference guide provides comprehensive documentation on using `TypedDict` within the Konductor Infrastructure as Code (IaC) platform. `TypedDict`, introduced in PEP 589, is central to Konductor's configuration management strategy, providing type-safe dictionary structures that enhance code reliability and maintainability. + +## Table of Contents + +1. [Overview](#overview) + - [Why TypedDict?](#why-typeddict) + - [Benefits in Konductor](#benefits-in-konductor) + - [Integration with Pulumi](#integration-with-pulumi) + +2. [Using TypedDict in Konductor](#using-typeddict-in-konductor) + - [Basic Usage](#basic-usage) + - [Advanced Patterns](#advanced-patterns) + - [Best Practices](#best-practices) + +3. [Configuration Management](#configuration-management) + - [Module Configurations](#module-configurations) + - [Default Values](#default-values) + - [Configuration Validation](#configuration-validation) + +4. [Type Checking](#type-checking) + - [Static Analysis with Pyright](#static-analysis-with-pyright) + - [Runtime Considerations](#runtime-considerations) + - [Common Type Issues](#common-type-issues) + +5. [Examples](#examples) + - [Basic Examples](#basic-examples) + - [Module Examples](#module-examples) + - [Complex Configurations](#complex-configurations) + +6. [Troubleshooting](#troubleshooting) + - [Common Issues](#common-issues) + - [Best Practices](#best-practices-1) + - [FAQs](#faqs) + +## Overview + +### Why TypedDict? + +`TypedDict` provides several key advantages for configuration management in Konductor: + +- **Type Safety**: Catch configuration errors at development time +- **IDE Support**: Enhanced autocompletion and type hints +- **Documentation**: Self-documenting configuration structures +- **Validation**: Static type checking for configurations +- **Maintainability**: Clear contract for configuration objects + +### Benefits in Konductor + +In Konductor, `TypedDict` is used to: + +1. Define module configurations +2. Specify resource properties +3. Structure complex data hierarchies +4. Ensure configuration consistency +5. Facilitate static analysis + +### Integration with Pulumi + +`TypedDict` integrates seamlessly with Pulumi's Python SDK: + +```python +from typing import TypedDict, List +from pulumi import ResourceOptions + +class ResourceConfig(TypedDict): + name: str + tags: dict + opts: ResourceOptions +``` + +## Using TypedDict in Konductor + +### Basic Usage + +Define configuration schemas using `TypedDict`: + +```python +from typing import TypedDict, Optional + +class ModuleConfig(TypedDict): + enabled: bool + version: Optional[str] + namespace: str + labels: dict + +# Default configuration +module_defaults: ModuleConfig = { + "enabled": False, + "version": None, + "namespace": "default", + "labels": {} +} +``` + +### Advanced Patterns + +#### Nested Configurations + +```python +class ContainerPort(TypedDict): + containerPort: int + protocol: str + +class Container(TypedDict): + name: str + image: str + ports: List[ContainerPort] + +class PodSpec(TypedDict): + containers: List[Container] +``` + +#### Optional Fields + +```python +from typing import TypedDict, Optional + +class AwsConfig(TypedDict, total=False): + region: str # Required + profile: Optional[str] # Optional + tags: dict # Optional +``` + +### Best Practices + +1. **Use Clear Names**: + ```python + # Good + class NetworkConfig(TypedDict): + vpc_cidr: str + subnet_cidrs: List[str] + + # Avoid + class Config(TypedDict): + cidr: str + subnets: List[str] + ``` + +2. **Document Fields**: + ```python + class DeploymentConfig(TypedDict): + """Configuration for Kubernetes deployments. + + Attributes: + name: Name of the deployment + replicas: Number of desired replicas + image: Container image to deploy + """ + name: str + replicas: int + image: str + ``` + +3. **Provide Defaults**: + ```python + deployment_defaults: DeploymentConfig = { + "name": "app", + "replicas": 1, + "image": "nginx:latest" + } + ``` + +## Configuration Management + +### Module Configurations + +Each module defines its configuration schema: + +```python +# modules/aws/types.py +class AwsModuleConfig(TypedDict): + enabled: bool + region: str + profile: Optional[str] + tags: Dict[str, str] + +# Default configuration +aws_defaults: AwsModuleConfig = { + "enabled": False, + "region": "us-west-2", + "profile": None, + "tags": {} +} +``` + +### Default Values + +Implement default value handling: + +```python +def merge_with_defaults( + user_config: dict, + defaults: TypedDict +) -> TypedDict: + """Merge user configuration with defaults.""" + config = defaults.copy() + config.update(user_config) + return config +``` + +### Configuration Validation + +Use Pyright for static validation: + +```json +# Configure in pyrightconfig.json +{ + "typeCheckingMode": "strict", + "reportUnknownMemberType": true +} +``` + +## Type Checking + +### Static Analysis with Pyright + +Enable strict type checking: + +```json +{ + "include": ["/.py"], + "exclude": ["/_pycache_/"], + "reportMissingImports": true, + "pythonVersion": "3.8", + "typeCheckingMode": "strict" +} +``` + +### Runtime Considerations + +`TypedDict` performs no runtime validation: + +```python +from typing import cast + +def load_config(config_dict: dict) -> ModuleConfig: + """Load and validate configuration. + Note: TypedDict casting provides no runtime checks. + Add explicit validation if needed. + """ + config = cast(ModuleConfig, config_dict) + validate_config(config) # Add custom validation if needed + return config +``` + +### Common Type Issues + +1. **Missing Required Fields**: + ```python + # Error: Missing required 'name' field + config: ModuleConfig = { + "enabled": True + } + ``` + +2. **Incorrect Types**: + ```python + # Error: 'enabled' must be bool + config: ModuleConfig = { + "enabled": "true", # Should be True + "name": "test" + } + ``` + +## Examples + +### Basic Examples + +```python +from typing import TypedDict, List + +class ServiceConfig(TypedDict): + name: str + port: int + replicas: int + +# Usage +service_config: ServiceConfig = { + "name": "web", + "port": 80, + "replicas": 3 +} +``` + +### Module Examples + +```python +# AWS VPC Configuration +class VpcConfig(TypedDict): + cidr_block: str + enable_dns: bool + tags: Dict[str, str] + +# Kubernetes Deployment +class DeploymentConfig(TypedDict): + name: str + namespace: str + replicas: int + image: str + ports: List[int] +``` + +### Complex Configurations + +```python +class DatabaseConfig(TypedDict): + engine: str + version: str + size: str + backup: bool + +class ApplicationConfig(TypedDict): + name: str + environment: str + replicas: int + database: DatabaseConfig + features: Dict[str, bool] +``` + +## Troubleshooting + +### Common Issues + +1. **Type Mismatch Errors**: + ```python + # Error: Type mismatch + config: NetworkConfig = { + "vpc_cidr": 10, # Should be str + "subnet_cidrs": ["10.0.0.0/24"] + } + ``` + +2. **Missing Fields**: + ```python + # Error: Missing required field + config: ServiceConfig = { + "name": "api" # Missing required 'port' field + } + ``` + +### Best Practices + +1. Use explicit type annotations +2. Provide default values +3. Document configuration schemas +4. Use static type checking +5. Implement custom validation when needed + +### FAQs + +**Q: When should I use TypedDict vs. dataclass?** +A: Use TypedDict when working with dictionary-like structures, especially for configurations. Use dataclasses for more complex objects with methods. + +**Q: How do I handle optional fields?** +A: Use `total=False` or wrap types with `Optional[]`. + +**Q: Can I use TypedDict with Pulumi outputs?** +A: Yes, but be aware that Pulumi outputs are handled differently than regular values. + +## Related Documentation + +- [Python Development Standards](./PULUMI_PYTHON.md) +- [Style Guide](./style_guide.md) +- [Developer Guide](../developer_guide/konductor_developer_guide.md) diff --git a/docs/reference/pulumi_python.md b/docs/reference/pulumi_python.md new file mode 100644 index 0000000..b94ad41 --- /dev/null +++ b/docs/reference/pulumi_python.md @@ -0,0 +1,475 @@ +# Pulumi Python Development Standards + +## Table of Contents + +1. [Introduction](#introduction) +2. [Development Philosophy](#development-philosophy) +3. [Environment Setup](#environment-setup) +4. [Code Organization](#code-organization) +5. [Type Safety](#type-safety) +6. [Configuration Management](#configuration-management) +7. [Development Standards](#development-standards) +8. [Testing Requirements](#testing-requirements) +9. [Documentation Requirements](#documentation-requirements) +10. [Best Practices](#best-practices) +11. [Appendices](#appendices) + +## Introduction + +This document establishes the development standards and best practices for Pulumi Python projects within the Konductor framework. It serves as the authoritative reference for code quality, maintainability, and consistency across all modules and components. + +### Core Principles + +- **Type Safety**: Enforce static type checking for reliability +- **Maintainability**: Write clear, documented, and testable code +- **Consistency**: Follow established patterns and standards +- **Quality**: Prioritize code quality over feature quantity + +## Development Philosophy + +> "Features are nice. Quality is paramount." + +Our development approach emphasizes: + +1. **Code Quality**: + - Comprehensive type checking + - Clear documentation + - Thorough testing + - Consistent style + +2. **Developer Experience**: + - Intuitive interfaces + - Clear error messages + - Helpful documentation + - Streamlined workflows + +3. **Maintainability**: + - Modular design + - Single responsibility + - Clear dependencies + - Consistent patterns + +## Environment Setup + +### Prerequisites + +- Python 3.8+ +- Poetry for dependency management +- Pulumi CLI +- Pyright for type checking +- Git for version control + +### Initial Setup + +1. **Install Poetry**: + ```bash + curl -sSL https://install.python-poetry.org | python3 - + ``` + +2. **Configure Poetry**: + ```bash + poetry config virtualenvs.in-project true + ``` + +3. **Initialize Project**: + ```bash + poetry install + poetry shell + ``` + +4. **Configure Pulumi**: + ```yaml + # Pulumi.yaml + name: your-project + runtime: + name: python + options: + toolchain: poetry + typechecker: pyright + ``` + +## Code Organization + +### Directory Structure + +``` +project/ +├── pulumi/ +│ ├── __main__.py +│ ├── core/ +│ │ ├── __init__.py +│ │ ├── config.py +│ │ └── utils.py +│ └── modules/ +│ ├── aws/ +│ │ ├── __init__.py +│ │ ├── types.py +│ │ └── deploy.py +│ └── kubernetes/ +│ ├── __init__.py +│ ├── types.py +│ └── deploy.py +├── tests/ +├── pyproject.toml +├── poetry.lock +└── pyrightconfig.json +``` + +### Module Structure + +Each module should follow this structure: + +```python +# types.py +from typing import TypedDict + +class ModuleConfig(TypedDict): + """Configuration schema for the module.""" + enabled: bool + version: str + parameters: Dict[str, Any] + +# deploy.py +def deploy_module( + config: ModuleConfig, + dependencies: List[Resource] +) -> Resource: + """Deploy module resources.""" + pass +``` + +## Type Safety + +### TypedDict Usage + +1. **Configuration Schemas**: + ```python + from typing import TypedDict, Optional + + class NetworkConfig(TypedDict): + """Network configuration schema.""" + vpc_cidr: str + subnet_cidrs: List[str] + enable_nat: bool + tags: Optional[Dict[str, str]] + ``` + +2. **Default Values**: + ```python + network_defaults: NetworkConfig = { + "vpc_cidr": "10.0.0.0/16", + "subnet_cidrs": [], + "enable_nat": True, + "tags": None + } + ``` + +### Type Checking + +1. **Configure Pyright**: + ```json + { + "include": ["**/*.py"], + "exclude": ["**/__pycache__/**"], + "reportMissingImports": true, + "pythonVersion": "3.8", + "typeCheckingMode": "strict" + } + ``` + +2. **Run Type Checking**: + ```bash + poetry run pyright + ``` + +## Configuration Management + +### Configuration Structure + +1. **Module Configurations**: + ```python + class ModuleConfig(TypedDict): + enabled: bool + version: str + parameters: Dict[str, Any] + + def load_config( + module_name: str, + config: pulumi.Config + ) -> ModuleConfig: + """Load and validate module configuration.""" + pass + ``` + +2. **Validation**: + ```python + def validate_config(config: ModuleConfig) -> None: + """Validate configuration values.""" + if not isinstance(config["enabled"], bool): + raise TypeError("enabled must be a boolean") + ``` + +## Development Standards + +### Code Style + +1. **Naming Conventions**: + - Classes: `PascalCase` + - Functions/Variables: `snake_case` + - Constants: `UPPER_SNAKE_CASE` + - Private members: `_leading_underscore` + +2. **Documentation**: + ```python + def create_resource( + name: str, + config: ResourceConfig + ) -> Resource: + """Create a new resource. + + Args: + name: Resource name + config: Resource configuration + + Returns: + Created resource + + Raises: + ValueError: If configuration is invalid + """ + pass + ``` + +3. **Error Handling**: + ```python + try: + resource = create_resource(name, config) + except ValueError as e: + pulumi.log.error(f"Failed to create resource: {e}") + raise + ``` + +### Function Signatures + +1. **Type Annotations**: + ```python + from typing import Optional, List, Dict, Any + + def deploy_resources( + configs: List[ResourceConfig], + dependencies: Optional[List[Resource]] = None, + **kwargs: Any + ) -> List[Resource]: + """Deploy multiple resources.""" + pass + ``` + +2. **Return Types**: + ```python + def get_resource_status( + resource_id: str + ) -> Optional[Dict[str, Any]]: + """Get resource status or None if not found.""" + pass + ``` + +## Testing Requirements + +### Unit Tests + +1. **Test Structure**: + ```python + import pytest + from typing import Generator + + @pytest.fixture + def resource_config() -> Generator[ResourceConfig, None, None]: + """Provide test configuration.""" + config = create_test_config() + yield config + cleanup_test_config(config) + + def test_resource_creation( + resource_config: ResourceConfig + ) -> None: + """Test resource creation.""" + result = create_resource("test", resource_config) + assert result.id is not None + ``` + +2. **Mocking**: + ```python + @pytest.fixture + def mock_aws_client(mocker): + """Mock AWS client.""" + return mocker.patch("boto3.client") + ``` + +### Integration Tests + +```python +def test_module_deployment( + pulumi_stack: auto.Stack +) -> None: + """Test full module deployment.""" + result = pulumi_stack.up() + assert result.summary.resource_changes["create"] > 0 +``` + +## Documentation Requirements + +### Code Documentation + +1. **Module Documentation**: + ```python + """AWS networking module. + + This module manages AWS networking resources including VPCs, + subnets, and security groups. + + Example: + ```python + config = NetworkConfig(...) + vpc = create_vpc(config) + ``` + """ + ``` + +2. **Function Documentation**: + ```python + def create_vpc( + config: NetworkConfig, + tags: Optional[Dict[str, str]] = None + ) -> Resource: + """Create a VPC with the specified configuration. + + Args: + config: VPC configuration + tags: Optional resource tags + + Returns: + Created VPC resource + + Raises: + ValueError: If CIDR is invalid + """ + pass + ``` + +## Best Practices + +1. **Resource Management**: + ```python + def create_resources( + configs: List[ResourceConfig] + ) -> List[Resource]: + """Create multiple resources with proper cleanup.""" + resources = [] + try: + for config in configs: + resource = create_resource(config) + resources.append(resource) + return resources + except Exception: + cleanup_resources(resources) + raise + ``` + +2. **Configuration Handling**: + ```python + def load_config( + path: str, + defaults: Dict[str, Any] + ) -> Dict[str, Any]: + """Load configuration with defaults.""" + config = load_yaml(path) + return deep_merge(defaults, config) + ``` + +3. **Error Handling**: + ```python + class ResourceError(Exception): + """Base exception for resource operations.""" + pass + + class ResourceNotFoundError(ResourceError): + """Raised when a resource cannot be found.""" + pass + ``` + +## Appendices + +### A. Common Patterns + +1. **Resource Tags**: + ```python + def get_resource_tags( + name: str, + environment: str, + additional_tags: Optional[Dict[str, str]] = None + ) -> Dict[str, str]: + """Generate standard resource tags.""" + tags = { + "Name": name, + "Environment": environment, + "ManagedBy": "pulumi" + } + if additional_tags: + tags.update(additional_tags) + return tags + ``` + +2. **Resource Names**: + ```python + def generate_resource_name( + base_name: str, + suffix: Optional[str] = None + ) -> str: + """Generate consistent resource names.""" + name = f"{base_name}-{pulumi.get_stack()}" + if suffix: + name = f"{name}-{suffix}" + return name.lower() + ``` + +### B. Type Checking Examples + +```python +from typing import TypedDict, Optional, List, Dict, Any + +class ResourceConfig(TypedDict): + """Resource configuration.""" + name: str + type: str + parameters: Dict[str, Any] + tags: Optional[Dict[str, str]] + +def create_resource( + config: ResourceConfig, + dependencies: Optional[List[Resource]] = None +) -> Resource: + """Create a resource with type checking.""" + validate_config(config) + return Resource( + config["name"], + config["parameters"], + opts=ResourceOptions(depends_on=dependencies) + ) +``` + +### C. Testing Patterns + +```python +import pytest +from pulumi import automation as auto + +def test_infrastructure_deployment(): + """Test full infrastructure deployment.""" + stack = auto.create_stack(...) + try: + result = stack.up() + assert result.summary.result == "succeeded" + finally: + stack.destroy() + stack.workspace.remove_stack(stack.name) +``` diff --git a/docs/reference/style_guide.md b/docs/reference/style_guide.md new file mode 100644 index 0000000..4694bf6 --- /dev/null +++ b/docs/reference/style_guide.md @@ -0,0 +1,267 @@ +# Konductor Documentation Style Guide + +## Introduction + +This style guide establishes standards for creating and maintaining documentation within the Konductor project. It ensures consistency, clarity, and accessibility across all documentation while aligning with the project's technical standards outlined in `PULUMI_PYTHON.md`. + +## Table of Contents + +1. [General Principles](#general-principles) +2. [Document Structure](#document-structure) +3. [Writing Style](#writing-style) +4. [Formatting Standards](#formatting-standards) +5. [Code Examples](#code-examples) +6. [Links and References](#links-and-references) +7. [Images and Diagrams](#images-and-diagrams) +8. [Accessibility Guidelines](#accessibility-guidelines) +9. [Version Control](#version-control) +10. [File Organization](#file-organization) + +## General Principles + +### Clarity +- Write for your audience's knowledge level +- Define technical terms on first use +- Use consistent terminology throughout +- Avoid jargon unless necessary + +### Completeness +- Include all necessary information +- Provide context for technical concepts +- Link to related documentation +- Include troubleshooting guidance + +### Maintainability +- Keep content modular +- Use relative links +- Follow the DRY (Don't Repeat Yourself) principle +- Regular reviews and updates + +## Document Structure + +### Required Sections + +1. **Title**: Clear, descriptive title using H1 (`#`) +2. **Introduction**: Brief overview of the document's purpose +3. **Table of Contents**: For documents longer than 3 sections +4. **Prerequisites** (if applicable): Required knowledge or setup +5. **Main Content**: Organized in logical sections +6. **Conclusion** (if applicable): Summary or next steps +7. **Related Resources**: Links to related documentation + +### Header Hierarchy + +```markdown +# Document Title (H1) +## Section Title (H2) +### Sub-section Title (H3) +#### Minor Sub-section Title (H4) +``` + +### Metadata Block (Optional) + +```yaml +--- +title: Document Title +description: Brief description of the document +authors: [Author Name] +date: YYYY-MM-DD +version: 0.0.1 +--- +``` + +## Writing Style + +### Voice and Tone +- Use active voice +- Be direct and concise +- Maintain a professional, friendly tone +- Write in present tense + +### Paragraphs +- Keep paragraphs focused on a single topic +- Use short paragraphs (3-5 sentences) +- Include transition sentences between sections + +### Lists +- Use bullet points for unordered lists +- Use numbered lists for sequential steps +- Maintain parallel structure in list items + +## Formatting Standards + +### Text Formatting + +- **Bold**: Use for emphasis and UI elements +- *Italic*: Use for introducing new terms +- `Code`: Use for code snippets, file names, and commands +- ***Bold Italic***: Avoid unless absolutely necessary + +### Code Blocks + +- Use triple backticks with language identifier +- Include description of code's purpose +- Add line numbers for longer snippets + +```python +# Example code block +def example_function(): + """Docstring describing the purpose of the function.""" + pass +``` + +### Tables + +- Use tables for structured data +- Include header row +- Align columns consistently + +| Header 1 | Header 2 | Header 3 | +|----------|----------|----------| +| Data | Data | Data | + +## Code Examples + +### General Guidelines + +- Keep examples simple and focused +- Include comments explaining key concepts +- Use meaningful variable and function names +- Follow `PULUMI_PYTHON.md` coding standards + +### Example Structure + +```python +from typing import TypedDict, Optional + +class ConfigExample(TypedDict): + """Example configuration structure. + Attributes: + name: Resource name + enabled: Whether the resource is enabled + """ + name: str + enabled: bool + +def example_function(config: ConfigExample) -> None: + """Example function with type annotations. + Args: + config: Configuration dictionary + """ + if config.get("enabled", True): + print(f"Resource {config['name']} is enabled.") +``` + +## Links and References + +### Internal Links +- Use relative paths +- Link to specific sections where applicable +- Check links regularly for validity + +### External Links +- Include link text that makes sense out of context +- Add notes for external dependencies +- Consider link stability + +### Cross-References +- Use consistent terminology +- Link to glossary terms +- Reference related documentation + +## Images and Diagrams + +### Requirements +- Use SVG format when possible +- Include alt text for accessibility +- Provide high-resolution versions +- Keep file sizes reasonable + +### Captions +- Include descriptive captions +- Number figures sequentially +- Reference figures in text + +## Accessibility Guidelines + +### Text Content +- Use sufficient color contrast +- Avoid relying solely on color +- Provide text alternatives for media +- Use semantic markup + +### Navigation +- Logical heading structure +- Descriptive link text +- Keyboard navigation support +- Skip navigation links + +### Media +- Alt text for images +- Transcripts for audio +- Captions for video +- Accessible data tables + +## Version Control + +### Commit Messages +- Use clear, descriptive messages +- Reference issue numbers +- Indicate documentation changes +- Follow conventional commits + +Example: + +``` +docs(aws): update EKS cluster setup guide +``` + +* Add troubleshooting section +* Update configuration examples +* Fix broken links + +Fixes: +- #123: Fixed broken link to EKS documentation + + +### Branching +- Create feature branches for substantial changes +- Use `docs/` prefix for documentation branches +- Review changes before merging +- Keep documentation in sync with code + +## File Organization + +### Naming Conventions +- Use lowercase with underscores +- Be descriptive but concise +- Include relevant prefixes +- Maintain consistent extensions + +### Directory Structure +- Follow the established hierarchy +- Group related documents +- Use README files for navigation +- Maintain clean organization + +### File Templates +- Use standard templates +- Include required sections +- Maintain consistent structure +- Update templates as needed + +--- + +## Implementation Notes + +This style guide should be: +- Referenced during documentation creation +- Updated based on team feedback +- Reviewed quarterly +- Enforced through automation where possible + +## Related Documents + +- [PULUMI_PYTHON.md](./PULUMI_PYTHON.md) +- [TypedDict.md](./TypedDict.md) +- [Contribution Guidelines](../developer_guide/contribution_guidelines.md) diff --git a/docs/roadmaps/ROADMAP.md b/docs/roadmaps/ROADMAP.md new file mode 100644 index 0000000..20539d5 --- /dev/null +++ b/docs/roadmaps/ROADMAP.md @@ -0,0 +1,218 @@ +# Next-Generation Platform Engineering Roadmap + +## Table of Contents + +1. [Introduction](#introduction) +2. [Vision and Goals](#vision-and-goals) +3. [Strategic Pillars](#strategic-pillars) +4. [Technical Architecture](#technical-architecture) +5. [Implementation Phases](#implementation-phases) +6. [Development Standards](#development-standards) +7. [Key Components](#key-components) +8. [Success Metrics](#success-metrics) +9. [Timeline and Milestones](#timeline-and-milestones) +10. [Appendices](#appendices) + +## Introduction + +The Konductor Platform Engineering initiative aims to establish a cloud-agnostic, compliance-ready infrastructure platform that accelerates development while maintaining security and governance. This roadmap outlines our journey from initial setup through full multi-cloud deployment. + +### Overview +- Architecture focuses on modular design with TypedDict-based configurations +- Emphasis on static type checking and automated compliance controls +- Integration points for future cloud provider expansion + +### Objectives +- Clear implementation patterns for new modules +- Standardized approach to configuration management +- Automated testing and validation frameworks + +## Standards +- Well-documented setup procedures +- Example-driven development guides +- Clear path for contribution + +## Vision and Goals + +### Primary Objectives +- Create a cloud-agnostic infrastructure platform +- Automate compliance and security controls +- Enable self-service for application teams +- Reduce time-to-production for new services + +### Key Outcomes +- Reduced manual compliance tasks +- Faster application deployment +- 99.99% automated infrastructure orchestration +- Zero-touch compliance validation + +## Strategic Pillars + +### 1. Cloud-Agnostic Architecture + +```python +# Example Configuration Structure +class CloudProviderConfig(TypedDict): + provider: str # aws, azure, or gcp + region: str + credentials: Dict[str, str] + compliance_level: str +``` + +### 2. Compliance Automation + +```python +# Example Compliance Integration +class ComplianceConfig(TypedDict): + nist_controls: List[str] + fisma_level: str + audit_logging: bool + encryption: Dict[str, str] +``` + +### 3. Developer Experience + +```python +# Example Module Structure +modules/ +├── aws/ +│ ├── types.py # TypedDict definitions +│ ├── deploy.py # Deployment logic +│ └── README.md # Module documentation +``` + +## Technical Architecture + +### Core Components + +1. **Configuration Management** + - TypedDict-based schemas + - Static type checking with Pyright + - Centralized configuration validation + +2. **Module System** + - Pluggable architecture + - Standard interfaces + - Automated testing framework + +3. **Compliance Framework** + - Policy as Code implementation + - Automated controls + - Audit logging and reporting + +## Implementation Phases + +### Phase 1: Foundations (Months 1-3) +- Set up IaC framework with Pulumi +- Implement core TypedDict configurations +- Establish CI/CD pipelines + +### Phase 2: Core Infrastructure (Months 4-6) +- Deploy base AWS infrastructure +- Implement networking modules +- Set up monitoring and logging + +### Phase 3: Compliance Integration (Months 7-9) +- Implement NIST controls +- Set up FISMA compliance +- Automate compliance reporting + +### Phase 4: Multi-Cloud Expansion (Months 10-12) +- Add Azure support +- Implement GCP integration +- Cross-cloud networking + +## Development Standards + +### Code Organization +```python +# Standard Module Structure +class ModuleConfig(TypedDict): + enabled: bool + version: str + parameters: Dict[str, Any] + +def deploy_module( + config: ModuleConfig, + dependencies: List[Resource] +) -> Resource: + """Deploy module with standard interface.""" + pass +``` + +### Testing Requirements +- Unit tests for all modules +- Integration tests for workflows +- Compliance validation tests + +## Key Components + +### 1. Account Structure +- Multi-account strategy +- Role-based access control +- Resource organization + +### 2. Infrastructure Components +- Networking +- Compute resources +- Storage solutions +- Security controls + +### 3. Compliance Framework +- Policy definitions +- Control mapping +- Audit mechanisms + +## Success Metrics + +### Technical Metrics +- Deployment success rate +- Infrastructure drift percentage +- Test coverage +- Type checking compliance + +### Business Metrics +- Time to deployment +- Cost optimization +- Compliance achievement +- Developer satisfaction + +## Timeline and Milestones + +- Complete Phase 1: Foundations +- Establish development standards +- Initial AWS implementation + +- Complete Phase 2: Core Infrastructure +- Deploy first production workload +- Achieve initial compliance targets + +- Complete Phase 3: Compliance Integration +- Full NIST compliance +- Automated security controls + +- Complete Phase 4: Multi-Cloud Expansion +- Azure and GCP integration +- Cross-cloud operations + +## Appendices + +### A. Technical Specifications +- Python 3.10+ +- Pulumi latest version +- AWS/Azure/GCP SDKs + +### B. Compliance Requirements +- NIST 800-53 +- FISMA Moderate +- ISO 27001 + +### C. Reference Architectures +- AWS Landing Zone +- Azure Landing Zone +- GCP Organization + +### D. Development Tools +- Poetry for dependency management +- Pyright for type checking +- pytest for testing diff --git a/docs/roadmaps/ROADMAP_Addendum.md b/docs/roadmaps/ROADMAP_Addendum.md new file mode 100644 index 0000000..bc1359b --- /dev/null +++ b/docs/roadmaps/ROADMAP_Addendum.md @@ -0,0 +1,417 @@ +# Konductor IaC Template Repository Refactor and Enhancement Roadmap + +> **Technical Blueprint Addendum** + +## Table of Contents + +1. [Executive Summary](#executive-summary) +2. [Technical Implementation Details](#technical-implementation-details) +3. [Module Development Guidelines](#module-development-guidelines) +4. [Infrastructure Components](#infrastructure-components) +5. [Security and Compliance](#security-and-compliance) +6. [Testing Strategy](#testing-strategy) +7. [Documentation Requirements](#documentation-requirements) +8. [Deployment Workflows](#deployment-workflows) +9. [Monitoring and Observability](#monitoring-and-observability) +10. [Risk Management](#risk-management) + +## Executive Summary + +This technical addendum provides detailed implementation guidance for the Konductor Platform Engineering initiative. It serves as a comprehensive reference for engineers at all levels, with specific focus on technical implementation details and best practices. + +### Audience-Specific Guidance + +#### Principal Engineers +- Architectural decision records +- System design considerations +- Performance optimization strategies +- Scalability patterns + +#### Senior Engineers +- Implementation patterns +- Code quality standards +- Testing strategies +- Module development guidelines + +#### Junior Engineers +- Setup procedures +- Development workflows +- Testing practices +- Documentation requirements + +## Technical Implementation Details + +### Core Architecture Components + +#### 1. Configuration Management + +```python +from typing import TypedDict, Optional, Dict, Any + +class BaseConfig(TypedDict): + """Base configuration for all modules.""" + enabled: bool + version: Optional[str] + parameters: Dict[str, Any] + tags: Dict[str, str] + +class ModuleConfig(BaseConfig): + """Extended configuration for specific modules.""" + dependencies: List[str] + providers: Dict[str, Any] + compliance: Dict[str, Any] + +# Example implementation +def load_module_config( + module_name: str, + config: Dict[str, Any] +) -> ModuleConfig: + """Load and validate module configuration.""" + base_config = get_base_config(module_name) + return merge_configs(base_config, config) +``` + +#### 2. Resource Management + +```python +class ResourceManager: + """Manages infrastructure resources.""" + + def __init__(self, config: ModuleConfig): + self.config = config + self.resources: List[Resource] = [] + + def create_resource( + self, + resource_type: str, + resource_config: Dict[str, Any] + ) -> Resource: + """Create and track a new resource.""" + resource = self._create_resource_internal( + resource_type, + resource_config + ) + self.resources.append(resource) + return resource +``` + +### Module Integration Framework + +#### 1. Standard Module Interface + +```python +from abc import ABC, abstractmethod +from typing import Optional, Dict, Any + +class ModuleInterface(ABC): + """Base interface for all modules.""" + + @abstractmethod + def deploy( + self, + config: ModuleConfig, + dependencies: Optional[List[Resource]] = None + ) -> Resource: + """Deploy module resources.""" + pass + + @abstractmethod + def validate(self) -> bool: + """Validate module configuration.""" + pass + + @abstractmethod + def cleanup(self) -> None: + """Cleanup module resources.""" + pass +``` + +## Module Development Guidelines + +### Module Structure + +``` +module_name/ +├── __init__.py +├── types.py +├── deploy.py +├── config.py +├── utils.py +├── tests/ +│ ├── __init__.py +│ ├── test_deploy.py +│ └── test_config.py +└── README.md +``` + +### Implementation Standards + +#### 1. Type Definitions + +```python +from typing import TypedDict, Optional + +class ModuleResourceConfig(TypedDict): + """Resource configuration for the module.""" + name: str + type: str + parameters: Dict[str, Any] + tags: Optional[Dict[str, str]] + +class ModuleDeploymentConfig(TypedDict): + """Deployment configuration for the module.""" + resources: List[ModuleResourceConfig] + dependencies: Optional[List[str]] + providers: Dict[str, Any] +``` + +#### 2. Deployment Logic + +```python +def deploy_module( + config: ModuleDeploymentConfig, + dependencies: Optional[List[Resource]] = None +) -> Resource: + """Deploy module resources with proper error handling.""" + try: + validate_config(config) + resources = create_resources(config) + return resources + except Exception as e: + handle_deployment_error(e) + raise +``` + +## Infrastructure Components + +### AWS Infrastructure + +#### 1. Network Architecture + +```python +class NetworkConfig(TypedDict): + """Network configuration structure.""" + vpc_cidr: str + subnet_cidrs: List[str] + availability_zones: List[str] + nat_gateways: int + +def create_network_stack( + config: NetworkConfig, + tags: Dict[str, str] +) -> NetworkStack: + """Create VPC and associated networking components.""" + vpc = create_vpc(config.vpc_cidr, tags) + subnets = create_subnets(vpc, config.subnet_cidrs) + return NetworkStack(vpc, subnets) +``` + +#### 2. Security Components + +```python +class SecurityConfig(TypedDict): + """Security configuration structure.""" + encryption_key_rotation: int + backup_retention_days: int + log_retention_days: int + +def configure_security( + config: SecurityConfig, + resources: List[Resource] +) -> None: + """Apply security configurations to resources.""" + for resource in resources: + apply_encryption(resource, config) + configure_backup_retention(resource, config) + setup_logging(resource, config) +``` + +## Security and Compliance + +### Compliance Framework + +#### 1. NIST Controls Implementation + +```python +class NistControl(TypedDict): + """NIST control implementation structure.""" + control_id: str + implementation: str + validation: str + monitoring: str + +def implement_nist_controls( + controls: List[NistControl], + resources: List[Resource] +) -> None: + """Implement NIST controls on resources.""" + for control in controls: + apply_control(control, resources) + validate_control(control, resources) +``` + +#### 2. Audit Logging + +```python +class AuditConfig(TypedDict): + """Audit logging configuration.""" + log_level: str + retention_period: int + encryption_enabled: bool + +def setup_audit_logging( + config: AuditConfig, + resources: List[Resource] +) -> None: + """Configure audit logging for resources.""" + logger = create_audit_logger(config) + for resource in resources: + enable_resource_logging(resource, logger) +``` + +## Testing Strategy + +### Automated Testing Framework + +#### 1. Unit Tests + +```python +import pytest +from typing import Generator + +@pytest.fixture +def module_config() -> Generator[ModuleConfig, None, None]: + """Provide test configuration.""" + config = create_test_config() + yield config + cleanup_test_config(config) + +def test_module_deployment( + module_config: ModuleConfig +) -> None: + """Test module deployment process.""" + result = deploy_module(module_config) + assert result.status == "SUCCESS" + validate_deployment(result) +``` + +#### 2. Integration Tests + +```python +def test_cross_module_integration( + module_a_config: ModuleConfig, + module_b_config: ModuleConfig +) -> None: + """Test integration between modules.""" + module_a = deploy_module(module_a_config) + module_b = deploy_module(module_b_config) + validate_integration(module_a, module_b) +``` + +## Documentation Requirements + +### Technical Documentation + +#### 1. Module Documentation Template + +```markdown +# Module Name + +## Overview +[Brief description of the module's purpose] + +## Configuration +```python +class ModuleConfig(TypedDict): + # Configuration structure + pass +``` + +## Usage Examples +[Code examples showing common use cases] + +## Implementation Details +[Technical details about the implementation] + +## Testing +[Instructions for testing the module] +``` + +## Deployment Workflows + +### CI/CD Pipeline + +#### 1. Build and Test + +```yaml +# GitHub Actions workflow +name: Build and Test +on: [push, pull_request] + +jobs: + build: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v2 + - name: Set up Python + uses: actions/setup-python@v2 + with: + python-version: '3.8' + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install poetry + poetry install + - name: Run tests + run: poetry run pytest +``` + +## Monitoring and Observability + +### Telemetry Implementation + +#### 1. Metrics Collection + +```python +class MetricsConfig(TypedDict): + """Metrics collection configuration.""" + namespace: str + dimensions: Dict[str, str] + period: int + +def setup_metrics( + config: MetricsConfig, + resources: List[Resource] +) -> None: + """Configure metrics collection for resources.""" + metrics_client = create_metrics_client(config) + for resource in resources: + enable_resource_metrics(resource, metrics_client) +``` + +## Risk Management + +### Risk Mitigation Strategies + +#### 1. Deployment Safeguards + +```python +class DeploymentSafeguards(TypedDict): + """Deployment safety configuration.""" + rollback_enabled: bool + validation_timeout: int + max_retry_attempts: int + +def apply_deployment_safeguards( + config: DeploymentSafeguards, + deployment: Deployment +) -> None: + """Apply safety measures to deployment.""" + configure_rollback(deployment + +, config) + set_validation_checks(deployment, config) + configure_retry_policy(deployment, config) +``` + +This technical addendum provides detailed implementation guidance while maintaining alignment with the main roadmap. It serves as a comprehensive reference for engineers at all levels, ensuring consistent implementation across the platform. diff --git a/pulumi/CALL_TO_ACTION.md b/pulumi/CALL_TO_ACTION.md deleted file mode 100644 index 93dc579..0000000 --- a/pulumi/CALL_TO_ACTION.md +++ /dev/null @@ -1,169 +0,0 @@ -### Refactoring Enhancements to Consider - -**Modular Design**: - -- Core functionalities are segregated into distinct files/modules, such as `config.py`, `deployment.py`, `resource_helpers.py`, etc. -- Each module follows a clear pattern with separate `types.py` and `deploy.py` files. - -**Configuration Management**: - -- Centralize configuration management using `config.py` to handle global settings. -- Use data classes for module configurations to ensure type safety and defaults. - -**Global Metadata Handling**: - -- Implementation of a singleton pattern for managing global metadata (labels and annotations). -- Functions to generate and apply global metadata. - -**Consistency and Readability**: - -- The existing TODO comments highlight areas needing reorganization and refactoring. -- Some modules including `kubevirt`, `cert_manager`, `hostpath_provisioner` and others deploy differently in terms of resource creation and dependency management, look for ways to improve consistency. - -**Centralized Configuration Loading**: - -- Configuration loading and merging logic vary across modules. -- There is redundancy in fetching the latest versions for modules (e.g., `kubevirt`, `cert_manager`). Look for ways to reduce version fetching redundancy. - -**Exception Handling**: - -- Exception handling is partially implemented in some places, consistent and detailed error handling across all modules will improve reliability. - -**Resource Helper Centralization**: - -- Several helper functions like `create_namespace`, `create_custom_resource`, etc., provide common functionality but could be standardized further. -- Handling dependencies and resource transformations could be more DRY (Don't Repeat Yourself). - -**Standardize Configuration Management**: - -- Refactor configuration management to ensure consistency across all modules. -- Implement a common pattern for fetching the latest versions and configuration merging. - -__Refactor `initialize_pulumi` Function__: - -- Use data classes or named tuples instead of dictionaries for initialization components. -- Centralize and streamline initialization logic to reduce redundancy. - -**Enhance Error Handling and Logging**: - -- Implement structured logging and consistent error handling across all the modules. -- Ensure all relevant operations are logged, and errors are informative. - -**Simplify Function Signatures and Improve Type Safety**: - -- Refactor function signatures to use data classes and named tuples. This will improve readability and maintainability. - -**Centralize Shared Logic**: - -- Standardize and centralize shared logic like version fetching, resource transformation, and compliance metadata generation. -- Use utility functions from `utils.py` and refactor repetitive logic across `deploy.py` files. - -### Implementation Examples - -#### Centralize Configuration Handling - -Refactor the `load_default_versions` function and adopt it across all modules: - -```python -# centralize logic in core/config.py -def load_default_versions(config: pulumi.Config, force_refresh=False) -> dict: - ... - # reuse this function for fetching specific versions in modules - return default_versions - -# example usage in kubevirt/types.py -@staticmethod -def get_latest_version() -> str: - return load_default_versions(pulumi.Config()).get('kubevirt', 'latest') -``` - -#### Standardize Initialization Method - -Refactor `initialize_pulumi` in `deployment.py`: - -```python -from typing import NamedTuple - -class PulumiInit(NamedTuple): - config: pulumi.Config - stack_name: str - project_name: str - default_versions: Dict[str, str] - versions: Dict[str, str] - configurations: Dict[str, Dict[str, Any]] - global_depends_on: List[pulumi.Resource] - k8s_provider: k8s.Provider - git_info: Dict[str, str] - compliance_config: ComplianceConfig - global_labels: Dict[str, str] - global_annotations: Dict[str, str] - -def initialize_pulumi() -> PulumiInit: - ... - # use PulumiInit named tuple for returning components - return PulumiInit(...) -``` - -Update main entry (`__main__.py`) to use the tuple: - -```python -def main(): - try: - init = initialize_pulumi() - - # Use named tuple instead of dictionary - config = init.config - k8s_provider = init.k8s_provider - versions = init.versions - configurations = init.configurations - default_versions = init.default_versions - global_depends_on = init.global_depends_on - - ... - except Exception as e: - log.error(f"Deployment failed: {str(e)}") - raise -``` - -#### Enhance Logging and Error Handling - -Standardize logging across the modules: - -```python -import logging - -logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') - -def deploy_module(...): - try: - ... - except ValueError as ve: - log.error(f"Value error during deployment: {ve}") - raise - except Exception as e: - log.error(f"General error during deployment: {e}") - raise -``` - -#### Improve Reusability of Helper Functions - -Refactor `resource_helpers.py` to adopt utility functions for setting metadata and transformations: - -```python -def universal_resource_transform(resource_args: pulumi.ResourceTransformationArgs): - props = resource_args.props - set_resource_metadata(props.get('metadata', {}), get_global_labels(), get_global_annotations()) - return pulumi.ResourceTransformationResult(props, resource_args.opts) -``` - -### Adopt universal transforms in more places: - -```python -def create_custom_resource(..., transformations: Optional[List] = None, ...): - ... - opts = pulumi.ResourceOptions.merge( - ... # include universal transformations - transformations=[universal_resource_transform] + (transformations or []) - ) -``` - diff --git a/pulumi/COMPLIANCE.md b/pulumi/COMPLIANCE.md deleted file mode 100644 index f71dc63..0000000 --- a/pulumi/COMPLIANCE.md +++ /dev/null @@ -1,392 +0,0 @@ -# COMPLIANCE.md - ---- - -## Table of Contents - -1. [Introduction](#introduction) -2. [Motivation](#motivation) -3. [Compliance Strategy](#compliance-strategy) - - [Objectives](#objectives) - - [Key Innovations](#key-innovations) - -4. [Implementation Details](#implementation-details) - - [Compliance Metadata Management](#compliance-metadata-management) - - [Automatic Propagation Mechanism](#automatic-propagation-mechanism) - - [Resource Tagging and Labeling](#resource-tagging-and-labeling) - - [Configuration Schema](#configuration-schema) - - [Version Management](#version-management) - - [Stack Outputs and Reporting](#stack-outputs-and-reporting) - -5. [Developer Expectations and Best Practices](#developer-expectations-and-best-practices) - - [Module Autonomy](#module-autonomy) - - [Integration with Compliance Framework](#integration-with-compliance-framework) - - [Coding Standards](#coding-standards) - -6. [User Experience (UX)](#user-experience-ux) - - [Simplified Configuration](#simplified-configuration) - - [Deployment Workflow](#deployment-workflow) - - [Validation and Error Handling](#validation-and-error-handling) - -7. [Business Stakeholder Value](#business-stakeholder-value) - - [Accelerated Time-to-Compliance](#accelerated-time-to-compliance) - - [Auditability and Transparency](#auditability-and-transparency) - - [Risk Reduction](#risk-reduction) - -8. [Conclusion](#conclusion) -9. [Appendix](#appendix) - - [Glossary](#glossary) - - [References](#references) - ---- - -## Introduction - -This document outlines the comprehensive compliance strategy implemented in the **Konductor Infrastructure as Code (IaC) Template Repository**. It details how the codebase is designed to **reduce the time necessary to achieve production-ready compliance and authority to operate**, while also minimizing the overhead associated with compliance maintenance and renewal audits. The document serves as a benchmark and guiding framework for developing and maintaining compliance features within the Konductor IaC codebase. - ---- - -## Motivation - -In the modern regulatory landscape, organizations face increasing pressure to comply with various standards such as **NIST**, **FISMA**, **ISO 27001**, and industry-specific regulations like **HIPAA** and **PCI DSS**. Achieving and maintaining compliance is often a resource-intensive process due to: - -- **Complexity of Compliance Requirements**: Navigating overlapping and evolving regulations. -- **Dynamic Infrastructure**: Rapid changes in cloud environments and deployment practices. -- **High Audit Costs**: Significant time and financial resources required for compliance audits and renewals. -- **Human Error Risk**: Manual processes increase the likelihood of misconfigurations leading to non-compliance. -- **Muddy Compliance Traceability**: Difficulty in tracking compliance status across multiple environments and resources. - -**Our Goal**: To **innovate** within the IaC domain by creating a codebase that automates compliance tasks, reduces human error, and provides a clear path to achieving and maintaining compliance with minimal overhead. - ---- - -## Compliance Strategy - -### Objectives - -1. **Automate Compliance Integration**: Embed compliance controls and metadata into the IaC workflow. -2. **Modular Autonomy**: Allow module maintainers to define and manage compliance aspects within their specialty areas. -3. **Centralized Compliance Metadata**: Collect and manage compliance-related information centrally for consistency and ease of access. -4. **Simplify Auditing Processes**: Provide comprehensive outputs that facilitate easy auditing and reporting. -5. **Enhance Developer and User Experience**: Reduce the complexity and mental overhead associated with compliance tasks. - -### Key Innovations - -- **Pydantic-Based Configuration Models**: Utilizing Pydantic for type-safe, validated configurations that allow for complex, nested compliance metadata. -- **Automatic Propagation of Compliance Metadata**: Implementing mechanisms to automatically propagate compliance intent through resource tags, labels, and annotations across all providers. -- **Dynamic Module Integration**: Enabling modules to autonomously define their compliance requirements while integrating seamlessly with the core compliance framework. -- **Comprehensive Stack Outputs**: Aggregating compliance data into easily consumable outputs for both technical and non-technical stakeholders. - ---- - -## Implementation Details - -### Compliance Metadata Management - -**Centralized Metadata Collection**: - -- **Configuration Dictionaries**: Compliance-related metadata is defined in a central configuration file (e.g., `Pulumi..yaml`) under a dedicated `compliance` section. -- **Metadata Types**: - - **Regulatory Controls**: NIST, FISMA, ISO control identifiers. - - **Component Versions**: Versions of deployed components and dependencies. - - **Source Control Information**: Git repository URLs, commit hashes, branches. - - **Identity Information**: Cloud provider identities (e.g., AWS STS `GetCallerIdentity`, Kubernetes User and Service Account, etc. outputs). - - **Organizational Metadata**: Owner information, environment tags, project identifiers. - -**Example Configuration**: - -```yaml -config: - compliance: - nist: - controls: - - "AC-2" - - "IA-5" - exceptions: - - "AC-2(1)" - - "IA-5(1)" - fisma: - level: "moderate" - iso_27001: - controls: - - "A.9.2.3" - - "A.11.2.1" - controls: [] - organization: "magic-science-team" - environment: "production" - owner: "compliance-team@nasa.com" - project_id: "proj-12345" -``` - -### Automatic Propagation Mechanism - -**Mechanism Overview**: - -- **Metadata Injection**: Compliance metadata is injected into resources during their creation within each module. -- **Uniform Interfaces**: Modules interact with compliance metadata through standardized interfaces provided by the core codebase. -- **Dynamic Discovery**: Modules dynamically discover compliance configurations relevant to them, promoting autonomy and flexibility. - -**Technical Implementation**: - -- **Core Compliance Module** (`core/compliance.py`): - - Provides functions to access compliance metadata. - - Supplies utility functions to format and apply metadata to resources. - -- **Module Integration**: - - Modules import the core compliance utilities. - - Apply compliance metadata during resource instantiation. - -**Example in Module Deployment**: - -```python -from core.compliance import get_compliance_tags - -def deploy_aws_module( - config: AWSConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - aws_provider = providers.get('aws') - compliance_tags = get_compliance_tags() - s3_bucket = aws.s3.Bucket( - resource_name='my_bucket', - bucket='my-unique-bucket-name', - tags=compliance_tags, - opts=pulumi.ResourceOptions( - provider=aws_provider, - depends_on=global_depends_on, - ) - ) - return s3_bucket -``` - -### Resource Tagging and Labeling - -**AWS Resources**: - -- Tags are applied to resources using the `tags` argument. -- Supports tagging for resources like EC2 instances, S3 buckets, Lambda functions, etc. - -**Kubernetes Resources**: - -- Labels and annotations are applied to resources via the `metadata` field. -- Useful for compliance-related labeling in deployments, services, pods, etc. - -**Example Tags and Labels**: - -- **Tags**: - - `compliance:nist-controls=AC-2,IA-5` - - `compliance:owner=compliance-team@example.com` - - `compliance:project-id=proj-12345` - -- **Labels**: - - `compliance/nist-controls: "AC-2,IA-5"` - - `compliance/owner: "compliance-team@example.com"` - - `compliance/project-id: "proj-12345"` - -### Configuration Schema - -**Pydantic Models**: - -- Each module defines its configuration schema using Pydantic models (`types.py`), which include compliance-related fields as needed. -- Validation ensures that compliance metadata adheres to expected formats and values. - -**Example Pydantic Model**: - -```python -from pydantic import BaseModel, Field - -class ComplianceConfig(BaseModel): - nist_controls: List[str] = Field(default_factory=list) - fisma_moderate: bool = False - iso_27001_controls: List[str] = Field(default_factory=list) - organization: str - environment: str - owner: str - project_id: str -``` - -### Version Management - -**Module Versioning**: - -- **Kubernetes Modules**: Utilize version locking mechanisms to ensure specific versions of Helm charts and resources are deployed. -- **Cloud Provider Modules**: Rely on SDK versions specified in `requirements.txt`; internal version management within configurations is unnecessary. - -**Version Reporting**: - -- Component versions are collected and included in compliance metadata. -- Enables traceability and auditing of deployed component versions. - -### Stack Outputs and Reporting - -**Aggregated Outputs**: - -- Compliance metadata, configuration details, and versions are aggregated and exposed via stack outputs. -- Outputs are structured in a way that is consumable by both technical tools and non-technical stakeholders. - -**Example Stack Output**: - -```bash -pulumi stack output compliance_report -``` - -```json -{ - "nist_controls": ["AC-2", "IA-5"], - "fisma": { - "moderate": true, - }, - "iso_27001_controls": ["A.9.2.3", "A.11.2.1"], - "organization": "magic-science-team", - "environment": "production", - "owner": "compliance-team@example.com", - "project_id": "proj-12345", - "component_versions": { - "aws": "3.40.0", - "kubernetes": "2.8.0", - "cert_manager": "v1.5.3" - }, - "source_control": { - "repository": "https://github.com/magic-science-team/fork-konductor-template-to-make-new-proj", - "branch": "main", - "commit": "abc123def456" - }, - "identity": { - "aws_account_id": "123456789012", - "aws_user_arn": "arn:aws:iam::123456789012:user/DeployUser" - } -} -``` - -**Reporting Tools Integration**: - -- Stack outputs can be consumed by reporting tools, compliance dashboards, or exported to formats like CSV, JSON, or integrated with SIEM systems. -- Facilitates automated compliance checks and monitoring. - ---- - -## Developer Expectations and Best Practices - -### Module Autonomy - -- **Specialty Ownership**: Module maintainers have full authority over their module's configuration structure and implementation. -- **Compliance Integration**: Modules are expected to integrate compliance metadata according to the standards provided by the core compliance framework. - -### Integration with Compliance Framework - -- **Accessing Compliance Metadata**: Modules should use the core compliance utilities to access and apply compliance metadata. -- **Consistent Application**: Ensure that compliance metadata is applied uniformly across all resources within the module. -- **Validation**: Leverage Pydantic models to validate compliance-related configurations. - -**Example**: - -```python -from core.compliance import get_compliance_annotations - -def deploy_kubernetes_module( - config: KubernetesConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - k8s_provider = providers.get('k8s') - compliance_annotations = get_compliance_annotations() - deployment = k8s.apps.v1.Deployment( - resource_name='my_app', - metadata={ - 'name': 'my-app', - 'annotations': compliance_annotations, - }, - spec={ - # Deployment spec... - }, - opts=pulumi.ResourceOptions( - provider=k8s_provider, - depends_on=global_depends_on, - ) - ) - return deployment -``` - -### Coding Standards - -- **Type Annotations**: Use type hints and Pydantic models for configurations. -- **Documentation**: Document compliance integration points in module `README.md` files. -- **Error Handling**: Provide clear error messages for compliance-related validation errors. -- **Avoid Hardcoding**: Do not hardcode compliance metadata; always use the centralized configurations. - ---- - -## User Experience (UX) - -### Simplified Configuration - -- **Single Source of Truth**: Users define compliance configurations in one place, reducing complexity. -- **Default Values**: Sensible defaults are provided, minimizing the required input from users. -- **Validation Feedback**: Immediate validation feedback helps users correct configurations before deployment. - -### Deployment Workflow - -- **Seamless Integration**: Compliance features are integrated into the deployment workflow without additional steps. -- **Visibility**: Users can view applied compliance metadata through resource tags, labels, and stack outputs. -- **Customization**: Users can customize compliance settings to fit their organizational requirements. - -### Validation and Error Handling - -- **Pydantic Validation**: Configurations are validated using Pydantic, ensuring type safety and correctness. -- **Clear Error Messages**: Users receive detailed error messages that pinpoint configuration issues. -- **Examples and Documentation**: Modules provide examples and documentation to guide users in configuring compliance settings. - ---- - -## Business Stakeholder Value - -### Accelerated Time-to-Compliance - -- **Reduced Implementation Time**: Automation of compliance tasks speeds up the deployment process. -- **Pre-Built Compliance Controls**: Out-of-the-box compliance integrations reduce the need for custom development. -- **Quick Adaptation**: Ability to quickly adapt to new compliance requirements by updating configurations. - -### Auditability and Transparency - -- **Comprehensive Reporting**: Stack outputs provide a complete picture of compliance status. -- **Traceability**: Source control and version metadata enable tracking of changes over time. -- **Evidence for Audits**: Resource tags and labels serve as evidence of compliance measures in place. - -### Risk Reduction - -- **Minimized Human Error**: Automated compliance reduces the risk of misconfigurations leading to non-compliance. -- **Consistent Application**: Ensures compliance controls are applied uniformly across all infrastructure components. -- **Regulatory Alignment**: Simplifies alignment with regulatory requirements, reducing the risk of penalties or sanctions. - ---- - -## Conclusion - -The Konductor IaC codebase embodies a strategic approach to compliance management by integrating compliance considerations into every layer of the infrastructure provisioning process. By leveraging innovative techniques such as Pydantic for configuration validation and automatic propagation of compliance metadata, we have significantly reduced the time and effort required to achieve and maintain compliance. This not only accelerates the path to production-ready deployments but also provides business stakeholders with the tools and transparency needed for effective compliance governance. - ---- - -## Appendix - -### Glossary - -- **IaC (Infrastructure as Code)**: The practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. -- **NIST (National Institute of Standards and Technology)**: A U.S. federal agency that develops and promotes measurement, standards, and technology. -- **FISMA (Federal Information Security Management Act)**: U.S. legislation that defines a comprehensive framework to protect government information, operations, and assets. -- **ISO 27001**: An international standard for managing information security. -- **Pydantic**: A Python library for data parsing and validation using Python type annotations. -- **SDK (Software Development Kit)**: A collection of software development tools in one installable package. - -### References - -- **NIST Cybersecurity Framework**: [https://www.nist.gov/cyberframework](https://www.nist.gov/cyberframework) -- **FISMA Compliance**: [https://csrc.nist.gov/topics/laws-and-regulations/laws/fisma](https://csrc.nist.gov/topics/laws-and-regulations/laws/fisma) -- **ISO/IEC 27001 Information Security Management**: [https://www.iso.org/isoiec-27001-information-security.html](https://www.iso.org/isoiec-27001-information-security.html) -- **Pulumi Documentation**: [https://www.pulumi.com/docs/](https://www.pulumi.com/docs/) -- **Pydantic Documentation**: [https://pydantic-docs.helpmanual.io/](https://pydantic-docs.helpmanual.io/) -- __AWS STS GetCallerIdentity__: [https://docs.aws.amazon.com/STS/latest/APIReference/API_GetCallerIdentity.html](https://docs.aws.amazon.com/STS/latest/APIReference/API_GetCallerIdentity.html) - ---- - -**Note**: This document serves as a living reference for the compliance strategy within the Konductor IaC codebase. It should be updated as new compliance requirements emerge and as the codebase evolves to meet the changing needs of the organization and regulatory landscape. diff --git a/pulumi/PYDANTIC.md b/pulumi/PYDANTIC.md deleted file mode 100644 index 1a9de73..0000000 --- a/pulumi/PYDANTIC.md +++ /dev/null @@ -1,242 +0,0 @@ -# **Pydantic in Konductor** - -**Pydantic** is a Python library used for **data validation** and **settings management** using Python type annotations. In the Konductor IaC codebase, Pydantic plays a crucial role in ensuring that module configurations are type-safe, valid, and easy to manage. - ---- - -## Table of Contents - -1. [Introduction](#introduction) -2. [Why Use Pydantic in Konductor?](#why-use-pydantic-in-konductor) -3. [Getting Started with Pydantic](#getting-started-with-pydantic) -4. [Defining Configuration Models](#defining-configuration-models) -5. [Validation and Error Handling](#validation-and-error-handling) -6. [Advanced Features](#advanced-features) -7. [Practical Example in Konductor](#practical-example-in-konductor) -8. [Best Practices](#best-practices) -9. [Conclusion](#conclusion) - ---- - -## Introduction - -In automation and cloud operations, handling configurations and data from various sources is a common task. Ensuring that this data is accurate and correctly structured prevents errors and makes applications more robust. Pydantic leverages Python's type hints to provide data validation and parsing, making it an ideal choice for managing module configurations in Konductor. - ---- - -## Why Use Pydantic in Konductor? - -- **Type Safety**: Enforces data types, reducing runtime errors due to type mismatches. -- **Data Validation**: Automatically validates configuration data, ensuring it meets the required criteria. -- **Ease of Use**: Integrates seamlessly with Python's type annotations and has a simple syntax. -- **Customization**: Allows for complex nested configurations and custom validation logic. -- **Error Reporting**: Provides clear and informative error messages, improving the developer and user experience. - ---- - -## Getting Started with Pydantic - -### Installation - -Ensure Pydantic is included in your project's `requirements.txt`: - -```bash -pydantic>=1.8.2 -``` - -Install the dependencies: - -```bash -pip install -r requirements.txt -``` - ---- - -## Defining Configuration Models - -Each module in Konductor defines its configuration schema using Pydantic models in `types.py`. - -**Example**: - -```python -# modules/aws/types.py - -from pydantic import BaseModel, Field -from typing import Optional, List, Dict, Any - -class AWSConfig(BaseModel): - enabled: bool = False - profile: Optional[str] = None - region: str - account_id: Optional[str] = None - landingzones: List[Dict[str, Any]] = Field(default_factory=list) - # ... other fields ... - - @validator('region') - def validate_region(cls, v): - if v not in ['us-east-1', 'us-west-2', 'eu-central-1']: - raise ValueError('Unsupported region specified') - return v -``` - -**Explanation**: - -- **Fields**: Defined with type annotations and optional default values. -- **Validators**: Custom validation logic to enforce specific rules. - ---- - -## Validation and Error Handling - -When configurations are loaded, Pydantic validates the data and raises `ValidationError` if any issues are found. - -**Example**: - -```python -from pydantic import ValidationError - -try: - config_obj = AWSConfig(**module_config_dict) -except ValidationError as e: - pulumi.log.error(f"Configuration error in AWS module:\n{e}") - raise -``` - -**Error Output**: - -``` -Configuration error in AWS module: -1 validation error for AWSConfig -region - Unsupported region specified (type=value_error) -``` - ---- - -## Advanced Features - -### Nested Models - -Modules can define complex configurations using nested models. - -```python -# modules/cert_manager/types.py - -from pydantic import BaseModel - -class IssuerConfig(BaseModel): - name: str - email: str - -class CertManagerConfig(BaseModel): - enabled: bool = False - version: str = "latest" - namespace: str = "cert-manager" - issuer: IssuerConfig -``` - -### Custom Validators - -Custom validators can enforce complex validation logic. - -```python -from pydantic import validator - -class CertManagerConfig(BaseModel): - # ... fields ... - - @validator('version') - def check_version_format(cls, v): - if not v.startswith('v'): - raise ValueError('Version must start with "v"') - return v -``` - -### Environment Variables - -Pydantic models can read values from environment variables. - -```python -from pydantic import BaseSettings - -class Settings(BaseSettings): - debug: bool = False - aws_region: str - - class Config: - env_prefix = 'KONDUCTOR_' # Environment variables start with KONDUCTOR_ - -settings = Settings() -``` - ---- - -## Practical Example in Konductor - -**Scenario**: Validate the configuration for the `kubevirt` module. - -**Configuration Schema (Pulumi YAML)**: - -```yaml -kubevirt: - enabled: true - version: "v0.34.0" - namespace: "kubevirt" - features: - liveMigration: true - cpuManager: false -``` - -**Configuration Model**: - -```python -# modules/kubevirt/types.py - -from pydantic import BaseModel - -class FeaturesConfig(BaseModel): - liveMigration: bool = False - cpuManager: bool = False - -class KubeVirtConfig(BaseModel): - enabled: bool = False - version: str = "latest" - namespace: str = "kubevirt" - features: FeaturesConfig -``` - -**Usage in Deployment Function**: - -```python -def deploy_kubevirt( - config: KubeVirtConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], -) -> pulumi.Resource: - # Access configuration values directly - version = config.version - namespace = config.namespace - live_migration_enabled = config.features.liveMigration - # Deployment logic... -``` - ---- - -## Best Practices - -1. **Define Clear Schemas**: Use Pydantic models to define clear and explicit configuration schemas. -2. **Provide Defaults**: Set default values for optional fields to simplify user configurations. -3. **Validate Early**: Perform validation as soon as configurations are loaded to catch errors early. -4. **Use Custom Validators**: Implement custom validators for complex validation rules. -5. **Document Configurations**: Clearly document all configuration fields and their expected values in module `README.md` files. -6. **Handle Errors Gracefully**: Provide informative error messages to assist users in correcting configuration issues. - ---- - -## Conclusion - -Integrating Pydantic into the Konductor IaC codebase enhances both developer and user experiences by providing robust configuration management. It ensures configurations are validated and type-safe, reducing runtime errors and simplifying debugging. By following the best practices outlined in this document, developers can create modules that are reliable, maintainable, and user-friendly. - ---- - -For further reading and advanced usage of Pydantic, refer to the [official Pydantic documentation](https://pydantic-docs.helpmanual.io/). diff --git a/pulumi/README.md b/pulumi/README.md deleted file mode 100644 index 4fbef03..0000000 --- a/pulumi/README.md +++ /dev/null @@ -1,485 +0,0 @@ -# Kargo KubeVirt Kubernetes PaaS - Pulumi Python Infrastructure as Code (IaC) - -Welcome to the **Konductor DevOps Template Infrastructure as Code (IaC) new project template**! This guide is designed to help both newcomers to DevOps and experienced module developers navigate, contribute to, and get the most out of the Kargo platform. Whether you're setting up your environment for the first time or looking to develop new modules, this guide provides comprehensive instructions and best practices. - -# TODO: Convert this template docs from Kargo origins to general to Konductor template repo docs. - ---- - -## Table of Contents - -- [Introduction](#introduction) -- [Developer & Architecture Ethos](#developer--architecture-ethos) - - [Prime Directive](#prime-directive) - - [Developer Directives](#developer-directives) - -- [Getting Started](#getting-started) - - [Prerequisites](#prerequisites) - - [Setting Up Your Environment](#setting-up-your-environment) - -- [Developer Imperatives](#developer-imperatives) - - [Detailed Breakdown](#detailed-breakdown) - -- [Developing New Modules](#developing-new-modules) - - [Directory Structure](#directory-structure) - - [Creating a New Module](#creating-a-new-module) - -- [Common Utilities](#common-utilities) -- [Version Control](#version-control) -- [Contributing to the Project](#contributing-to-the-project) -- [Additional Resources](#additional-resources) -- [Conclusion](#conclusion) - ---- - -## Introduction - -The Kargo KubeVirt Kubernetes PaaS project leverages Pulumi and Python to manage your Kubernetes infrastructure as code. Our goal is to provide an enjoyable developer experience (DX) and user experience (UX) by simplifying the deployment and management of Kubernetes resources, including KubeVirt virtual machines and other essential components. - -This guide aims to make core concepts accessible to everyone, regardless of their experience level in DevOps. - ---- - -## Developer & Architecture Ethos - -### Prime Directive - -> **"Features are nice. Quality is paramount."** - -Quality is not just about the product or code; it's about creating an enjoyable developer and user experience. At ContainerCraft, we believe that the success of open-source projects depends on the happiness and satisfaction of the community developers and users. - -### Developer Directives - -1. **Improve Code Maintainability**: Write code that is structured, organized, and easy to understand. Prioritize readability, reusability, and extensibility. -2. **Optimize Performance**: Ensure that the code performs efficiently and respects configurations. Avoid executing inactive or unnecessary code. -3. **Establish Standard Practices**: Develop consistent approaches to configuration handling, module deployment, and code organization to guide future development. - ---- - -## Getting Started - -### Prerequisites - -Before you begin, make sure you have the following installed: - -- **Pulumi CLI**: [Install Pulumi](https://www.pulumi.com/docs/get-started/) -- **Python 3.6+**: Ensure you have Python installed on your system. -- **Python Dependencies**: Install required Python packages using `pip install -r requirements.txt` -- **Kubernetes Cluster**: Access to a Kubernetes cluster with `kubectl` configured. -- **Helm CLI**: [Install Helm](https://helm.sh/docs/intro/install/) if you plan to work with Helm charts. - -### Setting Up Your Environment - -Follow these steps to set up your environment: - -1. **Clone the Repository** - -```bash -git clone https://github.com/ContainerCraft/Kargo.git -cd Kargo/pulumi -``` - -2. **Install Python Dependencies** - -```bash -pip install -r requirements.txt -``` - -3. **Initialize Pulumi Stack** - -```bash -pulumi stack init dev -``` - -4. **Configure Pulumi** - -Set your Kubernetes context and any necessary configuration options. - -```bash -pulumi config set --path kubernetes.kubeconfig -# Set other configuration options as needed -``` - -5. **Deploy the Stack** - -Preview and deploy your changes. - -```bash -pulumi up -``` - -Follow the prompts to confirm the deployment. - ---- - -## Developer Imperatives - -### Detailed Breakdown - -1. **User Experience (UX)** - -- **Clear Error Messages**: Provide meaningful error messages to help users resolve issues. - -- **Uniform Logging**: Use consistent logging practices to make debugging easier. - -```python -pulumi.log.info(f"Deploying module: {module_name}") -``` - -2. **Developer Experience (DX)** - -- **Documentation**: Include comprehensive docstrings and comments in your code. - -```python -def deploy_module(...): - """ - Deploys a module based on configuration. - - Args: - module_name (str): Name of the module. - config (pulumi.Config): Pulumi configuration object. - ... - - Returns: - None - """ -``` - -- **Examples**: Provide example configurations and usage in the documentation to help others understand how to use your code. - -3. **Configurable Modules** - -- **Pulumi Stack Configuration**: Use the Pulumi config object to allow users to customize module configurations. - -```python -module_config = config.get_object("module_name") or {} -``` - -4. **Module Data Classes** - -- **Typed Data Classes**: Use `dataclass` to encapsulate configurations clearly. - -```python -from dataclasses import dataclass - -@dataclass -class KubeVirtConfig: - namespace: str = "default" -``` - -5. **Sane Defaults in Data Classes** - -- **Sensible Defaults**: Set reasonable default values to minimize the need for user configuration. - -```python -@dataclass -class CertManagerConfig: - namespace: str = "cert-manager" - install_crds: bool = True -``` - -6. **User Configuration Handling** - -- **Merge Configurations**: Combine user-provided configurations with defaults to ensure all necessary parameters are set. - -```python -@staticmethod -def merge(user_config: Dict[str, Any]) -> 'CertManagerConfig': - default_config = CertManagerConfig() - for key, value in user_config.items(): - if hasattr(default_config, key): - setattr(default_config, key, value) - else: - pulumi.log.warn(f"Unknown configuration key '{key}' in cert_manager config.") - return default_config -``` - -7. **Simple Function Signatures** - -- **Reduce Parameters**: Keep function signatures minimal by encapsulating configurations within data classes. - -```python -def deploy_module(config_module: ModuleConfig, ...) -``` - -8. **Type Annotations** - -- **Enhance Readability**: Use type annotations to clarify expected parameter types and return values. - -```python -def deploy_module(module_name: str, config: pulumi.Config) -> None: -``` - -9. **Safe Function Signatures** - -- **Type Safety**: Use consistent type checks and raise meaningful errors when types don't match expectations. - -```python -if not isinstance(module_name, str): - raise TypeError("module_name must be a string") -``` - -10. **Streamlined Entrypoint** - -- **Encapsulate Logic**: Keep the top-level code minimal and encapsulate logic within functions. - -```python -if __name__ == "__main__": - main() -``` - -11. **Reuse and Deduplicate Code** - -- **Central Utilities**: Place reusable code in the `core` module to maintain consistency and reduce duplication. - -```python -from core.utils import sanitize_label_value, extract_repo_name -``` - -12. **Version Control Dependencies** - -- **Manage Versions**: Control component versions within configuration files to maintain consistency across deployments. - -```python -default_versions = load_default_versions(config) -``` - -13. **Transparency** - -- **Informative Outputs**: Export configuration and version information for visibility and auditing. - -```python -pulumi.export("versions", versions) -``` - -14. **Conditional Execution** - -- **Avoid Unnecessary Execution**: Only load and execute modules that are enabled in the configuration. - -```python -if module_enabled: - deploy_func(...) -``` - -15. **Remove Deprecated Code** - - - **Maintain a Clean Codebase**: Remove obsolete features and update code to align with current best practices. - ---- - -## Developing New Modules - -### Directory Structure - -Maintain a consistent directory structure for new modules: - -``` -kargo/ - pulumi/ - __main__.py - requirements.txt - core/ - __init__.py - utils.py - ... - modules/ - / - __init__.py - deploy.py - types.py - README.md - ... -``` - -### Creating a New Module - -1. **Define Configuration** - -Create a `types.py` file in your module directory to define the configuration data class: - -```python -from dataclasses import dataclass, field -from typing import Optional, Dict, Any - -@dataclass -class NewModuleConfig: - version: Optional[str] = None - namespace: str = "default" - labels: Dict[str, str] = field(default_factory=dict) - annotations: Dict[str, Any] = field(default_factory=dict) - - @staticmethod - def merge(user_config: Dict[str, Any]) -> 'NewModuleConfig': - default_config = NewModuleConfig() - for key, value in user_config.items(): - if hasattr(default_config, key): - setattr(default_config, key, value) - else: - pulumi.log.warn(f"Unknown configuration key '{key}' in new_module config.") - return default_config -``` - -2. **Implement Deployment Logic** - -Define the deployment logic in `deploy.py`: - -```python -import pulumi -import pulumi_kubernetes as k8s -from typing import List, Dict, Any, Tuple, Optional - -from core.metadata import get_global_labels, get_global_annotations -from core.resource_helpers import create_namespace -from .types import NewModuleConfig - -def deploy_new_module( - config_new_module: NewModuleConfig, - global_depends_on: List[pulumi.Resource], - k8s_provider: k8s.Provider, - ) -> Tuple[Optional[str], Optional[pulumi.Resource]]: - # Create Namespace - namespace_resource = create_namespace( - name=config_new_module.namespace, - labels=config_new_module.labels, - annotations=config_new_module.annotations, - k8s_provider=k8s_provider, - depends_on=global_depends_on, - ) - - # Implement specific resource creation logic - # ... - - return config_new_module.version, namespace_resource -``` - -3. __Update `__main__.py`__ - -Include your module in the main deployment script: - -```python -from typing import List, Dict, Any -import pulumi -from pulumi_kubernetes import Provider - -from core.deployment import initialize_pulumi, deploy_module -from core.config import export_results - -def main(): - try: - init = initialize_pulumi() - - config = init["config"] - k8s_provider = init["k8s_provider"] - versions = init["versions"] - configurations = init["configurations"] - default_versions = init["default_versions"] - global_depends_on = init["global_depends_on"] - - modules_to_deploy = ["cert_manager", "kubevirt", "new_module"] # Add your module here - - deploy_modules( - modules=modules_to_deploy, - config=config, - default_versions=default_versions, - global_depends_on=global_depends_on, - k8s_provider=k8s_provider, - versions=versions, - configurations=configurations, - ) - - compliance_config = init.get("compliance_config", {}) - export_results(versions, configurations, compliance_config) - - except Exception as e: - pulumi.log.error(f"Deployment failed: {str(e)}") - raise - -if __name__ == "__main__": - main() -``` - -4. **Document Your Module** - -Create a `README.md` file in your module directory to document its purpose, configuration options, and usage instructions. - -```markdown -# New Module - -Description of your module. - -## Configuration - -- **version** *(string)*: The version of the module to deploy. -- **namespace** *(string)*: The Kubernetes namespace where the module will be deployed. -- **labels** *(dict)*: Custom labels to apply to resources. -- **annotations** *(dict)*: Custom annotations to apply to resources. - -## Usage - -Example of how to configure and deploy the module. - -## Additional Information - -Any additional details or resources. -``` - ---- - -## Common Utilities - -Refer to `core/utils.py` for common helper functions, such as applying global labels and annotations to resources. - -```python -import re -import pulumi -import pulumi_kubernetes as k8s -from typing import Dict, Any - -def set_resource_metadata(metadata: Any, global_labels: Dict[str, str], global_annotations: Dict[str, str]): - if isinstance(metadata, dict): - metadata.setdefault('labels', {}).update(global_labels) - metadata.setdefault('annotations', {}).update(global_annotations) - elif isinstance(metadata, k8s.meta.v1.ObjectMetaArgs): - metadata.labels = {**metadata.labels or {}, **global_labels} - metadata.annotations = {**metadata.annotations or {}, **global_annotations} - -def sanitize_label_value(value: str) -> str: - value = value.lower() - sanitized = re.sub(r'[^a-z0-9_.-]', '-', value) - sanitized = re.sub(r'^[^a-z0-9]+', '', sanitized) - sanitized = re.sub(r'[^a-z0-9]+$', '', sanitized) - return sanitized[:63] -``` - ---- - -## Version Control - -Manage module versions and dependencies within configuration files, such as `default_versions.json`, to ensure consistency across deployments. - -```json -{ - "cert_manager": "1.15.3", - "kubevirt": "1.3.1", - "new_module": "0.1.0" -} -``` - ---- - -## Contributing to the Project - -We welcome contributions from the community! Here's how you can help: - -- **Report Issues**: If you encounter any bugs or have feature requests, please open an issue on GitHub. -- **Submit Pull Requests**: If you'd like to contribute code, fork the repository and submit a pull request. -- **Improve Documentation**: Help us enhance this guide and other documentation to make it more accessible. - ---- - -## Additional Resources - -- **Kargo Project Repository**: [ContainerCraft Kargo on GitHub](https://github.com/ContainerCraft/Kargo) -- **Pulumi Documentation**: [Pulumi Official Docs](https://www.pulumi.com/docs/) -- **Kubernetes Documentation**: [Kubernetes Official Docs](https://kubernetes.io/docs/home/) -- **KubeVirt Documentation**: [KubeVirt Official Docs](https://kubevirt.io/docs/) diff --git a/pulumi/ROADMAP.md b/pulumi/ROADMAP.md deleted file mode 100644 index 8365582..0000000 --- a/pulumi/ROADMAP.md +++ /dev/null @@ -1,806 +0,0 @@ -# Comprehensive Konductor IaC Template Repository Refactor and Enhancement Roadmap - -## Table of Contents - -1. [Introduction](#introduction) -2. [Objectives](#objectives) -3. [Overview of the Current Codebase](#overview-of-the-current-codebase) -4. [Proposed Solution](#proposed-solution) - - [Part 1: Refactoring AWS Module to Align with Kubernetes Modules](#part-1-refactoring-aws-module-to-align-with-kubernetes-modules) - - [Part 2: Adjusting Version Handling to Be Exclusive to Kubernetes Modules](#part-2-adjusting-version-handling-to-be-exclusive-to-kubernetes-modules) - - [Part 3: Improving Configuration Schema and Type System Using Pydantic](#part-3-improving-configuration-schema-and-type-system-using-pydantic) - -5. [Detailed Implementation Steps](#detailed-implementation-steps) - - [Part 1 Steps](#part-1-steps) - - [Part 2 Steps](#part-2-steps) - - [Part 3 Steps](#part-3-steps) - -6. [Additional Considerations](#additional-considerations) -7. [Conclusion](#conclusion) -8. [Appendix: Understanding Pydantic](#appendix-understanding-pydantic) - ---- - -## Introduction - -This document serves as a comprehensive roadmap and educational resource for refactoring and enhancing the Konductor Infrastructure as Code (IaC) codebase. It is designed to guide junior developers through the process of aligning the AWS module with existing Kubernetes modules, adjusting version handling mechanisms, and improving the configuration schema using Pydantic. Each step is thoroughly explained to ensure a deep understanding of the reasoning behind the changes. - ---- - -## Objectives - -- **Consistency**: Align the AWS module structure and deployment logic with that of the Kubernetes modules. -- **Modular Version Handling**: Ensure that version locking mechanisms are exclusive to Kubernetes modules. -- **Enhanced Configuration Management**: Implement Pydantic for configuration models to improve type safety and validation. -- **Extensibility**: Prepare the codebase for future support of other cloud providers like GCP and Azure. -- **Educational Resource**: Provide detailed explanations to serve as a learning tool for junior developers. - ---- - -## Overview of the Current Codebase - -### Current Situation - -- **Kubernetes Modules**: - - - Each module resides under `pulumi/modules//`. - - Contains `types.py` for configuration, `deploy.py` for deployment logic, and `README.md` for documentation. - - Deployment functions have a consistent signature and return values. - - Dynamic module discovery and deployment are handled via `core/deployment.py` and `__main__.py`. - -- **AWS Module**: - - - Does not conform to the structure of Kubernetes modules. - - Has a different code organization and deployment function signature. - - Integration with the core module and dynamic discovery is inconsistent. - - Version handling is not aligned with the strategy used for Kubernetes modules. - -### Issues Identified - -- **Inconsistency**: The AWS module's structure and deployment logic differ from the standard pattern established by the Kubernetes modules. -- **Version Handling**: Version interfaces and locking mechanisms are present in the AWS module but are unnecessary since cloud provider modules rely on SDK versions specified in `requirements.txt`. -- **Configuration Complexity**: The configuration schema lacks flexibility and type safety, making it difficult for module maintainers to define complex configurations. - ---- - -## Proposed Solution - -### Part 1: Refactoring AWS Module to Align with Kubernetes Modules - -- **Goal**: Restructure the AWS module to match the directory and code structure of Kubernetes modules, ensuring consistency across the codebase. -- **Actions**: - - Move AWS module files under `pulumi/modules/aws/`. - - Define configuration data classes in `types.py` without unnecessary version attributes. - - Update the deployment logic in `deploy.py` to match the standard function signature. - - Modify `__main__.py` and core modules to include and deploy the AWS module consistently. - -### Part 2: Adjusting Version Handling to Be Exclusive to Kubernetes Modules - -- **Goal**: Ensure that version locking mechanisms are exclusive to Kubernetes modules and remove unnecessary version interfaces from cloud provider modules like AWS. -- **Actions**: - - Update `core/config.py` and `core/deployment.py` to handle versioning exclusively for Kubernetes modules. - - Adjust module discovery functions to accommodate modules with and without versioning. - - Remove version handling code from cloud provider modules. - - Optionally, implement a utility to extract versions from `requirements.txt` for logging purposes. - -### Part 3: Improving Configuration Schema and Type System Using Pydantic - -- **Goal**: Enhance the configuration schema by using Pydantic models for type safety, validation, and flexibility, allowing module maintainers to define complex configurations independently. -- **Actions**: - - Integrate Pydantic into the project. - - Each module defines its own Pydantic configuration model in `types.py`. - - Centralize configuration loading and validation in `core/config.py`. - - Update deployment functions to use validated configuration objects. - - Provide clear error reporting and documentation. - ---- - -## Detailed Implementation Steps - -### Part 1 Steps - -#### Step 1: Restructure the AWS Module Directory - -**Action**: Move all AWS module files under `pulumi/modules/aws/` to mirror the structure of Kubernetes modules. - -**Reasoning**: This ensures consistency in the codebase, making it easier for developers to navigate and maintain modules. - -**Implementation**: - -- Create the directory `pulumi/modules/aws/` if it doesn't exist. -- Move existing AWS module files into this directory: - - `types.py`: Defines configuration data classes. - - `deploy.py`: Contains deployment logic. - - `README.md`: Provides module documentation. - -- Ensure the directory contains an `__init__.py` file to make it a Python package. - -#### Step 2: Define Configuration Data Classes in `types.py` - -**Action**: Create a configuration data class `AWSConfig` in `pulumi/modules/aws/types.py` without a `version` attribute. - -**Reasoning**: Cloud provider modules do not require version locking in the configuration since their versions are managed via `requirements.txt`. - -**Implementation**: - -```python -# pulumi/modules/aws/types.py - -from pydantic import BaseModel -from typing import Optional, List, Dict, Any - -class AWSConfig(BaseModel): - enabled: bool = False - profile: Optional[str] = None - region: str - account_id: Optional[str] = None - landingzones: List[Dict[str, Any]] = [] - # Add other fields as needed - - # Optional: Custom validation methods - @root_validator - def check_region(cls, values): - region = values.get('region') - if not region: - raise ValueError('region must be specified for AWS module') - return values -``` - -**Explanation**: - -- We use Pydantic's `BaseModel` to define the configuration schema. -- The `region` field is required, and a custom validator ensures it is provided. -- The `enabled` field allows the module to be toggled on or off. - -#### Step 3: Update Deployment Logic in `deploy.py` - -__Action__: Update the AWS module's `deploy.py` to contain a deployment function `deploy_aws_module` with a consistent signature and remove any version handling. - -**Reasoning**: Aligning the function signature with other modules ensures consistent deployment patterns and simplifies the core deployment logic. - -**Implementation**: - -```python -# pulumi/modules/aws/deploy.py - -from typing import List, Dict, Any -import pulumi -import pulumi_aws as aws -from .types import AWSConfig - -def deploy_aws_module( - config: AWSConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - """ - Deploys the AWS module and returns the primary resource. - """ - aws_provider = providers.get('aws') - if not aws_provider: - raise ValueError("AWS provider not found") - - # Example AWS resource creation - s3_bucket = aws.s3.Bucket( - 'my_bucket', - bucket='my-unique-bucket-name', - opts=pulumi.ResourceOptions( - provider=aws_provider, - depends_on=global_depends_on, - ) - ) - - # Return the primary resource - return s3_bucket -``` - -**Explanation**: - -- The deployment function accepts the validated `config` object, `global_depends_on`, and `providers`. -- It retrieves the AWS provider from the `providers` dictionary. -- An example AWS resource (S3 bucket) is created using the provider. -- The function returns the primary resource without version information. - -#### Step 4: Modify `__main__.py` to Include AWS Module - -__Action__: Update `pulumi/__main__.py` to include and deploy the AWS module consistently with other modules. - -**Reasoning**: This ensures that the AWS module is integrated into the deployment process in the same way as Kubernetes modules. - -**Implementation**: - -```python -# pulumi/__main__.py - -from pulumi import log -from core.config import export_results, get_module_config -from core.deployment import initialize_pulumi, deploy_modules - -def main(): - try: - # Initialize Pulumi - init = initialize_pulumi() - - # Extract components - config = init["config"] - k8s_provider = init["k8s_provider"] - versions = init["versions"] # For Kubernetes modules - configurations = init["configurations"] - default_versions = init["default_versions"] - global_depends_on = init["global_depends_on"] - compliance_config = init.get("compliance_config", {}) - - # Initialize AWS provider if AWS module is enabled - aws_config_obj, aws_enabled = get_module_config('aws', config) - aws_provider = None - if aws_enabled: - from pulumi_aws import Provider as AWSProvider - aws_provider = AWSProvider( - 'aws_provider', - profile=aws_config_obj.profile, - region=aws_config_obj.region, - ) - - # Prepare providers dictionary - providers = { - 'k8s': k8s_provider, - 'aws': aws_provider, - # Add other providers as needed - } - - # Modules to deploy - modules_to_deploy = [ - "aws", - # Add other modules as needed - ] - - # Deploy modules - deploy_modules( - modules=modules_to_deploy, - config=config, - global_depends_on=global_depends_on, - providers=providers, - versions=versions, # Kubernetes modules will update this - configurations=configurations, - ) - - # Export stack outputs - export_results(versions, configurations, compliance_config) - - except Exception as e: - log.error(f"Deployment failed: {str(e)}") - raise - -if __name__ == "__main__": - main() -``` - -**Explanation**: - -- The AWS module is included in `modules_to_deploy`. -- The AWS provider is initialized if the module is enabled and added to the `providers` dictionary. -- Version handling for the AWS module is omitted, as it is unnecessary. - -#### Step 5: Adjust `core/deployment.py` to Handle Providers and Versioning - -__Action__: Modify `deploy_module` in `pulumi/core/deployment.py` to handle modules with and without versioning. - -**Reasoning**: The core deployment function needs to accommodate both Kubernetes modules (which use versioning) and cloud provider modules (which do not). - -**Implementation**: - -```python -# pulumi/core/deployment.py - -def deploy_module( - module_name: str, - config: pulumi.Config, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - versions: Dict[str, str], - configurations: Dict[str, Dict[str, Any]] -) -> None: - # Retrieve module configuration and enabled status - config_obj, module_enabled = get_module_config(module_name, config) - - if module_enabled: - # Discover module's deploy function - deploy_func = discover_deploy_function(module_name) - - # Deploy the module - result = deploy_func( - config=config_obj, - global_depends_on=global_depends_on, - providers=providers, - ) - - # Handle result based on module type - if module_name in KUBERNETES_MODULES: - # Modules with versioning - version, primary_resource = result - versions[module_name] = version # Update versions dictionary - else: - # Modules without versioning - primary_resource = result - - configurations[module_name] = {"enabled": module_enabled} - global_depends_on.append(primary_resource) - - else: - log.info(f"Module {module_name} is not enabled.") -``` - -**Explanation**: - -- The function checks if the module is enabled. -- For Kubernetes modules, it expects the deployment function to return a tuple `(version, primary_resource)`. -- For cloud provider modules like AWS, it expects the deployment function to return just the primary resource. -- The `KUBERNETES_MODULES` list defines which modules require version handling. - -### Part 2 Steps - -#### Step 6: Update `core/config.py` to Handle Versioning Exclusively for Kubernetes Modules - -__Action__: Adjust `get_module_config` to include version information only for Kubernetes modules. - -**Reasoning**: Cloud provider modules do not need version information in their configuration. - -**Implementation**: - -```python -# pulumi/core/config.py - -KUBERNETES_MODULES = [ - 'cert_manager', - 'kubevirt', - 'multus', - 'hostpath_provisioner', - 'containerized_data_importer', - 'prometheus', - # Add other Kubernetes modules as needed -] - -def get_module_config( - module_name: str, - config: pulumi.Config, - ) -> Tuple[Any, bool]: - module_config_dict = config.get_object(module_name) or {} - module_enabled = module_config_dict.get('enabled', False) - - # Import the module's configuration class - types_module = importlib.import_module(f"modules.{module_name}.types") - ModuleConfigClass = getattr(types_module, f"{module_name.capitalize()}Config") - - if module_name in KUBERNETES_MODULES: - # Inject version information for Kubernetes modules - default_versions = load_default_versions() - module_config_dict['version'] = module_config_dict.get('version', default_versions.get(module_name)) - - try: - # Create an instance of the configuration model - config_obj = ModuleConfigClass(**module_config_dict) - except ValidationError as e: - # Handle validation errors - pulumi.log.error(f"Configuration error in module '{module_name}':\n{e}") - raise - - return config_obj, module_enabled -``` - -**Explanation**: - -- The function checks if the module is a Kubernetes module. -- If so, it injects the version information from the default versions. -- This ensures that only Kubernetes modules have version data in their configurations. - -#### Step 7: Adjust Module Discovery Functions - -__Action__: Ensure `discover_config_class` and `discover_deploy_function` work correctly for all modules. - -**Reasoning**: These functions need to dynamically import the appropriate classes and functions for each module, regardless of whether they handle versioning. - -**Implementation**: - -```python -# pulumi/core/deployment.py - -def discover_config_class(module_name: str) -> Type: - types_module = importlib.import_module(f"modules.{module_name}.types") - for name, obj in inspect.getmembers(types_module): - if inspect.isclass(obj) and issubclass(obj, BaseModel): - return obj - raise ValueError(f"No Pydantic BaseModel found in modules.{module_name}.types") - -def discover_deploy_function(module_name: str) -> Callable: - deploy_module = importlib.import_module(f"modules.{module_name}.deploy") - function_name = f"deploy_{module_name}_module" - deploy_function = getattr(deploy_module, function_name, None) - if not deploy_function: - raise ValueError(f"No deploy function named '{function_name}' found in modules.{module_name}.deploy") - return deploy_function -``` - -**Explanation**: - -- The `discover_config_class` function looks for classes inheriting from `BaseModel`, indicating a Pydantic model. -- The `discover_deploy_function` dynamically imports the deploy function based on the module name. - -#### Step 8: Remove Version Handling from Cloud Provider Modules - -**Action**: Review cloud provider modules (AWS, GCP, Azure) and remove any version-related code. - -**Reasoning**: Since versioning is managed via `requirements.txt` for these modules, internal version handling is unnecessary. - -**Implementation**: - -- **In `types.py`**: - - Ensure no `version` field is present in the configuration models. - -- **In `deploy.py`**: - - Ensure deployment functions do not return version information. - - Remove any logic that deals with versioning. - -**Explanation**: - -- This simplifies the modules and prevents confusion regarding version handling. - -#### Step 9: Optional - Load Versions from `requirements.txt` - -**Action**: Implement a utility function to extract module versions from `requirements.txt` for logging or documentation purposes. - -**Reasoning**: This provides transparency on the versions of cloud provider SDKs being used. - -**Implementation**: - -```python -# pulumi/core/utils.py - -def get_module_version_from_requirements(module_name: str) -> Optional[str]: - try: - with open('requirements.txt', 'r') as f: - for line in f: - if module_name in line: - version = line.strip().split('==')[1] - return version - except Exception as e: - pulumi.log.warn(f"Error reading requirements.txt: {e}") - return None -``` - -**Explanation**: - -- The function parses `requirements.txt` to find the version of the specified module. -- This can be used for logging purposes but should not affect module configuration. - -### Part 3 Steps - -#### Step 10: Integrate Pydantic into the Project - -**Action**: Add Pydantic to the project dependencies and update `requirements.txt`. - -**Reasoning**: Pydantic provides robust data validation and type safety for configurations. - -**Implementation**: - -- Install Pydantic: - -```bash -pip install pydantic -``` - -- Add to `requirements.txt`: - -``` -pydantic>=1.8.2 -``` - -#### Step 11: Define Base Configuration Classes - -**Action**: Create a base configuration class that other modules can inherit from if needed. - -**Reasoning**: Provides a common structure and default fields for all modules. - -**Implementation**: - -```python -# pulumi/core/base_config.py - -from pydantic import BaseModel - -class BaseConfig(BaseModel): - enabled: bool = False -``` - -#### Step 12: Update Module Configuration Models to Use Pydantic - -**Action**: For each module, define a Pydantic model in `types.py`. - -**Reasoning**: This gives each module autonomy over its configuration schema and ensures type safety. - -**Implementation**: - -- **For AWS Module**: - -```python -# pulumi/modules/aws/types.py - -from pydantic import BaseModel, root_validator -from typing import Optional, List, Dict, Any - -class AWSConfig(BaseModel): - enabled: bool = False - profile: Optional[str] = None - region: str - account_id: Optional[str] = None - landingzones: List[Dict[str, Any]] = [] - - @root_validator - def check_region(cls, values): - region = values.get('region') - if not region: - raise ValueError('region must be specified for AWS module') - return values -``` - -- **For Kubernetes Module**: - -```python -# pulumi/modules/cert_manager/types.py - -from pydantic import BaseModel - -class CertManagerConfig(BaseModel): - enabled: bool = False - version: str = "latest" - namespace: str = "cert-manager" - install_crds: bool = True -``` - -**Explanation**: - -- Each module defines its own configuration model, which can include any fields and validation logic needed. -- This allows for complex and nested configurations as required. - -#### Step 13: Centralize Configuration Loading and Validation - -**Action**: Update `core/config.py` to load configurations using Pydantic models. - -**Reasoning**: Centralizing configuration loading ensures consistency and reduces duplication. - -**Implementation**: - -```python -# pulumi/core/config.py - -from typing import Any, Tuple -from pydantic import ValidationError -import importlib - -def get_module_config( - module_name: str, - config: pulumi.Config, - ) -> Tuple[Any, bool]: - module_config_dict = config.get_object(module_name) or {} - module_enabled = module_config_dict.get('enabled', False) - - # Import the module's configuration class - types_module = importlib.import_module(f"modules.{module_name}.types") - ModuleConfigClass = getattr(types_module, f"{module_name.capitalize()}Config") - - try: - # Create an instance of the configuration model - config_obj = ModuleConfigClass(**module_config_dict) - except ValidationError as e: - # Handle validation errors - pulumi.log.error(f"Configuration error in module '{module_name}':\n{e}") - raise - - return config_obj, module_enabled -``` - -**Explanation**: - -- The function dynamically imports the module's configuration class. -- It creates an instance of the configuration model, which automatically validates the data. -- Any validation errors are caught and reported. - -#### Step 14: Update Deployment Functions to Use Validated Configurations - -**Action**: Modify deployment functions to accept the validated configuration objects. - -**Reasoning**: Ensures that deployment logic operates on valid data, simplifying error handling and code complexity. - -**Implementation**: - -```python -# pulumi/modules/aws/deploy.py - -def deploy_aws_module( - config: AWSConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - aws_provider = providers.get('aws') - if not aws_provider: - raise ValueError("AWS provider not found") - - # Use configuration values directly - region = config.region - profile = config.profile - # ... - - # Implement AWS resource creation using aws_provider - # ... -``` - -**Explanation**: - -- The deployment function uses the validated `config` object, eliminating the need for additional validation within the function. -- Configuration values are accessed directly from the `config` object. - -#### Step 15: Provide Clear Error Reporting - -**Action**: Ensure that validation errors are clearly reported to the user. - -**Reasoning**: Improves user experience by helping users quickly identify and fix configuration issues. - -**Implementation**: - -- Errors are caught in `get_module_config` and logged with detailed information. - -- Example error message: - -``` -Configuration error in module 'aws': -1 validation error for AWSConfig -region - field required (type=value_error.missing) -``` - -#### Step 16: Document Configuration Schemas - -**Action**: Update module `README.md` files to include configuration schemas and field explanations. - -**Reasoning**: Provides users with a clear reference for configuring modules, reducing errors and support requests. - -**Implementation**: - -- **Example for AWS Module**: - -```markdown -# AWS Module Configuration - -## Configuration Schema - -```yaml -aws: - enabled: true - profile: "default" - region: "us-west-2" - account_id: "123456789012" - landingzones: - - name: "tenant1" - email: "tenant1@example.com" - # Add other fields as needed -``` - -## Configuration Fields - -- **enabled** *(bool)*: Enable or disable the AWS module. -- **profile** *(string)*: AWS CLI profile to use. -- **region** *(string, required)*: AWS region. -- __account_id__ _(string)_: AWS account ID. -- **landingzones** *(list)*: List of landing zone configurations. - - **name** *(string)*: Name of the landing zone. - - **email** *(string)*: Email associated with the landing zone. - - *...* - -``` - -``` - -**Explanation**: - -- Users can refer to the documentation to understand how to configure the module. -- This reduces the learning curve and potential configuration errors. - ---- - -## Additional Considerations - -### Consistent Function Signatures - -Ensure all deployment functions follow the standard signature: - -```python -def deploy__module( - config: , - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> Union[Tuple[str, pulumi.Resource], pulumi.Resource]: - # Deployment logic -``` - -- __Kubernetes Modules__: Return a tuple `(version, primary_resource)`. -- __Cloud Provider Modules__: Return the `primary_resource`. - -### Code Comments and Docstrings - -- Add comments and docstrings to explain complex logic and important implementation details. -- Use standard documentation practices to improve code readability. - -### Testing and Validation - -- **Unit Tests**: Write tests for configuration models and deployment functions. -- **Integration Tests**: Test module deployment in a controlled environment. -- **Pulumi Preview**: Use `pulumi preview` to validate infrastructure changes before deployment. - ---- - -## Conclusion - -By following this comprehensive roadmap, the Konductor IaC codebase will be refactored and enhanced to: - -- **Achieve Consistency**: Align module structures and deployment patterns across the codebase. -- **Improve Configuration Management**: Use Pydantic for type-safe and validated configurations. -- **Optimize Version Handling**: Apply version locking exclusively to Kubernetes modules where it is needed. -- **Enhance Developer and User Experience**: Provide clear documentation, error messages, and consistent patterns. -- **Prepare for Future Extensibility**: Facilitate the addition of new modules and support for other cloud providers. - -This document serves as both a plan of action and an educational resource for junior developers, providing detailed explanations and reasoning for each step. - ---- - -## Appendix: Understanding Pydantic - -To ensure that all developers are comfortable with Pydantic and its usage in the codebase, please refer to the following detailed explainer on Pydantic: - -[**Pydantic in Konductor**](#pydantic-in-konductor) - ---- - -### **Pydantic in Konductor** - -**Pydantic** is a Python library used for **data validation** and **settings management** using Python type annotations. In the Konductor IaC codebase, Pydantic plays a crucial role in ensuring that module configurations are type-safe, valid, and easy to manage. - -#### Why Use Pydantic in Konductor? - -- **Type Safety**: Enforces data types, reducing runtime errors due to type mismatches. -- **Data Validation**: Automatically validates configuration data, ensuring it meets the required criteria. -- **Ease of Use**: Integrates seamlessly with Python's type annotations and has a simple syntax. -- **Customization**: Allows for complex nested configurations and custom validation logic. -- **Error Reporting**: Provides clear and informative error messages, improving the developer and user experience. - -#### Getting Started with Pydantic - -1. **Installation**: - -```bash -pip install pydantic -``` - -2. **Defining a Model**: - -```python -from pydantic import BaseModel - -class ModuleConfig(BaseModel): - enabled: bool = False - name: str - version: str = "latest" -``` - -3. **Validation and Usage**: - -```python -try: - config = ModuleConfig(**user_input) -except ValidationError as e: - print(e) -``` - -#### Advanced Features - -- **Nested Models**: Define complex configurations with nested models. -- **Custom Validators**: Implement custom validation logic for specific fields. -- **Settings Management**: Use `BaseSettings` for managing environment variables and configuration files. diff --git a/pulumi/ROADMAP_Addendum.md b/pulumi/ROADMAP_Addendum.md deleted file mode 100644 index f7b89a6..0000000 --- a/pulumi/ROADMAP_Addendum.md +++ /dev/null @@ -1,764 +0,0 @@ -# Konductor IaC Template Repository Refactor and Enhancement Roadmap - -> **Technical Blueprint Addendum** - ---- - -## Table of Contents - -1. [Executive Summary](#executive-summary) -2. [Introduction](#introduction) -3. [Objectives](#objectives) -4. [Current State Analysis](#current-state-analysis) -5. [Proposed Solution](#proposed-solution) - - [Part 1: Aligning AWS Module with Kubernetes Modules](#part-1-aligning-aws-module-with-kubernetes-modules) - - [Part 2: Modular Version Handling](#part-2-modular-version-handling) - - [Part 3: Enhancing Configuration Management with Pydantic](#part-3-enhancing-configuration-management-with-pydantic) - -6. [Detailed Implementation Plan](#detailed-implementation-plan) - - [Part 1 Implementation Steps](#part-1-implementation-steps) - - [Part 2 Implementation Steps](#part-2-implementation-steps) - - [Part 3 Implementation Steps](#part-3-implementation-steps) - -7. [Technical Considerations](#technical-considerations) - - [Dependency Management](#dependency-management) - - [Error Handling and Logging](#error-handling-and-logging) - - [Testing Strategy](#testing-strategy) - - [Documentation Standards](#documentation-standards) - - [Security Implications](#security-implications) - -8. [Risks and Mitigations](#risks-and-mitigations) -9. [Timeline and Milestones](#timeline-and-milestones) -10. [Conclusion](#conclusion) -11. [Appendices](#appendices) - - [Appendix A: Pydantic Overview](#appendix-a-pydantic-overview) - - [Appendix B: Code Samples](#appendix-b-code-samples) - - [Appendix C: Glossary](#appendix-c-glossary) - ---- - -## Executive Summary - -This technical document outlines a comprehensive plan to refactor and enhance the Konductor Infrastructure as Code (IaC) codebase. The primary goals are to align the AWS module with the existing Kubernetes modules, implement modular version handling, and improve configuration management using Pydantic. By executing this plan, we aim to achieve consistency, improve maintainability, and enhance the developer and user experience. This document is intended for the principal engineers leading the project and includes detailed technical explanations and code examples to guide the implementation. - ---- - -## Introduction - -Konductor is an IaC platform built using Pulumi and Python, designed to streamline DevOps workflows and Platform Engineering practices. As the codebase has evolved, inconsistencies have emerged, particularly between the AWS module and the Kubernetes modules. This document proposes a refactoring plan to address these inconsistencies and introduces Pydantic for robust configuration management. - ---- - -## Objectives - -- **Consistency**: Standardize the module structure and deployment logic across all modules, including AWS. -- **Modular Version Handling**: Limit version locking mechanisms to Kubernetes modules where necessary. -- **Enhanced Configuration Management**: Utilize Pydantic for type-safe, validated, and flexible configurations. -- **Extensibility**: Prepare the codebase for future support of additional cloud providers (e.g., GCP, Azure). -- **Maintainability**: Improve code readability and reduce technical debt. -- **Developer and User Experience**: Provide clear documentation, error messages, and consistent patterns. - ---- - -## Current State Analysis - -### Kubernetes Modules - -- **Structure**: - - Reside under `pulumi/modules//`. - - Include `types.py`, `deploy.py`, and `README.md`. - - Use `dataclasses` for configuration models. - - Deployment functions have consistent signatures and return types. - -- **Version Handling**: - - Version locking mechanisms are in place. - - Versions are managed via configuration. - -### AWS Module - -- **Structure**: - - Does not conform to the standard module structure. - - Lacks separation of concerns (configuration vs. deployment logic). - - Deployment function signatures differ from Kubernetes modules. - -- **Version Handling**: - - Unnecessary version interfaces are present. - - Versions are managed via `requirements.txt`, making internal version handling redundant. - -### Issues Identified - -- **Inconsistency** in module structures and deployment patterns. -- **Redundant Version Handling** in cloud provider modules. -- **Complex Configuration Management** lacking type safety and validation. -- **Integration Challenges** due to divergent module implementations. - ---- - -## Proposed Solution - -### Part 1: Aligning AWS Module with Kubernetes Modules - -- **Restructure the AWS module** to match the directory and code organization of Kubernetes modules. -- **Standardize deployment function signatures** and remove unnecessary version handling. -- **Integrate the AWS module into the core deployment process** using dynamic module discovery. - -### Part 2: Modular Version Handling - -- **Restrict version locking mechanisms** to Kubernetes modules where applicable. -- **Remove version interfaces** from cloud provider modules (AWS, GCP, Azure). -- **Adjust core configuration and deployment functions** to handle versioning exclusively for Kubernetes modules. - -### Part 3: Enhancing Configuration Management with Pydantic - -- **Integrate Pydantic** for robust configuration models with validation. -- **Allow each module to define its own configuration schema** independently. -- **Centralize configuration loading and validation** in the core module. -- **Improve error reporting and documentation** for configurations. - ---- - -## Detailed Implementation Plan - -### Part 1 Implementation Steps - -#### Step 1: Restructure AWS Module Directory - -**Action**: Move all AWS module files under `pulumi/modules/aws/`. - -**Technical Details**: - -- **Create Directory**: - - Ensure `pulumi/modules/aws/` exists. - -- **Organize Files**: - - Move existing AWS code files (`aws_deploy.py`, `aws_types.py`, etc.) into `pulumi/modules/aws/`. - - Rename files to `deploy.py` and `types.py` to match the convention. - -- **Initialize Module**: - - Add `__init__.py` to `pulumi/modules/aws/`. - -**Expected Outcome**: - -- The AWS module directory mirrors the structure of Kubernetes modules. -- All AWS-related code is encapsulated within the module directory. - -#### Step 2: Define Configuration Data Classes in `types.py` - -**Action**: Create `AWSConfig` class in `pulumi/modules/aws/types.py` without a `version` attribute. - -**Technical Details**: - -- **Use Pydantic**: - - - Import `BaseModel` from `pydantic`. - -- **Define Configuration Class**: - -```python -from pydantic import BaseModel, Field, root_validator -from typing import Optional, List, Dict, Any - -class AWSConfig(BaseModel): - enabled: bool = False - profile: Optional[str] = None - region: str - account_id: Optional[str] = None - landingzones: List[Dict[str, Any]] = Field(default_factory=list) - - @root_validator - def validate_region(cls, values): - if not values.get('region'): - raise ValueError('The "region" field is required for AWSConfig.') - return values -``` - -- **Explanation**: - - - `enabled`: Determines if the module is active. - - `profile`: AWS CLI profile. - - `region`: Required AWS region. - - `account_id`: Optional AWS account ID. - - `landingzones`: List of landing zone configurations. - - `validate_region`: Ensures `region` is provided. - -**Expected Outcome**: - -- A type-safe, validated configuration model for the AWS module. - -#### Step 3: Update Deployment Logic in `deploy.py` - -__Action__: Create `deploy_aws_module` function in `pulumi/modules/aws/deploy.py` with a consistent signature. - -**Technical Details**: - -- **Define Deployment Function**: - -```python -from typing import List, Dict, Any -import pulumi -import pulumi_aws as aws -from .types import AWSConfig - -def deploy_aws_module( - config: AWSConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - aws_provider = providers.get('aws') - if not aws_provider: - raise ValueError("AWS provider not found.") - - # Example AWS resource creation - s3_bucket = aws.s3.Bucket( - resource_name='my_bucket', - bucket='my-unique-bucket-name', - opts=pulumi.ResourceOptions( - provider=aws_provider, - depends_on=global_depends_on, - ) - ) - - return s3_bucket -``` - -- **Explanation**: - - - The function signature aligns with other modules. - - It accepts the validated `config` object. - - Uses the `aws_provider` from the `providers` dictionary. - - Returns the primary resource without version information. - -**Expected Outcome**: - -- AWS module deployment logic conforms to the standard pattern. -- Version handling is removed from the AWS module. - -#### Step 4: Modify `__main__.py` to Include AWS Module - -__Action__: Update `pulumi/__main__.py` to deploy the AWS module consistently. - -**Technical Details**: - -- **Initialize AWS Provider**: - -```python -# Inside main function -aws_config, aws_enabled = get_module_config('aws', config) -aws_provider = None -if aws_enabled: - from pulumi_aws import Provider as AWSProvider - aws_provider = AWSProvider( - 'aws_provider', - profile=aws_config.profile, - region=aws_config.region, - ) -``` - -- **Update Providers Dictionary**: - -```python -providers = { - 'k8s': k8s_provider, - 'aws': aws_provider, - # Add other providers as needed -} -``` - -- **Include AWS Module in Deployment**: - -```python -modules_to_deploy = [ - 'aws', - # Other modules... -] -``` - -**Expected Outcome**: - -- `__main__.py` treats the AWS module like other modules. -- AWS provider is initialized and passed to the deployment function. - -#### Step 5: Adjust `core/deployment.py` to Handle Providers and Versioning - -__Action__: Modify `deploy_module` to handle modules with and without versioning. - -**Technical Details**: - -- __Update `deploy_module` Function__: - -```python -def deploy_module( - module_name: str, - config: pulumi.Config, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - versions: Dict[str, str], - configurations: Dict[str, Dict[str, Any]] -) -> None: - config_obj, module_enabled = get_module_config(module_name, config) - - if module_enabled: - deploy_func = discover_deploy_function(module_name) - result = deploy_func( - config=config_obj, - global_depends_on=global_depends_on, - providers=providers, - ) - - if module_name in KUBERNETES_MODULES: - version, primary_resource = result - versions[module_name] = version - else: - primary_resource = result - - configurations[module_name] = {"enabled": module_enabled} - global_depends_on.append(primary_resource) - else: - log.info(f"Module {module_name} is not enabled.") -``` - -- **Explanation**: - - - Checks if the module is enabled. - - For Kubernetes modules, expects a `(version, resource)` tuple. - - For other modules, expects a single resource. - - Updates `versions` only for Kubernetes modules. - -**Expected Outcome**: - -- `deploy_module` can handle both versioned and non-versioned modules. - -### Part 2 Implementation Steps - -#### Step 6: Update `core/config.py` to Handle Versioning Exclusively for Kubernetes Modules - -__Action__: Adjust `get_module_config` to inject version information only for Kubernetes modules. - -**Technical Details**: - -- **Define Kubernetes Modules List**: - -```python -KUBERNETES_MODULES = [ - 'cert_manager', - 'kubevirt', - 'multus', - # Other Kubernetes modules... -] -``` - -- __Update `get_module_config` Function__: - -```python -def get_module_config( - module_name: str, - config: pulumi.Config, - ) -> Tuple[Any, bool]: - module_config_dict = config.get_object(module_name) or {} - module_enabled = module_config_dict.get('enabled', False) - - # Inject version for Kubernetes modules - if module_name in KUBERNETES_MODULES: - default_versions = load_default_versions() - module_config_dict['version'] = module_config_dict.get('version', default_versions.get(module_name)) - - # Import and instantiate the module's configuration class - # ... (same as before) -``` - -- **Explanation**: - - - Only Kubernetes modules receive a `version` in their configuration. - - Non-Kubernetes modules are unaffected. - -**Expected Outcome**: - -- Version handling is exclusive to Kubernetes modules. - -#### Step 7: Adjust Module Discovery Functions - -__Action__: Ensure `discover_config_class` and `discover_deploy_function` work for all modules. - -**Technical Details**: - -- __Update `discover_config_class`__: - -```python -def discover_config_class(module_name: str) -> Type[BaseModel]: - types_module = importlib.import_module(f"modules.{module_name}.types") - for name, obj in inspect.getmembers(types_module): - if inspect.isclass(obj) and issubclass(obj, BaseModel): - return obj - raise ValueError(f"No Pydantic BaseModel found in modules.{module_name}.types") -``` - -- __Update `discover_deploy_function`__: - -```python -def discover_deploy_function(module_name: str) -> Callable: - deploy_module = importlib.import_module(f"modules.{module_name}.deploy") - function_name = f"deploy_{module_name}_module" - deploy_function = getattr(deploy_module, function_name, None) - if not deploy_function: - raise ValueError(f"No deploy function named '{function_name}' found in modules.{module_name}.deploy") - return deploy_function -``` - -**Expected Outcome**: - -- Module discovery functions can handle any module following the standard structure. - -#### Step 8: Remove Version Handling from Cloud Provider Modules - -**Action**: Remove any version-related code from cloud provider modules. - -**Technical Details**: - -- **In `types.py`**: - - Ensure no `version` field is present. - -- **In `deploy.py`**: - - Ensure deployment functions do not return version information. - - Remove any logic related to version checking or handling. - -**Expected Outcome**: - -- Cloud provider modules (AWS, GCP, Azure) are free of unnecessary version handling code. - -#### Step 9: Optional - Load Versions from `requirements.txt` - -**Action**: Implement a utility to extract versions from `requirements.txt` for logging purposes. - -**Technical Details**: - -- **Utility Function**: - -```python -def get_module_version_from_requirements(module_name: str) -> Optional[str]: - try: - with open('requirements.txt', 'r') as f: - for line in f: - if module_name in line: - parts = line.strip().split('==') - if len(parts) == 2: - return parts[1] - except Exception as e: - pulumi.log.warn(f"Error reading requirements.txt: {e}") - return None -``` - -- **Usage**: - - - For logging or documentation, not for configuration. - -**Expected Outcome**: - -- Ability to report the versions of cloud provider SDKs in use. - -### Part 3 Implementation Steps - -#### Step 10: Integrate Pydantic into the Project - -**Action**: Add Pydantic to the project dependencies. - -**Technical Details**: - -- **Install Pydantic**: - -```bash -pip install pydantic -``` - -- **Update `requirements.txt`**: - -``` -pydantic>=1.8.2 -``` - -**Expected Outcome**: - -- Pydantic is available for use in the codebase. - -#### Step 11: Define Base Configuration Classes - -__Action__: Create `BaseConfig` in `pulumi/core/base_config.py` for common fields. - -**Technical Details**: - -- **Define `BaseConfig`**: - -```python -from pydantic import BaseModel - -class BaseConfig(BaseModel): - enabled: bool = False -``` - -- **Explanation**: - - - Provides a common `enabled` field. - - Modules can inherit from `BaseConfig`. - -**Expected Outcome**: - -- A foundational configuration class for modules to extend. - -#### Step 12: Update Module Configuration Models to Use Pydantic - -**Action**: Update `types.py` in each module to define Pydantic models. - -**Technical Details**: - -- **For AWS Module**: - -```python -from pydantic import BaseModel, Field, root_validator -from typing import Optional, List, Dict, Any - -class AWSConfig(BaseConfig): - profile: Optional[str] = None - region: str - account_id: Optional[str] = None - landingzones: List[Dict[str, Any]] = Field(default_factory=list) - - @root_validator - def validate_region(cls, values): - if not values.get('region'): - raise ValueError('The "region" field is required for AWSConfig.') - return values -``` - -- **For Kubernetes Module**: - -```python -class CertManagerConfig(BaseConfig): - version: str = "latest" - namespace: str = "cert-manager" - install_crds: bool = True -``` - -**Expected Outcome**: - -- Module configurations are now type-safe and validated using Pydantic. - -#### Step 13: Centralize Configuration Loading and Validation - -__Action__: Update `get_module_config` in `core/config.py` to use Pydantic models. - -**Technical Details**: - -- **Update Function**: - -```python -def get_module_config( - module_name: str, - config: pulumi.Config, - ) -> Tuple[Any, bool]: - module_config_dict = config.get_object(module_name) or {} - module_enabled = module_config_dict.get('enabled', False) - - if module_name in KUBERNETES_MODULES: - default_versions = load_default_versions() - module_config_dict['version'] = module_config_dict.get('version', default_versions.get(module_name)) - - types_module = importlib.import_module(f"modules.{module_name}.types") - ModuleConfigClass = getattr(types_module, f"{module_name.capitalize()}Config") - - try: - config_obj = ModuleConfigClass(**module_config_dict) - except ValidationError as e: - pulumi.log.error(f"Configuration error in module '{module_name}':\n{e}") - raise - - return config_obj, module_enabled -``` - -**Expected Outcome**: - -- Configurations are loaded and validated centrally, reducing duplication. - -#### Step 14: Update Deployment Functions to Use Validated Configurations - -**Action**: Modify deployment functions to accept the Pydantic `config` object. - -**Technical Details**: - -- **Example for AWS Module**: - -```python -def deploy_aws_module( - config: AWSConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], - ) -> pulumi.Resource: - aws_provider = providers.get('aws') - if not aws_provider: - raise ValueError("AWS provider not found.") - - # Use configuration values directly - # Example: region = config.region - # ... - - # Deployment logic - # ... -``` - -**Expected Outcome**: - -- Deployment functions operate on validated, type-safe configurations. - -#### Step 15: Provide Clear Error Reporting - -**Action**: Ensure validation errors are clearly reported to the user. - -**Technical Details**: - -- __In `get_module_config`__: - -```python -except ValidationError as e: - pulumi.log.error(f"Configuration error in module '{module_name}':\n{e}") - raise -``` - -- **User Feedback**: - -``` -Configuration error in module 'aws': -1 validation error for AWSConfig -region - The "region" field is required for AWSConfig. (type=value_error) -``` - -**Expected Outcome**: - -- Users receive immediate and actionable feedback on configuration issues. - -#### Step 16: Document Configuration Schemas - -**Action**: Update `README.md` for each module to reflect the new configuration schemas. - -**Technical Details**: - -- **Include Configuration Examples**: - -```markdown -## Configuration Schema - -```yaml -aws: - enabled: true - profile: "default" - region: "us-west-2" - # Other fields... -``` - -``` - -``` - -- **Explain Each Field**: - - - **enabled**: Enables or disables the module. - - **profile**: AWS CLI profile name. - - **region**: AWS region (required). - - __account_id__: AWS account ID. - - **landingzones**: List of landing zone configurations. - -**Expected Outcome**: - -- Users have clear guidance on how to configure each module. - ---- - -## Technical Considerations - -### Dependency Management - -- **Pydantic Version**: Ensure compatibility with the Python version in use. -- **Pulumi Providers**: Keep cloud provider SDKs up to date via `requirements.txt`. - -### Error Handling and Logging - -- **Consistent Logging**: Use `pulumi.log` for logging messages. -- **Exception Handling**: Catch and handle exceptions at appropriate levels. - -### Testing Strategy - -- **Unit Tests**: Write tests for configuration models and deployment functions. -- **Integration Tests**: Test module deployments in sandbox environments. -- **Continuous Integration**: Implement CI pipelines to automate testing. - -### Documentation Standards - -- **Docstrings**: Use Google or NumPy style for functions and classes. -- **Inline Comments**: Explain complex logic within the code. -- **Module `README.md`**: Provide comprehensive documentation for each module. - -### Security Implications - -- **Sensitive Data Handling**: Ensure that sensitive information is not logged or exposed. -- **Configuration Validation**: Prevent invalid configurations that could lead to security vulnerabilities. - ---- - -## Risks and Mitigations - -- **Risk**: Breaking Changes During Refactoring - - **Mitigation**: Implement changes incrementally and test thoroughly. - -- **Risk**: Incompatibility with Existing Configurations - - **Mitigation**: Provide migration guides and support legacy configurations temporarily. - -- **Risk**: Learning Curve for Pydantic - - **Mitigation**: Provide training and resources for the development team. - ---- - -## Timeline and Milestones - -1. **Week 1**: - - Integrate Pydantic into the project. - - Update `core/config.py` and base classes. - -2. **Week 2**: - - Refactor AWS module to align with Kubernetes modules. - - Remove version handling from cloud provider modules. - -3. **Week 3**: - - Update Kubernetes modules to use Pydantic. - - Adjust core deployment functions. - -4. **Week 4**: - - Comprehensive testing and bug fixing. - - Update documentation and provide training. - -5. **Week 5**: - - Final review and deployment to production. - - Post-deployment monitoring and support. - ---- - -## Conclusion - -This technical roadmap provides a detailed plan for refactoring and enhancing the Konductor IaC codebase. By aligning module structures, implementing modular version handling, and integrating Pydantic for configuration management, we address current inconsistencies and set the foundation for future growth. The detailed steps and technical considerations outlined in this document are intended to guide the principal engineers leading the project through a successful implementation. - ---- - -## Appendices - -### Appendix A: Pydantic Overview - -Refer to the [Pydantic documentation](https://pydantic-docs.helpmanual.io/) for comprehensive information on features and usage. - -### Appendix B: Code Samples - -Detailed code samples are provided within the implementation steps above. - -### Appendix C: Glossary - -- **IaC (Infrastructure as Code)**: Managing and provisioning computing infrastructure through machine-readable definition files. -- **Pulumi**: An IaC tool that allows you to define cloud resources using programming languages. -- **Pydantic**: A Python library for data validation using type annotations. -- **BaseModel**: The base class in Pydantic for creating data models. -- **Provider**: In Pulumi, a provider is a plugin that interacts with a cloud service. diff --git a/pulumi/core/README.md b/pulumi/core/README.md deleted file mode 100644 index 9d4cbeb..0000000 --- a/pulumi/core/README.md +++ /dev/null @@ -1,266 +0,0 @@ -# Core Module Developer Guide - -Welcome to the **Core Module** of the Kargo KubeVirt Kubernetes PaaS project! This guide is designed to help both newcomers to DevOps and experienced module developers navigate and contribute to the core functionalities of the Kargo platform. Whether you're looking to understand the basics or dive deep into the module development, this guide has got you covered. - ---- - -## Table of Contents - -- [Introduction](#introduction) -- [Getting Started](#getting-started) -- [Core Module Overview](#core-module-overview) - - [Module Structure](#module-structure) - - [Key Components](#key-components) - -- [Detailed Explanation of Core Files](#detailed-explanation-of-core-files) - - [config.py](#configpy) - - [deployment.py](#deploymentpy) - - [metadata.py](#metadatapy) - - [resource_helpers.py](#resource_helperspy) - - [types.py](#typespy) - - [utils.py](#utilspy) - -- [Best Practices](#best-practices) -- [Troubleshooting and FAQs](#troubleshooting-and-faqs) -- [Contributing to the Core Module](#contributing-to-the-core-module) -- [Additional Resources](#additional-resources) - ---- - -## Introduction - -The Core Module is the heart of the Kargo KubeVirt Kubernetes PaaS project. It provides essential functionalities that facilitate the development, deployment, and management of modules within the Kargo ecosystem. This guide aims to make core concepts accessible to everyone, regardless of their experience level in DevOps. - ---- - -## Getting Started - -If you're new to Kargo or DevOps, start here! - -- **Prerequisites**: - - - Basic understanding of Python and Kubernetes. - - [Pulumi CLI](https://www.pulumi.com/docs/get-started/) installed. - - Access to a Kubernetes cluster (minikube, kind, or cloud-based). - -- **Setup Steps**: - -1. **Clone the Repository**: - -```bash -git clone https://github.com/ContainerCraft/Kargo.git -cd Kargo/pulumi -``` - -2. **Install Dependencies**: - -```bash -pip install -r requirements.txt -``` - -3. **Configure Pulumi**: - -```bash -pulumi login -pulumi stack init dev -``` - ---- - -## Core Module Overview - -### Module Structure - -The Core Module is organized as follows: - -``` -pulumi/core/ -├── __init__.py -├── README.md -├── config.py -├── deployment.py -├── metadata.py -├── resource_helpers.py -├── types.py -└── utils.py -``` - -### Key Components - -- **Configuration Management**: Handles loading and merging of user configurations. -- **Deployment Orchestration**: Manages the deployment of modules and resources. -- **Metadata Management**: Generates and applies global labels and annotations. -- **Utility Functions**: Provides helper functions for common tasks. -- **Type Definitions**: Contains shared data structures used across modules. - ---- - -## Detailed Explanation of Core Files - -### config.py - -**Purpose**: Manages configuration settings for modules, including loading defaults and exporting deployment results. - -**Key Functions**: - -- `get_module_config(module_name, config, default_versions)`: Retrieves and merges the configuration for a specific module. -- `load_default_versions(config, force_refresh=False)`: Loads default module versions, prioritizing user-specified sources. -- `export_results(versions, configurations, compliance)`: Exports deployment outputs for reporting and auditing. - -**Usage Example**: - -```python -from core.config import get_module_config - -module_config, is_enabled = get_module_config('cert_manager', config, default_versions) -if is_enabled: - # Proceed with deployment -``` - ---- - -### deployment.py - -**Purpose**: Orchestrates the deployment of modules, initializing providers and handling dependencies. - -**Key Functions**: - -- `initialize_pulumi()`: Sets up Pulumi configurations and Kubernetes provider. -- `deploy_module(module_name, config, ...)`: Deploys a specified module, handling its configuration and dependencies. - -**Usage Example**: - -```python -from core.deployment import initialize_pulumi, deploy_module - -init = initialize_pulumi() -deploy_module('kubevirt', init['config'], ...) -``` - ---- - -### metadata.py - -**Purpose**: Manages global metadata, such as labels and annotations, ensuring consistency across resources. - -**Key Components**: - -- **Singleton Pattern**: Ensures a single source of truth for metadata. -- **Metadata Functions**: - - `set_global_labels(labels)` - - `set_global_annotations(annotations)` - - `get_global_labels()` - - `get_global_annotations()` - -**Usage Example**: - -```python -from core.metadata import set_global_labels - -set_global_labels({'app': 'kargo', 'env': 'production'}) -``` - ---- - -### resource_helpers.py - -**Purpose**: Provides helper functions for creating Kubernetes resources with consistent metadata. - -**Key Functions**: - -- `create_namespace(name, labels, annotations, ...)` -- `create_custom_resource(name, args, ...)` -- `create_helm_release(name, args, ...)` - -**Usage Example**: - -```python -from core.resource_helpers import create_namespace - -namespace = create_namespace('kargo-system', labels={'app': 'kargo'}) -``` - ---- - -### types.py - -**Purpose**: Defines shared data structures and configurations used across modules. - -**Key Data Classes**: - -- `NamespaceConfig` -- `FismaConfig` -- `NistConfig` -- `ScipConfig` -- `ComplianceConfig` - -**Usage Example**: - -```python -from core.types import ComplianceConfig - -compliance_settings = ComplianceConfig(fisma=FismaConfig(enabled=True)) -``` - ---- - -### utils.py - -**Purpose**: Contains utility functions for common tasks such as version checking and resource transformations. - -**Key Functions**: - -- `set_resource_metadata(metadata, global_labels, global_annotations)` -- `get_latest_helm_chart_version(url, chart_name)` -- `is_stable_version(version_str)` - -**Usage Example**: - -```python -from core.utils import get_latest_helm_chart_version - -latest_version = get_latest_helm_chart_version('https://charts.jetstack.io', 'cert-manager') -``` - ---- - -## Best Practices - -- **Consistency**: Use the core functions and types to ensure consistency across modules. -- **Modularity**: Keep module-specific logic separate from core functionalities. -- **Documentation**: Document your code and configurations to aid future developers. -- **Error Handling**: Use appropriate error handling and logging for better debugging. - ---- - -## Troubleshooting and FAQs - -**Q1: I get a `ConnectionError` when deploying modules. What should I do?** - -- **A**: Ensure your Kubernetes context is correctly configured and that you have network access to the cluster. - -**Q2: How do I add a new module?** - -- **A**: Create a new directory under `pulumi/modules/`, define your `deploy.py` and `types.py`, and update the main deployment script. - -**Q3: The deployment hangs during resource creation.** - -- **A**: Check for resource conflicts or namespace issues. Use `kubectl` to inspect the current state. - ---- - -## Contributing to the Core Module - -We welcome contributions from the community! - -- **Reporting Issues**: Use the GitHub issues page to report bugs or request features. -- **Submitting Pull Requests**: Follow the project's coding standards and ensure all tests pass. -- **Code Reviews**: Participate in reviews to maintain high code quality. - ---- - -## Additional Resources - -- **Kargo Project Documentation**: [Kargo GitHub Repository](https://github.com/ContainerCraft/Kargo) -- **Pulumi Documentation**: [Pulumi Official Docs](https://www.pulumi.com/docs/) -- **Kubernetes API Reference**: [Kubernetes API](https://kubernetes.io/docs/reference/generated/kubernetes-api/) diff --git a/pulumi/modules/README.md b/pulumi/modules/README.md deleted file mode 100644 index 77be9d2..0000000 --- a/pulumi/modules/README.md +++ /dev/null @@ -1,280 +0,0 @@ -# Konductor User Guide - -Welcome to the **Konductor IaC Platform Engineering User Guide**. This document provides an in-depth overview of the design principles, code structure, and best practices for developing and maintaining modules within the Konductor IaC codebase. It is intended for both DevOps Template users and Konductor Template Developers to understand, contribute, and confidently work with the project's architecture and features. - ---- - -## Table of Contents - -- [Introduction](#introduction) -- [Design Principles](#design-principles) -- [Code Structure](#code-structure) -- [Configuration Management with Pydantic](#configuration-management-with-pydantic) - - [Why Pydantic?](#why-pydantic) - - [Integration Strategy](#integration-strategy) - -- [Module Development Guide](#module-development-guide) - - [1. Module Configuration](#1-module-configuration) - - [2. Defining Configuration Models](#2-defining-configuration-models) - - [3. Module Deployment Logic](#3-module-deployment-logic) - - [4. Updating `__main__.py`](#4-updating-__main__py) - - [5. Best Practices](#5-best-practices) - -- [Example Module: Cert Manager](#example-module-cert-manager) - - [Configuration Schema](#configuration-schema) - - [Configuration Model](#configuration-model) - - [Deployment Logic](#deployment-logic) - - [Integration in `__main__.py`](#integration-in-__main__py) - -- [Conclusion](#conclusion) - ---- - -## Introduction - -Konductor is a Pulumi-based Infrastructure as Code (IaC) platform designed to streamline DevOps workflows and Platform Engineering practices. It leverages Pulumi for IaC and uses Python for scripting and automation. This guide aims to standardize module development by centralizing configuration management using Pydantic, simplifying module code, and promoting consistency across the codebase. - ---- - -## Design Principles - -- **Modularity**: Each module should be self-contained, defining its own configuration schema and deployment logic. -- **Centralized Configuration**: Use a centralized mechanism for loading and validating configurations to reduce duplication. -- **Type Safety**: Employ Pydantic models for configuration schemas to ensure type safety and validation. -- **Consistency**: Establish clear patterns and standards for module development to ensure uniformity. -- **Developer Experience (DX)**: Simplify the development process with clear guidelines and reusable components. -- **User Experience (UX)**: Provide clear documentation and error messages to enhance user interaction with the platform. -- **Extensibility**: Allow modules to be easily added or modified without affecting the core system. - ---- - -## Code Structure - -- __`__main__.py`__: The entry point of the Pulumi program. Handles global configurations, provider setup, and module deployments. -- **`core/`**: Contains shared utilities and libraries, such as configuration management (`config.py`), deployment orchestration (`deployment.py`), and metadata handling (`metadata.py`). -- __`modules//`__: Each module resides in its own directory under `modules/`, containing its specific configuration models (`types.py`) and deployment logic (`deploy.py`). -- __`modules//types.py`__: Defines Pydantic models for module configurations with default values and validation logic. -- __`modules//deploy.py`__: Contains the module-specific deployment logic, taking in the validated configuration and returning relevant outputs. -- __`modules//*.py`__: Contains additional module-specific scripts or utilities, if needed. -- __`modules//README.md`__: Module-specific documentation with configuration options, features, and usage instructions. -- **`requirements.txt`**: Lists the dependencies for the project, including Pydantic and cloud provider SDKs. - ---- - -## Configuration Management with Pydantic - -### Why Pydantic? - -- **Type Safety and Validation**: Ensures configurations are type-safe and valid before deployment. -- **Flexibility**: Allows modules to define complex nested configurations and custom validation logic. -- **Error Reporting**: Provides clear and detailed error messages for invalid configurations. -- **Ease of Use**: Simplifies the configuration process for both developers and users. - -### Integration Strategy - -- **Module Autonomy**: Each module defines its own Pydantic configuration model in `types.py`. -- **Centralized Loading**: A core function handles loading configurations from Pulumi config and passes them to the modules after validation. -- **Consistency**: Modules follow a consistent pattern for defining configurations and deployment functions. - ---- - -## Module Development Guide - -### 1. Module Configuration - -- **Purpose**: Retrieve and validate the module's configuration using Pydantic models. -- __Implementation__: Use the `get_module_config` function in `core/config.py`. - -```python -# core/config.py - -from pydantic import ValidationError - -def get_module_config(module_name: str, config: pulumi.Config) -> Tuple[Any, bool]: - module_config_dict = config.get_object(module_name) or {} - module_enabled = module_config_dict.get('enabled', False) - - # Import the module's configuration class - types_module = importlib.import_module(f"modules.{module_name}.types") - ModuleConfigClass = getattr(types_module, f"{module_name.capitalize()}Config") - - try: - # Create an instance of the configuration model - config_obj = ModuleConfigClass(**module_config_dict) - except ValidationError as e: - # Handle validation errors - pulumi.log.error(f"Configuration error in module '{module_name}':\n{e}") - raise - - return config_obj, module_enabled -``` - -### 2. Defining Configuration Models - -- **Purpose**: Define a Pydantic model for the module's configuration with default values and validation logic. -- **Implementation**: Create a `types.py` in the module's directory. - -```python -# modules/module_name/types.py - -from pydantic import BaseModel, Field, validator - -class ModuleNameConfig(BaseModel): - enabled: bool = False - version: Optional[str] = "latest" # For Kubernetes modules - # ... other configuration fields ... - - @validator('version') - def check_version(cls, v): - if v not in ["latest", "stable", "edge"]: - raise ValueError("Invalid version specified") - return v -``` - -### 3. Module Deployment Logic - -- **Purpose**: Implement the module's deployment logic using the validated configuration. -- **Implementation**: Create a `deploy.py` in the module's directory. - -```python -# modules/module_name/deploy.py - -def deploy_module_name( - config: ModuleNameConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], -) -> pulumi.Resource: - # Module-specific deployment logic - # Use configuration values directly, e.g., config.version - # Return the primary resource -``` - -### 4. Updating `__main__.py` - -- **Purpose**: Integrate the module into the main Pulumi program. -- **Implementation**: - -```python -# __main__.py - -from core.config import get_module_config - -# Initialize providers and other global configurations -providers = { - 'aws': aws_provider, - 'k8s': k8s_provider, - # Add other providers as needed -} - -# List of modules to deploy -modules_to_deploy = ["aws", "cert_manager", "kubevirt", "multus"] - -# Deploy modules -for module_name in modules_to_deploy: - config_obj, module_enabled = get_module_config(module_name, config) - if module_enabled: - deploy_func = discover_deploy_function(module_name) - primary_resource = deploy_func( - config=config_obj, - global_depends_on=global_depends_on, - providers=providers, - ) - global_depends_on.append(primary_resource) - configurations[module_name] = {"enabled": module_enabled} - else: - pulumi.log.info(f"Module {module_name} is not enabled.") -``` - -### 5. Best Practices - -- **Use Pydantic Models**: Define all configurations using Pydantic for validation and type safety. -- **Autonomy**: Modules control their own configuration schema and validation logic. -- **Error Handling**: Provide clear and informative error messages for configuration issues. -- **Documentation**: Document configuration options and usage in module `README.md` files. -- **Consistency**: Follow the established code structure and patterns for module development. -- **Avoid Global Variables**: Pass necessary objects as arguments to functions. - ---- - -## Example Module: Cert Manager - -### Configuration Schema - -```yaml -# Pulumi configuration (Pulumi..yaml) -cert_manager: - enabled: true - version: "v1.15.3" - namespace: "cert-manager" - install_crds: true -``` - -### Configuration Model - -```python -# modules/cert_manager/types.py - -from pydantic import BaseModel - -class CertManagerConfig(BaseModel): - enabled: bool = False - version: str = "latest" - namespace: str = "cert-manager" - install_crds: bool = True - # ... other fields and validators as needed ... -``` - -### Deployment Logic - -```python -# modules/cert_manager/deploy.py - -def deploy_cert_manager( - config: CertManagerConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], -) -> pulumi.Resource: - k8s_provider = providers.get('k8s') - # Deployment logic for Cert Manager using config values - # Return the Helm release resource or primary resource -``` - -### Integration in `__main__.py` - -```python -# __main__.py - -config_cert_manager, cert_manager_enabled = get_module_config('cert_manager', config) -if cert_manager_enabled: - from modules.cert_manager.deploy import deploy_cert_manager - cert_manager_release = deploy_cert_manager( - config=config_cert_manager, - global_depends_on=global_depends_on, - providers=providers, - ) - global_depends_on.append(cert_manager_release) - configurations["cert_manager"] = {"enabled": cert_manager_enabled} -else: - pulumi.log.info("Cert Manager is not enabled.") -``` - ---- - -## Conclusion - -By following this guide and utilizing Pydantic for configuration management, developers can create modules that are: - -- **Robust**: Type-safe and validated configurations reduce runtime errors. -- **Maintainable**: Clear separation of concerns and consistent patterns simplify maintenance. -- **User-Friendly**: Clear documentation and error messages improve user experience. -- **Extensible**: New modules can be added with minimal changes to the core system. - -For any questions or further assistance, please refer to the `DEVELOPER.md` document or reach out to the Konductor development team. - ---- - -## Next Steps - -- **Developers**: Start updating or creating modules using the guidelines provided. -- **Users**: Refer to module `README.md` files for configuration options and usage instructions. -- **Contributors**: Follow the contribution guidelines in `DEVELOPER.md` to submit enhancements or bug fixes. diff --git a/pulumi/modules/aws/DEVELOPER.md b/pulumi/modules/aws/DEVELOPER.md deleted file mode 100644 index 72aa63b..0000000 --- a/pulumi/modules/aws/DEVELOPER.md +++ /dev/null @@ -1,258 +0,0 @@ -# Konductor Developer Guide - -## Introduction - -This document is intended for developers who want to contribute to the Konductor IaC codebase. It provides insights into the code structure, development best practices, and the contribution workflow. By adhering to these guidelines, developers can ensure that their contributions align with the project's standards and quality expectations. - ---- - -## Table of Contents - -1. [Code Structure](#code-structure) -2. [Development Best Practices](#development-best-practices) -3. [Contribution Workflow](#contribution-workflow) -4. [Adding Enhancements and Features](#adding-enhancements-and-features) -5. [Testing and Validation](#testing-and-validation) -6. [Documentation Standards](#documentation-standards) -7. [Support and Resources](#support-and-resources) - ---- - -## Code Structure - -The project is organized into modular components to ensure scalability and maintainability. Below is an overview of the key components: - -- __`__main__.py`__: The entry point of the Pulumi program. -- **`core/`**: Contains shared utilities and libraries. - - **`config.py`**: Handles configuration loading and validation using Pydantic. - - **`deployment.py`**: Manages deployment orchestration and module integration. - - **`metadata.py`**: Manages global metadata, labels, and annotations. - - **`utils.py`**: Provides generic utility functions. - -- **`modules/`**: Contains individual modules, each in its own directory. - - __`modules//types.py`__: Defines Pydantic models for module configurations. - - __`modules//deploy.py`__: Contains module-specific deployment logic. - - __`modules//*.py`__: Contains other module-specific functions and components. - - __`modules//README.md`__: Module-specific documentation. - -- **`requirements.txt`**: Lists the dependencies for the project. - -### Example Directory Structure - -``` -pulumi/ -├── __main__.py -├── core/ -│ ├── __init__.py -│ ├── config.py -│ ├── deployment.py -│ ├── metadata.py -│ └── utils.py -├── modules/ -│ ├── aws/ -│ │ ├── __init__.py -│ │ ├── types.py -│ │ ├── deploy.py -│ │ ├── # Other components... -│ │ └── README.md -│ ├── cert_manager/ -│ │ ├── __init__.py -│ │ ├── types.py -│ │ ├── deploy.py -│ │ ├── # Other components... -│ │ └── README.md -│ └── # Other modules... -├── requirements.txt -└── # Other files... -``` - ---- - -## Development Best Practices - -### Code Hygiene - -- **Modularity**: Break down functions and logic into small, reusable components. -- **Type Annotations**: Use type hints throughout the code for better readability and tooling support. -- **Docstrings and Comments**: Document code extensively using docstrings and inline comments. -- **Error Handling**: Implement robust error handling and logging for easier debugging. -- **Resource Options**: Use `ResourceOptions` to manage resource dependencies and parent-child relationships. - -### Naming Conventions - -- __Files and Directories__: Use `snake_case` for file and directory names. -- **Classes**: Use `PascalCase` for class names. -- __Variables and Functions__: Use `snake_case` for variable and function names. -- __Constants__: Use `UPPER_SNAKE_CASE` for constants. - -### Configuration Management - -- **Use Pydantic Models**: Define configurations using Pydantic models in `types.py`. -- **Validation**: Include validation logic within the models to ensure configurations are correct before deployment. -- **Default Values**: Provide sensible default values for configuration fields. - -### Module Integration - -- **Deployment Functions**: Follow the standard function signature for deployment functions. - -```python -def deploy_( - config: ModuleNameConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], -) -> pulumi.Resource: - # Deployment logic -``` - -- __Configuration Loading__: Use the `get_module_config` function to load and validate configurations. - -- **Providers**: Access required providers from the `providers` dictionary passed to deployment functions. - ---- - -## Contribution Workflow - -### Fork and Clone the Repository - -1. **Fork the repository** on GitHub. - -2. **Clone the forked repository** to your local machine. - -```sh -git clone https://github.com/your-username/konductor.git -cd konductor/pulumi -``` - -### Set Up Development Environment - -1. **Create a virtual environment**: - -```sh -python -m venv venv -source venv/bin/activate -``` - -2. **Install dependencies**: - -```sh -pip install -r requirements.txt -``` - -3. **Configure Pulumi**: - -```sh -pulumi login -pulumi stack init dev -``` - -4. **Set Up Cloud Provider Credentials**: - -- For AWS: - -```sh -aws configure -``` - -- For other providers, follow their respective setup instructions. - -### Create a Feature Branch - -```sh -git checkout -b feature/new-module -``` - -### Implement Changes - -1. **Develop the module** following the guidelines in this document. -2. **Add or modify code** in the appropriate modules. -3. **Write tests** for your changes (if applicable). -4. **Ensure code quality** by running linters and formatters. - -### Commit and Push - -1. **Commit your changes** with clear and concise messages. - -```sh -git add . -git commit -m "Add new module: description" -``` - -2. **Push to your fork**: - -```sh -git push origin feature/new-module -``` - -### Create a Pull Request - -1. **Go to the original repository** on GitHub. -2. **Open a pull request** from your feature branch. -3. **Provide a detailed description** of the changes made and any relevant context or documentation. - ---- - -## Adding Enhancements and Features - -### Example: Adding a New Module - -1. **Create Module Directory**: - -```sh -mkdir modules/new_module -touch modules/new_module/__init__.py -``` - -2. **Define Configuration Model**: - -```python -# modules/new_module/types.py - -from pydantic import BaseModel - -class NewModuleConfig(BaseModel): - enabled: bool = False - # ... other configuration fields ... -``` - -3. **Implement Deployment Logic**: - -```python -# modules/new_module/deploy.py - -def deploy_new_module( - config: NewModuleConfig, - global_depends_on: List[pulumi.Resource], - providers: Dict[str, Any], -) -> pulumi.Resource: - # Deployment logic -``` - -4. __Update `__main__.py`__: - - Add the module to `modules_to_deploy`. - - Ensure the module is correctly loaded and deployed. - ---- - -## Testing and Validation - -- **Unit Tests**: Write unit tests for critical functions and logic. -- **Integration Tests**: Test module integration with the core system. -- **Pulumi Preview**: Use `pulumi preview` to validate infrastructure changes. -- **Code Review**: Request reviews from team members to ensure code quality. - ---- - -## Documentation Standards - -- **Module README**: Each module should have a `README.md` explaining its purpose, configuration options, and usage instructions. -- **Docstrings**: Use Google or NumPy style docstrings for functions and classes. -- **Inline Comments**: Add comments to explain complex logic or decisions. -- **Change Logs**: Maintain a `CHANGELOG.md` if applicable. - ---- - -## Support and Resources - -- **Slack Channel**: Join the project's Slack channel for real-time communication. -- **Issue Tracker**: Use GitHub Issues to report bugs or request features. -- **Wiki**: Refer to the project wiki for additional resources and guides. diff --git a/pulumi/modules/aws/ROADMAP.md b/pulumi/modules/aws/ROADMAP.md deleted file mode 100644 index 9a77b14..0000000 --- a/pulumi/modules/aws/ROADMAP.md +++ /dev/null @@ -1,289 +0,0 @@ -# Project Roadmap: Comprehensive AWS Organization and IAM Management with Pulumi - -## Introduction - -This project aims to develop a comprehensive AWS Organization and IAM Management program using Pulumi and Python. It will focus on demonstrating high-quality AWS infrastructure automation setups, including AWS Organizations, Control Tower, IAM policies, roles, users, groups, and associated permissions. The primary audience for this project is senior platform engineering teams who require real-world, scalable, and modular infrastructure automation solutions. - -## Goals and Objectives - -- Demonstrate the setup of an AWS Organization with nested Organizational Units (OUs). -- Implement AWS Control Tower for multi-account governance. -- Configure and manage IAM roles, policies, users, and groups. -- Adopt best practices for infrastructure as code (IaC), including modular code, thorough documentation, and type-safe configurations. - -## Plan of Action - -### 1. Code Structure and Modules - -1. **Core Module**: Handles configuration management, deployment orchestration, metadata management, and utility functions. -2. **Infrastructure Module**: Manages AWS-specific resources, including AWS Organization, Control Tower, IAM roles, policies, users, groups, and their permissions. -3. **Utils Module**: Provides helper functions for common tasks and resource transformations. - -### 2. Defining Configuration Structures - -1. **Configuration File (Pulumi.aws.yaml)**: - - - Centralize all configurations, including AWS Organization details, Control Tower settings, IAM configurations, and global tags. - - Use nested structures to define tenant accounts, workloads, and associated tags. - -2. **Type-Safe Data Classes**: - - - Define data classes for AWS configurations using Python's `dataclasses` module. - - Ensure type safety and clarity by using well-defined data structures. - -### 3. AWS Organization Setup - -1. **Creating AWS Organizations**: - - Define a function to create AWS Organizations with all features enabled. - -```python -def create_organization() -> aws.organizations.Organization: - """ - Creates an AWS Organization with all features enabled. - - Returns: - aws.organizations.Organization: The AWS Organization resource. - """ - organization = aws.organizations.Organization("my_organization", feature_set="ALL", opts=ResourceOptions(protect=True)) - return organization -``` - -2. **Creating Organizational Units (OUs)**: - - Define a function to create OUs under the specified AWS Organization. - -```python -def create_organizational_units(organization: aws.organizations.Organization, config: ControlTowerConfig) -> Dict[str, aws.organizations.OrganizationalUnit]: - """ - Creates Organizational Units (OUs) under the specified AWS Organization. - - Args: - organization: The AWS Organization resource. - config: ControlTowerConfig - Configuration parameters for Control Tower. - - Returns: - Dict[str, aws.organizations.OrganizationalUnit]: Dictionary of Organizational Unit resources. - """ - organizational_units = {} - for ou_name in config.managed_organizational_unit_names: - ou = aws.organizations.OrganizationalUnit(f"ou_{ou_name.lower()}", name=ou_name, parent_id=organization.roots[0].id, opts=ResourceOptions(parent=organization)) - organizational_units[ou_name] = ou - return organizational_units -``` - -### 4. AWS Control Tower Setup - -1. **Enabling AWS Control Tower**: - - Placeholder function to enable AWS Control Tower (hypothetical until AWS provides full programmatic support). - -### 5. IAM Management - -1. **Creating IAM Roles and Policies**: - - Define functions to create IAM roles for Control Tower and other specific needs. - - Attach necessary policies to these roles. - -```python -def create_iam_roles(config: ControlTowerConfig) -> Dict[str, aws.iam.Role]: - """ - Creates IAM roles required by AWS Control Tower. - - Args: - config: ControlTowerConfig - Configuration parameters for Control Tower. - - Returns: - Dict[str, aws.iam.Role]: Dictionary of IAM Role resources. - """ - iam_roles = {} - - # Control Tower Admin Role - admin_role = aws.iam.Role( - "control_tower_admin_role", - name=config.control_tower_admin_role_name, - assume_role_policy="""{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"Service": "controltower.amazonaws.com"}, "Action": "sts:AssumeRole"}]}""", - tags=global_tags.__dict__ - ) - aws.iam.RolePolicyAttachment("admin_role_policy_attachment", role=admin_role.name, policy_arn="arn:aws:iam::aws:policy/AdministratorAccess") - - iam_roles["admin_role"] = admin_role - - # Control Tower Execution Role - execution_role = aws.iam.Role( - "control_tower_execution_role", - name=config.control_tower_execution_role_name, - assume_role_policy="""{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"AWS": "*"}, "Action": "sts:AssumeRole"}]}""", - tags=global_tags.__dict__ - ) - aws.iam.RolePolicyAttachment("execution_role_policy_attachment", role=execution_role.name, policy_arn="arn:aws:iam::aws:policy/AWSControlTowerExecution") - - iam_roles["execution_role"] = execution_role - - return iam_roles -``` - -2. **Creating and Managing IAM Users**: - - Define functions to create IAM users, assign them to groups, and attach policies. - -```python -def create_iam_resources(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account, config: TenantAccountConfig) -> None: - """ - Creates IAM users, groups, roles, and policies in the tenant account. - - Args: - tenant_provider: aws.Provider - The AWS Provider configured for the tenant account. - tenant_account: aws.organizations.Account - The AWS Account resource representing the tenant account. - config: TenantAccountConfig - Configuration parameters for tenant account. - """ - # Create an IAM Group - developers_group = aws.iam.Group(f"{tenant_account.name}_developers_group", name="Developers", path="/teams/", opts=ResourceOptions(provider=tenant_provider, parent=tenant_account)) - - # Create an IAM Policy - developers_policy = aws.iam.Policy(f"{tenant_account.name}_developers_policy", name="DevelopersPolicy", path="/policies/", description="Policy for developers group.", policy="""{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Action": ["ec2:Describe*", "s3:List*"], "Resource": "*"}]}""", opts=ResourceOptions(provider=tenant_provider, parent=developers_group)) - - # Attach the Policy to the Group - aws.iam.GroupPolicyAttachment(f"{tenant_account.name}_developers_policy_attachment", group=developers_group.name, policy_arn=developers_policy.arn, opts=ResourceOptions(provider=tenant_provider, parent=developers_group)) - - # Create IAM Users and add them to the Group - for user_name in ["alice", "bob"]: - iam_user = aws.iam.User(f"{tenant_account.name}_user_{user_name}", name=user_name, path="/users/", tags={"Name": user_name, "Department": "Engineering"}, opts=ResourceOptions(provider=tenant_provider, parent=tenant_account)) - aws.iam.UserGroupMembership(f"{tenant_account.name}_user_{user_name}_group_membership", user=iam_user.name, groups=[developers_group.name], opts=ResourceOptions(provider=tenant_provider, parent=iam_user)) -``` - -### 6. Deploying Workloads in Tenant Accounts - -1. **Deploying Various Workloads**: - - Define functions for deploying EKS clusters, RDS instances, Lambda functions, etc. - - Utilize tenant-specific AWS Providers for these deployments. - -```python -def deploy_workload(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account, config: TenantAccountConfig) -> None: - """ - Deploys workloads in the tenant account based on the specified workload type. - - Args: - tenant_provider: aws.Provider - The AWS Provider configured for the tenant account. - tenant_account: aws.organizations.Account - The AWS Account resource representing the tenant account. - config: TenantAccountConfig - Configuration parameters for tenant account. - """ - workload_type = config.workload - if workload_type == "eks_cluster": - deploy_eks_cluster(tenant_provider, tenant_account, config) - elif workload_type == "rds_instance": - deploy_rds_instance(tenant_provider, tenant_account, config) - elif workload_type == "lambda_function": - deploy_lambda_function(tenant_provider, tenant_account, config) - else: - pulumi.log.warn(f"No workload defined for {tenant_account.name}") - -def deploy_eks_cluster(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account, config: TenantAccountConfig) -> None: - # Implement deployment of EKS Cluster - ... - -def deploy_rds_instance(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account, config: TenantAccountConfig) -> None: - # Implement deployment of RDS Instance - ... - -def deploy_lambda_function(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account, config: TenantAccountConfig) -> None: - # Implement deployment of Lambda Function - ... -``` - -### 7. Secrets Management - -1. **Creating Secrets in AWS Secrets Manager**: - - Define a function to create secrets in AWS Secrets Manager for tenant accounts. - -```python -def create_secrets(tenant_provider: aws.Provider, tenant_account: aws.organizations.Account) -> None: - """ - Creates secrets in AWS Secrets Manager in the tenant account. - - Args: - tenant_provider: aws.Provider - The AWS Provider configured for the tenant account. - tenant_account: aws.organizations.Account - The AWS Account resource representing the tenant account. - """ - secret = aws.secretsmanager.Secret( - f"{tenant_account.name}_secret", - name=f"{tenant_account.name}-Secret", - description="A secret for demo purposes", - tags=global_tags.__dict__, - opts=ResourceOptions(provider=tenant_provider, parent=tenant_account) - ) - aws.secretsmanager.SecretVersion( - f"{tenant_account.name}_secret_version", - secret_id=secret.id, - secret_string="SuperSecretValue", # Replace with actual secret value - opts=ResourceOptions(provider=tenant_provider, parent=secret) - ) - pulumi.export(f"{tenant_account.name}_secret_arn", secret.arn) -``` - -### 8. Main Execution and Resource Management - -1. **Main Execution Function**: - - Define the main function to call all other functions in sequence, ensuring appropriate dependencies and order of execution. - -```python -def main(): - organization = create_organization() - - if control_tower_config.enable_control_tower: - enable_control_tower(control_tower_config) - - organizational_units = create_organizational_units(organization, control_tower_config) - iam_roles = create_iam_roles(control_tower_config) - tenant_accounts = create_tenant_accounts(organizational_units[control_tower_config.organizational_unit_name], tenant_account_configs) - - apply_control_tower_controls(tenant_accounts) - - for tenant_account in tenant_accounts: - tenant_provider = aws.Provider( - f"tenant_provider_{tenant_account.name}", - assume_role=aws.ProviderAssumeRoleArgs( - role_arn=tenant_account.arn.apply( - lambda arn: arn.replace("arn:aws:organizations::", "arn:aws:iam::").replace(":account/", ":role/OrganizationAccountAccessRole") - ), - session_name="PulumiSession" - ), - region=control_tower_config.region, - opts=ResourceOptions(parent=tenant_account) - ) - - config = tenant_account_configs[tenant_account.name.lower()] - create_iam_resources(tenant_provider, tenant_account, config) - deploy_workload(tenant_provider, tenant_account, config) - create_secrets(tenant_provider, tenant_account) - - pulumi.export(f"{tenant_account.name}_id", tenant_account.id) - - for ou_name, ou in organizational_units.items(): - pulumi.export(f"organizational_unit_{ou_name}_id", ou.id) - - pulumi.export("control_tower_admin_role_arn", iam_roles["admin_role"].arn) - pulumi.export("control_tower_execution_role_arn", iam_roles["execution_role"].arn) - -if __name__ == "__main__": - main() -``` - -### 9. Documentation and Best Practices - -1. **Comprehensive Docstrings and Comments**: - - - Ensure all functions and classes are well-documented with detailed docstrings and inline comments. - -2. **Developer Guide**: - - - Create a detailed developer guide to explain the project structure, codebase, and contribution guidelines. - -### 10. Testing and Validation - -1. **Unit Testing**: - - - Write unit tests for critical functions to ensure functionality and prevent regressions. - -2. **Integration Testing**: - - - Deploy the infrastructure in a test environment and validate its correctness and robustness. - -3. **Error Handling**: - - - Implement robust error handling and logging to aid in troubleshooting and debugging. diff --git a/pulumi/modules/aws/eks_donor_opentelemetry_docs.md b/pulumi/modules/aws/eks_donor_opentelemetry_docs.md deleted file mode 100644 index bb4746d..0000000 --- a/pulumi/modules/aws/eks_donor_opentelemetry_docs.md +++ /dev/null @@ -1,517 +0,0 @@ -# AWS Distro for OpenTelemetry (ADOT) with Amazon EKS - -## Overview - -The **AWS Distro for OpenTelemetry (ADOT)** provides a secure, production-ready, AWS-supported distribution of the **OpenTelemetry** project. It enables the collection and export of telemetry data (metrics and traces) from your applications running on Amazon EKS to AWS services like Amazon Managed Service for Prometheus (AMP), Amazon CloudWatch, and AWS X-Ray. - -By integrating ADOT with Amazon EKS, you can simplify the setup and management of observability pipelines, allowing you to gain deep insights into your applications and infrastructure with minimal overhead. - ---- - -## Table of Contents - -1. [Introduction](#introduction) -2. [Requirements and Prerequisites](#requirements-and-prerequisites) -3. [Installation](#installation) - - [TLS Certificate Configuration](#tls-certificate-configuration) - - [Installation via AWS Management Console](#installation-via-aws-management-console) - - [Installation via AWS CLI](#installation-via-aws-cli) -4. [Collector Configuration](#collector-configuration) - - [Deployment Modes](#deployment-modes) - - [Example Collector Configurations](#example-collector-configurations) - - [Advanced Configurations](#advanced-configurations) -5. [Advanced Configuration for Add-on Versions](#advanced-configuration-for-add-on-versions) -6. [Instrumenting Applications](#instrumenting-applications) - - [Auto-Instrumentation Injection](#auto-instrumentation-injection) - - [Configuring Auto-Instrumentation with Instrumentation CRD](#configuring-auto-instrumentation-with-instrumentation-crd) -7. [Kubernetes Attributes Processor](#kubernetes-attributes-processor) -8. [Target Allocator](#target-allocator) -9. [Monitoring and Verification](#monitoring-and-verification) -10. [Troubleshooting](#troubleshooting) -11. [Resources and Support](#resources-and-support) -12. [Stay Connected](#stay-connected) - ---- - -## Introduction - -The AWS Distro for OpenTelemetry (ADOT) simplifies the deployment and management of OpenTelemetry components in Amazon EKS. By using the ADOT Operator as an EKS add-on, you can streamline the installation, updates, and configurations of the OpenTelemetry Collector, enhancing observability without extensive manual setup. - -### Key Components - -- **ADOT Operator**: Manages the lifecycle of the OpenTelemetry Collector within the Kubernetes environment, using Custom Resource Definitions (CRDs). -- **ADOT Collector**: Receives, processes, and exports telemetry data for both metrics and traces to various AWS services. - -An end-to-end pipeline in ADOT consists of multiple telemetry data flows, including: - -- **Prometheus Metrics Collection**: Collects and sends Prometheus metrics to Amazon Managed Service for Prometheus (AMP). -- **Metrics Pipeline**: Receives OTLP metrics and forwards them to AMP and Amazon CloudWatch. -- **Tracing Pipeline**: Collects distributed traces and sends them to AWS X-Ray. - -![ADOT Operator EKS Pipeline Diagram](#) - ---- - -## Requirements and Prerequisites - -Before installing ADOT on Amazon EKS, ensure that you have the following: - -1. **Amazon EKS Cluster**: An EKS cluster running Kubernetes version 1.21 or higher. - ```bash - kubectl version | grep "Server Version" - ``` - -2. **kubectl**: Installed and configured for your EKS cluster. - ```bash - aws eks update-kubeconfig --name --region - ``` - -3. **eksctl**: Installed for managing EKS clusters. - -4. **AWS CLI v2**: Installed and configured. - -5. **IAM Permissions**: Sufficient IAM roles and policies for EKS and ADOT. - -6. **RBAC Permissions**: Required if installing an add-on version v0.62.1 or earlier. - ```bash - kubectl apply -f https://amazon-eks.s3.amazonaws.com/docs/addons-otel-permissions.yaml - ``` - -**Note**: Currently, ADOT does not support Windows nodes or connected clusters in EKS. - ---- - -## Installation - -### TLS Certificate Configuration - -The ADOT Operator uses admission webhooks to manage OpenTelemetry Collector configurations, which require a TLS certificate trusted by the Kubernetes API server. It is recommended to use **cert-manager** for managing TLS certificates. - -#### Steps to Install cert-manager - -1. **Install cert-manager**: - ```bash - kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml - ``` - -2. **Verify cert-manager Deployment**: - Check that the cert-manager pods are running in the `cert-manager` namespace. - ```bash - kubectl get pods -w -n cert-manager - ``` - - **Expected Output**: - ``` - NAME READY STATUS RESTARTS AGE - cert-manager-5597cff495-mnb2p 1/1 Running 0 12s - cert-manager-cainjector-bd5f9c764-8jp5g 1/1 Running 0 12s - cert-manager-webhook-5f57f59fbc-h9st8 1/1 Running 0 12s - ``` - -For more details on certificate management, refer to the [cert-manager documentation](https://cert-manager.io/docs/). - -### Installation via AWS Management Console - -1. **Open Amazon EKS Console**: - - Navigate to [Amazon EKS Console](https://console.aws.amazon.com/eks/home#/clusters). - -2. **Select Cluster**: - - In the left pane, choose **Clusters** and select your EKS cluster. - -3. **Add ADOT Add-On**: - - Go to the **Add-ons** tab and click **Get more add-ons**. - - Under **Amazon EKS-addons**, select **AWS Distro for OpenTelemetry** and click **Next**. - -4. **Configure Add-On Settings**: - - Choose the desired **Version** for the ADOT add-on. - - For custom configurations, expand **Optional configuration settings** and input relevant values for the ADOT Collector. - -5. **Conflict Resolution**: - - If the cluster already has a service account without an IAM role, select **Override** under **Conflict resolution method**. - -6. **Review and Install**: - - Verify settings on the **Review and add** page, then click **Create**. - - After installation, ADOT will appear under installed add-ons. - -### Installation via AWS CLI - -1. **Enable ADOT Add-On for EKS**: - ```bash - aws eks create-addon --cluster-name --addon-name adot - ``` - -2. **Verify Add-On Installation**: - - Confirm that the ADOT Operator is running in the `opentelemetry-operator-system` namespace. - ```bash - kubectl get pods -n opentelemetry-operator-system - ``` - - - Check the status of the add-on: - ```bash - aws eks describe-addon --addon-name adot --cluster-name - ``` - The `status` should be `"ACTIVE"`. - ---- - -## Collector Configuration - -The ADOT Collector can be deployed in various modes and configured to collect and export telemetry data according to your needs. - -### Deployment Modes - -- **Deployment**: Ideal for centralized metrics and trace collection. -- **DaemonSet**: Suitable for node-level monitoring. -- **StatefulSet**: Supports stateful workloads. -- **Sidecar**: Used when a sidecar pattern is preferred for application pods. - -### Example Collector Configurations - -#### 1. Combined Metrics and Traces Collector - -This configuration collects Prometheus metrics and OTLP traces, exporting metrics to Amazon Managed Service for Prometheus and traces to AWS X-Ray. - -```yaml -apiVersion: opentelemetry.io/v1alpha1 -kind: OpenTelemetryCollector -metadata: - name: sample-adot-collector -spec: - mode: deployment - config: - receivers: - prometheus: - config: - scrape_configs: - - job_name: 'kubernetes-nodes' - kubernetes_sd_configs: - - role: node - otlp: - protocols: - grpc: - http: - processors: - batch: - exporters: - prometheusremotewrite: - endpoint: "" - auth: - authenticator: sigv4auth - awsxray: - extensions: - sigv4auth: - region: "" - service: "aps" - service: - extensions: [sigv4auth] - pipelines: - metrics: - receivers: [prometheus] - processors: [batch] - exporters: [prometheusremotewrite] - traces: - receivers: [otlp] - processors: [batch] - exporters: [awsxray] -``` - -#### 2. CloudWatch Metrics Collector Configuration - -Exports metrics to Amazon CloudWatch. - -```yaml -apiVersion: opentelemetry.io/v1alpha1 -kind: OpenTelemetryCollector -metadata: - name: adot-collector-cloudwatch -spec: - mode: deployment - config: - receivers: - prometheus: - config: - scrape_configs: - - job_name: 'cloudwatch-sample' - static_configs: - - targets: ['localhost:9090'] - processors: - batch: - timeout: 5s - exporters: - awsemf: - region: "" - service: - pipelines: - metrics: - receivers: [prometheus] - processors: [batch] - exporters: [awsemf] -``` - -#### 3. X-Ray Traces Collector Configuration - -Exports traces to AWS X-Ray. - -```yaml -apiVersion: opentelemetry.io/v1alpha1 -kind: OpenTelemetryCollector -metadata: - name: adot-collector-xray -spec: - mode: deployment - config: - receivers: - otlp: - protocols: - grpc: - endpoint: 0.0.0.0:55680 - processors: - batch: - timeout: 5s - exporters: - awsxray: - region: "" - service: - pipelines: - traces: - receivers: [otlp] - processors: [batch] - exporters: [awsxray] -``` - -### Advanced Configurations - -For complex observability needs, you can combine multiple receivers, processors, and exporters in a single ADOT Collector configuration. - -Examples include: - -- **Combining Prometheus and OTLP Receivers**: Collect both metrics and traces from your applications. -- **Multiple Exporters**: Send metrics to both Amazon Managed Prometheus and CloudWatch simultaneously. - ---- - -## Advanced Configuration for Add-on Versions - -### For Versions Pre-v0.88.0-eksbuild.1 - -For versions before `v0.88.0-eksbuild.1`, configurations are provided as a JSON string using the `--configuration-values` option during add-on creation or update. - -#### Example: Setting CPU Limits and Replica Count - -```json -{ - "manager": { - "resources": { - "limits": { - "cpu": "200m" - } - } - }, - "replicaCount": 2 -} -``` - -#### Applying the Configuration - -```bash -aws eks create-addon \ - --cluster-name \ - --addon-name adot \ - --configuration-values file://configuration-values.json \ - --resolve-conflicts OVERWRITE -``` - -### For Versions v0.88.0 and Above - -For `v0.88.0` and newer versions: - -- Use the `OpenTelemetryCollector` CRD to specify configurations. -- Enhanced customization options are available. -- Follow the AWS-provided migration guide when upgrading from earlier versions. - ---- - -## Instrumenting Applications - -### Auto-Instrumentation Injection - -ADOT supports **auto-instrumentation injection** for applications running on Amazon EKS, enabling automatic injection of OpenTelemetry libraries into workloads without modifying application code. - -#### Enabling Auto-Instrumentation - -Annotate your workloads: - -```yaml -instrumentation.opentelemetry.io/inject-: "true" -``` - -**Supported Languages**: - -- **Java** -- **Node.js** -- **Python** -- **.NET** - -#### Example: Java Application - -```yaml -apiVersion: apps/v1 -kind: Deployment -metadata: - name: my-java-app -spec: - replicas: 1 - template: - metadata: - labels: - app: my-java-app - annotations: - instrumentation.opentelemetry.io/inject-java: "true" - spec: - containers: - - name: app-container - image: my-java-app-image -``` - -### Configuring Auto-Instrumentation with Instrumentation CRD - -Use the **Instrumentation CRD** to customize auto-instrumentation settings. - -#### Example Configuration - -```yaml -apiVersion: opentelemetry.io/v1alpha1 -kind: Instrumentation -metadata: - name: my-instrumentation -spec: - exporter: - endpoint: http://adot-collector:4317 - java: - image: public.ecr.aws/aws-observability/adot-autoinstrumentation-java:v1.31.1 -``` - ---- - -## Kubernetes Attributes Processor - -The **Kubernetes Attributes Processor** enriches telemetry signals with Kubernetes metadata, enhancing observability by providing context for signals received from various Kubernetes resources. - -### Configuration Example - -```yaml -processors: - k8sattributes: - pod_association: - - sources: - - from: connection -``` - -Integrate the processor into your pipelines to automatically attach Kubernetes resource attributes like `k8s.pod.name`, `k8s.namespace.name`, and `k8s.node.name` to your telemetry data. - ---- - -## Target Allocator - -The **Target Allocator (TA)** enables flexible and scalable Prometheus service discovery and metrics collection by decoupling scrape target discovery from metrics collection. - -### Enabling the Target Allocator - -Set `OpenTelemetryCollector.spec.targetAllocator.enabled` to `true` in the OpenTelemetry Collector CRD. - -#### Example Configuration - -```yaml -apiVersion: opentelemetry.io/v1alpha1 -kind: OpenTelemetryCollector -metadata: - name: collector-with-ta -spec: - mode: statefulset - targetAllocator: - enabled: true - config: | - receivers: - prometheus: - config: - scrape_configs: - - job_name: 'otel-collector' - scrape_interval: 10s - static_configs: - - targets: ['0.0.0.0:8888'] - processors: - batch: - exporters: - prometheusremotewrite: - endpoint: "" - auth: - authenticator: sigv4auth - extensions: - sigv4auth: - region: "" - service: "aps" - service: - pipelines: - metrics: - receivers: [prometheus] - processors: [batch] - exporters: [prometheusremotewrite] -``` - ---- - -## Monitoring and Verification - -After deploying ADOT, monitor and verify your setup: - -1. **Verify Collector Logs**: - ```bash - kubectl logs -n opentelemetry-operator-system - ``` - -2. **Access Metrics in Amazon Managed Prometheus or CloudWatch**: - - Ensure metrics are being collected and exported as configured. - -3. **View Traces in AWS X-Ray**: - - Validate that traces from your applications are visible in AWS X-Ray. - ---- - -## Troubleshooting - -### Common Errors - -1. **Access Denied for `eks:addon-manager` Role**: - - **Error Message**: `"roles.rbac.authorization.k8s.io "opentelemetry-operator-leader-election-role" is forbidden"` - - **Solution**: Update IAM permissions for the `eks:addon-manager` role in the `opentelemetry-operator-system` namespace. - -2. **CREATE_FAILED or UPDATE_FAILED**: - - **Cause**: Conflict or unsupported architecture. - - **Solution**: Use `--resolve-conflicts OVERWRITE` in the EKS command or ensure architecture compatibility for your add-on version. - -3. **DELETE_FAILED**: - - **Cause**: EKS management conflicts. - - **Solution**: Add the `--preserve` flag when deleting the add-on: - ```bash - aws eks delete-addon --cluster-name --addon-name adot --preserve - ``` - ---- - -## Resources and Support - -For further reading and support, refer to: - -- **AWS Distro for OpenTelemetry Documentation**: [aws-otel.github.io](https://aws-otel.github.io/) -- **GitHub Repository**: [aws-observability/aws-otel-collector](https://github.com/aws-observability/aws-otel-collector) -- **OpenTelemetry Specification**: [opentelemetry.io](https://opentelemetry.io/) -- **cert-manager Documentation**: [cert-manager.io/docs](https://cert-manager.io/docs/) - ---- - -## Stay Connected - -- **GitHub Community**: [AWS Observability on GitHub](https://github.com/aws-observability) -- **Twitter**: Follow [@AWSOpenSource](https://twitter.com/AWSOpenSource) for updates on ADOT and other AWS observability tools. - -For issues or enhancement requests, file an issue on the [GitHub repository](https://github.com/aws-observability/aws-otel-collector/issues).