Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add lake formation support in catalog #810

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

vgkowski
Copy link
Contributor

Issue #, if available:

Description of changes:

Add the support for Lake Formation permission model in the DataCatalogDatabase and the DataLakeCatalog constructs. Supports both full and hybrid mode.

Checklist

Breaking change checklist

RFC issue #:

  • Migration process documented
  • Implement warnings (if it can live side by side)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@vgkowski vgkowski force-pushed the feat/lake-formation-catalog branch from 3f69301 to ce60908 Compare January 29, 2025 14:19
@vgkowski vgkowski marked this pull request as ready for review January 29, 2025 14:20
@vgkowski vgkowski requested a review from lmouhib January 29, 2025 15:42
* Only needed when permissionModel is set to Lake Formation or Hybrid
* @default - A new role is created
*/
readonly lakeFormationDataAccessRole?: IRole;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more explicit here? Who will assume this role? What is its principal? Lakeformation, Glue, Cloudformation?

* Only needed when permissionModel is set to Lake Formation or Hybrid
* @default - A new role is created
*/
readonly lakeFormationConfigurationRole?: IRole;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more explicit here? Who will assume this role? What is its principal? Lakeformation, Glue, Cloudformation?

import { Construct } from 'constructs';
import { DataCatalogDatabaseProps } from './data-catalog-database-props';
import { Context, TrackedConstruct, TrackedConstructProps, Utils } from '../../utils';
import { /*grantDataLakeLocation,*/ grantCrawler, grantDataLakeLocation, putDataLakeSettings, registerS3Location, revokeIamAllowedPrincipal } from './lake-formation-helpers';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clean this grantDataLakeLocation its commented out.

/**
* The IAM Role used by Lake Formation to access data.
*/
readonly lfDataAccessRole?: IRole;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more explicit in the variable naming, I understand lf means lakeformation, but we need to reflect it in the variable name.

/**
* The Lake Formation grant on the data location for the Crawler when Lake Formation or Hybrid is used
*/
readonly crawlerLfLocationGrant?: CfnPrincipalPermissions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same on Lf

/**
* The IAM Role used to revoke LakeFormation IAMAllowedPrincipals
*/
readonly lfRevokeRole?: IRole;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same on lf

/**
* The Lake Formation grant on the data location for the CDK role
*/
readonly cdkLfLocationGrant?: CfnPrincipalPermissions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this is only available when using lakeformation, is it also the case for hybrid?

@@ -59,6 +114,8 @@ export class DataCatalogDatabase extends TrackedConstruct {

super(scope, id, trackedConstructProps);
const catalogType = this.determineCatalogType(props);
this.permissionModel = props.permissionModel || DataCatalogDatabase.DEFAULT_PERMISSION_MODEL;
const useLakeFormation = props.permissionModel === PermissionModel.LAKE_FORMATION || props.permissionModel === PermissionModel.HYBRID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you first if the value exist? instead of this being implicit. Meaning use props?.


if (catalogType === CatalogType.S3 && useLakeFormation) {

if (props.permissionModel === PermissionModel.LAKE_FORMATION) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a nested condition? please lets remove the nested condition and check this on line 180 too.

* Only needed when permissionModel is set to Lake Formation or Hybrid
* @default - A new role is created for the entire Data Lake
*/
readonly lakeFormationDataAccessRole?: IRole;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the comment in the datalog datatabse props.

@@ -10,6 +10,17 @@
"@jridgewell/gen-mapping" "^0.3.5"
"@jridgewell/trace-mapping" "^0.3.24"

"@asamuzakjp/css-color@^2.8.2":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance we avoid changing the lockfile in the PR? Ideally we would this to be in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants