Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add environment variable configuration for storage #42

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions internal/storages/builder/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package builder
import (
"context"
"errors"
"os"

"github.com/greenmaskio/greenmask/internal/domains"
"github.com/greenmaskio/greenmask/internal/storages"
Expand All @@ -27,9 +28,10 @@ import (
func GetStorage(ctx context.Context, stCfg *domains.StorageConfig, logCgf *domains.LogConfig) (
storages.Storager, error,
) {
if stCfg.Directory != nil {
envCfg := os.Getenv("STORAGE_TYPE")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to implement the configuration in the next way:

storage:
  type: "s3"

   s3_config: 
     bucket: "test"
  
   directory_config: 
     path: "/tmp"

We should introduce the storage.type parameter and depending on the type provided you can use either S3 config or Directory.

Keep in mind that the env vars mapping should be done via viper/cobra usage here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important now, because AutomaticEnv does not work when we provide the variable STORAGE_DIRECTORY_PATH=/tmp. The default config file assembles the S3 storage (not the Directory). It would be fine to introduce the next keys in the config:

  • storage.type
  • storage.s3
  • storage.directory

Then the variables will be dynamically working using the next variable naming:

  • For path setting in Directory STORAGE_DIRECTORY_PATH=/tmp
  • For bucket setting in S3 STORAGE_S3_BUCKET=testbucket

if stCfg.Directory != nil || envCfg == "directory" {
return directory.NewStorage(stCfg.Directory)
} else if stCfg.S3 != nil {
} else if stCfg.S3 != nil || envCfg == "s3" {
return s3.NewStorage(ctx, stCfg.S3, logCgf.Level)
}
return nil, errors.New("no one storage was provided")
Expand Down
18 changes: 18 additions & 0 deletions internal/storages/directory/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,24 @@

package directory

import (
"errors"
"os"
)

type Config struct {
Path string `mapstructure:"path"`
}

func NewConfig() *Config {
return &Config{
Path: os.Getenv("STORAGE_DIRECTORY_PATH"),
}
}

func (c *Config) Validate() error {
if c.Path == "" {
return errors.New("path cannot be empty when using directory storage")
}
return nil
}
4 changes: 4 additions & 0 deletions internal/storages/s3/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ package s3

import (
"errors"
"os"
)

const (
Expand Down Expand Up @@ -50,6 +51,9 @@ func NewConfig() *Config {
ForcePathStyle: true,
MaxRetries: defaultMaxRetries,
MaxPartSize: defaultMaxPartSize,
Bucket: os.Getenv("STORAGE_S3_BUCKET_NAME"),
Region: os.Getenv("STORAGE_S3_BUCKET_REGION"),
Prefix: os.Getenv("STORAGE_S3_BUCKET_PREFIX"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Greenmask employs Viper and Cobra frameworks for managing the configuration. This means you should not hardcode os.Getenv.

Have a look at an example

Copy link
Contributor Author

@joao-zanutto joao-zanutto Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwoytenko I see, while investigating the code further I realized that Viper would take care of the configuration. As I understand, I believe that it was supposed to load all the configuration from environment variables automatically since viper.AutomaticEnv() is being called in the initConfig() function here.

However, I can't seem to make it work using any environment variables following Viper's docummentation (like STORAGE_DIRECTORY_PATH or COMMON_TMP_DIR). I'm still investigating but I'll surely drop those modifications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwoytenko it seems I was able to fix it, it was simply the lack of the separator replacer. I opened a new PR #43 with the change.

On a side note I would like to propose a couple of other changes (and I'm unsure where would be the ideal location to communicate those, so I'm just adding them here):

  • make config.yaml not required -- this would make it easier to run greenmask as a container without needing to rebuild the image with a config file or mounting a volume with the file
  • set common.tmp_dir default value to linux /tmp
  • change Dockerfile entrypoint to greenmask instead of /bin/bash -- this is more of a matter of following convention of other dockerized tools
  • define a user greenmask on the container to avoid running it as root -- this is a security best practice I believe

I can work on all of those if you agree, I've tested a few of those changes locally already and they are really simple, the first one being the most impactful for my use case. To give you some context of my use case: I'm planning to run greenmask once a day as a ECS container inside a VPC in AWS, export dumps to S3 and also use it in Github actions to restore the dump to a postgres database so we can run database migration tests in our CI. So, having all the configurations as environment variables would be ideal, even the transformer configurations for the dump operation (which I still haven't tested yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joao-zanutto sure you can start doing it with all the listed fixes/features. Kindly remind you to review the documentation according to the changes will make.

Thank you!

}

Expand Down