Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning on Duplicate keys in a map object. #28727

Open
techdragon opened this issue May 18, 2021 · 24 comments
Open

Warning on Duplicate keys in a map object. #28727

techdragon opened this issue May 18, 2021 · 24 comments
Labels
enhancement new new issue not yet triaged

Comments

@techdragon
Copy link

Current Terraform Version

Terraform v0.15.0
on darwin_amd64

Use-cases

Avoiding duplicate keys in maps.

Attempted Solutions

Linter scripts and third party tools.

Proposal

Duplicate keys in maps should raise some kind of warning. This would help prevent mistakes when working with larger and more complex maps. In particular dynamic maps and merged maps where the keys may not be visible in the terraform configuration as written, only being populated when terraform is run.

References

N.A.

@techdragon techdragon added enhancement new new issue not yet triaged labels May 18, 2021
@myronmeier
Copy link

Thanks for raising this, @techdragon, literally days before we ran into the same issue. Upvoted! I would even go so far as to propose it be an error and not just a warning.

Here is a simple demonstration.

terraform {
  required_version = "0.15.4"
}

locals {
  local_dbs = {
    db1 = { a = "1a" }
    db1 = { a = "2a" }    // notice the key here is the same as the previous (e.g. a mistake)
    db3 = { a = "3a" }
  }
}

output locals_map_out {
  value = local.local_dbs
}

The above produces these outputs

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

locals_map_out = {
  "db1" = {
    "a" = "2a"
  }
  "db3" = {
    "a" = "3a"
  }
}

Notice the loss of the 1st db1 map entry. There was no error and no warning. It just silently replaced the 1st map entry with the 2nd.

The same can be demonstrated with input variables as well

variable "var_dbs" {
  type = map(object({
    a = string
  }))
}

output variable_map_out {
  value = var.var_dbs
}

With this terraform.tfvars

var_dbs = {
  db1 = { a = "1a" }
  db1 = { a = "2a" }
  db3 = { a = "3a" }
}

Produces

Outputs:

variable_map_out = tomap({
  "db1" = {
    "a" = "2a"
  }
  "db3" = {
    "a" = "3a"
  }
})

Adding more outputs to the locals example above to see if the 1st map entry is just somehow hidden indicates that it was likely lost while constructing the map itself. This similar older issue #18573 (circa TF 11?) seems to confirm that, but it also makes a prediction about duplicate map keys being an error in future versions (TF 12?), which appears to have not come true.

output locals_map_length {
  value = length(local.local_dbs)
}

output locals_map_keys {
  value = keys(local.local_dbs)
}

output locals_map_values {
  value = values(local.local_dbs)
}

output locals_map_json {
  value = jsonencode(local.local_dbs)
}
Outputs:

locals_map_json = "{\"db1\":{\"a\":\"2a\"},\"db3\":{\"a\":\"3a\"}}"
locals_map_keys = [
  "db1",
  "db3",
]
locals_map_length = 2
locals_map_out = {
  "db1" = {
    "a" = "2a"
  }
  "db3" = {
    "a" = "3a"
  }
}
locals_map_values = [
  {
    "a" = "2a"
  },
  {
    "a" = "3a"
  },
]

Even reading from JSON silently drops the duplicate key without error or warning

locals {
  local_json = jsondecode( <<-EOT
    {
      "db1": { "a": "1a" },
      "db1": { "a": "2a" },
      "db3": { "a": "3a" }
    }
    EOT
  )
}

output locals_json_out {
  value = local.local_json
}
Outputs:

locals_json_out = {
  "db1" = {
    "a" = "2a"
  }
  "db3" = {
    "a" = "3a"
  }
}

@myronmeier
Copy link

@techdragon , I'm curious what linters you tried. I quickly tried https://github.com/terraform-linters/tflint but it didn't catch the duplicate keys in any of the examples above.

@techdragon
Copy link
Author

@myronmeier Thanks for writing out the excellent examples! And yes I had the same issue tflint not catching it, but since the issue goes beyond the maps as written statically in code, I just mentioned the linters in passing as I had tried them as a workaround, but I probably should have been more clear about what I meant.

@nwsparks
Copy link

@myronmeier I agree, this should be an error. Warning is not sufficient as it's easily missed and this can be very problematic.

From what I can tell this can't be caught by variable validation either.

nsparks@w10-vm:/mnt/c/test$ cat data.tf
variable "routing_rules" {
  default =   {
    "8000" = {
      host_headers = ["h1"]
    },
    "8000" = {
      host_headers = ["h2"]
    },
    "8001" = {
      host_headers = ["h3"]
    }
  }

  validation {
    condition = length(keys(var.routing_rules)) == 3
    error_message = "Length is not 3."
  }
}
nsparks@w10-vm:/mnt/c/test$ terraform console
╷
│ Warning: Due to the problems above, some expressions may produce unexpected results.
│
│
╵

╷
│ Error: Invalid value for variable
│
│   on data.tf line 1:
│    1: variable "routing_rules" {
│
│ Length is not 3.
│
│ This was checked by the validation rule at data.tf:14,3-13.
╵

@techdragon
Copy link
Author

techdragon commented Jul 10, 2021

@nwsparks I didn't expect validation to fail like that... Also I'm not 100% convinced that it should always be an error. I can imagine there are users out there who may be relying on this to power various workarounds for providing default variable values that may have been inherited from older terraform modules or when needing to pass in configuration and merge defaults with more complex map related values. While it might be good to make this an error in time, I'd rather have the warning added now, with perhaps some way to configure this warning as an error, than navigate the longer path of changing behaviour like this now that 1.0 released and the terraform team have more to worry about with respect to long term compatibility.

Edit: Just found another example of duplicated key names that may become a common pattern in future that could conflict with making duplicate keys an error #25088 has an example where the key is defined multiple times for running different validation checks against its value.

@philip-harvey
Copy link

I struck this today, and it's very dangerous. In terraform.tfvars it will happily let you define a map with the same key multiple times as per the example above and doesn't throw an error, but it discards all but one of the map objects with that key. This can lead to non-recoverable data loss. I tried to add a test to check for duplicate keys, but it seems to discard the data right when it reads it in so there doesn't seems to be any way to test for this inside Terraform. I suggest that Terraform should error if you have multiple occurrences of the same key in a map in terraform.tfvars.

@fernando-villalba
Copy link

fernando-villalba commented Feb 22, 2022

This one is really bizarre that you can have duplicate keys for objects and maps in terraform as it can be a considerable to debug headache and potentially catastrophic if not careful. Neither Python, Java or Golang accept duplicate keys for maps so TF accepting them is just strange

What's even stranger is how very few downvotes this has! I hope it gets addressed soon.

@ams1
Copy link

ams1 commented Oct 28, 2022

For anyone interested in workarounds for this, I opened a SO question:
https://stackoverflow.com/questions/74233973/terraform-map-alternative-when-duplicated-keys-should-raise-error/74234215#74234215

@naitmare01
Copy link

I managed to solve it by first creating a list of items containing my key-value pair in my map. Then I converted it to a map, the conversion itself will check for duplicates and error out if any exist.

locals{
tuple_items = [
{ "AB" = {
ID = "123"
} },
{ "CD" = {
ID = "456"
} }
]
tuple_to_map = { for item in local.tuple_items : keys(item)[0] => values(item)[0] } # Transform the tuple to a map to be able to iterate over it, output errors if not all keys are unique
}

@scottpiper-wiz
Copy link

This ticket was referenced recently in this blog post which identified a security issue that results from this functionality: https://securitylabs.datadoghq.com/articles/exploring-github-to-aws-keyless-authentication-flaws/

When you create an AWS IAM role in terraform with a trust policy that contains two StringEquals elements, this results in one of the element disappearing when the IAM role is created by terraform. These StringEquals elements are required to restrict who can access this role. In the example provided, the role is being created for a Github Action, but without both restrictions, the researcher identified that a number of these roles can be assumed by any Github Action in any repository. The researcher found a number of organizations were vulnerable to this issue, including a UK government org.

image

In AWS, it is not possible to have the two StringEquals elements, and they must instead be combined into a single element. There are unfortunately some guides online that show this incorrect terraform code (which I'm working to get changed). Because terraform silently swallows one of the elements with an important restriction, the users are not aware that they are deploying IAM roles which anyone can assume (thereby gaining access to their AWS environments).

It looks like there is more recent discussion of this problem in #33570, where @apparentlymart stated:

I don't see a path to get there within the Terraform v1.x Compatibility Promises

If that is the case, and this will not be fixed in terraform, then should a check for this be added to tflint instead or is there a better place? https://github.com/terraform-linters/tflint

I am additionally worried that beyond Github Actions and AWS IAM Roles, that the problem of silently swallowing security restrictions on resources has impact beyond just that use case.

@naitmare01
Copy link

I am additionally worried that beyond Github Actions and AWS IAM Roles, that the problem of silently swallowing security restrictions on resources has impact beyond just that use case.

I can only agree with the last paragraph @scottpiper-wiz after reading both your comment and also the datadog article you linked.

I highly doubt that this is limited to GitHub action and AWS IAM Roles.

I don’t think this belong in a linter, this should be caught directly in Terraform thus preventing anyone actually deploying this kind of code. Like my own work around in locals. See my previous comment in this thread.
The least Terraform should do is to catch and inform of which key in the map it disregards.

@scottpiper-wiz
Copy link

In tracking down how this could be implemented, my understanding is jsonencode calls directly to https://github.com/zclconf/go-cty as shown here:

"jsonencode": stdlib.JSONEncodeFunc,

This then calls into json.Marshall as shown here: https://github.com/zclconf/go-cty/blob/7dcbae46a6f247e983efb1fa774d2bb68781a333/cty/function/stdlib/json.go#L59C15-L59C27

In encoding/json, a proposal for Decoder.DisallowDuplicateFields has been accepted here: golang/go#48298 (comment)

But the functionality into golang has not yet been merged.

Implementing this could therefore be done by:

  • Motivating the Decoder.DisallowDuplicateFields to be implemented and using that.
  • Or, use a different library for json marshaling rather than the built-in one.

@apparentlymart
Copy link
Contributor

It's been a long time since I considered this issue and so I honestly don't really remember all of what I was considering when I responded before.

My interpretation of the original idea here is that it should be an error or warning if an object constructor expression (which we might also casually call an "object literal") can see that two of the key/value pairs have key expressions that evaluate to equal values. I have a faint memory of originally having implemented it that way in my own codebase "zcl" that later became HCL 2 after HashiCorp hired me, and then having to relax that rule in response to some compatibility problem relative to Terraform v0.11. I don't remember exactly what the deal was with that, though. (Object-constructing for expressions do still have this uniqueness restriction, leaving the language slightly inconsistent in this regard.)

Ultimately the handling of object constructor expressions belongs to HCL rather than to Terraform, and so if we are going to change something here I expect it would need to be an HCL change rather than a Terraform change. That presents two challenges which were perhaps what I was reacting to in earlier discussion:

  • It can't be an error because that would mean something that was previously valid would become invalid. That would violate both Terraform's explicit compatibility promises and HCL's implied semver compatibility.

  • There isn't really any precedent for HCL-level language features to generate warnings. The concept of warning diagnostics is primarily for application-level concerns.

    Although it would arguably be backward compatible to generate a warning here in principle, I think in practice that is likely to upset assumptions that other calling applications make about HCL. Most HCL callers are not as sophisticated as Terraform and in particular often cannot propagate fully-fledged diagnostics all the way to the UI, and at best those applications would probably just drop the warning altogether and at worst will fail in confusing ways due to naive treatment of the diagnostics as a funny sort of Go error.

I don't think it's possible to solve this at the application layer inside Terraform -- including via Terraform's built-in functions -- because by the time Terraform is holding the evaluated object result the duplicate attributes have already been discarded.

With all of that said then, I think treating this as a "lint"-like check is probably the most viable path given the existing constraints. The third-party tflint could potentially do it, although we don't have direct influence over that tool. It could also potentially live as part of #29661, but that idea seems to have attracted very little interest so I don't expect I'll be given permission to work on it further for the foreseeable future.

At this point I don't think there is any clear path to resolving this issue. We can keep it open to track the concern, but I don't expect anything further to happen here unless someone has a concrete proposal for what to change, how to change it, and in particular how to do it without breaking compatibility promises or impacting other callers of HCL.

@nwsparks
Copy link

nwsparks commented Aug 15, 2023

It's difficult to imagine that anyone has intentionally created a map with duplicate keys expecting that earlier entries will be discarded, but I guess anything is possible. This does seem to me as more of a bug or security issue rather than a feature which must be preserved.

https://developer.hashicorp.com/terraform/language/v1-compatibility-promises#pragmatic-exceptions the promise does have exceptions that I could see this classifying under.

If there is no way to resolve this in 1.x could it not just be targeted as a fix in 2.0, hopefully altering the behavior so that it returns an error?

@stefansedich
Copy link

Another +1 for this being an error, our team ran into this one today where a later definition overrode permissions from an earlier definition.

@nhooey
Copy link

nhooey commented Jan 8, 2024

@nwsparks said:

It's difficult to imagine that anyone has intentionally created a map with duplicate keys expecting that earlier entries will be discarded, but I guess anything is possible. This does seem to me as more of a bug or security issue rather than a feature which must be preserved.

I agree with the sentiment that this issue should be interpreted as a bug rather than a behaviour worth preserving.

This issue also emerges when converting JSON to HCL with json2hcl tool (which uses HCL v1 instead of v2), since the HCL library it uses is likely converting JSON key-to-list-of-maps assignments to duplicate HCL variable assignments, shown in this example:

$ echo '{"Simple List": [1, 2, 3], "List of Maps": [{"k1": 1, "k2": 2}, {"k3": 3}]}' | json2hcl -

"Simple List" = [1, 2, 3]

"List of Maps" = {
  "k1" = 1
  "k2" = 2
}

"List of Maps" = {
  "k3" = 3
}

This can hardly be desirable behavior, given that the last assignments of "List of Maps" will get discarded (as discussed in previous comments), and especially since Terraform has the ability to properly interpret lists with a perfect 1:1 mapping of JSON to HCL.

Specifically, Terraform can interpret these HCL lists without issue and encode it to JSON:

"Simple List" = [1, 2, 3]

# Notably: an actual list instead of duplicate HCL variable assignments:
"List of Maps" = [
  {
    "k1" = 1
    "k2" = 2
  },
  {
  "k3" = 3
  },
]

@rpuserh
Copy link

rpuserh commented Jan 10, 2024

Same story. I think this is a bug as no any other language accept this including terraform vars. for some reason locals in terraform do. How do we escalate this issue? As I see it was open since May 2021 ?

locals {
  azure_data_map = {
    development = {
      location = "West Europe"
      size     = "small"
    }
    production = {
      location = "Central US"
      size     = "medium"
    }
    production = {                      // Note duplicate key
      location = "Central US"
      size     = "medium1"
    }
  }
}

output "outtest" {
  value = local.azure_data_map
}
Outputs:

outtest = {
  "development" = {
    "location" = "West Europe"
    "size" = "small"
  }
  "production" = {
    "location" = "Central US"
    "size" = "medium1"
  }
}

@Shocktrooper
Copy link

Just ran into this where no warning or error was generated for duplicate map keys added by different developers and one entry into the map was hiding values from the true map entry. I am surprised that terraform acts this way and hides this error as map types almost universally across languages and even mathematics has the key value be unique in a singular map

@yermulnik
Copy link

TF 1.9.2 — just ran into this too and was quite surprised TF doesn't throw error.

@ericrichtert
Copy link

tflint-ruleset-terraform 0.9.0 and above has a rule for duplicate keys in a map. See terraform-linters/tflint-ruleset-terraform#194.

But this one doesn't seem to handle .tfvars files

@lra
Copy link

lra commented Aug 17, 2024

tflint-ruleset-terraform 0.9.0 and above has a rule for duplicate keys in a map. See terraform-linters/tflint-ruleset-terraform#194.

But this one doesn't seem to handle .tfvars files

This seems to have been reverted in v0.9.1 https://github.com/terraform-linters/tflint-ruleset-terraform/releases/tag/v0.9.1

@ericrichtert
Copy link

As far as I can see, the revert does not handle tfvars files

@jeff-carey
Copy link

Just ran across this problem myself. The duplicate was unintentional and undesirable, and only caught during a review of the resulting plan. I really can't see this as anything other than a bug which needs to be corrected.

@jdix531
Copy link

jdix531 commented Jan 5, 2025

For anyone interested in workarounds for this, I opened a SO question: https://stackoverflow.com/questions/74233973/terraform-map-alternative-when-duplicated-keys-should-raise-error/74234215#74234215

This was very helpful to find a few idiomatic-enough-ways to get by for now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new new issue not yet triaged
Projects
None yet
Development

No branches or pull requests