Skip to content

Commit

Permalink
Retry logic fix (#373)
Browse files Browse the repository at this point in the history
* Update retry logic

- look for specific error messages and do not retry the API call when they are present
- Exponential wait times
- wait time changed from nanoseconds to milliseconds

* Update ssid retry logic

- don't retry on specific error messages

* add sleep statement

- added sleep statement to get around errors caused by rapid creation

* feat: Enhanced Sweeper Logic

- separated sweeper api calls into individual funcs
- improved error handling and logging
- added organization sweeper
- consolidated sweeper init() definitions
- docs

* test: fix admin resource names

- updated admin test names/emails with `test_acc` prefix

* chore: sweeper logs

- separated provider.go test, sweeper, http code
- replaced test framework tflog with stdout print statements until context.Background() issue is identified.

* feat: retry logic enhancement

- use http status code range to select criteria for retry
- null reference checks

* fix: updates resource retry logic

- switchports, networks, ssids

* fix: Unrecognized remote plugin message

- removed print func from init

* fix: open roaming certificate err

- set payload value to nil if == 0

* chore: additional test cases for ssids

- more of a todo item

---------

Co-authored-by: Riley Jacobson <[email protected]>
  • Loading branch information
iamdexterpark and Rileyj53 authored Jul 5, 2024
1 parent c4f763e commit 6053d0e
Show file tree
Hide file tree
Showing 15 changed files with 1,081 additions and 451 deletions.
68 changes: 68 additions & 0 deletions TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,74 @@ To set up your integration testing environment, follow these steps:
- **Clean Up**: Write tests that clean up resources after they run to prevent unnecessary charges and clutter in your Meraki environment.
- **Monitoring**: Keep an eye on the Meraki dashboard while tests are running. This will help you understand the changes being made and troubleshoot any issues that arise.

## Sweepers

We use terraform sweepers to clean up build artifacts left in the Meraki dashboard from failed integration tests.
In normal conditions sweepers are run atomatically during testing due to TestMain func in the `provider_sweeper_test.go` file.

When writing tests it is important to label all resources with a `test_acc` prefix as this is what the sweepers use to identify artifacts for cleanup.

As specified below, you have the option to manually run the Terraform sweepers for cleaning up test artifacts in the Meraki Dashboard using CLI or IDE program arguments.

**Setting Environment Variables**

In your terminal, set the environment variables as follows:

```shell
export MERAKI_DASHBOARD_API_KEY=your_meraki_api_key
export TF_ACC_MERAKI_ORGANIZATION_ID=your_organization_id
```

**Commands**

To run the meraki_networks sweeper:
```shell
go test -v -sweep=123467890 -sweep-run='meraki_networks'
```

To run the meraki_admins sweeper:
```shell
go test -v -sweep=123467890 -sweep-run='meraki_admins'
```
To run the meraki_organization sweeper:
```shell
go test -v -sweep=123467890 -sweep-run='meraki_organization'
```

To run the meraki_organizations sweeper:
```shell
go test -v -sweep=123467890 -sweep-run='meraki_organizations'
```


**Running in an IDE**

If you are using an IDE like GoLand, you can configure the run configuration to include the necessary program arguments.

1. Open Run/Debug Configurations:
• Go to Run > Edit Configurations...
2. Add New Configuration:
• Click on the + icon and select Go Test.
3. Set Program Arguments:
• In the Program arguments field, add:

```shell
-v -sweep=123467890 -sweep-run='meraki_organization'
```
• Or for meraki_network:
```shell
-v -sweep=123467890 -sweep-run='meraki_network'
```
4. Set Environment Variables:
• In the Environment field, add:
```shell
MERAKI_DASHBOARD_API_KEY=your_meraki_api_key
TF_ACC_MERAKI_ORGANIZATION_ID=your_organization_id
```
5. Apply and Run:
• Apply the configuration and click Run.


## Troubleshooting

If you encounter any issues during testing, consider the following steps:
Expand Down
22 changes: 18 additions & 4 deletions internal/provider/devices_switch_port_resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -445,21 +445,35 @@ func (r *DevicesSwitchPortResource) Read(ctx context.Context, req resource.ReadR

inlineResp, httpResp, err, tfDiags := tools.CustomHttpRequestRetryStronglyTyped(ctx, maxRetries, retryDelay, apiCall)
if err != nil {

if tfDiags.HasError() {
fmt.Printf(": %s\n", tfDiags.Errors())
resp.Diagnostics.AddError("Diagnostics Errors", fmt.Sprintf(" %s", tfDiags.Errors()))
}
fmt.Printf("Error reading device switch port: %s\n", err)
resp.Diagnostics.AddError("Error reading device switch port", fmt.Sprintf(" %s", err))

if httpResp != nil {
var responseBody string
if httpResp.Body != nil {
bodyBytes, readErr := io.ReadAll(httpResp.Body)
if readErr == nil {
responseBody = string(bodyBytes)
} else {
responseBody = fmt.Sprintf("Failed to read response body: %s", readErr)
}
} else {
responseBody = "No response body"
}
fmt.Printf("Failed to create resource. HTTP Status Code: %d, Response Body: %s\n", httpResp.StatusCode, responseBody)
resp.Diagnostics.AddError("Failed to create resource.",
fmt.Sprintf("HTTP Status Code: %d, Response Body: %s\n", httpResp.StatusCode, responseBody))
} else {
resp.Diagnostics.AddError("HTTP Response is nil", "")
}

return
}

// Ensure inlineResp is not nil before dereferencing it
if inlineResp == nil {
fmt.Printf("Received nil response for device switch port: %s, port ID: %s\n", data.Serial.ValueString(), data.PortId.ValueString())
return
}

Expand Down
3 changes: 2 additions & 1 deletion internal/provider/networks_group_policy_resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -1974,10 +1974,11 @@ func (r *NetworksGroupPolicyResource) Create(ctx context.Context, req resource.C

// API call function to be passed to retryOn4xx
apiCall := func() (map[string]interface{}, *http.Response, error) {
time.Sleep(retryDelay * time.Millisecond)

inline, httpResp, err := r.client.NetworksApi.CreateNetworkGroupPolicy(ctx, plan.NetworkId.ValueString()).CreateNetworkGroupPolicyRequest(groupPolicy).Execute()
return inline, httpResp, err
}

// Use retryOn4xx for the API call as the meraki API backend returns HTTP 400 messages as a result of collision issues with rapid creation of postgres GroupPolicyIds.
inlineResp, httpResp, err := tools.CustomHttpRequestRetry(ctx, maxRetries, retryDelay, apiCall)
if err != nil {
Expand Down
94 changes: 50 additions & 44 deletions internal/provider/networks_network_resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import (
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"time"

Expand Down Expand Up @@ -304,7 +306,6 @@ func (r *NetworkResource) Create(ctx context.Context, req resource.CreateRequest
return
}

// Note: Id is properly returned here, shouldnt be example
data.Id = data.NetworkId.StringValue

if data.CopyFromNetworkId.IsUnknown() {
Expand Down Expand Up @@ -590,62 +591,67 @@ func (r *NetworkResource) Delete(ctx context.Context, req resource.DeleteRequest
// Read Terraform state data
resp.Diagnostics.Append(req.State.Get(ctx, &data)...)

// HTTP DELETE METHOD does not leverage the retry-after header and throws 400 errors.
retries := 100 // This is the only way to ensure a scaled result.
wait := 1
var deletedFromMerakiPortal bool
deletedFromMerakiPortal = false
maxRetries := r.client.GetConfig().MaximumRetries
retryDelay := time.Duration(r.client.GetConfig().Retry4xxErrorWaitTime)

for retries > 0 {
// API call function to be passed to retryOn4xx
apiCall := func() (map[string]interface{}, *http.Response, error) {
time.Sleep(retryDelay * time.Millisecond)

// Initialize provider client and make API call
httpResp, err := r.client.NetworksApi.DeleteNetwork(context.Background(), data.NetworkId.ValueString()).Execute()
return nil, httpResp, err
}

if httpResp.StatusCode == 204 {
// check for HTTP errors
if err != nil {
resp.Diagnostics.AddError(
"HTTP Client Failure",
tools.HttpDiagnostics(httpResp),
)
return
}
// HTTP DELETE METHOD does not leverage the retry-after header and throws 400 errors.
_, httpResp, err := tools.CustomHttpRequestRetry(ctx, maxRetries, retryDelay, apiCall)
if err != nil {
resp.Diagnostics.AddError(
"Error creating group policy",
fmt.Sprintf("Could not create group policy, unexpected error: %s", err),
)

// Check for errors after diagnostics collected
if resp.Diagnostics.HasError() {
return
if httpResp != nil {
var responseBody string
if httpResp != nil && httpResp.Body != nil {
bodyBytes, readErr := io.ReadAll(httpResp.Body)
if readErr == nil {
responseBody = string(bodyBytes)
}
}

deletedFromMerakiPortal = true

// escape loop
break

} else {

// decrement retry counter
retries -= 1

// exponential wait
time.Sleep(time.Duration(wait) * time.Second)
wait += 1
tflog.Error(ctx, "Failed to create resource", map[string]interface{}{
"error": err.Error(),
"httpStatusCode": httpResp.StatusCode,
"responseBody": responseBody,
})
resp.Diagnostics.AddError(
"Error creating group policy",
fmt.Sprintf("HTTP Response: %v\nResponse Body: %s", httpResp, responseBody),
)
}
return
}

// Check if deleted from Meraki Portal was successful
if deletedFromMerakiPortal {
resp.State.RemoveResource(ctx)
// check for HTTP errors
if httpResp.StatusCode != 204 {
if err != nil {
resp.Diagnostics.AddError(
"HTTP Client Failure",
tools.HttpDiagnostics(httpResp),
)
return
}
}

// Write logs using the tflog package
tflog.Trace(ctx, "removed resource")
} else {
resp.Diagnostics.AddError(
"Failed to delete resource",
"Failed to delete resource",
)
// Check for errors after diagnostics collected
if resp.Diagnostics.HasError() {
return
}

resp.State.RemoveResource(ctx)

// Write logs using the tflog package
tflog.Trace(ctx, "removed resource")

}

func (r *NetworkResource) ImportState(ctx context.Context, req resource.ImportStateRequest, resp *resource.ImportStateResponse) {
Expand Down
9 changes: 0 additions & 9 deletions internal/provider/networks_network_resource_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,6 @@ import (
"testing"
)

func init() {
resource.AddTestSweepers("meraki_network", &resource.Sweeper{
Name: "meraki_network",
F: func(organization string) error {
return sweepMerakiNetwork(organization)
},
})
}

func TestAccOrganizationsNetworkResource(t *testing.T) {

resource.Test(t, resource.TestCase{
Expand Down
51 changes: 50 additions & 1 deletion internal/provider/networks_wireless_ssids_resource.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"github.com/hashicorp/terraform-plugin-framework/types/basetypes"
"github.com/hashicorp/terraform-plugin-log/tflog"
openApiClient "github.com/meraki/dashboard-api-go/client"
"io"
"net/http"
"strconv"
"strings"
Expand Down Expand Up @@ -483,6 +484,9 @@ func NetworksWirelessSsidPayloadRadiusServers(input types.List) ([]openApiClient
diags = append(diags, diag.NewErrorDiagnostic("Error converting OpenRoamingCertificateID", fmt.Sprintf("%s", err.Errors())))
}

if *openRoamingCertificateId == 0 {
openRoamingCertificateId = nil
}
serversList = append(serversList, openApiClient.UpdateNetworkWirelessSsidRequestRadiusServersInner{
Host: server.Host.ValueString(),
Port: port,
Expand Down Expand Up @@ -3995,13 +3999,38 @@ func (r *NetworksWirelessSsidsResource) Create(ctx context.Context, req resource
inline, respHttp, err := r.client.WirelessApi.UpdateNetworkWirelessSsid(context.Background(), plan.NetworkId.ValueString(), fmt.Sprint(plan.Number.ValueInt64())).UpdateNetworkWirelessSsidRequest(payload).Execute()
return inline, respHttp, err
})

// Capture the response body for logging
var responseBody string
if httpResp != nil && httpResp.Body != nil {
bodyBytes, readErr := io.ReadAll(httpResp.Body)
if readErr == nil {
responseBody = string(bodyBytes)
}
}

// Check if the error matches a specific condition
if err != nil {
// Terminate early if specific error condition is met
if strings.Contains(responseBody, "Open Roaming certificate 0 not found") {
tflog.Error(ctx, "Terminating early due to specific error condition", map[string]interface{}{
"error": err.Error(),
"responseBody": responseBody,
})
resp.Diagnostics.AddError(
"HTTP Call Failed",
fmt.Sprintf("Details: %s", responseBody),
)
return
}

// Check for the specific unmarshaling error
if strings.Contains(err.Error(), "json: cannot unmarshal number") && strings.Contains(err.Error(), "GetNetworkWirelessSsids200ResponseInner.minBitrate") {
tflog.Warn(ctx, "Suppressing unmarshaling error: json: cannot unmarshal number into GetNetworkWirelessSsids200ResponseInner.minBitrate of type int32")
} else {
tflog.Error(ctx, "HTTP Call Failed", map[string]interface{}{
"error": err.Error(),
"error": err.Error(),
"responseBody": responseBody,
})
resp.Diagnostics.AddError(
"HTTP Call Failed",
Expand Down Expand Up @@ -4055,6 +4084,12 @@ func (r *NetworksWirelessSsidsResource) Read(ctx context.Context, req resource.R
tflog.Warn(ctx, "Suppressing unmarshaling error: json: cannot unmarshal number into GetNetworkWirelessSsids200ResponseInner.minBitrate of type int32")
err = nil
}

// Check for specific error
if strings.Contains(err.Error(), "Open Roaming certificate 0 not found") {
tflog.Warn(ctx, "Suppressing unmarshaling error: Open Roaming certificate 0 not found")
err = nil
}
}
return inline, respHttp, err
})
Expand Down Expand Up @@ -4122,6 +4157,13 @@ func (r *NetworksWirelessSsidsResource) Update(ctx context.Context, req resource
tflog.Warn(ctx, "Suppressing unmarshaling error: json: cannot unmarshal number into GetNetworkWirelessSsids200ResponseInner.minBitrate of type int32")
err = nil
}

// Check for specific error
if strings.Contains(err.Error(), "Open Roaming certificate 0 not found") {
tflog.Warn(ctx, "Suppressing unmarshaling error: Open Roaming certificate 0 not found")
err = nil
}

}
return inline, respHttp, err
})
Expand Down Expand Up @@ -4175,6 +4217,13 @@ func (r *NetworksWirelessSsidsResource) Delete(ctx context.Context, req resource
tflog.Warn(ctx, "Suppressing unmarshaling error: json: cannot unmarshal number into GetNetworkWirelessSsids200ResponseInner.minBitrate of type int32")
err = nil
}

// Check for specific error
if strings.Contains(err.Error(), "Open Roaming certificate 0 not found") {
tflog.Warn(ctx, "Suppressing unmarshaling error: Open Roaming certificate 0 not found")
err = nil
}

}
return inline, respHttp, err
})
Expand Down
Loading

0 comments on commit 6053d0e

Please sign in to comment.