Skip to content

eks: Changing securityGroup deletes entire cluster #28584

@gricey432

Description

@gricey432

Describe the bug

Adding a custom securityGroup to eks.Cluster causes a full replacement of the stateful k8s cluster and data loss.

Expected Behavior

securityGroup would be updated in place as it is through the console, SDK, and as indicated in cdk diff.

Current Behavior

The custom resource quietly created a new, empty EKS cluster and then deleted the existing cluster.

Reproduction Steps

Should be reproducible by

  1. Create a barebones eks.Cluster with no securityGroup
  2. Deploy
  3. Add a custom securityGroup to the cluster
  4. Deploy

Possible Solution

Looking at the onEventHandler logs for the custom resource, it looks like the issue is that changing the SG caused this update set

{
    "updates": {
        "replaceName": false,
        "replaceVpc": true,
        "updateAccess": false,
        "replaceRole": false,
        "updateVersion": false,
        "updateEncryption": false,
        "updateLogging": false
    }
}

Looks like the check for replaceVpc is very rough and way too trigger happy

https://github.com/aws/aws-cdk/blob/main/packages/%40aws-cdk/custom-resource-handlers/lib/aws-eks/cluster-resource-handler/cluster.ts#L330

In fact it looks like the API can replace subnets and security groups in place https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html

The eks.Cluster resource takes a vpc prop, I think the changing of that prop should be the only reason that replaceVpc needs to be true...

Additional Information/Context

No response

CDK CLI Version

2.113.0 (build ccd534a)

Framework Version

No response

Node.js Version

v18.17.1

OS

Windows 11

Language

TypeScript

Language Version

4.9.3

Other information

#25544 may have aided disaster recovery here

I would like to have a Stack Policy on this stack but since it's implemented as a custom resource, cloudformation can only tell me that the CR itself isn't going to be replaced, but can't stop / inform me that the CR is going to be destructive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    @aws-cdk/aws-eksRelated to Amazon Elastic Kubernetes ServicebugThis issue is a bug.effort/mediumMedium work item – several days of effortp1

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions