A company has stateful security appliances that are deployed to multiple Availability Zones in a centralized shared services VPC. The AWS environment includes a transit gateway that is attached to application VPCs and the shared services VPC. The application VPCs have workloads that are deployed in private subnets across multiple Availability Zones. The stateful appliances in the shared services VPC inspect all east-west (VPC-to-VPC) traffic.
Users report that inter-VPC traffic to different Availability Zones is dropping. A network engineer verified this claim by issuing Internet Control Message Protocol (ICMP) pings between workloads in different Availability Zones across the application VPCs. The network engineer has ruled out security groups, stateful device configurations, and network ACLs as the cause of the dropped traffic.
What is causing the traffic to drop?
The correct approach is to use AWS Systems Manager Session Manager, which allows you to manage your EC2 instances through a secure and browser-based interface. By deploying and configuring SSM Agent on each instance, you can enable Session Manager to communicate with the instances. By deploying VPC endpoints for Session Manager, you can enable the instances to connect to the AWS service without requiring an internet gateway, NAT device, or VPN connection. You can also use IAM policies and SSM documents to implement role-based access control for managing the instances. This approach has the least maintenance overhead, as it does not require any additional infrastructure or configuration.
Amalia
7 days ago