Enhancing AWS Security and Efficiency

Summary

This case study document outlines a series of improvements aimed at enhancing the security, efficiency, and reliability of AWS infrastructure. We’ve got engaged in checking overall condition of the AWS account and all workloads present in one availability zone.

Document provides actionable solutions to address existing limitations within customer setup. Our proposed changes revolve around AWS IAM user management, instance optimization for production workloads, database storage adjustments, Terraform code base consolidation, and leveraging AWS Organizations for improved access control and security.

IAM Users and MFA Enforcement

Current Challenge: We have a singular AWS account, utilizing IAM users to manage access and permissions, which currently does not mandate Multi-Factor Authentication (MFA).

Proposed Solution: Enforce MFA for all IAM users to significantly enhance security against unauthorized access. MFA adds an additional layer of security by requiring a second form of verification beyond just the password.

Reference:

Optimizing Production Instances

Current Challenge: Production environments run on T3 instances, subject to CPU credit limits, risking performance degradation and potential outages.

Proposed Solution: Transition to compute or memory-optimized instances (C or M classes) or disable CPU Credit for general-purpose instances in production. This change aims to provide stable and predictable performance, avoiding disruptions due to exhausted CPU credits.

Reference:

Database Storage Enhancement

Current Challenge: Production RDS instances are equipped with small disks, limiting IOPS and potentially affecting performance.

Proposed Solution: Upgrade disk sizes to a minimum of 100GB for customer-facing services, transitioning to gp3 for cost efficiency and performance. Implement monitoring for IOPS to proactively manage performance.

While checking that, we have found that an RDS instance is available publicly from the internet – we’ve strongly recommended moving the database to a dedicated private subnet and investigating logs extensively for any data leaks. The database didn’t contain any sensitive data, but metadata present in this RDS instance could provide attacker with some valuable information.

Reference:

Terraform Infrastructure as Code (IaC) Consolidation

Current Challenge: Terraform code is scattered across multiple repositories, complicating maintenance and disaster recovery efforts.

Proposed Solution: Centralize Terraform code into a single repository, and create additional single or multiple repositories for Terraform reusable code (modules). This would improve manageability and reliability. Another suggestion was to explore Terraform Cloud or Atlantis for automated application and testing.

Reference:

Adoption of AWS Organizations

Proposed Enhancement: Utilize AWS Organizations for enhanced security and access management, segregating environments to minimize the impact of security breaches and streamline user access control through roles and MFA.

Reference:

Conclusion

The proposed enhancements are designed to bolster customer AWS infrastructure’s security posture, efficiency, and resilience. By adopting these measures, we aim to create a robust environment that supports our operational requirements while mitigating risks. Stakeholders are encouraged to review the proposed solutions and references to understand the impact and benefits of these changes fully.