Agile cloud threat modeling - the fractus model

If you are looking for a walkthrough of industry based, formal threat models, this is not the correct place to look. What I am going to share is an agile, light-weight, "fractus" threat model for generating secure configurations for cloud services. The model is limited to 11 questions that can be delegated to cloud developers for accelerated secure service adoption.

It often feels like a never-ending race to keep up with the latest services and enhancements to cloud services from the major providers. Teams want and need to consume them. Competitive edge is a necessity for business survival and as security practitioners, we must be able to enable our businesses to consume, implement, and scale.

In order to scale the approach and delegate the development of secure defaults, cloud guardrails, and community driven security by design, the fractus model checklist covers the majority of cloud service variations while tackling the components of greatest risk to the business. Additionally, it requires limited to no specialized security expertise while identifying hardening requirements.

Many of the examples within the threat model are AWS specific but are applicable across other cloud providers.

If at this point, you just want the code, the checklist files can be generated via a JSON object and python script at: https://github.com/brevityinmotion/fractus/.

Threat Modeling Questions

1.) Does the service provide direct internet egress functionality?

  • Risk level: High
  • Why: Direct internet access functionality may bypass centralized network logging, monitoring, and visibility resulting in data exfiltration.
  • Examples: a developer PaaS IDE environment where the intent could be for ease of software package updates. AWS SageMaker is an example service that has a checkbox to permit direct internet access and could circumvent existing network controls.

2.) Does the service introduce a network entry point (ingress access) from the internet?

  • Risk level: High
  • Why: Often cloud services can be configured to be accessible directly from the internet resulting in a larger attack surface and increased risk of remote compromise.
  • Examples: Prior to enterprise enablement, there may be necessary guardrails to ensure that the services are only internet exposed through existing patterns such as behind a content delivery network (CDN), an API gateway, an application load balancer, or a web application firewall. Services that expand the attack surface may include AWS CloudFront, AWS AppSync, AWS EC2, and AWS API Gateway.

3.) Can the service resources be made publicly accessible?

  • Risk level: Critical
  • Why: Accidental public exposure is the most common cloud security headline. Simple misconfigurations by making databases, object storage repositories, and snapshots public can cause extensive damage to a company.
  • Examples: Common services that may require a private-only configuration which often is just a set of checkboxes. Example services to investigate are object storage repositories (i.e. S3), database services, backups, snapshots, server images (i.e. AMIs), and public marketplaces. Scott Piper has published a valuable list of publicly exposed resources at https://github.com/SummitRoute/aws_exposable_resources.

4.) Does the service expand or elevate administrative/privileged permissions to other services?

  • Risk level: Medium
  • Why: Least privilege policies can be difficult to implement at scale. There are numerous tools and technologies that can assist with this. Often, enabling a single cloud service cascades into additional service dependencies and authorizations necessary to make the intended service function effectively.
  • Examples: Review the service dependencies and ensure that the authorization is appropriate across other services. For example, if the service must call a key management service or secrets management, is there a default configuration to limit the read access to its specific usage requirements (i.e. tags, conditional access).

5.) Does the service provide functionality to create native/local user or service accounts?

  • Risk level: Medium
  • Why: Services may not be fully matured into the broader cloud provider's authentication and authorization schemes. There may be further due diligence and configuration necessary to secure the additional sprawl of service specific accounts.
  • Examples: This is often a situation with database services where local database passwords can be generated, cluster infrastructure may have a local login, and other IaaS services.

6.) Does the service provide encryption at rest functionality?

  • Risk level: Low
  • Why: Encryption at rest primarily protects against physical hardware compromise and theft. Depending on where the encryption occurs, it may result in incremental blast radius reduction from best to worst (data record level, data platform level, infrastructure/disk level). Another determining factor is whether encryption at rest utilizes different keys per customer/tenant.
  • Examples: If not natively included, does it integrate with other key management services and is there an option to force this encryption by default?

7.) Does the service provide connectivity via unencrypted protocols?

  • Risk level: Medium
  • Why: Public cloud providers are accessed via the internet (or tunnels, VPNs, WAN, private fiber) and likely require network traffic to traverse multiple internet service providers (ISPs) and routing paths. There are opportunistic network points available for adversaries and entities to monitor, intercept, or modify this traffic. Cryptographic protections are an effective defense to ensure the confidentiality and integrity of network layer communications.
  • Examples: Is there a secure default to only permit HTTPS vs HTTP; SFTP vs FTP; SSH vs Telnet?

8.) Does the service run within or attach to a private network?

  • Risk level: Medium
  • Why: Much of the risk and impact of this is dependent on the private network routing and reachability across the backend service network. There are often tradeoffs to be considered as to whether or not resources should be attached to a private network. For situations that are necessary based on internal layer 3 access dependencies, the use case will need to be reviewed for appropriate network segmentation. If a service does not have a layer 3 dependency, it should be evaluated whether the tradeoffs of aggregate network visibility and connectivity from being attached are worth the tradeoff of full layer 3 isolation and explicitly defined access models (zero trust principle?).
  • Examples: Investigate potential bridging situations where a service attached or connected to an internal network may result in a compromise of network trust zones or segmentation (i.e. Lambda functions can be attached to a VPC).

9.) Does the service have additional logging functionality that needs to be enabled?

10.) Does the service have its own resource level Identity and Access Management (IAM) policies?

  • Risk level: High
  • Why: Understanding and focus on the entirety of an access and authorization pattern for a cloud service is a foundational necessity. Cloud services may have resource specific policies when, if overlooked and unmanaged, could result in a misconfiguration that provides unintended access.
  • Examples: In the situation of AWS, in addition to the account level IAM roles and policies, are there resource specific policies applied that need to be factored into the overall access and authentication analysis?

11.) Does the service have any 3rd party compliance attestations/certifications?

  • Risk level: Medium
  • Why: Cloud adoption includes a shared responsibility model. That responsibility is governed via a legally binding contract by both parties. The contract ensures that the cloud provider is responsible for maintaining an adequate level of security. To quantify that "third-party trust", independent assessments by reputable third-party organizations can provide increased assurance that a cloud provider is meeting the expectations of their service delivery and security expectations.
  • Examples: This may include certifications, accreditations, or attestations such as ISO 27001, PCI, HIPAA, SOC1/2, or HITRUST. It is important to consider that brand new service offerings may not have been included in annual audits or have enough historic data for certification. For services that have existed for beyond a year and are not listed in a matrix are the ones that I recommend investigating further with the cloud provider.

Mapping severity to implementation expectations

Upon triage and review of these service components, the consumption expectations are:

  • Critical risk - requires implementation of preventive, detective, and responsive controls.
  • High risk - requires implementation of preventative and detective controls.
  • Medium risk - requires implementation of detective controls.
  • Low risk - requires implementation of procedural controls.

All services have an expectation that secure configuration baselines (procedural controls) are published as these are the initial foundation of governance and measurement within a cloud ecosystem.

Control Implementation Models

Upon establishment of secure baselines, the governance maturity can be accomplished through several types of security control models:

  • Procedural - Published baselines that can be consumed by cloud developers for secure implementation of cloud services.
  • Preventive - Secure design patterns, guardrails, and authorization boundaries that restrict cloud functionality or prevent insecure configurations.
  • Detective - Codified configuration management checks that can evaluate the security posture of the resources instantiated from the services and signal non-compliant configurations.
  • Responsive - Upon detection, the automated remediation or response actions initiated against a non-compliant resource.
  • Proactive - Pre-deploy, typically code linting within a developer IDE or a CI/CD code scan.

Conclusion

I hope this threat model can help you scale and enhance your overall cloud security program. Feel free to send me feedback on any gaps, additions, or modifications. I have found these to be easily adopted, quick to triage, and creates an initial foundational assessment of base cloud service risk and results in tangible actions to incorporate into security by design implementation as a component of the service.

I fully intend to triage every cloud service through this threat model and publish the hardening guardrails as a follow-up across AWS, Azure, and GCP in the upcoming future.  Thank you for your time spent in the article! If you enjoyed this, make sure to subscribe to this site for updates and new content!