Respond and Recover from a Data Breach

How to respond to a large-scale data breach for a global retail giant.

This project is a simulation of a real-world data breach for a global retail giant. You are a Junior Cloud Security Analyst for Cymbal Retail, a global retail powerhouse with 170 physical stores and an online platform spanning 28 countries. In 2022, the company reported $15 billion in revenue and employed 80,400 employees worldwide. With millions of transactions occurring daily, ensuring the security of customer data, employee information, and operational assets is critical to maintaining internal and external compliance.

Role: Junior Cloud Security Analyst

Tools: Google Cloud Security Command Center, Google Cloud Console, Cloud Identity and Access Management, Firewall Rules Configuration, PCI DSS Compliance Report and Cloud Logging and Monitoring

Deliverable(s): Executive Summary

(Click Here)

Inciting Incident:

One morning, the security team detected unusual activity in their systems. Further investigation revealed that the company had suffered a massive security breach across its applications, networks, systems, and data repositories. Attackers gained unauthorized access to sensitive customer information, including credit card data and personal details. This incident required immediate attention and thorough investigation. The first step was to gather information and analyze the available data to understand the scope and impact of the breach.

I was tasked with supporting the incident response effort to address a massive data breach that Cymbal Retail recently suffered. As part of the security team, I took the following steps:

Identifying vulnerabilities related to the breach.
Isolating and containing the threat to prevent further access.
Recovering compromised systems.
Remediating compliance issues.
Verifying that remediation steps met security standards.

Task 1: Analyze the Data Breach and Gather Information

To begin investigating the data breach, I open up Google Cloud Security Command Center to analyze available data and gather key information. By reviewing active vulnerabilities, I can gain an overview of the current security issues affecting resources such as storage buckets, virtual machines, and firewalls. I will focus on findings by resource type to identify vulnerabilities tied to the attack. This initial analysis will allow me to understand how the attackers gained access, helping me prioritize and take the necessary remediation steps to secure the environment.

PCI DSS is a set of security requirements that organizations must follow to protect cardholder data. As a retail company that accepts and processes credit card payments, Cymbal Retail must also ensure compliance with PCI DSS requirements to protect cardholder data.

Clicking on "View Details" will open up a new screen that highlights where the cloud environment isn't compliant with PCI DSS and could be at risk.

These rules correspond to the bucket, VM, and firewall rules found previously.

This report highlights multiple critical security issues:

Lack of proper logging, making it hard to audit network access.
Open RDP and SSH ports expose systems to the internet, increasing attack risks.
Publicly accessible VMs and storage buckets make sensitive data easy targets for attackers.
Overly permissive default service accounts could allow attackers to perform high-level actions if compromised.

Each of these vulnerabilities needs to be addressed to secure the environment and meet PCI DSS compliance standards.

Google Cloud Storage bucket

For further examination and analysis of the vulnerabilities, go to Security > Findings.

Click on the "Google Cloud storage bucket" under the quick filters section

Here are the active findings pertaining to the storage bucket listed and you'll see that the bucket is configured with a combination of security settings that could expose the the data to unauthorized access. To remediate this, remove the public access control list, disable public bucket access, and enable the uniform bucket level access policy.

Google Compute Instance

Click on the "Google compute instance under the quick filters section

Under this filter, you can see that the VM was configured in a way that left the machine vulnerable to attack. To remediate this, I needed to shut down the original VM down and create a new one using the clean snapshot of the original VM.

The Malware: bad domain finding indicates that a domain known to be associated with malware was accessed from the instance named cc-app-01.

The Compute Secure Boot Disabled finding is a medium finding that indicates that secure boot has been disabled for the virtual machine.

The Public IP Address is a high severity finding thar is listed in the PCI DSS report and indicates that the virtual machine has a Public facing IP address, allowing anyone on the internet to connect directly to the machine.

These findings indicate the virtual machine was configured in a way that left it very vulnerable to the attack. To remediate these findings I had to shut the original VM (cc-app-01) down, and create a VM (cc-app-02) using a clean snapshot of the disk. The new VM will have the following settings in place:

No compute service account
Firewall rule tag for a new rule for controlled SSH access
Secure boot enabled
Public IP address set to None

Google Compute Firewall

Click on the "Google compute firewall under the quick filters section

The Open SSH port is a high severity finding indicating that the firewall is configured to allow Secure Shell (SSH) traffic to all instances in the network from the whole internet. *This is like leaving your house door unlocked, allowing anyone from anywhere the ability to enter.
Open RDP port is another high severity finding that indicates that the firewall is configured to allow Remote Desktop Protocol (RDP) traffic to all instances in the network from the whole internet. *It is a similar vulnerability to the Open SSH port, so let's just say, this is like leaving your window wide open allowing any and everything the access to your home.
Firewall rule logging disabled is a medium severity finding that indicates that firewall rule logging is disabled. This means that there is no record of which firewall rules are being applied and what traffic is being allowed or denied. *This makes it harder to track or investigate suspicious activity, similar to not having security cameras to see who is coming in and out of a building.

These issues are listed in the PCI DSS report and show a major security weakness in how the network is set up. Because access to RDP and SSH isn’t restricted, and there’s no logging of firewall activity, the network is at high risk of hackers trying to break in and steal data. To fix this, you’ll need to remove the current firewall settings that allow too much access and replace them with a rule that only allows SSH connections from specific, trusted addresses used by Google Cloud's secure SSH service.

Now, with the vulnerabilities being assessed, it is time to work on remediating the report findings.

Task 2: Fix the Compute Engine Vulnerabilities

The vulnerable virtual machine (VM), "cc-app-01", needs to be shut down and we will create a new VM from a snapshot of the original VM prior to the malware infection. This will restore the system to a clean slate, ensuring that the new VM will not be infected with same malware. We stop the original by clicking the box next to "Name" and there is a prompt to "Stop".

Once it has stopped, you'll go back to the VM instance homepage. Here you can can begin to create the new instance that replaces the infected VM.

In this stage, we are prepping the new VM. Highlighted is where we will create the "Name" for the new VM.

Once it is saved, and you go back to the previous page, you can now see the creation of the new VM instance. Delete the first VM (cc-app-01).

Once again, activate the checkbox next to the desired VM instance (cc-app-01) and click DELETE. Then confirm that you want to delete.

Task 3: Fix Cloud Storage Permissions

In this task, I’ll secure the storage bucket by removing public access and applying uniform access control, which greatly reduces the chance of a data breach. By revoking all user permissions from the bucket, I ensure that only authorized individuals can access the sensitive data stored inside.

To do this, I’ll navigate to the Cloud Storage section, locate the storage bucket that contains a publicly accessible file (myfile.csv) with sensitive information exposed by a malicious actor, and remove public access. Then, I’ll switch to a uniform bucket-level access control policy, ensuring that permissions are consistent across the entire bucket and its contents.

It’s important to manage this carefully, so users with legitimate access aren’t accidentally locked out. After making the necessary changes, I’ll confirm that public access has been revoked and verify that all users who still need access retain their permissions.

This task helps tighten data security and addresses the issues of public access, bucket policy, and logging.

Task 4: Limit Firewall Ports Access

In this task, I’ll focus on securing remote access to the system by limiting RDP and SSH ports to only authorized networks. This is important because reducing the number of entry points helps lower the risk of unauthorized access.

When adjusting firewall rules, it’s critical to ensure that legitimate traffic isn't accidentally blocked, as this could disrupt important operations. In this scenario, I’ll make sure that virtual machines labeled with the "cc" tag can still be accessed via SSH from Google Cloud’s Identity-Aware Proxy (IAP) address range (35.235.240.0/20). To ensure management access isn’t interrupted, I’ll first create a new firewall rule that restricts SSH access to only authorized IP addresses, before removing the old rule that allowed SSH connections from anywhere.

This step is about tightening security without affecting system functionality.

Task 5: Fix the Firewall Configuration

In this task, I’ll remove three firewall rules that are currently allowing open access to important network protocols, such as ICMP, RDP, and SSH, which can create security risks. After that, I’ll turn on logging for the remaining firewall rules to keep track of any activity going through the network. This ensures that only the necessary traffic is allowed while maintaining visibility into the system’s security.

Task 6: Verify Compliance

After diligently addressing the vulnerabilities I identified in the PCI DSS 3.2.1 report, it's essential to verify the effectiveness of my remediation efforts. In this step, I will rerun the report to ensure that the previously identified vulnerabilities have been successfully mitigated and no longer pose a security risk to the environment.

View the Executive Summary

Page updated

Google Sites

Report abuse