Analyze AWS VPC Flow Logs with Gigasheet
Several months ago, a client company reached out for help in addressing several security issues identified during their annual Payment Card Industry (PCI) Data Security Standard (DSS) audit. The auditor conducting the assessment discovered that the company had set up its firewalls to allow all network traffic by default and had not established a documented process to review firewall configurations periodically.
PCI DSS is a payments security industry standard overseen by the PCI Security Standards Council (PCI SSC). PCI SSC was founded in 2006 by American Express, Discover, JCB International, MasterCard, and Visa to enhance global payment account data security by developing standards and supporting services that drive education and awareness and effective implementation by stakeholders. In general, merchants that accept card (credit or debit) payments from customers need to prove annual compliance with the security requirements outlined in the PCI DSS standard. One such requirement is for merchants to deploy and maintain firewalls to protect the cardholder environment and review firewall rule sets at least twice per year.
The client company had hosted its IT infrastructure in Amazon Web Services (AWS) primarily consisting of EC2 machines and a few serverless functions. Luckily for this company, they had enabled flow logs across all VPCs, which helped me identify all network flows and lock down the security groups. The project took four weeks to complete because of the large volume of logs I had to go through. At the time this project took place, Gigasheet did not exist. I had to follow a very manual and painful process to get the data I needed, using a combination of Linux commands and slicing and dicing spreadsheets to keep my computer from crashing. Now that Gigasheet is here, I wanted to see how long it would take me to complete the same log analysis using a similar data set.
In this blog, we will analyze a large volume of AWS VPC flow logs in Gigasheet and use the output of the analysis to create AWS security groups. In the interest of time and simplicity, we will illustrate how to quickly identify all ingress and egress network flows for a single instance, but the process outlined in this blog can be easily applied to multiple instances.
Disclaimer: the VPC flow log data here is fictitious (i.e., AWS account number, ENIs, and private IP addresses) .
An AWS VPC flow log represents an Internet Protocol (IP) network connection between two systems, consisting of a string of data separated by spaces. The type of fields included in a VPC flow log will depend on the version of VPC flow log used. By default, VPC flow logs include the version 2 fields as illustrated below:
An AWS security group is a virtual firewall that wraps around an EC2 instance or serverless function to control ingress and egress network traffic. Unlike traditional network firewalls that operate at the network subnet level, AWS security groups operate at the instance level, requiring two separate sets of rules:
Rules that control network traffic to an instance
Rules that control network traffic from an instance
Step 1: Uploading AWS VPC Flow Logs to Gigasheet
The file that I used in this analysis includes the default VPC flow log fields and contains roughly 2.7 million rows, as illustrated below.
I always start by decluttering the screen and removing information that is not relevant to the analysis. The following columns do not provide any useful data to this particular analysis, so I hid them from the screen:
Step 2: Identify all Ingress Network Connections
With the screen decluttered, we can begin analyzing the logs. We will start by identifying all connections to internal instances from other internal instances or external hosts by grouping the logs by the destination IP address column and filtering out all IP addresses that do not begin with "10.x.x.x" (all the instances were assigned an IP address in the 10.x.x.x IP range). The result includes fifteen unique private IP addresses present in the destination IP address column.
Next, we can further group by source IP address, followed by destination port, and finally by protocol. It is important to pay close attention to the destination port column because VPC flow logs will include records of both sent and received traffic. AWS security groups are stateful, meaning that they keep track of the state of TCP, UDP, and ICMP connections (with some exceptions). If an instance sends traffic to another instance or external host, the response traffic will be allowed regardless of the inbound rules because it would be considered part of the same flow or connection. For TCP traffic, VPC flow logs include a 'tcp-flags' field which can be used to identify the connection originator. The ‘tcp-flags’ field is included in version 3 of VPC flow logs. Unfortunately, the sample VPC flow logs used in this analysis do not include the tcp-flags field. Therefore, to exclude logs associated with return traffic, we will apply a filter to the destination port column to display values in the 1-1024 range only (or well-known ports). Doing it this way is not ideal because you could miss legitimate network connections if applications are running on ports greater than 1024. Therefore, I recommend that you enable ‘tcp-flags’ for your VPC flow logs to increase the accuracy of the analysis.
In the screenshot below, we can see two unique connections to instance 10.54.10.100, summarized in the table below:
You can apply the same process to the remaining fourteen instances to identify all ingress network flows.
Step 3: Identify all Egress Network Connections
To identify the egress flows, can apply the same steps as before, but instead of starting by grouping by the destination IP address column, you would group by the source IP address column and filter out all IP addresses that do not begin with "10.x.x.x". You would then further group by destination IP address, destination port, and protocol columns and apply a filter to the destination port column to display values in the 1-1024 range or well-known ports.
In the screenshot below, we can see one unique connection from instance 10.54.10.100, which is summarized in the table below:
Step 4: Build the Security Group
With the ingress and egress connections identified, you can now start to build the security group. The ingress and egress rules for the security group applied to instance 10.54.10.100 would look like this:
Using Gigasheet this work takes about 1/10th the time vs using Excel! Try it for yourself. Request access to the private Beta on the Gigasheet homepage.