Identifying Stolen Credential Use in AWS CloudTrail Logs with High Confidence using Categorical Anomaly DetectionRob Malnati
The move to the cloud represents new challenges for enterprise security teams. Systems are more distributed and the impact of credential theft is greater than ever. Running your services in a public cloud vendor like AWS requires you to monitor and detect attacks in real-time, but how do you do that without drowning in the noise? Existing tools can highlight statistical anomalies, but are limited to counts and thresholds and have been shown to produce an unacceptable rate of false positives.
About AWS CloudTrail
AWS CloudTrail monitors calls to AWS services and delivers detailed logs, providing a complete audit of management calls, with optional inclusion of data calls. To detect attacks effectively, you will need both, but the resulting high volume of log events creates a glut of data, making it very difficult to detect a behavioral attack like those leveraging stolen credentials. Stolen credentials provide access to sensitive resources, and an attack will tailor its activities to make those actions look like normal usage. In addition, attackers recognize the obfuscating effect of high traffic volume, and use the complexity of the logs to hide their activity by moving slowly and carefully. This challenge requires that anomaly detection be capable of understanding the shared characteristics of novel behaviors and to ignore the simpler sequential and time stamped information that is often relied upon by other methods.
In the example below, we have configured CloudTrail to monitor management, data, and Insights events. We have configured thatDot Anomaly Detector to read the “trail” of CloudTrail events as they are written to s3.
Stolen Credential Use—an Example
Let’s say through a dark web scan, you’ve become aware that many of your staff’s credentials may have been compromised. It might be through a company partner, a service provider, or the breach of a service popular in your industry. To address these newly exposed credentials, a common brute force remediation is to force a logout and password reset for every employee. This is obviously extremely disruptive and can create unintended interruptions and delays in business functions. By doing this, you have only changed the passwords, and have not detected a source, purpose, or target of a credential theft attack. A motivated adversary may have already installed keylogging or browser-subverting capabilities that will capture any new password and be able to gather other authenticating credentials. If this has happened, in spite of your efforts, you will still have no idea whether an attack is coming.
In the case described below, the compromise is real and a hacker has begun to probe the AWS account. She finds that she can successfully scan S3 buckets and creates a script to try other high-value services such as Service Discovery and ELB and runs the script later that evening.
If your security monitoring system is configured to monitor CloudTrail logs and uses thatDot Anomaly Detector, you can detect the attack quickly. Anomaly Detector generates an observation for each CloudTrail event, and these observations have a novelty score that indicates how relevant it is and how much it warrants your attention. Novelty doesn’t immediately trigger a security event, but the system will identify what makes the observation novel, and this provides the unique insight to speed manual or automated categorization. At this point we can show the real-time plots in Anomaly Detector to show it learning both system and user behavior and then seeing everything as normal in the form of a down-and-to-the-right curve.
Looking at the data below, which shows the most recent observations, we see that Anomaly Detector detected an observation with a particularly high score. User raul accessed three different AWS resources for the first time, and then three other AWS resources a few minutes later. The activities are novel, but the critical insight is that raul’s execution of these actions is the element that is most novel. Anomaly Detector retains the CloudTrail event ID and event time, enabling us to navigate to the actual CloudTrail events and investigate the details. Most importantly, it forwards the observation and score to our security monitoring system for action.
Seeing this pattern and the multiple novel events generated by raul, it’s obvious that raul’s credentials are being used in multiple, highly unusual, operations. In response, for this one user, we can immediately force a logout and password reset. This remediation limits the disruption only to the impacted user. Because we’ve identified the attacked services, we can follow up by reviewing the services’ access logs for other signs of suspicious activity.
How is thatDot Different?
Let’s compare this to watching for credential misuse through AWS CloudTrail Insights, which many use to find anomalies in CloudTrail events. CloudTrail Insights is based on traditional anomaly detection techniques, which, in this case, watches for a simple variance in the number of API calls. This method does not highlight the attack, as their volume of events is too low. It did, however, highlight a number of other events that are clearly false positives.
As an example, one of the CloudTrail Insights events shows that API calls to create a network interface on EC2 increased over a period of time. But this event is neither new nor novel. Every EC2 instance allows the creation of up to 15 network interfaces, and each of these can have one or more separate IP addresses. This is necessary for deploying services with multiple SSL certificates, and for many other purposes.
These types of false positives create two problems for security analysts. First, the additional volume of data forces extra work onto the analysts to recategorize and eliminate the false positives. Secondly, real events that may be detected are buried within the stream of mixed information, creating the problem of alert fatigue and missed events that define overworked security analysts. Using thatDot Anomaly Detector creates a limited and high confidence set of events, allowing analysts to identify, work, and resolve the most relevant events. To find what’s truly novel you will often need all of the context of an event. In this demonstration case, we’ve configured thatDot Anomaly Detector to focus on each users’ use of different operations on all services at various times of day. This context is also crucial for skills-based routing—getting the incident information to the right team who can follow up with timely verification and remediation.
Alert on What Matters
Normally, you don’t watch Anomaly Detector in real time. Your security monitoring system has integrations that alert you to the anomalies that you care about. In this example, we may set up a PagerDuty integration, with a score threshold to >0.99, so we were only notified for this urgent anomaly. For Slack, we set the threshold to >0.95, so we received six other observations of interest that we can tackle as our time permits.
These benefits mean that a security analyst will focus on high-value, high likelihood, alerts, leading to a 10x increase in productivity. Good security analysts are hard to find, and their limited time and burnout common in the industry make it crucial to use their time wisely.
thatDot’s categorical analysis delivers the capability to generate these high value, high confidence alerts, so that analysts can quickly find true anomalies, judge them for maliciousness, and filter out the noise. thatDot finds anomalies in a richer set of data including categorical values like usernames, hostnames, file paths, URLs, process names and more; not just numbers. Better tools for examing more kinds of data finds more true anomalies and filters out results that are unsurprising numeric outliers. In CloudTrail, other anomaly detection solutions miss the fact that raul’s account was used to try to access several new services because the context, the link between operation requests and a single user, went unexamined. This contextual awareness is the categorical difference.
If you’d like to see this in action, or learn more from the team that is pioneering categorical analysis at thatDot, click here.