Continuing our introductory series covering ingesting logs into Q:CYBER and using our streaming rules engine in this post we cover how to create windowed rules that can be applied to address advanced threat detection challenges.
This is the fifth installment of our series on Q:CYBER. In part 1 we covered how to configure NXLog to forward Windows Event Logs (WEL) into Q:CYBER. In part 2 we described in more detail how our streaming rules engine works. In part 3, we reviewed how to create simple rules using the Q:CYBER Rules Builder.
In Part 4, our latest installment, we discussed how to use Q:CYBER rule templates to detect Pass-the-Hash attacks based on Windows Event Logs. This basic type of streaming rule works well for writing many detections and can be quite powerful. However, some detection types require more control over when conditions are met than Pass-the-Hash attacks. In fact, there are some activities that can only be deemed malicious when there are a series of occurrences rather than just one occurrence in isolation.
For example, password spraying is a common variation on brute force attacks where an attacker attempts to gain access by trying the same password against multiple accounts. If done correctly this type of attack will not trigger account lockout policies that protect against traditional brute force attacks.This post walks through how to address such advanced detection challenges. In particular, we'll discuss how time-windowed rules can be applied to address these kinds of detection challenges.
Analyze the Data
One way to develop a detection is to execute the attack in a lab setting and look for the Indicators of Compromise (IOC). Tools such as DomainPasswordSpray are readily available on Github and can help with testing detections. Running the Invoke-DomainPasswordSpray command shown below will attempt to validate the password Winter2016 against every user account on the domain.
Looking at the events generated on the Domain Controller we can see 23 new 4625 events. Each 4625 event indicates that an account failed to login and each one of these events is for a different user.
Thinking through this behavior more we can make some assumptions about the attack:
- Attackers will likely spray one or more passwords against multiple users in the domain
- There are likely to be more failed logon attempts than successful attempts during the attack
- Attackers will likely be executing tools similar to DomainPasswordSpray from a single host where they have established a presence
The primary indicator of password spraying activity is a group of failed login events that occurred around the same time for many different accounts. However, this could be a fairly common occurrence on a large network where valid users often mistype or forget their passwords. Ideally what we want is a way to group a series of events that come from the same host based on our assumption 3 above. It turns out that the details in the 4625 event includes a WorkStationName field that we can use for grouping events, as shown in Figure 3.
Based on what we know so far we can summarize our example detection logic as,
“Alert me when we see excessive failed login attempts from the same workstation within a reasonably short window of time”
Of course this is not a precise definition. We still need to determine what number we wish to consider as excessive and over what window of time. We’ll need to make sure that these values make sense to effectively implement the rule as a detection or control.
Streaming view of events
In Part 2 of this series we detailed how our streaming rules engine was designed as a multi-stage pipeline consisting of ingest, parse, evaluate, and store stages. Logs flow through each pipeline and are evaluated against all the rules applied to the data source. The visualization below shows a representation of two types of common events passing through the evaluate stage - 4624 and 4625.
If we created a simple rule to trigger when EventID equaled 4625 the rule would trigger on each occurrence of event 4625 crossing the pipeline. The amount of alerts would be unmanageable using this rule.
Even further filtering on error codes to only trigger on “Unknown user name or bad password” failures would still result in too many alerts to handle as every time a user entered the wrong password we would get an alert. Instead what we really need is to evaluate a series of events within a window of time.
In other SIEM products using the scheduled query approach, you typically see time bound queries to search for password spraying behavior. An example query written in Kusto, that searches back over a day for clusters of 4625 events might look like:
let timeframe = ago(1d);
let threshold = 8;
| where TimeGenerated > timeframe
| where EventID == 4625
| summarize Total = count() by bin(TimeGenerated, 5m), WorkstationName
| where Total > threshold
But what schedule should you run this query on? Once a day? Once an hour? Every minute? How far should you look back? This kind of periodic query-based approach (which Q:Cyber can also support and we’ll cover when it is appropriate and how to use in future posts) is less ideal for these kinds of common time-windowed detections where the streaming event-oriented architecture of the Q:Cyber rules engine provides better performance, timeliness and control.
Q:CYBER Windowed Rules
Windowed rule evaluations is an inherent feature of the Q:CYBER streaming rules engine. Instead of worrying about scheduling queries we can easily craft a rule to detect password spraying using the Q:CYBER rules builder. Instead of evaluating each event we can instead do exactly what we need and evaluate a group of events within a window of time.
To create a windowed rule simply open the rules builder and select your data source, give the rule a name and make sure you toggle the Windowed Rule switch to enable more configuration options shown in Figure 7.
The logical condition for this rule is very simple. We want to look for any Windows events with EventID equal to 4625.
If you have been following along in the series you’ll notice that now there are more options available below the IF block in the rules builder. This is where you can configure the additional logic for windowed evaluations. There are several options that can be configured here, but let’s focus on the Window every, Aggregation count, and GROUP BY options, shown in Figure 9.
The settings for windowed evaluations applies to the logic specified in the IF block above. In Figure 10, setting the Window every value to 5 means we’re looking for events that fall within a 5 minute window. Setting the Aggregation count value to 8 means we want to alert if we see 8 occurrences within our window.
The last windowed setting is GROUP BY. As mentioned above we want to detect when there are a group of events coming from a single system so we set this field to WorkStationName.
In summary what we’ve setup is a detection that will:
“Alert me when 8 or more 4625 events occur from the same workstation within 5 minutes”
Lastly, we set our severity and description as usual.
Now we can enable our rule and try Invoke-DomainPasswordSpray one more time to see the attack detected in Q:CYBER.
With any heuristic it may be necessary to tune these settings to your environment for the best coverage. As a final important note, any detection is only as good as the assumptions it is based on. Attackers are constantly evolving their techniques to avoid getting caught and there is rarely a silver bullet to catch all variations of an attack, but they also only have to slip up once to trigger a detection and tip the odds in the defenders favor.