Aggregate Risky Events

Description

Detect low and slow activities and complex insider threat patterns by finding users with concentrations of risky activities.


Use Case

Insider Threat, Advanced Threat Detection

Category

Insider Threat, Account Sharing

Security Impact

A constant dilemma in the SIEM space is handling events that are potentially risky but not so risky that you want to send it directly to the SOC. There are a variety of different methods for responding these, including the high powered capabilities of Splunk UBA (graph mining, machine learning, looking over all time? oh my!), but those just starting down this path will find success even with simpler technological approaches with the Splunk ES Risk Framework. Here we look at two methods -- the first is to alert on any users for whom the number of searches in a short period is above a threshold, or if searches from multiple security domains fire in a short period of time. The second is to leverage just the numerical risk scores defined in the Risk Framework to show users who have bursty recent risk scores or consistent long term risk. These will detect users who have a large volume of suspicious behavior even in scenarios where there's not enough immediate risk to generate an alert to the SOC, preventing a slow and low threat from bypassing your detections.

Alert Volume

Medium (?)

SPL Difficulty

Medium

Journey

Stage 4

MITRE ATT&CK Tactics

Persistence
Privilege Escalation
Lateral Movement

MITRE ATT&CK Techniques

Valid Accounts
Remote Desktop Protocol
Windows Remote Management

MITRE Threat Groups

APT1
APT18
APT28
APT3
APT32
APT33
APT39
APT41
Axiom
Carbanak
Cobalt Group
Dragonfly 2.0
FIN10
FIN4
FIN5
FIN6
FIN8
Lazarus Group
Leviathan
Night Dragon
OilRig
Patchwork
PittyTiger
Soft Cell
Stolen Pencil
Suckfly
TEMP.Veles
Threat Group-1314
Threat Group-3390
menuPass

Data Sources

Ticket Management

   How to Implement

You can use almost anything to generate a list of risky events -- in fact that's the desired intent for most medium-or-higher alert volume searches in Splunk Security Essentials! Most customers who go down this path will use the Enterprise Security Risk Framework to aggregate these events, though you can also 'roll your own' risk framework with Summary Indexing.

   Known False Positives

This search can inherently generate false positives because it opens the door to alerting on low confidence events. You will need to tune the underlying searches, or potentially filter for a certain risk score to manage false positives.

   How To Respond

See the response guidance for the underlying searches that generate the original events.

   Help

Aggregate Risky Events Help

This example leverages the Simple Search assistant. Our example dataset is a collection of events from the risk index of a demo environment for Splunk Enterprise Security. The risk index provides a way to track "risky" events that aren't necessarily significant enough to send to a SOC analyst (along with aggregating the overall volume of events for a given user). Our live search looks for the same behavior using the standard sourcetypes. See How to Implement if you do not use Enterprise Security.

SPL for Aggregate Risky Events

Demo Data - By Count

First we bring in our basic demo dataset. In this case, a list of events from the risk index of a demo Splunk Enterprise Security environment. We're using a macro called Load_Sample_Log_Data to wrap around | inputlookup, just so it is cleaner for the demo data.
Many people have asked for the ability to alert on events crossing different security domains. In OOTB ES correlation searches the standard format for a correlation search is "Security Domain - Name of the Rule - Rule", so this regex will extract the security domain.
In order to do this analysis, we are going to bucket by _time so we can look over many days. (This is more relevant for demo data.)
The heart of the search uses the stats command to calculate the # of security domains and the # of different correlation searches that fire per risk_object. We also include the values for those fields for contextual convenience.
Finally we can filter for risk_objects that have 3 or more different security domains or 5 or more searches firing. (Thresholds are somewhat arbitrary, but probably reasonable for many organizations.)

Live Data - By Count

First we bring in our dataset of events from the risk index of a Splunk Enterprise Security environment.
Many people have asked for the ability to alert on events crossing different security domains. In OOTB ES correlation searches the standard format for a correlation search is "Security Domain - Name of the Rule - Rule", so this regex will extract the security domain.
The heart of the search uses the stats command to calculate the # of security domains and the # of different correlation searches that fire per risk_object. We also include the values for those fields for contextual convenience.
Finally we can filter for risk_objects that have 3 or more different security domains or 5 or more searches firing. (Thresholds are somewhat arbitrary, but probably reasonable for many organizations.)

Demo Data - By Score

First we bring in our basic demo dataset. In this case, a list of events from the risk index of a demo Splunk Enterprise Security environment. We're using a macro called Load_Sample_Log_Data to wrap around | inputlookup, just so it is cleaner for the demo data.
This line is only needed for demo data (which, like us, gets older and more out of date with each passing moment) and so we use eventstats to calculate the most recent timestamp instead of just using eval's now() function.
Next we add up the total risk per risk_object. We do this over the entire lifetime of the dataset (30 days) in the field thirty_day_risk, and then we run just over the most recent day in the field one_day_risk. The latter is possible through the power of stats+eval which allows us to encode eval functions in our stats command.
While we could build a behavioral threshold, it's usually easier and more manageable to just use a set threshold to detect what amount of risk we're willing to tolerate. These can be adjusted from company-to-company, but the basic idea is to set a different threshold for recent bursty risk activities versus long term slow-and-low activities.
Here we filter for where either the one day risk level or the thirty day risk level are over their respective thresholds.
One final step. To make sure that we are being clear to the SOC analysts, we should indicate why we chose to surface this alert, so we use the same threshold to add a comment to the search.

Live Data - By Score

First we bring in our dataset of events from the risk index of a Splunk Enterprise Security environment.
Next we add up the total risk per risk_object. We do this over the entire lifetime of the dataset (30 days) in the field thirty_day_risk, and then we run just over the most recent day in the field one_day_risk. The latter is possible through the power of stats+eval which allows us to encode eval functions in our stats command.
While we could build a behavioral threshold, it's usually easier and more manageable to just use a set threshold to detect what amount of risk we're willing to tolerate. These can be adjusted from company-to-company, but the basic idea is to set a different threshold for recent bursty risk activities versus long term slow-and-low activities.
Here we filter for where either the one day risk level or the thirty day risk level are over their respective thresholds.
One final step. To make sure that we are being clear to the SOC analysts, we should indicate why we chose to surface this alert, so we use the same threshold to add a comment to the search.

Screenshot of Demo Data