User with Many DLP Events

Description

Detect users that have many DLP events in a short period of time.


Use Case

Insider Threat

Category

Insider Threat

Security Impact

DLP events track meaningful risks to your organization, but unfortunately in most environments the volume of DLP events creates so much noise that valid events are disregarded. While you should absolutely be correlating DLP events with other security events, you may also wish to surface users that have a high number of DLP events in a short term to prioritize investigation into those events.

Alert Volume

Medium (?)

SPL Difficulty

Medium

Journey

Stage 3

MITRE ATT&CK Tactics

Exfiltration

MITRE ATT&CK Techniques

Exfiltration

Kill Chain Phases

Actions on Objectives

Data Sources

DLP

   How to Implement

Ensure that your DLP events are ingested with the Common Information Model, and tune the threshold to the degree you desire.

   Known False Positives

DLP systems in most organizations will generate many false positives (or at least, lots of noise). The goal of this search is to take users that have many different DLP events on a given day and escalate the priority of those events. That means that inherently there will still be false positives from this search based on the underlying policy. Tune your DLP system or adjust the threshold to control the noise from this search.

   How To Respond

Respond to the underlying events.

   Help

User with Many DLP Events Help

This example leverages the simple search assistant. Our dataset is an anonymized collection of DLP Events. For this analysis, we are looking for a large number of DLP Events based on a specific threshold.

SPL for User with Many DLP Events

Demo Data

First we pull in our demo dataset.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the same day.
Finally, we can count and aggregate per user, per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Live Data

First we pull in our DLP events, tagged via the TAs that are complaint with Splunk's Common Information Model. You can adjust this to the index and sourcetype of your DLP logs as well.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the same day.
Finally, we can count and aggregate per user, per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Accelerated Data

Here, tstats is pulling in one command a super-fast count of DLP Alerts per user per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Screenshot of Demo Data