Spike in Exported Records from Salesforce.com

Description

For many organizations, Salesforce.com contains the most critical information in their company. This use case tracks the number of records exported per day (and is based on a real set of data collection).


Use Case

Compliance, Insider Threat

Category

Data Exfiltration, GDPR, SaaS

Alert Volume

Medium (?)

SPL Difficulty

Hard

Journey

Stage 3

MITRE ATT&CK Tactics

Collection

MITRE ATT&CK Techniques

Data from Information Repositories

MITRE Threat Groups

APT28
Ke3chang

Kill Chain Phases

Actions on Objectives

Data Sources

Audit Trail

   GDPR Relevance

Impact:

A sudden, high-volume increase in exported records can indicate unauthorized, non-compliant, and potentially malicious behavior.

Specific to GDPR, detecting and proving that individuals within the organization are not abusing or misusing legitimate access to assets that store and process personal data is an industry best practice and can be considered an effective security control, as required by Article 32. This is applicable to processing personal data from the controller and needs to also be addressed if contractors or sub-processors from third countries or international organizations access and transfer personal data (Article 15).

   How to Implement

Implementation of this example (or any of the Time Series Spike / Standard Deviation examples) is generally pretty simple.

  • Validate that you have the right data onboarded, and that the fields you want to monitor are properly extracted. If the base search you see in the box below returns results.
  • Save the search to run over a long period of time (recommended: at least 30 days).

For most environments, these searches can be run once a day, often overnight, without worrying too much about a slow search. If you wish to run this search more frequently, or if this search is too slow for your environment, we recommend using a summary index that first aggregates the data. We will have documentation for this process shortly, but for now you can look at Summary Indexing descriptions such as here and here.

   Known False Positives

This is a strictly behavioral search, so we define "false positive" slightly differently. Every time this fires, it will accurately a spike in the number we're monitoring... it's nearly impossible for the math to lie. But while there are really no "false positives" in a traditional sense, there is definitely lots of noise.

How you handle these alerts depends on where you set the standard deviation. If you set a low standard deviation (2 or 3), you are likely to get a lot of events that are useful only for contextual information. If you set a high standard deviation (6 or 10), the amount of noise can be reduced enough to send an alert directly to analysts.

   How To Respond

When this search returns values, initiate your incident response process and identify the user demonstrating this behavior. Capture the time of the event, the user's role, and number of rows exported. If possible, determine the system the user is using to download this data and its location. Contact the user and their manager to determine if it is authorized, and document that this is authorized and by whom. If not, the user credentials may have been used by another party and additional investigation is warranted.

   Help

Spike in Exported Records from Salesforce.com Help

This example leverages the Detect Spikes (standard deviation) search assistant. Our dataset is an anonymized data collection from an actual customer environment.

SPL for Spike in Exported Records from Salesforce.com

Demo Data

First we pull in our demo SFDC dataset.
Then we filter for what we're looking for in this use case, specifically export EVENT_TYPEs with at least one ROWS_PROCESSED.
Then we enrich to convert the SFDC USER_ID into a friendly username via a lookup.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the same day.
Finally, we can count and aggregate per user, per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Live Data

First we pull in our SFDC dataset and filter for what we're looking for in this use case, specifically export EVENT_TYPEs with at least one ROWS_PROCESSED.
Then we enrich to convert the SFDC USER_ID into a friendly username via a lookup.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the same day.
Finally, we can count and aggregate per user, per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Screenshot of Demo Data