Healthcare Worker Opening More Patient Records Than Usual

Healthcare Worker Opening More Patient Records Than Usual


If a healthcare worker views more patient records than normal or more than their peers, it could be a sign that their system is infected, or that they are exfiltrating patient data.

Content Mapping

This content is not mapped to any local saved search. Add mapping

Use Case

Insider Threat, Advanced Threat Detection, Compliance


Data Exfiltration, Insider Threat

Security Impact

Healthcare organizations need to be particularly concerned about privacy laws (HIPAA/HITECH) and data exfiltration of financially lucrative patient records. Patient records are worth 10x or 20x what credit card numbers are worth on the black market, as the more you know about an individual the more useful that individuals' information is for identity theft. This data can also be used so that an individual can impersonate the victim and obtain healthcare services, and then the bill for said services goes to the “real” individual. Therefore, tracking who is accessing patient records in a “normal” manner vs anomalous access patterns is be critical to detecting this activity before records are exfiltrated. Note that this search could find true insiders acting maliciously, or detection of users that have had their credentials compromised, and those user accounts are now being hijacked for data exfiltration.

Alert Volume


SPL Difficulty


Data Availability



Stage 4



MITRE ATT&CK Techniques

Data from Information Repositories
Data from Network Shared Drive

MITRE Threat Groups

Gamaredon Group

Kill Chain Phases

Actions On Objectives

Data Sources


   How to Implement

Implementation of this example (or any of the Time Series Spike / Standard Deviation examples) is generally pretty simple.

  • Validate that you have the right data onboarded, and that the fields you want to monitor are properly extracted. If the base search you see in the box below returns results.
  • Save the search to run over a long period of time (recommended: at least 30 days).

For most environments, these searches can be run once a day, often overnight, without worrying too much about a slow search. If you wish to run this search more frequently, or if this search is too slow for your environment, we recommend using a summary index that first aggregates the data. We will have documentation for this process shortly, but for now you can look at Summary Indexing descriptions such as here and here.

   Known False Positives

This is a strictly behavioral search, so we define "false positive" slightly differently. Every time this fires, it will accurately a spike in the number we're monitoring... it's nearly impossible for the math to lie. But while there are really no "false positives" in a traditional sense, there is definitely lots of noise.

How you handle these alerts depends on where you set the standard deviation. If you set a low standard deviation (2 or 3), you are likely to get a lot of events that are useful only for contextual information . If you set a high standard deviation (6 or 10), the amount of noise can be reduced enough to send an alert directly to analysts.

   How To Respond

When this search returns values, initiate your incident response process and identify the user demonstrating this behavior. Determine the time, role, and number of accesses occurring and from what address if possible. Contact the user and manager to determine if it is authorized, and document that this is authorized and by whom. If not, the user credentials may have been used by another party and additional investigation is warranted to determine if patient data is being collected.


Healthcare Worker Opening More Patient Records Than Usual Help

This example leverages the Detect Spikes (standard deviation) search assistant. Our demo dataset is a set of manufactured logs based on a Cerner patient record system, which tracked the number of unique patient whose records a particular doctor, nurse, DBA, or etc. opened per day, 'dc(PatientID) as NumOpens by EmployeeName _time'. (In effect, if you worked with one patient all day, and opened their chart 100 times, you would have a NumOpens of 1, because you only viewed one patient). Then we calculate the average, standard deviation, and the most recent value, and filter out any users where the most recent is within the configurable number of standard deviations from average. Notably, the daily dc() is already done in this dataset -- this is akin to analyzing a summary index, as is explained in the High Cardinality Alert dialog.

SPL for Healthcare Worker Opening More Patient Records Than Usual

Demo Data

First we pull in our demo dataset.
We would normally need to aggregate now per user per day, but in this case the demo dataset is already aggregated (pulling from a summary index, as is often done in this scenario).
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation

Live Data

First we pull in our Cerner audit log dataset.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the same day.
Finally, we can count and aggregate per user, per day.
calculate the mean, standard deviation and most recent value
calculate the bounds as a multiple of the standard deviation