Processes with Lookalike (typo) Filenames

Description

To evade analysts, attackers will create a service with a name similar to that of a standard Windows service. This search looks for small differences. Idea from David Bianco, formerly of Sqrrl (link).


Use Case

Advanced Threat Detection

Category

Endpoint Compromise

Alert Volume

Very Low (?)

SPL Difficulty

Hard

Journey

Stage 4

MITRE ATT&CK Tactics

Defense Evasion
Execution

MITRE ATT&CK Techniques

Masquerading
Service Execution

MITRE Threat Groups

APT1
APT32
APT41
BRONZE BUTLER
Carbanak
Dragonfly 2.0
FIN6
FIN7
Honeybee
Ke3chang
MuddyWater
PLATINUM
Patchwork
Poseidon Group
Scarlet Mimic
Silence
Soft Cell
Sowbug
TEMP.Veles
admin@338
menuPass

Kill Chain Phases

Installation

Data Sources

Windows Security
Endpoint Detection and Response

   How to Implement

Implementing this search is similar to any of the other searches that require EDR data. In order to use it, we need to get process launch events. For the demo and live version of this search we use Microsoft Sysmon logs, but you could apply it to any other data source that shows you launched processes -- just adjust the file path field (Image) to match (it is not a CIM field).

   Known False Positives

This search will find any launched executable with filenames similar to standard Windows processes, much like running dnstwist on a domain name. If there are valid processes with names similar to standard Windows processes, they would create false positives. One might imagine a system speed up service creating a process like svchaste.exe, which would have a difference of 2 and would trigger (one for the o->a, one for the added e).

   How To Respond

When this search returns values, initiate your incident response process and capture the time of the event, filename and path of the suspect file, the system it executed on and the user and other pertinent information. Contact the user and owner of the system. If it is authorized behavior, document that this is authorized and by whom. If not, the user credentials may have been used by another party and additional investigation is warranted.

   Help

Processes with Lookalike (typo) Filenames Help

This example leverages the Simple Search assistant. Our dataset is a collection of Windows process launches (EventID 4688), with one injected to use scvhost.exe instead of svchost.exe. The search then leverages the URL Toolbox app from apps.splunk.com to run a levenshtein distance calculation against multiple different common Windows processes. It filters by making sure that there's no exact matches (distance=0), and then filters for where any of the matches was 1 or 2 away from the normal (e.g., svdhost.exe or scvhost.exe). There are a few other techniques used in the search to provide more context to the analyst -- the most interesting of those is the use of curly braces "{score}" -- that will insert the value of the score field (e.g., "2") into the variable name (so, score2). The rest of the more complicated aspects of the search are just formatting -- in particular, the use of foreach to pull out all suspicious filenames from the dataset.

SPL for Processes with Lookalike (typo) Filenames

Demo Data

This is one of the longest searches in Splunk Security Essentials, but we'll break it down for you. We start easy: first we pull in our demo dataset.
Earlier versions of sysmon didn't extract a filename by default, so we are adding that in here.
Ultimately when we're analyzing filenames, we are going to want to have a stats by filename, but with the context to understand the broader incident. So here we're adding the hosts that ran that filename, and the Image (file path) that it ran from.
We are about to start the process of comparing multiple standard Windows processes against each process that runs, and recording the scores in a field called ut_levenshtein. To make that work, we first have to initialize the field, so that we can continually add to it.
In this line, we are comparing the filename we see to svchost.exe using the Levenshtein algorithm that comes packaged in the free URL Toolbox app. Levenshtein will compare two strings and count the number of edits it would take to make them match, e.g., svchost.exe vs ssvchost.exe would be a difference of 1 because you'd need to add one character. It is known as an edit distance algorithm. At the end, we are using eval's mvappend command to add the score to the field levenshtein_scores, which by the end will have four different values.
This one is a neat SPL trick, though it only works in this one scenario. When you have an eval (and not an eval inside of a | foreach, or anything like that), you specify another variable in curly braces and it will insert the value of that variable. So here, ut_levenshtein contains a numeric score.. let's suppose it is a 2 for argument sake. What we're going to end up doing is assigning the value of comparisonterm (svchost.exe) to a field called score2. Note that this only works when it's on the left hand side in an eval statement (e.g., you can't create pointers, for those coming from C/C++ land), and it doesn't work inside of things like foreach. But still, it can allow you to do some awesome things you probably didn't expect.
Now we're going to repeat lines 4-6 for the term iexplore.exe, again adding the score to levenshtein_scores.
Now we're going to repeat lines 4-6 for the term ipconfig.exe, again adding the score to levenshtein_scores.
Now we're going to repeat lines 4-6 for the term explorer.exe, again adding the score to levenshtein_scores.
Before we filter, we're going to grab the total number of hosts in our environment, so we can filter out files that are seen by all of them (unlikely to be malware).
Now we filter out noise. Generally with Levenshtein, we look for scores that are greater then 0 (i.e., not an exact match), but less than 3.
Great, we now just have suspicious process launches, so it's now time to start making this usable for an analyst. To start with, let's grab that lowest levenshtein_scores value so we can tell them that.
Now we need to pull out the matching suspicious filename. This uses foreach, which basically iterates over anything starting with score (remember that SPL trick from line 6?), and if the score is less than 3, it will add it to the suspect_files field.
Next we calculate what percentage of the environment shows this filename.
Finally! Finally, we create a nice table

Live Data

This is one of the longest searches in Splunk Security Essentials, but we'll break it down for you. We start easy: first we pull in our dataset of process launch events (here from Windows Sysmon or Native Windows, but could come from any EDR data source).
Next, we do a little eval magic to join the disparate fields in Windows Sysmon (Image) and Windows native Event Code 4688 (New_Process_Name). At the time of this writing, there is no Endpoint Data Model to drive a common naming scheme, so we need this extra step to support either way.
Then we extract the filename portion of the field, so that we can compare different filenames.
Ultimately when we're analyzing filenames, we are going to want to have a stats by filename, but with the context to understand the broader incident. So here we're adding the hosts that ran that filename, and the Image (file path) that it ran from.
We are about to start the process of comparing multiple standard Windows processes against each process that runs, and recording the scores in a field called ut_levenshtein. To make that work, we first have to initialize the field, so that we can continually add to it.
In this line, we are comparing the filename we see to svchost.exe using the Levenshtein algorithm that comes packaged in the free URL Toolbox app. Levenshtein will compare two strings and count the number of edits it would take to make them match, e.g., svchost.exe vs ssvchost.exe would be a difference of 1 because you'd need to add one character. It is known as an edit distance algorithm. At the end, we are using eval's mvappend command to add the score to the field levenshtein_scores, which by the end will have four different values.
This one is a neat SPL trick, though it only works in this one scenario. When you have an eval (and not an eval inside of a | foreach, or anything like that), you specify another variable in curly braces and it will insert the value of that variable. So here, ut_levenshtein contains a numeric score.. let's suppose it is a 2 for argument sake. What we're going to end up doing is assigning the value of comparisonterm (svchost.exe) to a field called score2. Note that this only works when it's on the left hand side in an eval statement (e.g., you can't create pointers, for those coming from C/C++ land), and it doesn't work inside of things like foreach. But still, it can allow you to do some awesome things you probably didn't expect.
Now we're going to repeat lines 4-6 for the term iexplore.exe, again adding the score to levenshtein_scores.
Now we're going to repeat lines 4-6 for the term ipconfig.exe, again adding the score to levenshtein_scores.
Now we're going to repeat lines 4-6 for the term explorer.exe, again adding the score to levenshtein_scores.
Before we filter, we're going to grab the total number of hosts in our environment, so we can filter out files that are seen by all of them (unlikely to be malware).
Now we filter out noise. Generally with Levenshtein, we look for scores that are greater then 0 (i.e., not an exact match), but less than 3.
Great, we now just have suspicious process launches, so it's now time to start making this usable for an analyst. To start with, let's grab that lowest levenshtein_scores value so we can tell them that.
Now we need to pull out the matching suspicious filename. This uses foreach, which basically iterates over anything starting with score (remember that SPL trick from line 6?), and if the score is less than 3, it will add it to the suspect_files field.
Next we calculate what percentage of the environment shows this filename.
Finally! Finally, we create a nice table