Connection to New Domain

Description

Detects when users browse to domains never before seen in your organization.


Use Case

Advanced Threat Detection

Category

Command and Control, Data Exfiltration

Security Impact

Savvy Threat Hunters always know when users browse to new domains. This can be relevant in a variety of scenario, but the primary is that when your system connects to a command and control server, or to a staging server containing malware, those are usually on unusual domains. If you believe that a host is infected, checking to see whether it hit new domains is a great indicator to check. For more information on this detection in general, check out the great blog post specifically about this detection by Splunk's own Andrew Dauria (link).

Alert Volume

Very High (?)

SPL Difficulty

Medium

Journey

Stage 2

MITRE ATT&CK Tactics

Exfiltration
Command and Control

MITRE ATT&CK Techniques

Exfiltration Over Command and Control Channel
Exfiltration Over Alternative Protocol
Standard Application Layer Protocol

MITRE Threat Groups

APT18
APT19
APT28
APT3
APT32
APT33
APT37
APT38
APT41
BRONZE BUTLER
Cobalt Group
Dark Caracal
Dragonfly 2.0
FIN4
FIN6
FIN7
FIN8
Gamaredon Group
Honeybee
Ke3chang
Kimsuky
Lazarus Group
Machete
Magic Hound
Night Dragon
OilRig
Orangeworm
Rancor
SilverTerrier
Soft Cell
Stealth Falcon
Threat Group-3390
Thrip
Turla
WIRTE

Kill Chain Phases

Actions on Objectives

Data Sources

Web Proxy

   How to Implement

Implementing this search is relatively straightforward, as it expects Common Information Model compliant data. Just ingest your proxy data (or other web browsing visibility, such as stream:http or bro), and make sure there is a uri field. The only other step is to make sure that you have the URL Toolbox app installed, which allows us to parse out the domains. When scaling this search to greater volumes of data (or more frequent runs), leverage acceleration capabilities.

   Known False Positives

This search will inherently be very noisy. As a percentage of total domains, in most organizations new domains are very small. If you sent all of these events to the analysts though, it would be overwhelming. As a result, there are no known false positives per say, but value of any given alert is so small that you want to treat these alerts differently from most correlation searches. These are mostly appropriate just for contextual data, or to correlate with other indicators.

   How To Respond

These events are generally best to look at as contextual data for another event, for example uncleaned malware, new services, or unusual logins. The easiest way to accomplish this is to just record the events in a summary index, and then include searching that index as a part of your investigative actions. Enterprise Security customers can do this easily with the Risk Framework, which is effectively that -- create a risk indicator adaptive response action when saving this search, and it will then adjust the risk score of the assets involved, and show up in investigator workbench when you analyze an asset. Ultimately, to analyze the efficacy of any given alert here, we recommend looking up the domains in an Open Source Intelligence source like VirusTotal, ThreatCrowd, etc.

   Help

Connection to New Domain Help

This example leverages the First Seen Assistant. Our example dataset is a collection of anonymized Proxy logs, where browses to a new website. Our live search looks for the same behavior, as detailed in How to Implement.

SPL for Connection to New Domain

Demo Data

First we bring in our basic demo dataset. In this case, sample Proxy logs from a Palo Alto Networks NGFW. We're using a macro called Load_Sample_Log_Data to wrap around | inputlookup, just so it is cleaner for the demo data.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
Next we calculate the most recent value in our demo dataset
We end by seeing if the earliest time we've seen this value is within the last day of the end of our demo dataset.

Live Data

First we bring in our proxy dataset, leveraging Common Information Model fields, filtering for just events that actually have a URI.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
We end by seeing if the earliest time we've seen this value is within the last day.

Accelerated Data

First we bring in our proxy dataset, the Common Information Model Web Data Model, grouping by URL.
Next we rename the URL field to make it more usable.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
We end by seeing if the earliest time we've seen this value is within the last day.