Connection to New Domain

Connection to New Domain

Description

Detects when users browse to domains never before seen in your organization.

Content Mapping

This content is not mapped to any local saved search. Add mapping


Use Case

Advanced Threat Detection

Category

Command and Control, Data Exfiltration, Zero Trust

Security Impact

Savvy Threat Hunters always know when users browse to new domains. This can be relevant in a variety of scenario, but the primary is that when your system connects to a command and control server, or to a staging server containing malware, those are usually on unusual domains. If you believe that a host is infected, checking to see whether it hit new domains is a great indicator to check. For more information on this detection in general, check out the great blog post specifically about this detection by Splunk's own Andrew Dauria (link).

Alert Volume

Very High

SPL Difficulty

Medium

Data Availability

Bad

Journey

Stage 2

MITRE ATT&CK Tactics

Exfiltration
Command and Control

MITRE ATT&CK Techniques

Exfiltration Over C2 Channel
Exfiltration Over Alternative Protocol
Application Layer Protocol

MITRE Threat Groups

Lazarus Group
APT3
Kimsuky
Magic Hound
MuddyWater
Rocke
APT32
Stealth Falcon
Gamaredon Group
Frankenstein
Sandworm Team
Dragonfly 2.0
Wizard Spider
Soft Cell
Ke3chang

Kill Chain Phases

Actions On Objectives

Data Sources

Web Proxy

   How to Implement

Implementing this search is relatively straightforward, as it expects Common Information Model compliant data. Just ingest your proxy data (or other web browsing visibility, such as stream:http or bro), and make sure there is a uri field. The only other step is to make sure that you have the URL Toolbox app installed, which allows us to parse out the domains. When scaling this search to greater volumes of data (or more frequent runs), leverage acceleration capabilities.

   Known False Positives

This search will inherently be very noisy. As a percentage of total domains, in most organizations new domains are very small. If you sent all of these events to the analysts though, it would be overwhelming. As a result, there are no known false positives per say, but value of any given alert is so small that you want to treat these alerts differently from most correlation searches. These are mostly appropriate just for contextual data, or to correlate with other indicators.

   How To Respond

These events are generally best to look at as contextual data for another event, for example uncleaned malware, new services, or unusual logins. The easiest way to accomplish this is to just record the events in a summary index, and then include searching that index as a part of your investigative actions. Enterprise Security customers can do this easily with the Risk Framework, which is effectively that -- create a risk indicator adaptive response action when saving this search, and it will then adjust the risk score of the assets involved, and show up in investigator workbench when you analyze an asset. Ultimately, to analyze the efficacy of any given alert here, we recommend looking up the domains in an Open Source Intelligence source like VirusTotal, ThreatCrowd, etc.

   Help

Connection to New Domain Help

This example leverages the First Seen Assistant. Our example dataset is a collection of anonymized Proxy logs, where browses to a new website. Our live search looks for the same behavior, as detailed in How to Implement.

SPL for Connection to New Domain

Demo Data

First we bring in our basic demo dataset. In this case, sample Proxy logs from a Palo Alto Networks NGFW. We're using a macro called Load_Sample_Log_Data to wrap around | inputlookup, just so it is cleaner for the demo data.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
Next we calculate the most recent value in our demo dataset
We end by seeing if the earliest time we've seen this value is within the last day of the end of our demo dataset.

Live Data

First we bring in our proxy dataset, leveraging Common Information Model fields, filtering for just events that actually have a URI.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
We end by seeing if the earliest time we've seen this value is within the last day.

Accelerated Data

First we bring in our proxy dataset, the Common Information Model Web Data Model, grouping by URL.
Next we rename the URL field to make it more usable.
Now we use URL Toolbox to parse out the domain from the the URL. Parsing out domains is actually wildly complicated (a regex will not suffice!), but URL Toolbox makes it easy. Check out more detail on Splunk Blogs (https://www.splunk.com/blog/2017/09/21/ut-parsing-domains-like-house-slytherin.html).
Finally, we exclude IP addresses from our search using the regex filtering command. This is an optional step, but we've found that the value to noise ratio when including IP addresses can be quite high given that some applications will connect to many ephemeral AWS instance IPs for normal operations.
Here we use the stats command to calculate what the earliest and the latest time is that we have seen this combination of fields.
We end by seeing if the earliest time we've seen this value is within the last day.