Sources Sending a High Volume of DNS Traffic

Description

A common method of data exfiltration is to send out a huge volume (in bytes) of DNS or ping requests, embedding data into the payload. This is often not logged.


Use Case

Insider Threat

Category

Data Exfiltration

Security Impact

DNS Exfiltration is a sophisticated, but increasingly common technique used by malware authors, as well as adversaries inside of a network to exfiltrate data. The technique is becoming popular due to organizations increased monitoring of data exfiltration, but limiting their monitoring to common protocols, yet failing to monitor DNS as an exfiltration vector. There are several methods to exfiltrate data via DNS, however one way to monitor activity is to gauge the total bytes transferred and looking for anomalies and deviations from normal traffic levels.

Alert Volume

Low (?)

SPL Difficulty

Medium

Journey

Stage 1

MITRE ATT&CK Tactics

Exfiltration
Command and Control

MITRE ATT&CK Techniques

Exfiltration Over Alternative Protocol
Standard Application Layer Protocol
Exfiltration Over Command and Control Channel

MITRE Threat Groups

APT18
APT19
APT28
APT3
APT32
APT33
APT37
APT38
APT41
BRONZE BUTLER
Cobalt Group
Dark Caracal
Dragonfly 2.0
FIN4
FIN6
FIN7
FIN8
Gamaredon Group
Honeybee
Ke3chang
Kimsuky
Lazarus Group
Machete
Magic Hound
Night Dragon
OilRig
Orangeworm
Rancor
SilverTerrier
Soft Cell
Stealth Falcon
Threat Group-3390
Thrip
Turla
WIRTE

Kill Chain Phases

Command and Control
Actions on Objectives

Data Sources

Network Communication

   How to Implement

This search is intended to be an easy-to-implement one, as firewall data is usually the first to be Common Information Model compliant, and the fields are very easy to understand. To implement in your environment, first you should specify the correct index and sourcetype for your firewall data (we've included many different options in the search string, but it's not best practice to keep them). Then make sure that the standard fields (srcip, destport, bytes_*) match. You might need to filter out particularly high volume sources (particularly: DNS servers). With that, you should be good to go.

   Known False Positives

False positives for this search should be rare for hosts with static IPs. In one testing environment, a curiously configured free-standing IOT webcam was the only host that hit, and was easy to filter out.

If you have a small number of DHCP hosts that routinely send a large volume of DNS (first off, why?) you may need to filter out those destinations to reduce noise.

   How To Respond

When this search returns values, initiate your incident response process and identify the other systems that the alerting system is communicating with. Capture the time, applications, destination systems, ports, byte count and other pertinent information. Contact the system owner of this action. If systems being communicated to are internal, contact the owner(s) of those systems as well. If it is authorized, make a note that this is authorized and by whom. If not, additional investigation is warranted to determine if DNS is being used as a covert channel to exfiltrate data.

   Help

Sources Sending a High Volume of DNS Traffic Help

This example leverages the Simple Search assistant. Our dataset is a collection of firewall logs (with DNS app detection) from the last day or so. In it, we track per hour how much DNS traffic are sent per source IP (you should exclude actual DNS servers from this analysis). We then look to see if it is dramatically higher than the history for that source IP, but also dramatically higher than the organization overall, so that we can to some extent account for DHCP ranges. You may need to tune this for just servers, if this produces too much noise for your environment, or build two separate searches for user-subnet and server-subnet ranges, so that the organization-wide average can be separate.

SPL for Sources Sending a High Volume of DNS Traffic

Demo Data

First we pull in our demo dataset, which comes from Firewall Logs and targets scenarios where we will have a small number of DNS connections with a large amount of volume.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the last hour.
Now we are looking at the number of bytes sent per source ip per hour, over our time range (usually the last day).
Eventstats then allows us to calculate all manner of statistics. This is one of the more complicated stats syntaxes that you will see, but it's actually not that complicated. The big component here is leveraging stats+eval, where we can embed the flexible logic of eval inside of stats. In this case, when we are calculating our average and standard deviation, we really want to exclude the most recent values (that which we're concerned about), so that they don't sway our average (imagine you churn along at 1 kb per hour, then in the last hour it's 150 MB.. you really want your normal baseline to be 1 KB). One other note here -- we are doing two different eventstats, one on a global basis, one on a per host basis. That's so we can try to identify a host that's always at the top of the charts, while overall looking across the org, giving us a good balance across servers with static IPs and DHCP hosts that move around.
Here's where we really start doing the important work. Our lengthy eventstats gave us all these fields that we can filter on and interpret based on. (When testing this out, feel free to remove this line and those that follow, so you can see the raw fields coming out of eventstats.) Now we need to filter for hosts that are substantially above the norm.
From our last line, we have focused in to just hosts that are behaving abnormally. Here, we are using eval to add another field to the results -- not one focused on detection logic, but to try to add context and summarize some of the maths for an analyst to see why we are surfacing this host.
Finally, we clear up some of the nonsense fields we don't care that much about, again to make things clearer for the analyst.

Live Data

First we pull in our basic dataset, which comes from Firewall Logs and targets scenarios where we will have a small number of DNS connections with a large amount of volume.
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the last hour.
Now we are looking at the number of bytes sent per source ip per hour, over our time range (usually the last day).
Eventstats then allows us to calculate all manner of statistics. This is one of the more complicated stats syntaxes that you will see, but it's actually not that complicated. The big component here is leveraging stats+eval, where we can embed the flexible logic of eval inside of stats. In this case, when we are calculating our average and standard deviation, we really want to exclude the most recent values (that which we're concerned about), so that they don't sway our average (imagine you churn along at 1 kb per hour, then in the last hour it's 150 MB.. you really want your normal baseline to be 1 KB). One other note here -- we are doing two different eventstats, one on a global basis, one on a per host basis. That's so we can try to identify a host that's always at the top of the charts, while overall looking across the org, giving us a good balance across servers with static IPs and DHCP hosts that move around.
Here's where we really start doing the important work. Our lengthy eventstats gave us all these fields that we can filter on and interpret based on. (When testing this out, feel free to remove this line and those that follow, so you can see the raw fields coming out of eventstats.) Now we need to filter for hosts that are substantially above the norm.
From our last line, we have focused in to just hosts that are behaving abnormally. Here, we are using eval to add another field to the results -- not one focused on detection logic, but to try to add context and summarize some of the maths for an analyst to see why we are surfacing this host.
Finally, we clear up some of the nonsense fields we don't care that much about, again to make things clearer for the analyst.

Accelerated Data

First we pull in our accelerated dataset, which comes from Firewall Logs and targets scenarios where we will have a small number of DNS connections with a large amount of volume. tstats here is giving us the number of bytes sent per source_ip, filtered for dest_port 53
Eventstats then allows us to calculate all manner of statistics. This is one of the more complicated stats syntaxes that you will see, but it's actually not that complicated. The big component here is leveraging stats+eval, where we can embed the flexible logic of eval inside of stats. In this case, when we are calculating our average and standard deviation, we really want to exclude the most recent values (that which we're concerned about), so that they don't sway our average (imagine you churn along at 1 kb per hour, then in the last hour it's 150 MB.. you really want your normal baseline to be 1 KB). One other note here -- we are doing two different eventstats, one on a global basis, one on a per host basis. That's so we can try to identify a host that's always at the top of the charts, while overall looking across the org, giving us a good balance across servers with static IPs and DHCP hosts that move around.
Here's where we really start doing the important work. Our lengthy eventstats gave us all these fields that we can filter on and interpret based on. (When testing this out, feel free to remove this line and those that follow, so you can see the raw fields coming out of eventstats.) Now we need to filter for hosts that are substantially above the norm.
From our last line, we have focused in to just hosts that are behaving abnormally. Here, we are using eval to add another field to the results -- not one focused on detection logic, but to try to add context and summarize some of the maths for an analyst to see why we are surfacing this host.
Finally, we clear up some of the nonsense fields we don't care that much about, again to make things clearer for the analyst.