Monitor Unsuccessful Backups

Description

With good backups, a ransomware attack goes from unrecoverable losses to a manageable nuisance. This shows how you can analyze failed backups.


Use Case

Security Monitoring

Category

Operations

Alert Volume

Low (?)

SPL Difficulty

Basic

Journey

Stage 1

Data Sources

Backup

   How to Implement

Because there are so many different strategies for backups and data resiliency, this search is provided primarily as an example. In order to implement this use case, you need to index data from the backup logs on your endpoints, or from a central server responsible for performing the backups. In this case, we used netbackup as an example. You can modify this search according to your specific backup solution and strategy.

   Known False Positives

None at the moment

   How To Respond

When this search fires, you will to investigate why backups from these hosts have been failing.

   Help

Monitor Unsuccessful Backups Help

When you maintain sufficient backups outside your local network, you reduce the impact of a ransomware attack from unrecoverable losses to a manageable nuisance. Organizations must track their backup posture as a part of their overall corporate data availability and resiliency plan. This means knowing that routine backups are taking place and notifying the appropriate personnel when they are not. This use case helps Splunk customers manage and verify that data resiliency processes are running, specifically looking for indications of failed backups.

SPL for Monitor Unsuccessful Backups

Demo Data

First we load our basic demo data
Next we filter for the specific message that NetBackup sends for failed backups
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the day.
Finally we can look at the hosts that failed to back up over time, thanks to stats.

Live Data

First we load our NetBackup data and filter for the specific message that NetBackup sends for failed backups
Bucket (aliased to bin) allows us to group events based on _time, effectively flattening the actual _time value to the day.
Finally we can look at the hosts that failed to back up over time, thanks to stats.