User Finding Project Code Names from Many Departments
User Finding Project Code Names from Many Departments
Description
Find users trying to collect and analyze internal projects from across multiple departments by analyzing their search logs on company wiki software.
Content Mapping
This content is not mapped to any local saved search. Add mapping
How to Implement |
---|
There are two key components of implementing this particular detection. First, you need to make sure that you have the right Confluence logs -- run the base search, or look for confluence logs in your environment by searching for something like dositesearch (that's how I found them!) and record the index and sourcetype. If you use a different internal wiki (like Sharepoint), then you will need to alter the search to pull the search logs for that system. The second piece is harder -- in order to find project code names being searched in your logs, you have to know what those code names are (and for this detection, what department they belong to). You will have to reach out to different departments in your organization to find this knowledge, but once you have it you can mirror the format of the sample sse_project_codenames lookup. |
Known False Positives |
---|
Because we're using a static threshold here for the number of different departments, you would need to adjust this threshold to suit your organization. Because these types of events are inherently fairly bursty (someone catches up on their email, someone switches into a project management role, etc.) it's difficult to use ML to solve for it but relatively easy to understand it given business context. This alert, in isolation, is often benign for exactly the reasons listed above. |
How To Respond |
---|
Because these activities can be benign (see Known False Positives), look for other indications of suspicious behavior with this user, or validate with their management or HR that the behavior is expected. |
Help |
---|
User Finding Project Code Names from Many Departments HelpThis example leverages the Detect Spikes (standard deviation) search assistant. Our dataset is an anonymized collection of Confluence (an internal wiki software) logs centered around a few users for two months. |
SPL for User Finding Project Code Names from Many Departments
Demo Data
| First we bring in our basic demo dataset. In this case, anonymized Confluence logs. We're using a macro called Load_Sample_Log_Data to wrap around | inputlookup, just so it is cleaner for the demo data. |
| Next we filter for just search history in Confluence. |
| While you wouldn't have to do this with live data, for our sample data we're going to extract out the queryString explicitly. |
| Next we use eval's urldecode function to convert plus signs to spaces, and any other url encoding that might exist. |
| Now that we have everything looking clearly, we're going to use a regex to extract project code names from the search string. Normally with rex, you would include the regular expression here as a quoted string (much less scary). We're going to make this more complicated by using a subsearch, but it has the benefit of requiring that you don't have to enter the data twice. We'll explain the subsearch in the next line. |
| The goal of this line is to return a single string with the list of all the project code names in a field extraction, like "(? |
| Now we use the lookup command so that we can understand what department every project codename belongs to, pulled from the CSV file. |
| For simplicity, we want to group events together based on the day (you might look at this based on the hour, you might use the transaction command to give you a rolling window -- there are lots of approaches). |
| Now we use stats to look at the distinct count of department (the number of unique departments) whose codewords were searched by user, by day. |
| Finally we filter for where people have looked at five or more different departments. |
Live Data
| First we bring in our dataset, filtered for just the search history in Confluence. |
| Next we use eval's urldecode function to convert plus signs to spaces, and any other url encoding that might exist. |
| Now that we have everything looking clearly, we're going to use a regex to extract project code names from the search string. Normally with rex, you would include the regular expression here as a quoted string (much less scary). We're going to make this more complicated by using a subsearch, but it has the benefit of requiring that you don't have to enter the data twice. We'll explain the subsearch in the next line. |
| The goal of this line is to return a single string with the list of all the project code names in a field extraction, like "(? |
| Now we use the lookup command so that we can understand what department every project codename belongs to, pulled from the CSV file. |
| For simplicity, we want to group events together based on the day (you might look at this based on the hour, you might use the transaction command to give you a rolling window -- there are lots of approaches). |
| Now we use stats to look at the distinct count of department (the number of unique departments) whose codewords were searched by user, by day. |
| Finally we filter for where people have looked at five or more different departments. |
Screenshot of Demo Data
