Skip to content

Processing Data Example

@AlanOrlikoski edited this page Apr 19, 2018 · 13 revisions

This page provides the information required to

  • Obtain a sample disk image
  • Use CDQR to process the disk image (multiple options)
  • View data in the built in Kibana Dashboards
  • View data in TimeSketch

This is an example that requires a good amount of time to process an entire disk image. It was chosen, over a CyLR Live Response (LR) collection, to provide the most amount of data to the user in Kibana and TimeSketch.

It is not meant to be a speed test or an accurate reflection of how long it takes to process a CyLR LR. To do that it is recommended to collect data with CyLR, transfer to the Skadi system, and process with CDQR using these commands as a guide.

This example shows screenshots from a VM with 12 cores and 16 GB of RAM. Even though more cores and memory are better, the minimum requirements of 4 cores and 8 GB of RAM works fine.

Download the Disk Image

This example uses https://www.cfreds.nist.gov/ sample data sets; specifically the Data Leakage Case. It is possible to use the DD or E01 images but for this example the E01 version is used.

On the Skadi Desktop open a terminal window (alternatively open a ssh shell to the Skadi Server)

Once the terminal window is open run the following commands to get the files
Pro-tip: Use the up arrow to show the last wget command and change the number of the E01 to E02 (and so forth)

cd ~/
mkdir ~/samples
cd ~/samples
wget https://www.cfreds.nist.gov/data_leakage_case/images/pc/cfreds_2015_data_leakage_pc.E01
wget https://www.cfreds.nist.gov/data_leakage_case/images/pc/cfreds_2015_data_leakage_pc.E02
wget https://www.cfreds.nist.gov/data_leakage_case/images/pc/cfreds_2015_data_leakage_pc.E03
wget https://www.cfreds.nist.gov/data_leakage_case/images/pc/cfreds_2015_data_leakage_pc.E04

Process Data with CDQR

This command parses all the data files (it knows to process the other files in that image based on the E01 format) with the CDQR Windows options for Plaso parsers. This is the default option with CDQR but there are datt, lin, mac options as well. These can be chosen using the -p flag for CDQR.
Note: DATT is short for Do All The Things and will use every parser Plaso has. This is the most thorough option but can dramatically slow down processing times

The --max_cpu flag lets CDQR use all the cores available. Otherwise is uses three less than the total number of CPU cores available and defaults to 1 if there is less than 4 total.

Kibana Format

The --es_kb flag tells CDQR to output to ElasticSearch (es) in the Kibana (kb) format. This requires a case name and is cfreds_sample_datt for this example.

cdqr.py --max_cpu cfreds_2015_data_leakage_pc.E01 --es_kb cfreds_sample_datt  

TimeSketch Format

The --es_ks flag tells CDQR to output to ElasticSearch (es) in the TimeSketch (kb) format. This requires a case name and is cfreds_sample_datt for this example. Using CDQR to input the data allows both formats to co-exist in the same ElasticSearch database without overlapping.

DON'T REPROCESS THE SAME ARTIFACTS!!

The --plaso_db flag tells CDQR that the artifacts were processed (in the previous step) and to read the data from the ~/samples/Results/cfreds_2015_data_leakage_pc.plaso file.

The Results_ts was added to provide an alternative destination for the results. By default CDQR places the results in a folder named Results in the directory it was called from. If there is an overlap CDQR prompts for which actions the user wants to take (follow the prompts).

cdqr.py --plaso_db ~/samples/Results/cfreds_2015_data_leakage_pc.plaso Results_ts --es_ts cfreds_sample_datt  

To monitor the the progress open another terminal window (screen is installed in Skadi and is a good option when using the Skadi Server) to run htop

Then open up the Kibana page, set the time to "500 years ago" and look at the results in the multiple dashboards OR add the timelines to a new timesketch.

This is a great way to see sample data with known attributes. I recommend comparing what can be seen in Kibana vs. TimeSketch