Skip to content

Latest commit

 

History

History
33 lines (25 loc) · 1.57 KB

README.md

File metadata and controls

33 lines (25 loc) · 1.57 KB

Logo

Identifying Cybersecurity Threats

Using machine learning to analyze network communications, predict whether or not they are malicious and determine the type of attack if so.

View on Career Karma

Note

After finding a database of simulated network communication events, I wanted to tackle two tasks. First, predict whether an event was malicious or not, and second, predict the type of attack.

The original database contained over 16 million rows, making this my first experience with big data. It was interesting learning to deal with this issue via solutions such as Spark among other strategies.

The size of the data was only one of many hurdles, however. I've learned quite a lot throughout this project, and although it's far from finished, I hope you can enjoy my findings.

Presentation

https://docs.google.com/presentation/d/1b0OuQY0-tJ83sMYF_ZWu12m241WTaMj0SrKTKRZirHQ/edit?usp=sharing

Colab Notebook (Full Project)

https://colab.research.google.com/drive/1efSE4MkkXXv2oF35OeOzXSyM4IbKvO6A?usp=sharing

Databricks Notebook (Spark Demo)

https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/5838070033619084/1491280581433062/6937281827665532/latest.html