In this project we've tried to identify threats by using network analysis. to overcome the difficulties of the real world we've tried to break the project into multiple stages , so that if we couldn't finish the project , somebody else might get intersted and carry on the job.
In the very fist steps we must be capable of watching the network for any malware moving around, before any system gets infected.
- parse a Pcap file that was sniffed from network to get all passed URLs
- it must be capable of filtering the result by source or destination
We want to generalize the malware detection part eventually , but right now i think the Cuckoo Sandbox would be sufficient.
- create a workflow to analyse urls and files using Cuckoo Sandbox
- it must be capable of passing the urls that were sniffed from Stage Zero to Cuckoo Sandbox
Some of the current malwares in the wild are just a mutation of older ones, but due to lack of signature they cannot be detected , but maybe adding blacklisted hosts and community signature would help us to overcome that problem.
- use Blacklists and Signatures to increase the malware detection rate
Some malwares would use known ports with their own protocol so they can evade detection , for example if some host is talking on port 443 but not using the https protocol, it is a little bit suspicious! don't you agree with me?
- use protocol analysis to detect unusual activity on known ports
Many of the infected hosts will talk to botnet C&C servers using API Call
- analyse http headers for any unsual http api call
Many of the infected hosts contact their botnet C&C servers periodically and/or with similar Packets , so in this stage we will introduce ways to detect those patterns and mark them as suspecius traffic.
- Time Based : infected host asks for specific (non whitelisted) dns name priodically.
- Dns Answer Based : in case many Dns name requests ends up with the same IP address (many APTs would try to hide by using different dns names for their C&C servers).
- TTL Value Based : packets that are transfered between infected hosts and C&C server have a very low TTL to be effective in running commands.
- Domain Name Based : another possible method is to check the percentage of meaningfull name in dns name.
Do all the previous steps in Realtime (not from a saved Pcap file)
- another plus in this stage would be to check for any IRC traffic to mark them as suspicious.
Use WhiteList and Machine Learning Algorithms to Lower down the False Positives .
If we're 100% sure that a network is clean ; for example in an Industrial Network when it's completely off the grid, and we've not connected any device to it; we can Train our program to consider all traffic in that stage clean , and the when we've connected our network to outside world we can use Anomaly Detection to increase our Zero Day detection rate.
use Traffic Classification to Manually analyse the suspecouis categories.
use Dynamic Analyses and Sandboxing to increase malware detection rate.
and many other ideas that will be added gradually...