files aren't downloaded #182

lhunt23 · 2017-02-05T16:55:16Z

Hello,

I'm running Maltrieve on Ubuntu 16.0.4. I installed the dependencies as described in the installation instructions. When I 'python maltrieve.py', the script doesn't download any files. Please see the output below and let me know if you have any suggestions.

python maltrieve.py -d /home/acme/malware/020517
Processing source URLs
Completed source processing
/usr/local/lib/python2.7/dist-packages/bs4/init.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 514 of the file maltrieve.py. To get rid of this warning, change code that looks like this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "lxml")

markup_type=markup_type))
Downloading samples, check log for details
Completed downloads

tail maltrieve.log
2017-02-05 11:49:24 140020353632000 Starting new HTTP connection (1): malc0de.com
2017-02-05 11:49:29 140020353632000 http://www.malwaredomainlist.com:80 "GET /hostslist/mdl.xml HTTP/1.1" 200 4938
2017-02-05 11:49:29 140020353632000 http://malc0de.com:80 "GET /rss/ HTTP/1.1" 200 None
2017-02-05 11:49:30 140020353632000 http://malwareurls.joxeankoret.com:80 "GET /normal.txt HTTP/1.1" 200 11192
2017-02-05 11:49:30 140020353632000 http://support.clean-mx.de:80 "GET /clean-mx/rss?scope=viruses&limit=0%2C64 HTTP/1.1" 200 918
2017-02-05 11:49:30 140020353632000 http://vxvault.net:80 "GET /URL_List.php HTTP/1.1" 200 None
2017-02-05 11:49:30 140020353632000 https://zeustracker.abuse.ch:443 "GET /monitor.php?urlfeed=binaries HTTP/1.1" 200 3869
2017-02-05 11:49:32 140020353632000 http://urlquery.net:80 "GET / HTTP/1.1" 200 4766
2017-02-05 11:49:33 140020353632000 Dumping past URLs to urls.json
2017-02-05 11:49:33 140020353632000 Dumping hashes to hashes.json

clayball · 2017-02-28T14:53:30Z

I can confirm.. nothing is being downloaded.

lhunt23 · 2017-02-28T14:57:11Z

Thanks for the follow up. If you have any suggestions as to how to get the script working again, please let me know. I've found this script to be extremely useful and appreciate you making it available.

hi-T0day · 2017-03-04T11:52:11Z

Add sudo before 'python maltrieve.py' or change python to python3
Good Luck!

hi-T0day · 2017-03-04T12:30:47Z

Sorry, I gave you an wrong answer just now. But I got it now.
You can change in "maltrieve.py"
def process_urlquery(response):
soup = BeautifulSoup(response)
urls = set()
for t in soup.find_all("table", class_="test"):
for a in t.find_all("a"):
urls.add('http://' + re.sub('&', '&', a.text))
return urls

to:

def process_urlquery(response):
soup = BeautifulSoup(response, "html.parser")
urls = set()
for t in soup.find_all("table", class_="test"):
for a in t.find_all("a"):
urls.add('http://' + re.sub('&', '&', a.text))
return urls

lhunt23 · 2017-03-04T14:08:58Z

Hello, Thanks for your response. I made the suggested changes and the script still isn’t downloading files. Please let me know if you have any additional suggestions. Thanks. def process_urlquery(response): soup = BeautifulSoup(response, "html.parser") urls = set() for t in soup.find_all("table", class_="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls root@ubuntu:~/scripts/maltrieve-master# python maltrieve.py Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads From: hi-T0day [mailto:[email protected]] Sent: Saturday, March 04, 2017 7:31 AM To: krmaxwell/maltrieve <[email protected]> Cc: Lindsay Hunt <[email protected]>; Author <[email protected]> Subject: Re: [krmaxwell/maltrieve] files aren't downloaded (#182) Sorry, I gave you an wrong answer just now. But I got it now. You can change in "maltrieve.py" def process_urlquery(response): soup = BeautifulSoup(response) urls = set() for t in soup.find_all("table", class_="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls to: def process_urlquery(response): soup = BeautifulSoup(response, "html.parser") urls = set() for t in soup.find_all("table", class_="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_krmaxwell_maltrieve_issues_182-23issuecomment-2D284148635&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=q5gVyZzvxIN7Ph17gPhHTO7Q4aRkyOZ3mFKqvntA0Is&s=Bo83Xt6s_y-i4zVfjz2RploQJkZU9XGrLykpI64rA1I&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AYWdcofBGCSbHHGMCdu44BJXxVTK8oyxks5riVl5gaJpZM4L3jQB&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=q5gVyZzvxIN7Ph17gPhHTO7Q4aRkyOZ3mFKqvntA0Is&s=bYq5lVPDDGgRSi42bxp-2wHPiMbGqrvR1YoaWdCdWz4&e=>.

panw-ren · 2017-03-07T23:56:10Z

Having the same issue.

attrs==15.2.0
BeautifulSoup==3.2.1
beautifulsoup4==4.5.3
bs4==0.0.1
chardet==2.3.0
configobj==5.0.6
cryptography==1.2.3
ecdsa==0.13
enum34==1.1.2
feedparser==5.2.1
gevent==1.2.1
greenlet==0.4.12
idna==2.0
ipaddress==1.0.16
Landscape-Client==16.3+bzr834
ndg-httpsclient==0.4.0
PAM==0.4.2
paramiko==1.16.0
pyasn1==0.1.9
pyasn1-modules==0.0.7
pycrypto==2.6.1
pyOpenSSL==0.15.1
pyserial==3.0.1
python-apt==1.1.0b1
python-debian==0.1.27
python-magic==0.4.12
requests==2.13.0
scapy==2.3.3
service-identity==16.0.0
six==1.10.0
Twisted==16.0.0
zope.interface==4.1.3

user1@ubuntu-template:~/maltrieve/maltrieve-0.7/files$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial

user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ sudo ./maltrieve.py
Processing source URLs
Completed source processing
Downloading samples, check log for details
Completed downloads

user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ more urls.json
[]

user1@ubuntu-template:/maltrieve/maltrieve-0.7$ sudo ./maltrieve.py -d ./files/
Processing source URLs
Completed source processing
Downloading samples, check log for details
Completed downloads
user1@ubuntu-template:/maltrieve/maltrieve-0.7$ cd files/
user1@ubuntu-template:/maltrieve/maltrieve-0.7/files$ ls
user1@ubuntu-template:/maltrieve/maltrieve-0.7/files$

hi-T0day · 2017-03-08T08:19:49Z

I use another branch:https://github.com/HarryR/maltrieve. Now it works. I believe that you can success too.

lhunt23 · 2017-03-08T13:28:37Z

Hello, Please see below and let me know if you have any suggestions. python maltrieve.py -d /home/lhunt/malware/030817/ Traceback (most recent call last): File "maltrieve.py", line 580, in <module> main() File "maltrieve.py", line 520, in main cfg = config(args, 'maltrieve.cfg') File "maltrieve.py", line 131, in __init__ self.cuckoo_dist = self.configp.get('Maltrieve', 'cuckoo_dist') File "/usr/lib/python2.7/ConfigParser.py", line 623, in get return self._interpolate(section, option, value, d) File "/usr/lib/python2.7/ConfigParser.py", line 669, in _interpolate option, section, rawval, e.args[0]) ConfigParser.InterpolationMissingOptionError: Bad value substitution: section: [Maltrieve] option : cuckoo_dist key : dist_port_9003_tcp_addr rawval : http://%(DIST_PORT_9003_TCP_ADDR)s:9003<http://%25(DIST_PORT_9003_TCP_ADDR)s:9003> sudo python maltrieve.py Traceback (most recent call last): File "maltrieve.py", line 580, in <module> main() File "maltrieve.py", line 520, in main cfg = config(args, 'maltrieve.cfg') File "maltrieve.py", line 131, in __init__ self.cuckoo_dist = self.configp.get('Maltrieve', 'cuckoo_dist') File "/usr/lib/python2.7/ConfigParser.py", line 623, in get return self._interpolate(section, option, value, d) File "/usr/lib/python2.7/ConfigParser.py", line 669, in _interpolate option, section, rawval, e.args[0]) ConfigParser.InterpolationMissingOptionError: Bad value substitution: section: [Maltrieve] option : cuckoo_dist key : dist_port_9003_tcp_addr rawval : http://%(DIST_PORT_9003_TCP_ADDR)s:9003<http://%25(DIST_PORT_9003_TCP_ADDR)s:9003> python3 maltrieve.py File "maltrieve.py", line 125 self.priority = args.priority ^ TabError: inconsistent use of tabs and spaces in indentation From: hi-T0day [mailto:[email protected]] Sent: Wednesday, March 08, 2017 3:20 AM To: krmaxwell/maltrieve <[email protected]> Cc: Lindsay Hunt <[email protected]>; Author <[email protected]> Subject: Re: [krmaxwell/maltrieve] files aren't downloaded (#182) I use another branch:https://github.com/HarryR/maltrieve<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HarryR_maltrieve&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=ImMWXvy9JguyGyD18hgz8h_EksXC54OyxhexyLaAmVc&e=>. Now it works. I believe that you can success too. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_krmaxwell_maltrieve_issues_182-23issuecomment-2D284976575&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=06gLrt0YoyxjvFtkm4a7GAvquZxAmJ7e6NTS2c3pEmc&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AYWdcmZoQZP1fBLF6IFcFh5qZLgSybn7ks5rjmSngaJpZM4L3jQB&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=7jmQV-TS7HXIQUgqqzHYvJ4krDZGvRk2JCBgST_Vbdk&e=>.

hi-T0day · 2017-03-13T10:16:48Z

IF you add "#" before line8,9 in file "maltrieve.cfg" can maltrieve work?

rkalugdan · 2017-03-13T21:53:17Z

upon review, they were already commented out.

[Maltrieve]
dumpdir = archive
logfile = maltrieve.log
logheaders = true
User-Agent = Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)

#viper = http://127.0.0.1:8080
#cuckoo = http://127.0.0.1:8090
#vxcage = http://127.0.0.1:8080
#crits = https://127.0.0.1
#crits_user = maltrieve
#crits_key = <api_key>
#crits_source = maltrieve

Filter Lists are based on mime type NO SPACE BETWEEN ,

#black_list = text/html,text/plain
#white_list = application/pdf,application/x-dosexec

panw-ren · 2017-03-13T23:16:23Z

utilized the other branch as mentioned by hi-T0day but still no luck

user1@ubuntu-template:/maltrieve-0.7$ sudo ./maltrieve.py -d /home/user1/malware
Processing source URLs
Completed source processing
Downloading samples, check log for details
Completed downloads
user1@ubuntu-template:/maltrieve-0.7$ cd /home/user1/malware/
user1@ubuntu-template:/malware$ ls
user1@ubuntu-template:/malware$

hashesh.json/urls.json files are empty

2017-03-13 16:13:32 140601425241856 Loaded urls from urls.json
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): support.clean-mx.de
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): www.malwaredomainlist.com
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): vxvault.siri-urz.net
2017-03-13 16:13:32 140601425241856 Starting new HTTPS connection (1): zeustracker.abuse.ch
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): urlquery.net
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malwareurls.joxeankoret.com
2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malc0de.com
2017-03-13 16:13:32 140601425241856 http://www.malwaredomainlist.com:80 "GET /hostslist/mdl.xml HTTP/1.1" 200 5735
2017-03-13 16:13:33 140601425241856 https://zeustracker.abuse.ch:443 "GET /monitor.php?urlfeed=binaries HTTP/1.1" 200 3882
2017-03-13 16:13:33 140601425241856 http://malwareurls.joxeankoret.com:80 "GET /normal.txt HTTP/1.1" 200 11192
2017-03-13 16:13:33 140601425241856 http://malc0de.com:80 "GET /rss/ HTTP/1.1" 200 None
2017-03-13 16:13:33 140601425241856 http://urlquery.net:80 "GET / HTTP/1.1" 200 4703
2017-03-13 16:13:34 140601425241856 http://support.clean-mx.de:80 "GET /clean-mx/rss?scope=viruses&limit=0%2C64 HTTP/1.1" 200 918
2017-03-13 16:13:34 140601425241856 Dumping past URLs to urls.json
2017-03-13 16:13:34 140601425241856 Dumping hashes to hashes.json

user1@ubuntu-template:/maltrieve-0.7$ more hashes.json
[]
user1@ubuntu-template:/maltrieve-0.7$ more urls.json
[]

lhunt23 · 2017-03-14T00:32:11Z

panw-ren · 2017-03-14T14:09:49Z

are people still able to get help on issues w/ maltrieve?

rkalugdan · 2017-03-22T20:42:11Z

bump

getChester · 2017-03-25T05:17:20Z

confirming that nothing is being downloaded.

futex · 2017-06-21T11:36:18Z

Same problem, nothing is downloaded.

atefsaleh · 2017-09-20T18:26:15Z

I realize questions are 2 years old but i have the same case of this issue, did anybody came up with a solution or cause ?
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

files aren't downloaded #182

files aren't downloaded #182

lhunt23 commented Feb 5, 2017

clayball commented Feb 28, 2017

lhunt23 commented Feb 28, 2017

hi-T0day commented Mar 4, 2017

hi-T0day commented Mar 4, 2017 •

edited

Loading

lhunt23 commented Mar 4, 2017 via email

panw-ren commented Mar 7, 2017

hi-T0day commented Mar 8, 2017

lhunt23 commented Mar 8, 2017 via email

hi-T0day commented Mar 13, 2017

rkalugdan commented Mar 13, 2017

panw-ren commented Mar 13, 2017

lhunt23 commented Mar 14, 2017 via email

panw-ren commented Mar 14, 2017

rkalugdan commented Mar 22, 2017

getChester commented Mar 25, 2017

futex commented Jun 21, 2017

atefsaleh commented Sep 20, 2017

files aren't downloaded #182

files aren't downloaded #182

Comments

lhunt23 commented Feb 5, 2017

clayball commented Feb 28, 2017

lhunt23 commented Feb 28, 2017

hi-T0day commented Mar 4, 2017

hi-T0day commented Mar 4, 2017 • edited Loading

lhunt23 commented Mar 4, 2017 via email

panw-ren commented Mar 7, 2017

hi-T0day commented Mar 8, 2017

lhunt23 commented Mar 8, 2017 via email

hi-T0day commented Mar 13, 2017

rkalugdan commented Mar 13, 2017

Filter Lists are based on mime type NO SPACE BETWEEN ,

panw-ren commented Mar 13, 2017

lhunt23 commented Mar 14, 2017 via email

panw-ren commented Mar 14, 2017

rkalugdan commented Mar 22, 2017

getChester commented Mar 25, 2017

futex commented Jun 21, 2017

atefsaleh commented Sep 20, 2017

hi-T0day commented Mar 4, 2017 •

edited

Loading