An internal reporting tool that will use information from the database ( contains newspaper articles, as well as the web server log for the site) to discover what kind of articles the site's readers like. The log has a database row for each time a reader loaded a web page.
Using above information, this code will answer following questions:
- What are the most popular three articles of all time?
- Who are the most popular article authors of all time?
- On which days did more than 1% of requests lead to errors?
Python
Vagrant
Virtual Box
PostgreSQL
- Install Virtual Box, Vagrant.
- Clone the VM configuration vm
- Download this repo Or Clone this repo into the
/vagrant
directory. - Launch the VM:
$ vagrant up
- SSH into the VM:
$ vagrant ssh
- In the VM navigate to the
/vagrant
folder:vagrant@vagrant:~$ cd /vagrant
- Download Data.
The file inside is called
newsdata.sql
, put this file into thevagrant
directory, which is shared with your virtual machine. - Load the data into the
news
database already in the VM:vagrant@vagrant:~$ psql -d news -f newsdata.sql
- Run application:
vagrant@vagrant:~$ python wlogs_analysis.py