Amazon Web Services (AWS) SDK for Python S3 and using CloudFront logs

Connect to S3 via Boto3. Boto is the Amazon Web Services (AWS) SDK for Python.

Fetching CloudFront log files. You can be analysed log files by downloading the files to the EC2 instance. This Python script does the following:

Reads bucket name as the first argument.
Saves CloudFront log files to the EC2 instance as the second argument.
Combines the gzipped (gz) log files into a single date file.
Decompresses the file.
Removes comments.
Removes S3 CloudFront logs.

Getting Started

You can install Python version 3+ and Boto3 library. And all things will be done in few minutes.

Installing

If your haven't installed Python with Boto3, you have to install.

Server system on Unix platforms:

$ sudo -i
$ yum -y install python36
$ pip install botocore
$ pip install boto3

Using Boto3

After installing boto3. You can use it to configure your credentials file:

Basic Configuration

You need to set up your AWS security credentials before run as Python script.

aws configure

Alternatively, you can create the credential file yourself. By default, its location is at ~/.aws/credentials:

[default]
aws_access_key_id = <YOUR_AWS_ACCESS_KEY>
aws_secret_access_key = <YOUR_AWS_SECRET_KEY>

How to fetch Python code

You can clone this repository.

$ git clone https://github.com/spakkanen/python-boto3-mv-cloudfront-s3-logs-to-ec2-instances.git
$ cd python-boto3-mv-cloudfront-s3-logs-to-ec2-instances
$ python mv_s3_all_files.py <S3_BUCKET_NAME> <WORK_PATH>

Connect to S3 services

This Python application connects to the AWS S3. Script fetch CloudFront log files.

$ python mv_s3_all_files.py <S3_BUCKET_NAME> <WORK_PATH>

Cron schedule

You should set up automatically run Python scripts on a schedule.

Schedule syntax is normal Unix cron/crontab syntax for examples:

2,12,22,32,42,52 * * * * python /root/python-boto3-mv-cloudfront-s3-logs-to-ec2-instances/mv_s3_all_files.py <S3_BUCKET_NAME> <WORK_PATH> > /dev/null
# Removes all files older than 2 weeks.
02 04 * * * find /root/python-boto3-mv-cloudfront-s3-logs-to-ec2-instances/cloudfront_prod1/ -mtime +14 -type f -exec rm -f {} \;

Read the log files

The log file is a UNIX standard format and that's easy using and read UNIX tools.

Find main page links (/) amount from current date log files.

$ cat logs/$(date +"%Y%m%d").log | grep -P '\t/\t' | wc -l

Find all main page links.

$ cat logs/$(date +"%Y%m%d").log | grep -P '\t/\t' | cut -f 5 | sort | uniq -c | sort -nr | head -n10

Find all 404 error links.

$ cat logs/$(date +"%Y%m%d").log | grep -P '\t404\t' | cut -f 5 | sort | uniq -c | sort -nr | head -n10

Sort most hits.

$ cat logs/$(date +"%Y%m%d").log | cut -f4-5 -d$'\t' | sort | uniq -c | sort -nr | head -n10

Contributing

Contributions are welcome. Open an issue first before sending in a pull request. All pull requests review before they are merged to master.

Contact

Please contact: [email protected] .

Workgroup

Member

Member Name	GitHub Alias	Company	Role
Saku Pakkanen	[@spakkanen] (https://github.com/spakkanen)	[Kliffa Innovations Oy] (https://kliffa.fi/en/)	Committer

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
_config.yml		_config.yml
google483ade844fbd4a2e.html		google483ade844fbd4a2e.html
mv_s3_all_files.py		mv_s3_all_files.py
sitemap.xml		sitemap.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amazon Web Services (AWS) SDK for Python S3 and using CloudFront logs

Getting Started

Installing

Basic Configuration

How to fetch Python code

Connect to S3 services

Cron schedule

Read the log files

Contributing

Contact

Workgroup

Member

About

Releases

Packages

Languages

spakkanen/python-boto3-mv-cloudfront-s3-logs-to-ec2-instances

Folders and files

Latest commit

History

Repository files navigation

Amazon Web Services (AWS) SDK for Python S3 and using CloudFront logs

Getting Started

Installing

Basic Configuration

How to fetch Python code

Connect to S3 services

Cron schedule

Read the log files

Contributing

Contact

Workgroup

Member

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages