Skip to content

russellbrooks/aws-data-wrangler

 
 

Repository files navigation

AWS Data Wrangler

Pandas on AWS


NOTE

We just released a new major version 1.0 with breaking changes. Please make sure that all your old projects has dependencies frozen on the desired version (e.g. pip install awswrangler==0.3.2).


AWS Data Wrangler

Release Python Version Code style: black License Checked with mypy Average time to resolve an issue

Coverage Static Checking Documentation Status

Source Downloads Page Installation Command
PyPi PyPI Downloads Link pip install awswrangler
Conda Conda Downloads Link conda install -c conda-forge awswrangler

Quick Start

Install the Wrangler with: pip install awswrangler

import awswrangler as wr
import pandas as pd

df = pd.DataFrame({"id": [1, 2], "value": ["foo", "boo"]})

# Storing data on Data Lake
wr.s3.to_parquet(
    df=df,
    path="s3://bucket/dataset/",
    dataset=True,
    database="my_db",
    table="my_table"
)

# Retrieving the data directly from Amazon S3
df = wr.s3.read_parquet("s3://bucket/dataset/", dataset=True)

# Retrieving the data from Amazon Athena
df = wr.athena.read_sql_query("SELECT * FROM my_table", database="my_db")

Packages

No packages published

Languages

  • Python 65.3%
  • Jupyter Notebook 33.6%
  • Other 1.1%