SSB Package Statistics Viewer

This project provides a tool to download, process, and interactively explore statistics about public packages using the Libraries.io API. It fetches data about all public packages associated with Statistics Norway and presents the results in an interactive table format.

If you're just interested in the processed results, visit the GitHub Pages deployment.

Features

Package Data Fetching: Fetches data about all public packages from Libraries.io.
Interactive Table: Displays package data in a dynamic, searchable, and sortable table using Tabulator.js.
CSV Download: Allows users to download the dataset as a CSV file for offline use.
DuckDB Integration: Easily query and sample the data in the DuckDB Web Shell for further analysis.

Requirements

To fetch and process data using the Libraries.io API:

Libraries.io API Key:
- You'll need a valid API key from Libraries.io. You can sign up for one here.
- Add your API key to the appropriate part of the data-fetching script.
Python Environment:
- Install the required Python dependencies:
```
pip install pandas requests
```

How to Use

1. Download and Process Data

Run the data-fetching script to download the package data:

python fetch_data.py

The script will:

Fetch all public packages associated with Statistics Norway from Libraries.io.
Save the results as results.csv in the src/ directory.

2. Open the Results Viewer

Open index.html in your browser to view the interactive table.

3. Explore the Results

Use the "Download CSV" button to save the data for offline use.
Use the "Open in DuckDB Web Shell" button to query the dataset directly in the DuckDB Web Shell.

Preprocessed Results

If you don't want to fetch and process the data yourself, you can access the processed results directly:

Interactive Viewer on GitHub Pages

DuckDB Query Example

The DuckDB Web Shell button includes a query to:

Load the dataset into a table called ssb_packages.
Sample 10 random rows from the table.

The SQL query used:

-- Load CSV file and create a table
CREATE TABLE ssb_packages AS
SELECT *
FROM read_csv_auto('https://trygu.github.io/ssb-pypi-statistics/results.csv');

-- Sample 10 rows from the table
FROM ssb_packages USING SAMPLE 10;

Development Notes

Ensure the API key is correctly configured in the fetch_data.py script before running it.
The data viewer (index.html) is designed to use a preprocessed results.csv. Modify the DuckDB query URL in the HTML if hosting the dataset elsewhere.

Credits

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.github/workflows		.github/workflows
src		src
.cruft.json		.cruft.json
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
SSB-PROJECT-INSTRUKSJONER.md		SSB-PROJECT-INSTRUKSJONER.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSB Package Statistics Viewer

Features

Requirements

How to Use

1. Download and Process Data

2. Open the Results Viewer

3. Explore the Results

Preprocessed Results

DuckDB Query Example

Development Notes

Credits

License

About

Releases

Packages

Contributors 2

Languages

License

trygu/ssb-pypi-statistics

Folders and files

Latest commit

History

Repository files navigation

SSB Package Statistics Viewer

Features

Requirements

How to Use

1. Download and Process Data

2. Open the Results Viewer

3. Explore the Results

Preprocessed Results

DuckDB Query Example

Development Notes

Credits

License

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages