This is a project I worked on in 2024 to find arbitrage in Polymarket.com specifically as it pertained to the Presidential elections. My original writeup on this can be found here: Arbitrage in Polymarket.com.
The repository includes Python scripts and utilities to interact with the Polymarket Central Limit Order Book (CLOB) API on Polygon Mainnet (chain ID 137). These scripts enable:
- Generating/fetching API keys
- Fetching real-time or historical market data (prices, volumes, order books)
- Searching and analyzing markets
- Calculating arbitrage opportunities
- Scraping leaderboards
- Managing user trades (fetching, storing, analyzing)
- Automated or semi-automated trading logic
Important: All sensitive data (private key, secrets, passphrases) must be stored in environment variables (keys.env
or .env
). Never commit private keys to source control.
- Installation and Dependencies
- Environment Setup
- Project Structure
- Script Descriptions
- Usage Examples
- Outdated / Duplicate Scripts
- Recommendations and Best Practices
- Disclaimer
- License
Python 3.9+ recommended.
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
pip install -r requirements.txt
You may need to download a matching ChromeDriver for your Chrome version.
Store your sensitive data:
PK=<YOUR_PRIVATE_KEY>
API_KEY=<YOUR_API_KEY>
SECRET=<YOUR_API_SECRET>
PASSPHRASE=<YOUR_API_PASSPHRASE>
POLYGONSCAN_API_KEY=<YOUR_POLYGONSCAN_KEY>
Always add .env
and keys.env
to your .gitignore
.
source keys.env # Or use dotenv
.
├── data/
│ ├── markets_data.csv
│ ├── market_lookup.json
│ ├── user_trades/
│ ├── historical/
│ ├── book_data/
│ ├── polymarket_trades/
│ ├── ...
├── old/
│ └── condition_id_question_mapping.json
├── plots/
│ └── # HTML/visual outputs from Plotly
├── strategies/
│ └── # Arbitrage strategies or multi-trade definitions
├── create_markets_data_csv.py
├── derive_api_key.py
├── generate_api_key.py
├── generate_market_lookup_json.py
├── generate_markets_data_csv.py
├── get_all_historical_data.py
├── get_api_key.py
├── get_leaderboard_wallet_ids.py
├── get_live_price.py
├── get_market_book_and_live_arb.py
├── get_order_book.py
├── get_polygon_data.py
├── get_polygon_latest_trade_price.py
├── get_presidential_state_odds.py
├── get_trade_slugs_to_parquet.py
├── get_user_profile.py
├── get_user_trade_prices.py
├── goldsky.py
├── live_trade.py
├── plot_arb.py
├── rcp_poller.py
├── strategies.py
├── keys.env
├── .env
└── README.md
Below is a brief overview of each script:
- Purpose: Fetch all Polymarket CLOB markets and store as CSV (
./data/markets_data.csv
). - Notes: A simpler or older approach. Potentially replaced by
generate_markets_data_csv.py
.
- Purpose: Demonstrates deriving an API key from your private key (
PK
). - Notes: Example usage of
client.derive_api_key()
frompy_clob_client
.
- Purpose: Similar to
derive_api_key.py
, but usesclient.create_or_derive_api_creds()
. - Notes: Outputs
API_KEY
,SECRET
, andPASSPHRASE
which you must store securely.
- Purpose: Reads
markets_data.csv
and generatesmarket_lookup.json
, mappingcondition_id
→ details (description
,market_slug
,tokens
). - Notes: Helps with quick lookups of
slug
,outcome
, andtoken_id
.
- Purpose: A more robust or updated market-fetching script, includes additional logic (pagination, keyword search).
- Notes: Potentially supersedes
create_markets_data_csv.py
.
- Purpose: Uses
markets_data.csv
to pull minute-level historical prices for each market token. Saves to Parquet/CSV in./data/historical
. - Notes: Useful for time-series analysis or backtesting.
- Purpose: Fetches existing API keys linked to your Polymarket account using
client.get_api_keys()
. - Notes: Must have set creds from
.env
orkeys.env
.
- Purpose: Scrapes Polymarket’s leaderboard page (volume/profit) for top addresses using Selenium + BeautifulSoup.
- Notes: Uses ChromeDriver in headless mode. Outputs JSON to stdout by default.
- Purpose: Fetch last-trade price of a token ID from Polymarket CLOB. Implements caching to minimize repeated calls.
- Notes: Called by scripts that need up-to-date prices.
- Purpose: Integrates user trade data, real-time order books, and
live_price
to calculate arbitrage. Renders HTML summary withjinja2
. - Notes: Runs continuously if desired, updating every X minutes. Great for live monitoring.
- Purpose: Fetch order books for relevant trades (from
strategies.py
), store as CSV in./data/book_data
. - Notes: Typically invoked before
get_market_book_and_live_arb.py
or other arbitrage scripts.
- Purpose: Comprehensive script to fetch user trade data (ERC-20 and ERC-1155 from Polygonscan), enrich with market info, merge, calculate P/L, produce plots, etc.
- Notes: Calls sub-processes like
get_live_price.py
. - Output: Writes enriched transaction data to CSV and/or Parquet in
./data/user_trades/
.
- Purpose: Similar to
get_polygon_data.py
but focuses on retrieving the latest transaction prices for specific markets/trades. - Notes: Good for “just-in-time” price checks, merges data with user profiles, updates
latest_blockchain_prices.json
.
- Purpose: Specialized for fetching presidential election data by U.S. states (e.g., “Will Republicans or Democrats win?”).
- Notes: Builds
state_condition_map
from search results and writes odds tostate_odds.csv
.
- Purpose: Given a token ID, market slug, and outcome, fetch minute-level timeseries from Polymarket
/prices-history
. - Notes: Saves to both Parquet and optional CSV for easy repeated analysis (
./data/historical
).
- Purpose: Uses Selenium to scrape Polymarket user profile pages for user details (username, P/L, volume, etc.).
- Notes: Invoked by other scripts to attach user metadata to addresses.
- Purpose: Orchestrates user trades and merges them with strategies.
- Notes: Calls
get_polygon_data.py
as a subprocess, loads strategies fromstrategies.py
, and generates HTML summary.
- Purpose: Demonstrates how to download a Parquet from S3 (GoldSky data) and load it into a DataFrame.
- Notes: Contains AWS credentials for example (should store them securely!).
- Purpose: Example of how to place live trades (limit or market) via
py_clob_client
. - Notes: Includes logic for checking balances/allowances, adjusting limit price, etc. Use caution in production.
- Purpose: Aggregates, merges, and plots various trades from
strategies.py
. - Notes: Uses
get_trade_slugs_to_parquet.py
behind the scenes to fetch data, then produces Plotly HTML. Helps visualize multi-leg positions & arbitrage %.
- Purpose: Example script polling a RealClearPolitics page for poll data. Sends Twilio WhatsApp alerts if changes occur.
- Notes: Demonstrates web scraping + push notifications. Not strictly a Polymarket script but relevant for external signals.
- Purpose: Defines a Python list of trades describing multi-leg trades or “basket trades.” Each dictionary includes:
trade_name
,subtitle
,side_a_trades
andside_b_trades
(for “balanced” method)- or
positions
for “all_no.”
- Notes: Scripts like
get_market_book_and_live_arb.py
andplot_arb.py
reference these entries to systematically fetch and analyze data.
python generate_api_key.py
# or
python derive_api_key.py
Place outputs in keys.env
.
python generate_markets_data_csv.py
# or the older version
python create_markets_data_csv.py
python generate_market_lookup_json.py
python get_all_historical_data.py
python get_polygon_data.py --wallets 0xYourAddressHere
This produces an enriched CSV/Parquet in ./data/user_trades/
.
python get_order_book.py
python get_market_book_and_live_arb.py
python get_trade_slugs_to_parquet.py <token_id> <market_slug> <outcome>
# Then
python plot_arb.py