- Synopsis
- Contributions
- Requirements
- Installation
- Plaid
- Plaid2Text
- Arguments Summary
- Options Summary
- Example Uses
- Processing a Transaction
- Configuration Files
- Workflow
- DISCLAIMER
The purpose of this python script is to bring those of us who chose to use
command line accounting
, some of the benefits of more conventional accounting
programs, namely the ability to pull our transaction information from supported
institutions via automated means and format them into our preferred text syntax.
Currently, this program supports Ledger and Beancount syntax exports.
To download transactions, we use Plaid. This program will help you setup your accounts and download transactions from the Plaid API. I have tried to make this as simple as possible to setup.
Also, once downloaded, your transactions will be stored in a Mongo database. It is actually the transactions pulled from the database that we run though our syntax renderers. This is required to help keep track of which transaction we have already processed (as well as have records to reconstruct our files should the need arise).
The main inspiration for the workings of the export part of the script came from the excellent icsv2ledger. I borrowed heavily from Quentin’s excellent script in making this tool, and in some places I have shamelessly stolen his code altogether.
Feedback and contributions are encouraged. I hope that this will pick up some traction in the community, and in the end we can all have a rock-solid program to help us with our accounting.
- Python => 3.5
- PyMango => 0.1.1
- prompt_toolkit => 0.57
- plaid-python-legacy => 1.3.0
- ledger-cli => 3 (if using ledger syntax)
- Beancount => 2.0 (if using beancount syntax)
- MongoDB => 3.2.3
I have only tested this on Linux. I have no desire to run this on Windows, but feel free to give it a shot, it may work. The same goes for Mac, although sometime in the future I may test on OSX.
Note: python dependencies must be installed prior to running the script. All
can be installed from pip
or of course whatever means you wish to use.
This program is setup as a python package, and can be installed using your
preferred tools, but this document will only cover using pip
.
First, clone the git repo:
git clone [email protected]:madhat2r/plaid2text.git
or if preferring HTTPS:
git clone https://github.com/madhat2r/plaid2text.git
Then change into folder:
cd plaid2text
Then use pip
to install:
pip install .
If you plan on making modifications to the program, then you may want to install
it as editable. (this is my preferred way)
pip install -e .
Plaid is an API used in building web-based financial applications. It provides access to our transactions at a number of institutions. To get started with this program, you must sign up for a plaid account. Once you have done that Plaid will issue you some developer keys to use with their API.
The keys we are interested in are:
- client_id
- this is your developer ID
- secret
- this is your authentication token
Once you have obtained your keys then use plaid2text
to create your
configuration file and save your keys into it. You can do this by simply
invoking plaid2text without arguments. plaid2text
will prompt you for your
keys and store them in your config file.
A note about Plaid. Plaid is a paid service, but developers have access to the developer API without paying. The developer API has all the features of the production API. I have been using this for a few months now on my 6 accounts and everything is still working just fine. I did contact them about what the cost would be (and told them my use case), and was informed that a paid version comes to 0.25 USD per account, per month. That is still a heck-of-a-better deal than Quickbooks online in my opinion. I can get my 6 accounts for 1.50 USD per month, but like I mentioned, I have yet run into any caps on my developer account, so that may be all I ever use.
In order to use plaid2text
you must have already followed the instructions in
the Plaid section. Once you have your initial config in place, then let’s get
started in creating your first account.
In order to get transactions from Plaid, you must create an account. In order to
create an account, you must authenticate yourself to your institution via your
username and password, and also most institutions require some form of multi
factor authentication, usually in the form of security questions, or codes sent
to registered phone/email for the account. Using plaid2text
this is easy to
accomplish.
First issue the following command:
plaid2text account_nickname --create-account
As an example let’s say I am setting up an account for my personal checking, and the institution I use is Chase. I might use this command to create it.
plaid2text chase_checking --create-account
This would allow me to setup and account with the nickname chase_checking
. You
will use the nickname for downloading and parsing transactions, so make
it something meaningful for you.
Once you have issued the above command plaid2text
will present you with a list of
institutions from which to choose.
What bank is this account going to be for? 1 Bank of America 2 Chase 3 Wells Fargo 4 Citi 5 US Bank 6 USAA 7 American Express 8 Navy Federal Credit Union 9 SunTrust Enter Number [q to quit]:
In our example we will choose 2
and hit enter. This will take us to the
authorization screen, and will prompt for your credentials and follow up with
the multi factor authentication (if required). If you enter everything correctly
you will have setup your first account in Plaid. The program will then prompt
you for some default config options. Config options are covered in the proceeding
sections.
Note: wait at least 15 minutes before the first download of your transactions, this give Plaid time to collect the information from your institution. Plaid says it will have them within 240 seconds, but I think it’s better to give it time.
Also, different institutions keep your historical data for different lengths of time. You can check the availability here: Data Availability
plaid_account: (mandantory) this is the nickname you assigned when creating account outfile: output filename or stdout in your chosen snytax (ledger,beancount)
Note: the outfile will be overwritten each time this is run so be careful that you do not erase your current journal file, or any other file of importance.
A lot of these options also have an equivalent setting in the config file
(~/.config/plaid2text/config
). Where this happens, the config file settings
will be underscored versions of the command line long options: --mappings-file
would become mappings_file
.
Also, note that when there are both config setting and command line options, the command line options take precedence over config file settings.
--accounts-file FILE file which holds a list of account names (LEDGER ONLY) (default : ~/.config/plaid2text/accounts) --all-transactions pull all transactions even those who have been previously marked as processed (default: False) --clear-screen, -C clear screen for every transaction (default: False) --cleared-character {*,!} character to clear a transaction (default: *) --create-account Create a new Plaid account using the plaid-account argument as the new nickname (Example: chase_savings) --currency STR the currency of amounts (default: USD ) --default-expense STR expense account used as default destination (default: Expenses:Unknown) --download-transactions, -d download transactions into Mongo for given plaid account --from-date STR specify a the starting date for transactions to be pulled; use in conjunction with --to-date to specify rangeDate format: YYYY-MM-DD --headers-file FILE file which contains contents to be written to the top of the output file (default: ~/.config/plaid2text/headers) --journal-file FILE, -j FILE journal file where to read payees/accounts Tip: you can use includes to pull in your other journal files (default journal file: ~/.config/plaid2text/journal) --mapping-file FILE file which holds the mappings (default: ~/.config/plaid2text/mapping) --mongo-db STR The name of the Mongo database (default: plaid2text) --mongo-db-uri STR The URI for your MongoDB in the MongoDB URI format (default: mongodb://localhost:27017) --no-mark-processed, -n Do not mark pulled transactions. When given, the pulled transactions will still be listed as new transactions upon the next run. (default: False) --output-date-format STR date format for output file (default: YYYY/MM/DD) --output-format {beancount,ledger}, -o {beancount,ledger} what format to use for the output file. (default format: beancount) --posting-account STR, -a STR posting account used as source (default: Assets:Bank:Checking) --quiet, -q do not prompt if account can be deduced from mappings (default: False) --tags, -t prompt for transaction tags (default: False) --template-file FILE file which holds the template (default: ~/.config/plaid2text/template) --to-date STR specify the ending date for transactions to be pulled; use in conjunction with --from-date to specify rangeDate format: YYYY-MM-DD -h, --help show this help message and exit
--accounts-file
is a file that you can store predefined account definitions for Ledger in
the form of account Expenses:Unknown
. This file is parsed for the account
names and all lines that do not start with account will be ignored.
This is LEDGER specific setting.
--all-transactions
will pull all transactions regardless if they are marked as already pulled.
By default only transactions that have not been pulled to text are returned.
--clear-screen, -C
clears the screen before every transaction prompt. Default is False
.
--cleared-character {*,!}
is the character mark a transactions as cleared or not. Default is *
--create-account
is used to create a new account. See creating account section above for more.
--currency STR
is the currency used for transactions. Default is USD
.
--default-expense STR
is the default account for which to post transactions to. Default
Expenses:Unknown
--download-transactions, -d
fetches new transactions from Plaid into Mongo for given account.
Use: plaid2text acct_nickname -d
--from-date STR
specify a the starting date for transactions to be pulled.
Use in conjunction with --to-date
to specify range
Date format: YYYY-MM-DD
or YYYY/MM/DD
--headers-file FILE
file which contains contents to be written to the top of the output file. For
example, I store my beancount files as OrgMode files, so I have my headers file
setup to insert instructions at the top for Emacs
, to help ease my editing of
them once they are exported to text. And also I include my main beancount file
which has all my accounts listed, this also allows for easy running of
bean-check
to verify the newly exported file.
;; -*- mode: org; mode: beancount; -*- include "/path/to/somewhere/main.beancount"
Default: ~/.config/plaid2text/headers
--journal-file FILE, -j FILE
journal file where to read payees/accounts. This could be your main ledger file
or your main beancount file.
Tip: you can use includes to pull in your other journal files
Default journal file: ~/.config/plaid2text/journal
--mapping-file FILE
file which holds the mappings for matching transactions to accounts/payees as
well as some default tags, if you want.
You can have a separate mappings file per account.
default: ~/.config/plaid2text/mapping
--mongo-db STR
name of the Mongo database that stores downloaded transactions.
Default: plaid2text
--mongo-db-uri STR
The URI for your MongoDB in the MongoDB URI format
Default: mongodb://localhost:27017
--no-mark-processed, -n
will not mark pulled transactions as pulled. When passed, the pulled transactions will still be listed as new
transactions upon the next run.
Default: False
--output-date-format STR
date format for output file
Default: YYYY/MM/DD
--output-format {beancount,ledger}, -o {beancount,ledger}
what syntax to use for the output file.
Default output format: beancount
--posting-account STR, -a STR
posting account used as source
Default: Assets:Bank:Checking
--quiet, -q
do not prompt if account can be deduced from mappings
Default: False
--tags, -t
causes the program to prompt for transaction tags
Default: False
--template-file FILE
file which holds the text template used in the output file for formatting transactions.
Default: ~/.config/plaid2text/template
--to-date STR
specify the ending date for transactions to be pulled.
use in conjunction with --from-date
to specify range
Date format: YYYY-MM-DD
or YYYY/MM/DD
The following will set up a new account with nickname chase_checking
plaid2text chase_checking --create-account
The following will download all new transactions for the account
chase_checking
.
NOTE: when downloading for the first time, be sure to wait at least 15min after setting up the account. This gives Plaid time to pull your transactions from the institution.
plaid2text chase_checking --downlad-transactions
The following will pull all new transactions for account
chase_checking
and output them to /tmp/onetime.ldg
Ledger syntax
after prompting you for the correct information for every transaction and
marking all pulled transaction in the database as pulled.
plaid2text chase_checking /tmp/onetime.ldg --output-format ledger
The following will pull all transactions starting from the given date for the
chase_checking
account and will not mark them as pulled in the database, and
will output beancount syntax to stdout.
plaid2text chase_checking --all-transactions --from-date 2015/04/15 --no-mark-processed
When you start processing transactions, you will be presented with
several prompts related to the current transaction. These prompts will be to
get the associated account, the payee, and optionally tags. If you have a
mappings file, provided a journal file, or have already processed a few
transactions, then TAB
completion is available at all these prompts.
During your first run, when your mappings file has not yet been established, you will have to manually (via prompts) establish the correct accounts and payees. But once you have done so, your mappings file will have the correct information for transactions in the future, and given that most of us are creatures of habit and make purchases from the same places, you will only occasionally have to account for new entries.
Now let’s walk through a transaction for you can get an idea of what to expect.
Keeping with out sample account chase_checking
, we will pull the latest
transactions, and also prompt for tags (--tags
) and suppress prompting for
known transactions via our mappings (--quiet
), we will also be using
beancount
output syntax (--output-format
).
plaid2text chase_checking /tmp/onetime.bnc --quiet --tags --output-format beancount
When the above command is run, you will be presented with a prompt for the first
non-matched transaction. The first prompt is for the payee. You will notice that
the default answers are in []
, so if you just hit enter, that will be the
value. When looking at the transaction prompt, you will see it starts with a
date followed by the name that Plaid assigns this transaction, in this case
Plaid got it correct, this will not always be the case. The next area shows the
amount of the transaction.
Following the payee prompt is the “Account” prompt. Enter the correct associated account, then hit enter.
Then we are prompted for tags (because we passed --tags
). Tags work a little
differently, you will be prompted over and over for tags until you just hit
enter without typing another value. If you make a mistake in entering your tag,
you may prefix the tag with -
(minus) to remove it. For instance say you
accidentally typed mvoie
and hit enter, when the prompt comes back you see
your mistake in the default area and want to correct it. So now you type
-mvoie
and hit enter, and you will notice that the tag has been removed.
When you hit enter the final time on tags, the program will move on to the next transaction needing your input.
Again, all of these prompts use TAB
completion, and the more information you
give the program, via your config files, the better the completion becomes.
This is an example config file that has an account setup that is nicknamed
chase_checking
. You will notice some settings that are obfuscated with xxx,
these are created when setting up accounts, and are not entered manually.
[DEFAULT] posting_account = Assets:Bank:Checking default_expense = Expenses:Unknown encoding = utf-8 currency = USD mongo_db = plaid2text mongo_db_uri: mongodb://localhost:27017 quiet = False tags = False output_date_format = %%Y/%%m/%%d clear_screen = False cleared_character = * output_format = beancount [PLAID] client_id = xxxxxxxa66710877xxxxxxxx secret = xxxxxxxxx8c9a0cd27xxxxxxxxxxxx [chase_checking] access_token = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxdf0fbb4cc4f5323f7ccd6 account = xxxxxxxxxxxxxPzJ3nAkFxxxxxxxxxxxxxxxx currency = USD posting_account = Assets:Bank:Chase:Checking mapping_file = ~/.config/plaid2text/chase_checking/mapping_bc headers_file = ~/.config/plaid2text/chase_checking/headers_bc accounts_file = ~/somewhere/main.beancount journal_file = ~/somewhere/beancount/main.beancount template_file = ~/.config/plaid2text/chase_checking/template_bc
The template file is what transforms your transactions into the desired text
based accounting syntax. You have access to all the fields that plaid returns to
use in your templates. But be aware that not all fields are returned with every
transaction, and you might have to modify the source to handle this, should you
choose to use them in your template. Below is a list of all fields available.
The A
column indicates if field is always available.
Field | Types | A |
---|---|---|
_account | String | y |
_id | String | y |
amount | Number | y |
name | String | y |
date | Date | y |
meta | Object | y |
meta.location | Object | y |
pending | Boolean | y |
score | Object | y |
score.location | Object | y |
score.name | Number | y |
type | Object | y |
type.primary | String | y |
meta.location.state | String | n |
score.location.state | Number | n |
category | Array | n |
category_id | String | n |
meta.location.city | String | n |
score.location.city | Number | n |
meta.location.coordinates | Object | n |
meta.location.coordinates.lat | Number | n |
meta.location.coordinates.lon | Number | n |
score.location.address | Number | n |
score.location.zip | Number | n |
meta.location.address | String | n |
meta.location.zip | String | n |
meta.location.store_number | String | n |
meta.payment_processor | String | n |
meta.ppd_id | String | n |
_pendingTransaction | String | n |
meta.reference_number | String | n |
meta.payee | String | n |
meta.payment_method | String | n |
In addition to the above fields plaid2text
also provides the following:
Field | type |
---|---|
posting_account | String |
associated_accounts | String |
payee | String |
tags | String |
- _account
- the Plaid account ID
- _id
- the Plaid transaction ID, Also the MongoDB
_id
- name
- the Plaid name for the transaction. (i.e. Best Buy)
- amount
- the amount of debit/credit. This is a signed number.
- date
- the date the transaction occurred
- posting_account
- the account transaction are posted to
- associated_account
- the expense or other account attributed to the transaction
- payee
- the payee for the transaction
- tags
- the given tags for the transaction in a string
- beancount
- format: ‘#tag1 #tag2 #etc’
- ledger
- format: ‘:tag1:tag2:etc:’
NOTE: The tags
field is prefixed with a space, when tags are present, this
allows us to loose the trailing space that would otherwise exist in situations
where there were no tags, and the configured template supports them.
Example of trailing space template:
{transaction_date} {cleared_character} "{payee}" "" {tags}
Using the above template would result in a trailing space when no tags are present.
Example proper template:
{transaction_date} {cleared_character} "{payee}" ""{tags}
This template will add prefix the tags
with a space only if they are present,
otherwise it returns an empty string. This line will not have a trailing space.
{transaction_date} {cleared_character} "{payee}" ""{tags} plaid_name: "{name}" plaid_id: "{_id}" {associated_account:<60} {amount} {currency} {posting_account}
{transaction_date} {cleared_character} {payee}{tags} ; plaid_name: {name} ; _id: {_id} {associated_account:<60} {currency} {amount} {posting_account:<60}
The headers file is used to add some text to the top of the output file. This
can be anything you like. I use mine for adding some header info for Emacs
to
read for it sets the correct mode for me when I edit the file.
I also use the include
directive to pull in my main file, to aide in running bean-check
.
The mappings file is simply a CSV
formatted file, that contains four fields. When
exporting transactions, this file will try to establish the proper accounts and
payees for each transaction based on the fields in the file. It also handles
adding some default tags.
This file is created for you, if you do not have one defined in the settings.
Also, it is appended to every time you are exporting transactions with the new
matches, that way next time you export you will not have to enter the
information again if you use --quiet
.
- text to match against the Plaid
name
field. This can be either plain text or a regex. If the field starts and ends with/
it is assumed to be a regex. Note: all the regexes will be matched case insensitive. - the name you wish to use for the
payee
- the associated expense or other account (i.e.
Expenses:Unknown
) - tags to be used for this transaction. This should be in the form of a string.
For ledger the format would be:
:tag1:tag2:etc:
and for beancount:#tag1 #tag2 #etc
The matching algorithm will always use the latest match when processing entries. So if for example you have a regex setup that matches //best buy// at the top of the mappings file and another that has //buy// later in the file, the last match wins.
Some of the listings will contain ledger formatted tags while other will be beancount, you of course will only have the type that you need, do not mix them.
/Amazon/,"Amazon",Expenses:Unknown:Amazon, #sort-out /PAYPAL INST XFER/,"PayPal",Expenses:Unknown:PayPal, :sort-out: /.*NETFLIX.*/,"Netflix",Expenses:Bills:Subscriptions:Netflix /.*DROPBOX.*/,"Dropbox",Expenses:Bills:Subscriptions:Dropbox /Amazon Video/,"Amazon Video",Expenses:Entertainment:Movies The Doughnut Palace,"The Doughnut Palace",Expenses:Food:FastFood 54th Street,"54th Street",Expenses:Food:Restaurant BJ'S RESTAURANTS,"BJ's Restaurant",Expenses:Food:Restaurant
Also notice the sorting of the entries so that Amazon Video
gets categorized
properly. If it were above the Amazon
entry, it would use the setting from
there instead, as the last entry always wins.
In this section I will just describe my basic workflow to demonstrate how I use
this tool. Going forwards assumes you have already established your plaid setup
as well as at least one account. I will continue to demonstrate with the example
account chase_checking
to keep things consistent.
When I get ready to work on my books, I start by downloading the newest transactions for the account I am working on.
plaid2text chase_checking -d
This will download all the newest transactions from my accounts into the MongoDB.
You can of course setup a cron job to do this nightly, but I find it fits into my workflow just doing it manually.
I export new transaction (all the ones that haven’t previously been pulled), into a temporary file, where I can do some manual checking and editing.
plaid2text chase_checking /tmp/onetime.beancount --quiet
Using the --quiet
switch, the program will only prompt me for information on
the transactions that it cannot deduce based on the mappings file. You can of
course leave that switch off if you want to be able to change the defaults from
the mapping file.
Also, if you want to do a test run, without marking the transactions as pulled
use the --no-mark-pulled
switch.
IMPORTANT I want to stress that the outfile is OVERWRITTEN or created every time this command is run. So be careful. :)
When I am satisfied that all is well with my temp file. I copy the new entries into my actual journal file.
This should be considered *beta* version code. I have released it hoping that it will be of benefit to others in a similar situation as me. This version of the code is really hacked together and in need of serious refactoring, and will most likely contain bugs. I have had this working for myself for a few weeks, and have found it stable and usable. But I do caution you, to use at your own risk.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You can obtain a copy of the license here: GNU General Public License