Skip to content

Commit

Permalink
healthequity: ignore balance-after when merging
Browse files Browse the repository at this point in the history
The healthequity source merges newly-downloaded transaction data into the
previously-saved file. This failed when there were multiple transactions of
the same type in a single day, because the order in which HealthEquity reports
transactions is not stable and so the running balances were not either. The
result was that past transactions would sometimes be spontaneously duplicated
in the list upon a new finance-dl run.

This change causes the merge process to ignore the "Balance After" column.
This also means that the running balance within a day may end up incorrect,
if newly-available transactions happen to be listed before
previously-available ones in the new download. There's no really good way to
prevent this except either recalculating the balance-after column ourselves
after the merge or throwing it out entirely, neither of which is proposed in
this change.
  • Loading branch information
jktomer committed Sep 2, 2022
1 parent 496404f commit 6716ce3
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion finance_dl/healthequity.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,11 @@ def write_transactions(raw_transactions_data, path):
rows.append(row_values)
rows.reverse()
csv_merge.merge_into_file(filename=path, field_names=output_headers,
data=rows, sort_by=lambda x: x['Date'])
data=rows, sort_by=lambda x: x['Date'],
# Don't consider balance-after in comparing rows,
# because txn order (and therefore running
# balance) is not stable across visits
compare_fields = output_headers[0:3])


class Scraper(scrape_lib.Scraper):
Expand Down

0 comments on commit 6716ce3

Please sign in to comment.