Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running balance available under transaction detail #553

Open
jimbasiq opened this issue Nov 2, 2022 · 26 comments
Open

Running balance available under transaction detail #553

jimbasiq opened this issue Nov 2, 2022 · 26 comments
Labels
Banking Banking domain APIs

Comments

@jimbasiq
Copy link

jimbasiq commented Nov 2, 2022

Description

The Balance for an account is available via https://consumerdatastandardsaustralia.github.io/standards/#get-account-balance

This change request is for a running balance attribute to be added to the transaction object.

This is needed for example:

  1. If a data recipient wants to see a point in time balance
  2. If a data recipient wants to see an average balance over a defined period
  3. If a data recipient wants to predict a balance flow over time

Area Affected

Data returned by a GET /banking/accounts/{accountId}/transactions request response.

Change Proposed

The change would be the inclusion of a runningBalance of type BankingBalance on the ResponseBankingTransactionList.

@perlboy
Copy link

perlboy commented Nov 2, 2022

This isn't actually data that is stored in many source systems, particularly in the location that stores individual transactions. Instead it is often computed on presentation often from a daily checkpoint of current balance which in the context of list transactions is problematic because the endpoint is not bounded to a day. This request therefore seems to be asking for what is, in essence, a materialised view to be created and would effectively force holders to implement a data transform/store layer they may not be able to afford.

Further notes:

  • Is there a reason an ADR can't reverse engineer this running value from current balance and the transactions themselves?
  • Isn't the (1) scenario what get balance is for?
  • Isn't (2) solvable recipient side?
  • (3) doesn't seem solvable with the proposal
  • What happens if a transaction is reversed/altered, now every record after is altered
  • This breaks completely once we switch industry context

This proposal seems to be adding derived data for specific use cases (ie. it's degeneralising the data) that seem like they are better solved and more appropriate within the Recipient space.

@jimbasiq
Copy link
Author

The assumption here is all Banks are capable of doing this as they currently do it on their online banking and their mobile apps. I would be surprised if this is being done in the front end systems.

To clarify the use cases:

  1. If a data recipient wants to see a point in time balance without having to call the get balance api multiple times or once that point in time has passed
  2. If a data recipient wants to calculate an average balance over a defined period
  3. If a data recipient wants to predict a balance flow over time e.g. what is the expected (average) balance on the 25th of each month

@ghost
Copy link

ghost commented Nov 23, 2022

Hey @jimbasiq! In our case it is true that the running balances you see on statements or in Internet banking are calculated by the CBS as required. Other's may be different, but that's how ours works :-).

What is the granularity required for your use-cases? Would the ability to get end-of-day balances for specified dates be sufficient? (E.g. by specifying date windows when calling the get balance endpoint). Again, this data is not stored simply like this and would need to be calculated/generated, but may be more efficiently processed. Maybe... ;-)

@rob-hale
Copy link

You beat me to it @mattp-rab - nicely done 👍
I get why you'd assume what you did though @jimbasiq - it seems completely logical. However, Internet Banking and mobile bank apps only need to display a page of transactions at a time so calculating the running balance on the fly in the FE is not such a big overhead. It also gracefully accommodates reversed or retro-fixed transactions with ease. Probably worth bearing in mind too that the data source for CDR data will not necessarily be the same source as that used to serve mobile and IB apps. Banks have had to get a bit creative for CDR - some use an ODS to improve performance, so even if it were available for a mobile app, it might not be a trivial exercise to include a new data element for CDR. A final little nugget is that there are actually two balances for many accounts - the current balance and the available balance - I'm sure that will surface at some point too if it hasn't already...
I think I get your use case though - for example, finding the best (most reliable) day of the month to make a payment based on a profile of the average balance of an account - it's a useful one to improve payment success / confidence.
Would love to know if other banking DHs actually store the balance against each transaction... anyone?

@jimbasiq
Copy link
Author

Thanks both. It pains me to see a situation where a job is being done in multiple places (with possible differing results) when it could be done once at source. Just because this has been the pattern to date, should we persist it ¯_(ツ)_/¯

We do have some challenges if we were to calculate on our side.
e.g. Some use cases require historical balances, for example "what was the balance prior to the salary payment 1 year ago".
We can Get Transactions For Account with a date range to find the salary transaction.
However ,to find the balance at that time we will need to get the current balance then make multiple calls to Get Transactions For Account to get all the transactions from now to 1 year ago.
Rather than making 1 service call we are making multiple, pushing up already strained TPS limits.

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

@ghost
Copy link

ghost commented Nov 24, 2022

@jimbasiq consistency and source of truth are really good reasons why this should be done by the DH if it is to be done anywhere. I like it.

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

👆 This was my question above - would this approach meet your use-case requirements??? It is less granular and therefore potentially easier/more efficient to calculate/generate.

@rob-hale
Copy link

This is making sense @jimbasiq and @mattp-rab...
I was originally wondering if the ADR could just collect account balances each day and store them, knowing the timestamp of that action, but that creates problems because of data minimisation and also if it's a new authorisation, there is no history of what the balance was a year ago until we've been doing this for a year. So I think your suggestion of being able to request a balance for a specific historical datetime could be a useful way of meeting this need. Not sure how hard it will be for all DHs to implement, but it feels easier than stamping every transaction with a balance and all the complexity that may bring. Lets face it, if this data is needed to support a popular use case, the market will use whatever means are currently necessary and as you point out, today that means more TPS which isn't ideal.

@WestpacOpenBanking
Copy link

Westpac does not support this change as the raw data required for the calculation of running balance is already available to ADRs via existing APIs.

Westpac requests that DSB considers the economic benefit vs costs to the ecosystem when receiving requests for additional data to be provided from Data Holders. Data that can be derived may benefit specific use-cases, but can also be calculated or solutioned by ADRs developing those use cases. Expanding mandatory data fields to be provided by Data Holders for data easily derived has high costs of implementation across the ecosystem.

@NationalAustraliaBank
Copy link

We echo Westpac's point of view on this item and thus do not support this change.

Further, we also support Westpac's opinion on weighing up economic benefits vs cost to the ecosystem when new CRs are being raised.

@DougFromPayPal
Copy link

PayPal is not supportive of this change and is aligned with the concerns posted by Westpac and National Australia Bank. In addition, PayPal is a Purchased Payment/Stored Value Facility, that uses a customer’s digital wallet to facilitate payment transactions via online and brick/mortar business entities. Although PayPal account holders have the ability to ‘store value’ in their digital wallet account, the vast majority do not and rely on linked funding instruments (credit card, debit card, bank account) to fund each payment transaction. Even with a stored balance available, a PayPal account holder can choose to fund a transaction in full with a linked account, in full using their PayPal Digital Wallet balance, or partially fund it using both the available balance and a linked account. A PayPal account balance is unlike a typical bank account and the concept of a running balance is not applicable or relevant in our business model. Keeping and/or calculating a running balance, on demand, is not a function currently supported in PayPal. PayPal agrees with other comments that this warrants further investigation in weighing up economic benefits vs cost to the ecosystem.

@markskript
Copy link

I can appreciate the issues brought up by the ADH's here - but I also want to call out that it's essentially impossible for the ADR's to generate a running balance due to the inability to track payments across the PENDING and POSTED barrier, and also the fact that a few large ADH's seem to randomize the transaction ID every time a transaction is updated, leading to duplicates which we cannot resolve.

@perlboy
Copy link

perlboy commented Jul 13, 2023

Responding specifically to some of the Recipient comments here and it's a bit chronological cause I dropped from this convo.

The assumption here is all Banks are capable of doing this as they currently do it on their online banking and their mobile apps. I would be surprised if this is being done in the front end systems.

No. They aren't. In IB they are providing a balance according to the ledger not according to the actual balance that could be drawn down on an account. For instance, credit card holds, unreconciled transactions, reversals, payment clearings etc. all of those aren't on the ledger and many (most?) organisations either show them as standalone "Pending" without running balances or in the transaction list with balances redacted.

A ledger balance is not the same as an available balance.

Thanks both. It pains me to see a situation where a job is being done in multiple places (with possible differing results) when it could be done once at source. Just because this has been the pattern to date, should we persist it ¯_(ツ)_/¯

The CDR is about sharing data organisations already have. It might "pain" you but the alternative is effectively Recipients outsourcing their problems, at zero cost, back to Holders.

  1. If a data recipient wants to see a point in time balance without having to call the get balance api multiple times or once that point in time has passed

There's no reason to not call the Get Balance API multiple times. If the desire is to be able to call it more often then that's a discussion to be had around NFRs etc.

  1. If a data recipient wants to calculate an average balance over a defined period
  2. If a data recipient wants to predict a balance flow over time e.g. what is the expected (average) balance on the 25th of each month

These both seem like high school mathematics problems especially since the ADR has access to 2 years worth of data.

I note that ADRs seem to be deliberately retrieving the full 2 years regardless of use case so arguably this is a problem they can already solve from data they have already downloaded. It remains to be seen if this meets the data minimisation privacy safeguards (2 years of history seems appropriate for a house loan not a pay advance or <$2500 loan).

We do have some challenges if we were to calculate on our side.
e.g. Some use cases require historical balances, for example "what was the balance prior to the salary payment 1 year ago".

Play the transactions back in reverse.

We can Get Transactions For Account with a date range to find the salary transaction.
However ,to find the balance at that time we will need to get the current balance then make multiple calls to Get Transactions For Account to get all the transactions from now to 1 year ago.

That is correct and you can do it 1000 transactions at a time. For what it's worth our observation is that a typical Consumer rarely has more than a few thousand transactions in a year. There is grounds here to instead consider an async approach (i.e. "dump all of it and let me know when it's ready") as an alternative and that's something CDR+ has some initial conversations going on around.

Rather than making 1 service call we are making multiple, pushing up already strained TPS limits.

I don't think introducing functionality through attributes to solve for throughput challenges is a solution, all this is doing is moving the problem further into the banking architecture. It might work in some cases but what is also almost certain is velocity of change will dramatically slow down. This is, in essence, the whole reason for existence of many Fintech organisations (i.e. high velocity).

The other way to solve this is to this problem is to put a date/time on the Get Balances For Specific Accounts.

If this is a suggestion to request a balance at a point in time it is, again, moving the burden into the stack not solving for it.

I can appreciate the issues brought up by the ADH's here - but I also want to call out that it's essentially impossible for the ADR's to generate a running balance due to the inability to track payments across the PENDING and POSTED barrier,

This seems to imply Holders actually can do this without fundamental changes at very high cost to numerous backend systems (tl;dr: They can't). The CDR isn't meant to solve all the underlying challenges in banking. There is a statement to be made here that Recipients need to "get with the program" of the challenges holders face every day. Opposition to this proposal isn't really a Holder vs. Recipient situation but most likely technologist(s) in the Holder side saying "you don't understand the full problem space".

and also the fact that a few large ADH's seem to randomize the transaction ID every time a transaction is updated, leading to duplicates which we cannot resolve.

So this might be a compliance issue but it probably won't be. Taking the PENDING and POSTED situation, in quite a number of deployments PENDING lives in a completely separate part of a banks infrastructure (i.e. the part attached to the payment rails) and is then "broadcasted" to the downstream ledgers which then reconcile it as POSTED. This means that from a banks perspective the transaction identifier is different, the first representing the transaction id in the payment processing ledger and the second representing the reconciled version of it in the core system. I'll note here that in the CDR and for NPP payments this seems to be a solved problem with endToEndId so there may be a question around whether it is suitable (I'm not an NPP expert) and whether organisations are supplying it.

The NFRs of the Standards force organisations to concurrently broadcast these to a third system (ie. an operational data store or similar) which itself has no way of deduplicating such things to provide a single transaction identifier. In internet banking situations (ie. "other digital channels") this is completely fine because the user typically sees "Pending" transactions often without identifiers and this can be pulled from the payment rails ledger (that gets flushed after reconciliation).

@joshuanicholson
Copy link

We understand the request for a running account balance and would use it should it be made available. However, we also appreciate the feedback and issues raised by the Data Holders, as a balance is a calculated value rather than a stored value.

We feel this is fundamentally a data quality and compliance issue, as we have data integrity issues with balance & transaction data collected from DH’s. This means calculating the balance backwards from the current balance is proving problematic and essentially impossible for some DH’s.

A rather simple example can be demonstrated as follows (appreciate a couple of assumptions in the following example, but as we know, without clarity of the specification and consistent delivery of data, sometimes assumptions must be made).

Let's say two API calls are made within milliseconds of each other; a get balance and get transactions.

  1. The Balance call returns an available balance of $80 and a current ‘ledger’ balance of $100
  2. The Transaction call returns many transactions for the required period (28 days) and includes two ‘pending’ transactions that have a net value of $35
    a) One pending transaction is $20 to Big Telco
    b) Second pending transaction is $15 to Best Friend

So given the above pending transactions and a $20 difference between the available & current balance, how can this mismatch be explained? (Ignoring things like credit card & overdraft limits, just an everyday transaction account) Based on this simple example, should a human review the data, they would suggest that, in fact, there is only a single pending transaction as the $15 transaction is, in fact, posted and included in the current balance (suggesting the DH’s API for balance & transaction are not synchronised). Alternatively, the available balance should be $65. This also means any attempt to reverse engineer a running balance will yield incorrect information.

Sure, the above could be done programmatically, but we see examples where it is impossible to identify transactions that match the difference between the available and current balance. To make matters complicated should this be a credit card and the limit not be provided, any calculation/reconciliation becomes impossible. This lack of data quality and noncompliance is forcing us to seek more data points to assist us in reconstructing the ‘ledger’ of a consumer's bank account.

Is the answer compliance enforcement? (we say yes), should there be a change to provide more data, as an ADR we'd never say no to more data if it ensured data integrity.

@nils-work
Copy link
Member

Part of the discussion on this issue yesterday explored the impact that statement re-issuing, transaction reversals/adjustments and pending/posted status changes could have on a set of historical current/available balance snapshots (or even balances running alongside transactions). It was suggested that snapshots could be captured at 'end of day' or monthly statement time, for example.

Would the impact of those adjustments be equivalent in any existing 'offline' process where a customer is sharing possibly-outdated paper or PDF statements to demonstrate a balance over time? Is the intent of this proposal to improve access and efficiency, but basically provide parity with such a process, if there is value in doing that?

Would unaltered historical snapshots of balances be any better or worse for such a process, than the potential for incorrectly-calculated values being determined by retrieving and replaying transaction entries against an uncertain point-in-time balance, which may not have been in sync with any transactions available at the time of the respective invocations?

@kambasiq
Copy link

kambasiq commented Aug 7, 2024

It would be great to get some additional data points against an account level balance so when we are calculating the balance against each transaction we could avoid mistakes, information like:

  • The transaction ID and timestamp of the last transaction included in the balance calculation
  • Number of transactions included in the balance calculation (if there balance was calculated not using all historic transactions then the opening balance that was used to start the calculation should be included)
  • Any set of transaction IDs which were adjusted after their original transaction date and when the adjustment happened. This will be used to recalculate any previously calculated balance based on historic transaction adjustments

If the running balance in each transaction is not available in the source data (like how ledgers have worked for 100s of years) and DHs allow for historic transactions to be modified (as opposed to a true ledger where adjustments are applied as a new line item) then ADRs should be given all the necessary data points to do the running balance calculations similar to banking front-ends and bank statements.

@nils-work
Copy link
Member

To help refine the discussion and solutions explored, how relevant and achievable would the following options be?

  1. Provide a balance history endpoint (for example, end of day balance for last x days) (to support historical balance use cases)
  2. Timestamp field indicating the time the values in the Balance endpoints was 'sourced' (to support accounting/reconciliation use cases)
  3. Transaction 'modified date' field - this could assist with the provision of a modified-since query parameter (related to NFR issues), and possibly with reconciling transactions to the balance 'sourced' date (to support accounting/reconciliation use cases)
  4. Something to explain differences between current and available balances? (to support accounting/reconciliation use cases by dealing with potential ambiguity in each implementation.)

Are there other key challenges or options? Which areas have the priority/value/complexity?

@Macca2805
Copy link

Macca2805 commented Sep 4, 2024

Adding some insight - WeMoney currently has a feature that depends on tracking account balances over time. Here are some notes from our experience working with balance history;

Initially, we relied on our aggregator (Yodlee) who provide a balance history endpoint that allows us to extract daily balances, which we assume is calculated by them given there is no endpoint available in the standards.

Over time, we ran into some challenges relying on the data in this endpoint. For example a common issue we've come across is historical balances being restricted to far fewer than the 365 days requested. Other common issues we've experienced relate to time zones where balances appear to shift on the incorrect day based on the time of day the transaction occurred.

We attributed the issues to a mix of incorrect data returned by the data holder, plus bugs in the calculation logic in the aggregator endpoint.

WeMoney is moving towards calculating daily balances ourselves with an in-house algorithm. This gives us more control and has allowed us fix some of the issues mentioned in the previous paragraph. We learned through this process that accurately calculating daily balances comes with few nuances that make it more challenging than simply playing back transactions in reverse. We've made a few assumptions that result in our calculated balance history not perfectly matching 100% of the time (although close enough for our use case).

I'm on the fence about whether or not daily balances should be calculated and made available by data holders. I echo concerns about economic benefit vs costs, especially given the raw data needed to calculate it should already be available to ADRs. However, in our experience it can be challenging to calculate historic balances accurately without perfect understanding of the many nuances and without consistent and correct raw data from data holders.

Hope above context is helpful, thanks.

@amanuel13
Copy link

Something to explain differences between current and available balances? (to support accounting/reconciliation use cases by dealing with potential ambiguity in each implementation.)

Previous comment by Josh Nicholson on issue supported that of nab and w3stpac in relation to costs benefit.

Current balance being ledger balance
Available balance as has been forever current balance plus/minus outstanding holds (for credit cards this would be authorisations holds for unposted transactions, calculatedaaccumulated accrued but unposted fees such as within a month and cheques deposited into an account but not yet cleared

@nils-work
Copy link
Member

Hi @amanuel13

Re:

Something to explain differences between current and available balances

The aim of this to help Data Recipients understand the apparent inconsistency in what Data Holders are disclosing at any point in time. This may be due to differences in the latency of each data set, i.e. real time balance and delayed transactions, or incorrect/inconsistent/delayed PENDING/POSTED status changes.

For example, are you suggesting that the items you listed may affect the available balance but not appear as PENDING transactions, or would they always be in sync and aligned?

  • authorisations holds for unposted transactions,
  • calculated, accumulated, accrued but unposted fees such as within a month, and
  • cheques deposited into an account but not yet cleared.

@markskript
Copy link

markskript commented Sep 25, 2024

Other common issues we've experienced relate to time zones where balances appear to shift on the incorrect day based on the time of day the transaction occurred.

We are seeing the same. We have a compliance ticket open against one DH at the moment to fix the timezones they are supplying, and we hope to use that as precedence to raise the same issue with a few others that have the same problem.

However, in our experience it can be challenging to calculate historic balances accurately without perfect understanding of the many nuances and without consistent and correct raw data from data holders.

It's these nuances which fuels my belief that ADRs currently do not have enough information in order to calculate the balances based on the raw data to an acceptable level of accuracy.

I'm on the fence about whether or not daily balances should be calculated and made available by data holders. I echo concerns about economic benefit vs costs,

To me, it comes down to the use case. Some use cases may be able to support "close enough" calculations of a running balance based on the data we have now. The use case a few ADRs have mentioned in this discussion is the "reproduce a bank statement" use case. In this situation, the data must be accurate.

Or, we need to find another pattern for obtaining "statement quality" data as opposed to "transaction" data from the CDR to fulfill this use case (a concept that is also being discussed in the NFR working group). The driver behind this use-case is often audit and compliance - accountants want running balances, so it's likely more applicable to the business/corporate consumer than the retail consumer. That should also be taken into account in the cost/benefit discussion.

To me there are 3 paths that would allow this use-case to move forward.

Option 1 - this change
Option 2 - we analyse and fix all the data quality issues across all the DHs to allow ADRs to accurately calculate a running balance. This would entail a lot of fixes related to postingDateTime, and pending transactions, and being able to track a transactions when it is no longer pending, and ensuring the balances reflect that ........ this is a long road. A good road though, as improvements in data quality are always a win and will likely benefit other use cases.
Option 3 - a statements API where the ADR can request a static "export" of transaction and balance data, akin to accessing a PDF statement through Online Banking.

Skript supports a more detailed cost/benefit analysis of these options (plus any other ideas) to determine the way forward.

Regards,
Mark.

@nils-work
Copy link
Member

I think the current position and options available to address this issue are reflected in:

Any challenges in using the existing or proposed fields for the referenced use cases could be considered a compliance concern with respect to data quality.

@Mekaal
Copy link

Mekaal commented Oct 3, 2024

Depending on exactly what change to standard is ultimately proposed, there appears to be a high likelihood that Bendigo Bank would not need to supply it as it would not be considered "required consumer data" as per rule 3.2 (1) c) in Part 3, not being held in a digital form.

@markskript
Copy link

As discussed on the last MI call, we propose that to progress this issue we setup an experiment that will allow us to determine if adding a known date/time value to the account balance endpoint will be enough to resolve the current issues ADRs are having aligning balance and transaction data.

We suggest adding a new field to the https://consumerdatastandardsaustralia.github.io/standards/#cdr-banking-api_schemas_tocSbankingbalance structure as follows

Name: effectiveAt
Type: DateTimeString
Required: Mandatory
Description: The date/time when the current balance was last updated at the data holder. This value must relate to the postingDateTime value on the Banking Transactions.

@joshuanicholson can you please review the description above to see if it meets your requirements as well.

@Mekaal
Copy link

Mekaal commented Nov 17, 2024

I don't understand how an effectiveAt attribute as described above gives any new information. The currentBalance provided in the BankingBalance is "The balance of the account at this time" and the effectiveAt has to therefore be the date/time of the last posted transaction (which is already available).

@markskript
Copy link

@Mekaal - while that might be true for some DHs, it is not true for all. We are finding a disconnect between the balances and transactions on some DHs, almost like they are being sourced from disconnected systems.

@perlboy
Copy link

perlboy commented Nov 19, 2024

@Mekaal - while that might be true for some DHs, it is not true for all. We are finding a disconnect between the balances and transactions on some DHs, almost like they are being sourced from disconnected systems.

Because in an attempt to achieve NFRs, they are. For what it's worth this was predicted as the result when fast and inaccurate was prioritised above slow and inaccurate. Unfortunately the Regulator put fuel on the fire by then chasing people for solutions to response time problems (what they could see in Metrics) without having visibility of data quality/latency problems. The number of times I've seen a bank present their CDR data mastering architecture that inherently disconnects transactions from the rest of the source data is so numerous I can safely say this is a majority.

Unfortunately I'm dubious about the running balance proposal being the solution. Instead slowing down to speed up is likely to allow the data preparation layer to be "distilled" likely resulting in a cleaner transaction set. This is exactly what Biza within DataRight+ has specified, and is now releasing as a voluntary api for its holders, in Bulk Transaction Detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Banking Banking domain APIs
Projects
Status: Iteration Candidates
Development

No branches or pull requests