-
Java 11, for example from Azul
-
Eclipse IDE - PP is build using the Eclipse RCP (Rich Client Platform) framework. Therefore it generally does not make sense to use other IDEs. Download the Eclipse IDE for RCP and RAP Developers package.
Optionally, install language packs for Eclipse:
Menu
-->Help
-->Install New Software
- Use the following update site:
https://download.eclipse.org/technology/babel/update-site/latest/
- Select the language packs you want to install
- By default, Eclipse uses the host operating system language (locale).
To force the use of another language, use the -nl parameter:
eclipse.exe -nl de
Optionally, install via the Eclipse Marketplace (drag and drop the Install button to your Eclipse workspace)
- Eclipse PDE (Plug-in Development Environment) (skip if you installed the Eclipse IDE for RCP and RAP Developers)
- Infinitest
- ResourceBundle Editor
- Checkstyle Plug-In
- SonarLint
- Launch Configuration DSL
- Eclipse e4 Tools Developer Resources
Menu
-->Help
-->Install New Software
- Pick Latest Eclipse Simultaneous Release from the dropdown menu
- Under General Purpose Tools select the Eclipse e4 Tools Developer Resources
Configure the following preferences (Menu
--> Window
--> Preferences
)
Java
-->Editor
-->Save Actions
- Activate
Format Source Code
and thenFormat edited lines
- Activate
Organize imports
- Activate
Java
-->Editor
-->Content Assist
- Activate
Add import instead of qualified name
- Activate
Use static imports
- Activate
Java
-->Editor
-->Content Assist
-->Favorites
- Click on
New Type...
and add the following favoritesname.abuchen.portfolio.util.TextUtil
name.abuchen.portfolio.datatransfer.pdf.PDFExtractorUtils
- Click on
Java
-->Editor
-->Installed JREs
- Add the Java 11 JDK
For further disucssion, check out the thread in the (German) Forum.
To contribute to Portfolio Performacne, you create a fork, clone the repository, make and push changes to your repository, and then create a pull request.
- Create your own fork
- Within Eclipse, clone your repository. In the last step, choose to import all existing Eclipse projects.
- Within Eclipse, import projects from an existing repository
- Open the
portfolio-target-definition
project - Open the
portfolio-target-definition.target
file with the Target Editor (this may take a while as it requires Internet access). If you just get an XML file, use right click and chose Open With Target Editor - In the resulting editor, click on the "Set as Active Target Platform" link at the top right (this may also take a while)
PP uses Eclipse Launch Configuration DSL to define Eclipse launch configurations in a OS independent way.
First, add the Launch Configuration view to your workspace:
Menu
-->Window
-->Show View
-->Other...
-->Debug
-->Launch Configuration
To run the application, select Eclipse Application
--> PortfolioPerformance
and right-click Run.
To run the tests, select under JUnit Plug-in Tests
--> PortfolioPerformance_Tests
or PortfolioPerformance_UI_Tests
.
It is not required to use Maven as you can develop using the Eclipse IDE with the setup above. The Maven build is used for the Github Actions build.
The Maven build works fine when JAVA_HOME
points to an (Open-)JDK 11 installation.
Linux/macOS
export MAVEN_OPTS="-Xmx2g"
mvn -f portfolio-app/pom.xml clean verify
set MAVEN_OPTS="-Xmx2g"
mvn -f portfolio-app\pom.xml -Denforcer.skip=true clean verify
- Write a good commit message in English
- If the change is related to a Github issue, add a line
Issue: #<ISSUE NUMBER>
after an empty line - If the change is related to an thread in the forum, add a line
Issue: https://...
with the link to the post in the forum - Format the source code. The formatter configuration is part of the project source code. Exception: Do not reformat the PDF importer source code. Instead, carefully insert new code into the existing formatting.
- Add test cases where applicable. Today, there are no tests that test the SWT UI. But add tests for all calculations.
- Do not merge the the master branch into your feature branch. Instead, rebase your local changes to the head of the master branch.
- Create a Pull Request - for example using GitHub Desktop using this tutorial
The project uses Java property files to translate the application into multiple langauges.
There are two ways to contribute translations:
- Register and translate using POEditor. If you only want to contribute to one language (or fix the translation for existing labels), this is the easiest way. On regular basis we pull the tranlations from POEditor into the source code.
- Update the property files directly. Open the default property file (the one without the language). The Resource Bundle Editor (installed above) will detect all existing languages and display a consolidated editor.
When adding new labels,
- right-click in the source editor Source -> Externalize Strings
- use the formatting excactly as done by the Resource Bundle Editor
- use DeepL to translate new labels into all existing languages
Importers are created for each supported bank and/or broker. The process works like this:
- The users selects one or more PDF files via the import menu (or drags and drops multiple PDF files to the sidebar navigation)
- Each PDF file are converted to an array of strings; one entry per line
- Each importer is presented with the strings and applies the regular expresssions to extract transactions
If you want to add an importer for a new bank or a new transaction type, check out the existing importers for naming conventions, structure, formatting, etc.
PDF importer: name.abuchen.portfolio/src/name/abuchen/portfolio/datatransfer/pdf/
Test cases: name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/pdf/
The naming convention is BANKExtractor and BANKExtractorTest for extractor class and test class respectively.
PP separates between PortfolioTransaction (booked on a securities account) and AccountTransaction (booked on a cash account). The available types are defined as enum within the file, for example for purchase (BUY) and sale (SELL) of securities, etc.
The structure of the PDF importers is as follows:
- Client
addBankIdentifier
--> unique recognition feature of the PDF document
- Transaction types (basic types)
addBuySellTransaction
--> Purchase and sale ( single settlement )addSummaryStatementBuySellTransaction
--> Purchase and sale ( multiple settlements )addBuyTransactionFundsSavingsPlan
--> Savings plansaddDividendeTransaction
--> DividendsaddTaxTreatmentForDividendeTransaction
--> Tax treatment for dividendsaddAdvanceTaxTransaction
--> Advance tax paymentaddCreditcardStatementTransaction
--> Credit card transactionsaddAccountStatementTransaction
--> Giro account transactionsaddDepotStatementTransaction
--> Securities account transactions ( Settlement account )addTaxStatementTransaction
--> Tax settlementaddDeliveryInOutBoundTransaction
--> Inbound and outbound deliveriesaddTransferInOutBoundTransaction
--> Transfer in and outbound deliveriesaddReinvestTransaction
--> Reinvestment transactionaddTaxReturnBlock
--> Tax refundaddFeeReturnBlock
--> Fee refund
- Bank name
getLabel
--> display label of bank/broker, e.g., Deutsche Bank Privat- und Geschäftskunden AG
- Taxes and fees
addTaxesSectionsTransaction
--> handling of taxesaddFeesSectionsTransaction
--> handling of fees
- Overwrite the value extractor methods if the documents work with non-standard (English, German) locales:
- Example: Bank SLM (de_CH)
- Example: Baader Bank AG (de_DE + en_US)
- Add post processing on imported transaction using a
postProcessing
method:- Example: Comdirect
The importers are structured according to the following scheme and the mapping variables are to be adhered to as far as possible:
- Type (Optional)
type
--> Exchange of the transaction pair ( e.g. from purchase to sale )
- Security identification
name
--> Security nameisin
--> International Securities Identification Numberwkn
--> Security code numbertickerSymbol
--> Ticker symbol ( Optional )currency
--> Security currency
- Shares of the transaction
shares
--> Shares
- Date and time
date
--> Datetime
--> Time ( Optional )
- Total amount (With fees and taxes)
amount
--> Amount e.g. 123,15currency
--> Currency of the total amount
- Foreign currency
gross
--> Total amount in transaction currency without fees and taxescurrency
--> Currency of the total amountfxGross
--> Total amount in foreign currency without fees and taxesfxCurrency
--> Currency of the total amount in foreign currency
- Exchange rate
exchangeRate
--> Foreign currency exchange ratebaseCurrency
--> Base currencytermCurrency
--> Foreign currency
- Notes (Optional)
note
--> Notes e.g. quarterly dividend
- Tax section
tax
--> Amountcurrency
--> CurrencywithHoldingTax
--> Withholding taxcreditableWithHoldingTax
--> Creditable withholding tax
- Fee section
fee
--> Amountcurrency
--> Currency
A finished PDF importer as a basis would be e.g. the V-Bank AG PDF importer.
The utility class about standardized conversions, is called by the AbstractPDFExtractor.java and processed in the PDFExtractorUtils.java. The PDFExchangeRate helps processing for foreign currencies.
Use the Money class when working with amounts (it includes the currency and the value rounded to cents). Use BigDecimal for exchange rates and the conversion between currencies.
Use TextUtil class for some string manipulation such as trimming strings and stripping whitespace characters. The text created from PDF files has some corner cases that are not supported by the usual Java methods.
Due to the many comments with text fragments from the PDF documents, we do not auto-format the PDF importer class files. Instead, carefully insert new code into the existing formatting manually. To protect formatting from automatic formatting, use the @formatter:off
and @formatter:on
.
Please take a look at the formatting and structure in the other PDF importers! Example: V-Bank AG
Via the application menu, users can create a test case file. The test file is the extracted text from the PDF documents. Users then anonymize the text by replacing personal idenfiable information and account numbers with alternative text.
- The test files should not be modified beyond the anonymization
- All source code (including the test files) are stored in UTF-8 encoding
- Follow the naming convention for test files (type in the local language, two digit counter):
Buy01.txt, Sell01.txt
--> Purchase and sale (single settlements) ( e.g. Buy01.txt or Sell01.txt )Dividend01.txt
--> Dividends (single statements)SteuermitteilungDividende01.txt
--> Tax settlement for dividends (single settlement)SammelabrechnungKaufVerkauf01.txt
--> Purchase and sale (multiple settlements)Wertpapiereingang01.txt
--> Incoming securitiesWertpapierausgang01.txt
--> Outgoing securitiesVorabpauschale01.txt
--> Advance taxesGiroKontoauzug01.txt
--> Giro account statementKreditKontoauszug01.txt
--> Credit card account statementDepotauszug01.txt
--> security account transaction history (settlement account)
- Samples
- one transaction per PDF: Erste Bank Gruppe - see
testWertpapierKauf06()
andtestDividende05()
- supporting securities with multiple currencies: Erste Bank Gruppe with
testWertpapierKauf09()
/testWertpapierKauf09WithSecurityInEUR()
andtestDividende10()
/testDividende10WithSecurityInEUR()
- Background: in the PP model, the currency of the transaction always must match the currency of the security and its historical prices. However, sometimes securities are purchased on an different exchange with prices in an another currency. The importer try to handle this case automatically. This is reflected in the two test cases
- multiple transactions per PDF: DKB AG with
testGiroKontoauszug01()
- if transactions are created based on two separate PDF files, use post processing: Comdirect with
testDividendeWithTaxTreatmentForDividende01()
andtestDividendeWithTaxTreatmentReversedForDividende01()
- one transaction per PDF: Erste Bank Gruppe - see
To test regular expression you can use https://regex101.com/.
Beside general good practices for regular expresions, keep in mind:
- all special characters in the PDF document (
äöüÄÖÜß
as well as e.g. circumflex or similar) should be matched by a.
(dot) because the PDF to text conversion can create different results - expression in
.match(" ... ")
is started with an anchor^
and ended with$
- with
.find(" ... ")
do not add anchors as they will be automatically added
Keep in mind that the regular expressions work against text that is automatically created from PDF files. Due to the nature of the process, there can always be slight differences in the text files. The following table collects the regular expressions that worked well to match typical values.
Value | Example | Not Helpful | Works Well |
---|---|---|---|
Date | 01.01.1970 | \\d+.\\d+.\\d{4} |
[\\d]{2}\\.[\\d]{2}\\.[\\d]{4} |
1.1.1970 | \\d+.\\d+.\\d{4} |
[\\d]{1,2}\\.[\\d]{1,2}\\.[\\d]{4} |
|
Time | 12:01 | \\d+:\\d+ |
[\\d]{2}\\:[\\d]{2}} |
ISIN | IE00BKM4GZ66 | \\w+ |
[A-Z]{2}[A-Z0-9]{9}[0-9] |
[\\w]{12} |
|||
WKN | A111X9 | \\w+ |
[A-Z0-9]{6} |
[\\w]{6} |
|||
Amount | 751,68 | [\\d,.]+ |
[\\.,\\d]+ |
[\\.\\d]+,[\\d]{2} |
|||
74'120.00 | [\\d.']+ |
[\\.'\\d]+ |
|
20 120.00 | [\\d.\\s]+ |
[\\.\\d\\s]+ |
|
Currency | EUR | \\w+ |
[A-Z]{3} |
[\\w]{3} |
|||
Currency | € or $ | \\D |
\\p{Sc} |