Skip to content

Latest commit

 

History

History
327 lines (243 loc) · 19.6 KB

CONTRIBUTING.md

File metadata and controls

327 lines (243 loc) · 19.6 KB

Contributing to Portfolio Performance

Contents

Development Setup

Install Eclipse

  • Java 11, for example from Azul

  • Eclipse IDE - PP is build using the Eclipse RCP (Rich Client Platform) framework. Therefore it generally does not make sense to use other IDEs. Download the Eclipse IDE for RCP and RAP Developers package.

Optionally, install language packs for Eclipse:

  • Menu --> Help --> Install New Software
  • Use the following update site:
    https://download.eclipse.org/technology/babel/update-site/latest/
    
  • Select the language packs you want to install
  • By default, Eclipse uses the host operating system language (locale). To force the use of another language, use the -nl parameter:
    eclipse.exe -nl de
    

Install Eclipse Plugins

Optionally, install via the Eclipse Marketplace (drag and drop the Install button to your Eclipse workspace)

Configure Eclipse

Configure the following preferences (Menu --> Window --> Preferences)

  • Java --> Editor --> Save Actions
    • Activate Format Source Code and then Format edited lines
    • Activate Organize imports
  • Java --> Editor --> Content Assist
    • Activate Add import instead of qualified name
    • Activate Use static imports
  • Java --> Editor --> Content Assist --> Favorites
    • Click on New Type... and add the following favorites
      • name.abuchen.portfolio.util.TextUtil
      • name.abuchen.portfolio.datatransfer.pdf.PDFExtractorUtils
  • Java --> Editor --> Installed JREs
    • Add the Java 11 JDK

Project Setup

For further disucssion, check out the thread in the (German) Forum.

Source Code

To contribute to Portfolio Performacne, you create a fork, clone the repository, make and push changes to your repository, and then create a pull request.

Setup Target Platform

  • Open the portfolio-target-definition project
  • Open the portfolio-target-definition.target file with the Target Editor (this may take a while as it requires Internet access). If you just get an XML file, use right click and chose Open With Target Editor
  • In the resulting editor, click on the "Set as Active Target Platform" link at the top right (this may also take a while)

Launch Portfolio Performance

PP uses Eclipse Launch Configuration DSL to define Eclipse launch configurations in a OS independent way.

First, add the Launch Configuration view to your workspace:

  • Menu --> Window --> Show View --> Other... --> Debug --> Launch Configuration

To run the application, select Eclipse Application --> PortfolioPerformance and right-click Run.

To run the tests, select under JUnit Plug-in Tests --> PortfolioPerformance_Tests or PortfolioPerformance_UI_Tests.

Build with Maven

It is not required to use Maven as you can develop using the Eclipse IDE with the setup above. The Maven build is used for the Github Actions build.

The Maven build works fine when JAVA_HOME points to an (Open-)JDK 11 installation.

Linux/macOS

export MAVEN_OPTS="-Xmx2g"
mvn -f portfolio-app/pom.xml clean verify
set MAVEN_OPTS="-Xmx2g"
mvn -f portfolio-app\pom.xml -Denforcer.skip=true clean verify

Contribute Code

  • Write a good commit message in English
  • If the change is related to a Github issue, add a line Issue: #<ISSUE NUMBER> after an empty line
  • If the change is related to an thread in the forum, add a line Issue: https://... with the link to the post in the forum
  • Format the source code. The formatter configuration is part of the project source code. Exception: Do not reformat the PDF importer source code. Instead, carefully insert new code into the existing formatting.
  • Add test cases where applicable. Today, there are no tests that test the SWT UI. But add tests for all calculations.
  • Do not merge the the master branch into your feature branch. Instead, rebase your local changes to the head of the master branch.
  • Create a Pull Request - for example using GitHub Desktop using this tutorial

Translations

The project uses Java property files to translate the application into multiple langauges.

There are two ways to contribute translations:

  • Register and translate using POEditor. If you only want to contribute to one language (or fix the translation for existing labels), this is the easiest way. On regular basis we pull the tranlations from POEditor into the source code.
  • Update the property files directly. Open the default property file (the one without the language). The Resource Bundle Editor (installed above) will detect all existing languages and display a consolidated editor.

When adding new labels,

  • right-click in the source editor Source -> Externalize Strings
  • use the formatting excactly as done by the Resource Bundle Editor
  • use DeepL to translate new labels into all existing languages

PDF Importer

Importers are created for each supported bank and/or broker. The process works like this:

  • The users selects one or more PDF files via the import menu (or drags and drops multiple PDF files to the sidebar navigation)
  • Each PDF file are converted to an array of strings; one entry per line
  • Each importer is presented with the strings and applies the regular expresssions to extract transactions

If you want to add an importer for a new bank or a new transaction type, check out the existing importers for naming conventions, structure, formatting, etc.

Source Location

PDF importer: name.abuchen.portfolio/src/name/abuchen/portfolio/datatransfer/pdf/ Test cases: name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/pdf/

The naming convention is BANKExtractor and BANKExtractorTest for extractor class and test class respectively.

Imported Transactions

PP separates between PortfolioTransaction (booked on a securities account) and AccountTransaction (booked on a cash account). The available types are defined as enum within the file, for example for purchase (BUY) and sale (SELL) of securities, etc.

Anatomy of a PDF Importer

The structure of the PDF importers is as follows:

  • Client
    • addBankIdentifier --> unique recognition feature of the PDF document
  • Transaction types (basic types)
    • addBuySellTransaction --> Purchase and sale ( single settlement )
    • addSummaryStatementBuySellTransaction --> Purchase and sale ( multiple settlements )
    • addBuyTransactionFundsSavingsPlan --> Savings plans
    • addDividendeTransaction --> Dividends
    • addTaxTreatmentForDividendeTransaction --> Tax treatment for dividends
    • addAdvanceTaxTransaction --> Advance tax payment
    • addCreditcardStatementTransaction --> Credit card transactions
    • addAccountStatementTransaction --> Giro account transactions
    • addDepotStatementTransaction --> Securities account transactions ( Settlement account )
    • addTaxStatementTransaction --> Tax settlement
    • addDeliveryInOutBoundTransaction --> Inbound and outbound deliveries
    • addTransferInOutBoundTransaction --> Transfer in and outbound deliveries
    • addReinvestTransaction --> Reinvestment transaction
    • addTaxReturnBlock --> Tax refund
    • addFeeReturnBlock --> Fee refund
  • Bank name
    • getLabel --> display label of bank/broker, e.g., Deutsche Bank Privat- und Geschäftskunden AG
  • Taxes and fees
    • addTaxesSectionsTransaction --> handling of taxes
    • addFeesSectionsTransaction --> handling of fees
  • Overwrite the value extractor methods if the documents work with non-standard (English, German) locales:
  • Add post processing on imported transaction using a postProcessing method:

Naming Conventions for Detected Values

The importers are structured according to the following scheme and the mapping variables are to be adhered to as far as possible:

  • Type (Optional)
    • type --> Exchange of the transaction pair ( e.g. from purchase to sale )
  • Security identification
    • name --> Security name
    • isin --> International Securities Identification Number
    • wkn --> Security code number
    • tickerSymbol --> Ticker symbol ( Optional )
    • currency --> Security currency
  • Shares of the transaction
    • shares --> Shares
  • Date and time
    • date --> Date
    • time --> Time ( Optional )
  • Total amount (With fees and taxes)
    • amount --> Amount e.g. 123,15
    • currency --> Currency of the total amount
  • Foreign currency
    • gross --> Total amount in transaction currency without fees and taxes
    • currency --> Currency of the total amount
    • fxGross --> Total amount in foreign currency without fees and taxes
    • fxCurrency --> Currency of the total amount in foreign currency
  • Exchange rate
    • exchangeRate --> Foreign currency exchange rate
    • baseCurrency --> Base currency
    • termCurrency --> Foreign currency
  • Notes (Optional)
    • note --> Notes e.g. quarterly dividend
  • Tax section
    • tax --> Amount
    • currency --> Currency
    • withHoldingTax --> Withholding tax
    • creditableWithHoldingTax --> Creditable withholding tax
  • Fee section
    • fee --> Amount
    • currency --> Currency

A finished PDF importer as a basis would be e.g. the V-Bank AG PDF importer.

Auxiliary classes

The utility class about standardized conversions, is called by the AbstractPDFExtractor.java and processed in the PDFExtractorUtils.java. The PDFExchangeRate helps processing for foreign currencies.

Use the Money class when working with amounts (it includes the currency and the value rounded to cents). Use BigDecimal for exchange rates and the conversion between currencies.

Use TextUtil class for some string manipulation such as trimming strings and stripping whitespace characters. The text created from PDF files has some corner cases that are not supported by the usual Java methods.

Formatting of PDF Importer

Due to the many comments with text fragments from the PDF documents, we do not auto-format the PDF importer class files. Instead, carefully insert new code into the existing formatting manually. To protect formatting from automatic formatting, use the @formatter:off and @formatter:on.

Please take a look at the formatting and structure in the other PDF importers! Example: V-Bank AG

Test Cases

Via the application menu, users can create a test case file. The test file is the extracted text from the PDF documents. Users then anonymize the text by replacing personal idenfiable information and account numbers with alternative text.

  • The test files should not be modified beyond the anonymization
  • All source code (including the test files) are stored in UTF-8 encoding
  • Follow the naming convention for test files (type in the local language, two digit counter):
    • Buy01.txt, Sell01.txt --> Purchase and sale (single settlements) ( e.g. Buy01.txt or Sell01.txt )
    • Dividend01.txt --> Dividends (single statements)
    • SteuermitteilungDividende01.txt --> Tax settlement for dividends (single settlement)
    • SammelabrechnungKaufVerkauf01.txt --> Purchase and sale (multiple settlements)
    • Wertpapiereingang01.txt --> Incoming securities
    • Wertpapierausgang01.txt --> Outgoing securities
    • Vorabpauschale01.txt --> Advance taxes
    • GiroKontoauzug01.txt --> Giro account statement
    • KreditKontoauszug01.txt --> Credit card account statement
    • Depotauszug01.txt --> security account transaction history (settlement account)
  • Samples
    • one transaction per PDF: Erste Bank Gruppe - see testWertpapierKauf06() and testDividende05()
    • supporting securities with multiple currencies: Erste Bank Gruppe with testWertpapierKauf09() / testWertpapierKauf09WithSecurityInEUR() and testDividende10()/testDividende10WithSecurityInEUR()
      • Background: in the PP model, the currency of the transaction always must match the currency of the security and its historical prices. However, sometimes securities are purchased on an different exchange with prices in an another currency. The importer try to handle this case automatically. This is reflected in the two test cases
    • multiple transactions per PDF: DKB AG with testGiroKontoauszug01()
    • if transactions are created based on two separate PDF files, use post processing: Comdirect with testDividendeWithTaxTreatmentForDividende01() and testDividendeWithTaxTreatmentReversedForDividende01()

Regular Expressions

To test regular expression you can use https://regex101.com/.

Beside general good practices for regular expresions, keep in mind:

  • all special characters in the PDF document (äöüÄÖÜß as well as e.g. circumflex or similar) should be matched by a . (dot) because the PDF to text conversion can create different results
  • expression in .match(" ... ") is started with an anchor ^ and ended with $
  • with .find(" ... ") do not add anchors as they will be automatically added

Keep in mind that the regular expressions work against text that is automatically created from PDF files. Due to the nature of the process, there can always be slight differences in the text files. The following table collects the regular expressions that worked well to match typical values.

Value Example Not Helpful Works Well
Date 01.01.1970 \\d+.\\d+.\\d{4} [\\d]{2}\\.[\\d]{2}\\.[\\d]{4}
1.1.1970 \\d+.\\d+.\\d{4} [\\d]{1,2}\\.[\\d]{1,2}\\.[\\d]{4}
Time 12:01 \\d+:\\d+ [\\d]{2}\\:[\\d]{2}}
ISIN IE00BKM4GZ66 \\w+ [A-Z]{2}[A-Z0-9]{9}[0-9]
[\\w]{12}
WKN A111X9 \\w+ [A-Z0-9]{6}
[\\w]{6}
Amount 751,68 [\\d,.]+ [\\.,\\d]+
[\\.\\d]+,[\\d]{2}
74'120.00 [\\d.']+ [\\.'\\d]+
20 120.00 [\\d.\\s]+ [\\.\\d\\s]+
Currency EUR \\w+ [A-Z]{3}
[\\w]{3}
Currency € or $ \\D \\p{Sc}