Build Status

What is this?

This project contains scripts to automatically extract html financial statements with Selenium and parse them with PyQuery. The parsed transactions are stored in a database, and can be browsed and searched through a Django application.

Currently, there are parsers that understand html financial statements from these institutes:

You need to have a valid username and password for each bank that you want to use with this application.


In order to get started with this project, you need

Run the application locally

With that installed, you should be able to run the following in a terminal

git clone [email protected]:captnswing/banking.git
cd banking
docker-compose up -d

After all is done, you should be able to see the started containers

$ docker-compose ps

Name                   Command               State                        Ports
banking_db_1    / postgres   Up>5432/tcp
banking_es_1    /elasticsearch/bin/elastic ...   Up>9200/tcp,>9300/tcp
banking_web_1   python runserver ...   Up>8000/tcp

Now, initialize the database

docker-compose run web python ./banking/ migrate --noinput --syncdb

You should now be able to open the views at http://localhost:49174/statements (or open http://$(docker-machine ip docker 2>/dev/null):49174/statements/). They will not show any data yet. For that, you need to run the scripts below.

Note: the port will be different, check the output of docker-compose ps.

Collect and ingest the data

To collect statements from the banking sites, invoke

docker-compose run web python ./bin/ --bankname=<bankname> --username=<login> --password=<pwd>

for each combination of <bankname> and <username>. Use

docker-compose run web python ./bin/ --help

To view the supported banknames. The collected .html files will be stored in the folder specified in BANKING_OFFLINE_DATADIR (the default is ./data/).

To parse the collected html files and save all transactions into the database, run

docker-compose run web python ./bin/

webport=`docker-compose port web 8000 | sed -E 's|.*:(.*)|\1|g'`
dockerhost=`docker-machine ip docker 2>/dev/null`
open http://$dockerhost:$webport

Now the views at http://localhost:49174/statements (or open http://$(docker-machine ip docker 2>/dev/null):49174/statements/) should show you the parsed transactions.

Create or update the index

In order to be able to search, you need to create the search index

docker-compose run web python banking/ rebuild_index --noinput

The command

docker-compose run web python banking/ update_index --age=2

lets you update the index with transactions that have been modified or added in the last (in this example) 2 hours.

Customize settings

Settings are condocker-composeured in banking/ Decide e.g. where to save the extracted .html files by setting the name of the folder in BANKING_OFFLINE_DATADIR. The default is set to ./data/.

Use the Django admin

In order to be able to use the Django admin interface (to e.g. give your bankaccounts a name), you need to create a superuser. Load the superuser from a fixture:

docker-compose run web python ./banking/ loaddata admin_user

Then, access the admin interface http://localhost:49174/statements (or open http://$(docker-machine ip docker 2>/dev/null):49174/statements/) using admin/admin.

Tips for debugging the parsers

Sometimes, the html of the statement pages changes. I found selectorgadget useful in finding the right CSS selector expression.

Related Repositories



The Missing API for Banks - Get all of your transactions and balances using node ...



A screen scraped API for AIB Internet Banking ...



Aplikasi BNI SMS Banking untuk Android ...



Add the power of voice based authentication to your mobile apps - Sample Native ...



A prototype of a banking application for Google Glass. Uses open REST API by Fio ...

Top Contributors