Anton Keks
14 Mar 2013

Online bank from the scratch in five months

Antud lugu on saadaval ainult inglise keeles

We created an advanced online bank for Bank St. Petersburg from scratch in just five months. This mid-size Russian bank is a major player in its home city and region. We decided to share the details on how we did this project, and also to talk about some of the implementation details.

Recently, here at Codeborne, we created an advanced online bank for Bank St. Petersburg from scratch in just five months.

This mid-size Russian bank is a major player in its home city and region. We decided to share the details on how we did this project, and also to talk about some of the implementation details.

Agile

We, a gang of four, started the development in late April 2012. At that time, we had to integrate with new to us back-end - core and processing systems. Fortunately, we were given a direct access to bank’s specialists familiar to the systems. By access we mean Skype contacts, emails, and phone numbers. We had no dedicated analysts in our team, so we just started writing code without any formal requirements, but with some stories in Pivotal Tracker.

Despite the developers were in Tallinn (Estonia) and the rest of the team was in St. Petersburg (Russia), we still were in close contact: daily Skype / email conversations, regular (several times a week) video conferences and mandatory (every two weeks) meetings in the bank office in St. Petersburg. We even had a chance to drive there on motorbikes several times, which added more fun to the project. Also, in June, we were exposed to local life when living and working on site for almost three weeks. Thus, we were learning the specifics of Russian online payments and government taxes during working days and crawling local bars and clubs during the famous White Nights.

We worked in two pairs (yes, pair programming) and switched every day, so it was fun and everyone knew all the code. By doing this we effectively increased the truck-factor of our team to four. Already on 1st of June, a month after the start, we made the very first release to the production, connecting to the real systems with real financial data. Initially, we restricted access to our application - only bank’s employees could try it out. We started with really basic stuff - the account statement and payments between own accounts. Regular builds with new functionality and almost daily deployments to production enabled the fast feedback loop and allowed us to move forward very efficiently.

To sum it up, we did agile software development, and more precisely - XP (extreme programming). It was a proven path, as we used it very successfully when working in Swedbank Estonia and practiced it before with all of our other customers.

Architecture? Play!

Actually, we do not like to use the word “architecture” in context of IT. The project lead, Anton Keks, spoke many times on IT conferences across Europe attacking the traditional concepts of IT architecture (there are some videos on habrahabr.ru community site in Russian). There is a really good definition of software architecture by Martin Fowler: “things that people perceive as hard to change”. We strongly believe that our code is not the place for such things. Instead, we do like simplicity and follow Bob Martin’s movement of clean code. So, we picked the Play Framework.

“What? Play Framework for online banking? You’re probably kidding!” - that was the first reaction of other developers we know. Let’s see what was behind the decision:

  • We wanted to use Java, because we have a lot of experience with both the language and the platform. Java is not dead, one just has to use it properly.
  • We had a small team and tight schedule, so we had to be as productive as we could. A good productivity would be impossible without Intellij IDEA - an extremely good tool, actually also developed in St. Petersburg.
  • Java runs on Linux, its administration is fast and easy. Later, after we went to general availability, system administrators confirmed the better experience with our platform compared to inconvenience of already-replaced Windows-based solution.
  • Play makes development on Java faster by making redeploys unnecessary. It just automatically reloads changed code and recompiles it on-the-fly, thus saving you a lot of time and hassle.
  • Finally, Play solves many common problems of web development (starting from the structuring of the project) in very elegant way, bringing ease of dynamic languages ​​to the world of Java. In Play, one can find many ideas inspired by Ruby on Rails, Django for Python, and even NodeJS.

Last but not least, we had years of development experience in online banking and self-service systems in the Internet. Obviously, we knew what we wanted to create - a cozy online banking platform.

Playing more

Play Framework is completely stateless. Is is designed just like HTTP itself - server does not know anything about the user session, nor does it allocate extra memory for all active users, so that the application scales linearly - we can handle twice as many users by doubling the number of servers. So far, we use DNS-based load balancing between those servers and the load of hardware is not very high. Since we keep the user’s session only in the cookie, when one of the servers happens to be temporarily down, users are silently redirected to the second server without noticing it and not losing the session.

This gives us the opportunity to do software updates of the platform at any time, i.e. during working hours, without losing active users. On launch day we updated the application four times while receiving more than 10 requests per second from the users. Of course, it is worth to mention that user data in cookies is cryptographically signed, so that no one can forge it.

In order to hide the low performance of back-end systems, we use built-in Play integration with memcached - a distributed cache, which is not a standard solution in the world of Java. We cache heavily minimizing number of queries to core banking systems. We even decided to prefetch the list of cutomer’s accounts and cards into cache while he/she deals with the 2nd step of two-step authentication (e.g. receives and enters SMS confirmation code), resulting in a much quicker display of overview page than would otherwise be possible. We are able to do this thanks to the built-in support for background tasks, called “jobs” in Play. As soon as customer enters the correct password (the first step of authentication), we simply run the appropriate job asynchronously in a separate thread. We use jobs for many background tasks - from processing of recurrent payments to receiving bank messages via SMTP protocol.

As we already started talking about asynchrony, we should note the ability of Play to serve multiple concurrent HTTP requests in a single thread, similarly to NodeJS. This ability saves memory and makes it possible to serve a much larger number of users per thread than by traditional Java application servers. Play provides an excellent asynchronous framework based on Promises and continuations, performing most I/O operations asynchronously and at the same time releasing threads for other tasks.

Testing

Where Play was not up to our expectations - it is automated tests, the only reliable way to do several updates per day in production system without regressions.

Although, Play supports automated tests, we think, the support is not perfect - we do not want to run unit-tests in a browser, and we wanted to practice TDD (Test Driven Development). First, we want to write tests for not yet implemented functionality, and only then implement new features in the code, making tests a measure of readiness to deliver. In practice, most of the functionality can be covered by unit tests that are much faster than integration tests. When the application needs to work with third-party systems, usually via text-based protocol (most often XML or JSON), we (1) make a test query, (2) save the textual answer in its original form, and (3) write unit-test for input validation and data interpretation. It is very convenient for both developing our app and documenting the behavior of the third-party system.

For the rest we use Selenide - an open-source library developed by our company based on the Selenium WebDriver. It allows us to write concise UI tests that run in real browsers. When you run tests with JUnit, it automatically (1) starts the whole webapp with in-memory database as a backend, (2) creates all the required tables, (3) inserts test data into those tables, (4) starts your browser and finally (5) executes test cases. It takes just a few seconds, which is important for efficient development. All the tests (about 250 cases at the moment), covering all of the critical and most of the important functionality of the application, run a little longer than 5 minutes, which gives us a fairly rapid feedback in case of regressions. Of course, to achieve this speed we have gone through many stages of optimization. The main motivation of optimization was the fact that all the tests are essential and indivisible part of every build of the software, and do not run somewhere separately, as is done in most projects.

Builds

We use Jenkins for building our software. It monitors changes in the code, and, if any detected, immediately makes a new build (including running all the UI tests). Automatically. When we need to deliver, we simply take the latest build, and voila! When we need a quick fix, it is sufficient to commit the change to the versioning system (we use git), and in 5 minutes you can install the fix to the production environment. Typically, these urgent builds might contain other non-urgent changes, but as they have already been tested, there is usually no problem to have them within this build. We like to make frequent releases, thus keeping the number of changes coming with every installation small, so that the probability of failure is low and rolling back to a previous version (if something goes wrong) also with a low risk.

Also, Play has built-in support for changes in database structure - evolution scripts, sometimes called delta-scripts. Every time when new build is installed, Play checks the current revision of app’s database and applies all the changes since previous build, such as creating new tables, indexes, adding columns, and data migration. Every single build of our online bank contains a full set of scripts that can be used to recreate the entire database from the scratch (except for the data). This makes it very easy to setup a new development (or test) environment.

Simplicity / Usability

Despite all the bells, whistles, and effective way to develop, no one will be using a new online bank, if it is complex or incomprehensible to the user. “Don’t make me think,” is a famous book by Steve Krug on the subject.

Early in the development, we had a goal to make online banking clear for the average user, not just accountants or economists. Normal people do not have to know what exactly an overdraft account is, or look at three separate statements to understand where they have spent their money. That’s why a good online bank needs to considerably transform/combine the low-level data coming from the underlying banking system considerably. Nor can it overwhelm the user with over-abundance of details. Everything should be brief and to the point.

During the process of simplifying payment forms (which in Russia are horribly complicated), we’ve got an idea of ​​what later became advertised by the bank as “smart payment”: what if we initially ask the user only one thing - “to whom?” Users can simply start typing what they want it to be: name of a person, company name, someone’s account or card number - and the system guesses what exactly was entered, and loads the other necessary data? Like Google search. We started with a prototype and everyone liked it! Of course, the idea can be improved even further, but this is already a matter of technology.

Another important aspect is to validate ideas on real people - housewives, school children, company employees, etc. For this, the bank held usability testing, in which volunteers were asked to complete specific tasks (for example, “Pay $100 to your grandmother”) and were watched how quickly and easily they could find a way to do that. This is another kind of feedback that is very helpful in creating a successful and intuitive product. Very reasonable instead of writing boring user documentation, which no one reads anyway.

In case of errors still occurring on production (no one is perfect), instead of showing technical nonsense to the user, we show a form with an error code and possibility to describe what led to it, or simply to express their emotions. Upon receiving this kind of feedback message, we can quickly find the cause of an error in the logs, and even respond to the user once the error has been fixed . This really helped us in the first days of the public launch.

Security

You can write a lot about security of online banking. In short, we can say that Play framework (when used properly) can out-of-the-box prevent such common attack vectors as XSS (Cross-site scripting) and SQL Injection - everything that is shown to the user pass through compulsory escaping and all that goes to the database via JPA / JDBC is sent separately from the query text, eliminating the possibility to be changed by the user through cleverly constructed input. This is an elementary practice, but, nevertheless, security experts from one Russian company specializing in penetration testing admitted that it was the first ever application under their test where no serious vulnerabilities were found. We were very pleased, but of course, only the framework can not solve all the potential security problems. Developers must always remember checking such things as whether incoming IDs or account numbers belong to the active session. In order not to leave a hole somewhere, only the attention to detail and years of experience developing web-based applications can help. To do this, we even had to change the Play itself and disable automatic loading of objects from the database by ID coming from request parameters just in case. Also, there are even more clever attacks like CSRF (Cross-site request forgery), HTTP Response Splitting, session hijacking, replay-attacks and so on, which all need to be prevented the programmer.

In terms of user-visible security, we, of course, use two-factor authentication, personal greetings and avatars, show time and location of last login, which all make social engineering attacks such as phishing more complicated. Overall, we tried not to compromise usability of the application too much, securing the stuff in other ways. In our online bank users are perfectly able to use all the standard browser buttons as back / forward / refresh. Interesting enough is the fact that some users, being accustomed with scoffing at them by invalidating their session after having pressed one of these buttons, still are asking from customer support, “Where is the update button in the page?“.

Another important security factor is proper auditability - all user actions are logged in a compact format suitable for treatment with standard unix-utilities, with references to a unique request ID and session. So that the choice of operating system where the online bank runs (Linux) in this case is also crucial.

Summary

To sum it up, we enjoyed making this project, though at this moment, we still continue to add new features. We truly hope that the simplicity and clarity of the new online bank will be appreciated by customers of the Bank St. Petersburg. And for us, the developers, there is nothing better than our code running in production system, making users’ life a little easier!

Our recent stories