« July 2012 | Main | September 2012 »

August 2012

August 23, 2012

Converting Single-Tenant to Multi-Tenant Apps

Characteristics of Successful SaaS Application

Scott Chate, the VP or Product at Corent Technologies very well describes the characteristics of a successful SaaS application in hist post Convert your Web Application to a Multi-Tenant SaaS Solution from 2010. As per his post successful SaaS application must possess the following characteristics.

  • It must support multi-tenancy
  • It must offer self-service sign-up
  • It must have subscription and billing mechanisms in place
  • It must scale efficiently
  • It must support monitoring and management of tenants
  • It must support user authentication and authorization for each tenant
  • It must support tenant customization

In order to achieve true multi-tenancy, which also allows the highest efficiency your application should be able to share the database and the application logic among tenants. 

However what does this mean for application developers. 

Database Redesign

The first step in the application redesign is the introduction of tenant identifier column in each database table and view. The tenant identifier is used to filter the data that belongs to a particular tenant. This has several implicatioins for the application developers:

  • All database scripts need to be changed so they can include the tenant idetifier. This includes creation scripts, updates to primary and foreign keys, stored procedures etc. For example if you have an order processing application and you used the order number as primary key you need to make sure that now the primary key includes also the tenant ID. Thus two different tenants can have the same order numbers if their policies require it.
  • As part of the database redesign you need to update the indices on all tables so that these take into account the tenant id. This will make sure database queries that reuire tenant specific information are executed with the necessary performance in mind.
  • Next you need to update all database queries made at the business logic tier and the tenant identifier. This has direct impact on the source sode and depending on how well your application is architected this may be relatively easy or hard to do. If for example there is no designated data access layer and SQL queries are hardcoded and spread all across the code, changing those will be a nightmare.
  • Last but not least you need to think how you can scale the database tier. Now that you store data from multiple tenants in the same database the chances are that you will reach the limit much faster than when you have separate database for each tenant. You need to think how to shard the data, and whether you will do this at the application tier or at the data tier.

Security

The next big topic you need to consider during the redesign process is the security. Although it is always about securing the data there are two aspects here:

  • Security at runtime
  • Security at the data tier

In the true-multitenancy case the business logic code is shared among multiple tenants. What that means is that the users from different tenants will be handled by the same code running not only on the same machine but even in the same process on that machine. In order to ensure that users from particular tenant never see the data of other tenants you need to be much more diligent about security.

Let's look at a particular scenario. Imagine that you have a mortgage calculator that calculates the monthly payments for a customer based on the principal amount of the loan and the length of the loan supplied by the customer, and the interest rate that you read from the database. Because the interest rate does not change very often and is the same for every customer you may be tempted to cache this in a static field in your application. This may work OK for a single-tenant application but if you want so have multiple banks using your application in a multi-tenancy scenarios it will be disastrous. The issue is that you cannot assume that all banks will offer the same interest rate to their customers and the code that reads the interest rate from the database will overwrite the static varieble for each tenant. In this case you will not only provide the end user with misleading information but will also expose competitive information to the rest of the tenants.

As we already discussed, on the data tier each tenant must be uniquely identified when accessing the data. You may want to create different logins for each tenant and give them permissions to just their view of the data or you may want to restrict the access to it by special WHERE clause to achieve the same. And of course each tenant may have different access permissions for users from different roles, so you will need to keep the user authorization code from your single-tenant app (maybe with some modifications).

Last but not least data access auditing is even more important for multi-tenant applications than for single-tenant ones. Now you need to keep track not only of which user accessed the data but to which tenant this user belongs to in order to be able to trace back any unauthorized access.

 

Scale and Performance

 

I've already touched a bit on this topic in the Database Redisign section when I discussed the need for data sharding but there are other things that you need to consider when you are converting your application to multi-tenant one.

One of them is the diverse set of tenants you may have. If we take the previous example, the mortgage calculator may be used by banks from any size - like small local banks and credit unions with just few thousand clients and by big banks with millions of clients. In a multi-tenant environment you cannot expect that each tenant will be the same size and you need to make sure that your application is able to serve them equally, and it is easy to scale out and in when the need arises. As part of the application design you need to take care of things like:

  • Throttling the request of demanding tenants. Some times scaling out your application may require some time and it can vary from couple of seconds to tens of minutes or even may require manual intervention. In the mean time if your application is not able to throttle the requests from the one tenant that consumes all the resources you other tenants may be down. Hacker attacks or security issues may also be the reason for such spikes in particular tenant's activity.
  • Avoiding code that stores the session state in memory on the server side. If you suddenly need to scale your application out the odds are that the next request from the user may not land on the same server and if the session state is stored in memory then they will lose all that information. You need to make sure that such state is stored either on the client size (browser cookie or local browser storage) or in a shared location like database. Although this one is true for every cloud application, not only multi-tenant ones, you need to keep in mind that scale out scenario is much more common in multi-tenant applications.
  • Gracefully hadle errors. Lot of things can go wrong when your application is under heavy load. Timeouts, session data loss, connectivity loss are just few of the causes for errors. You need to make sure that such fault scenarios are easy to recover from as well as on the server also on the client side.

Those are just some of the design considerations for multi-tenant applications. There are certainly platforms (like my current employer's Apprenda) that will do most of the work for you when you migrate your applications to multi-tenant ones, however you still need to be aware of possible areas where such automatic conversion cannot be done. Taking a closer look at your code is always necesary in conjunction with the automation platforms.