Development

June 17, 2013

Open Source in the Cloud - How Much Should You Care?

Open-Ended-Funds-vs-Closed-Ended-Funds-300x300In his opening keynote for Red Hat Summit, Jim Whitehurst, the CEO of Red Hat asked the audience: "Name an innovation that isn't happening in Open Source - other than Azure!" I can certainly add iPhone and AWS to the mix but let me stick to the cloud topic with the following question: "How much Open Source matters in the cloud?"

 

Let's first elaborate on a two misconceptions about Open Source.

 

Open Source is Free

Not really! In the cloud doesn't matter whether you are running on an Open Source platform or not - it is NOT free because you pay for the service. And for long Open Source project have been funded through the services premiums that you pay. I would argue that Open Source vendors have mastered the way they can take profit from Open Source services and are far ahead than the proprietary vendors. The whole catch here is that you pay nothing for the software and incur no capital expenditures (CapEx) but you pay for the services (i.e. Operational Expenditures or OpEx) - remember, this is also the cloud model. Bottom line is that you may be better off with a propriatery vendor than an open source one, because the former need to yet master that business model.

 

Open Source Means No-Lock-In

Not sure about that too! Do you remember J2EE? It wasn't long time ago when Sun created the specification and said that there will be portability between vendors. Those of you who have tried to migrate J2EE application from JBoss to Weblogic to WebSphere will agree that the migration costs weren't negligible. It is the same with Open Source clouds - doesn't matter that HP and RackSpace both use the Open Source OpenStack - you still need to plan your migration costs.

 

I am far from saying that Open Source is not important. Quite opposite - I am big Open Source fan and the biggest example I can give is… well, Azure. They also understand that the source is not important anymore hence they open-sourced their SDKs (and continue to add more). It is time to forget those technology wars and really start thinking about the goals we have and the experience we provide for our customers. When you choose your cloud providers you should not ask the question: "Are they Open Source or proprietary?" Better questions to ask are:

  • Does the vendor provide functionality that will save me money?
  • Can they support my business for the next 5 or 10 years?
  • Do they provide the services and support that I need?
  • Are they agile enough to add the features that I need?
  • Do they have the feedback channel to the core development team that I can use to submit requests?
  • Do they have the vision to predict the future in the cloud?

All those are much more important questions for your technology strategy and your business than whether their cloud is Open Source or not.

April 01, 2013

Enabling Remote Debugging for Windows 8 Apps

Very short post but I hope this will save somebody lot of time. I was trying to debug a Windows 8 Store App on my Surface RT however quite a few of my attempts failed. The issue was obviously networking related because all the tools were properly installed, configured and started.

The whole problem was that my workstation running Visual Studio 2012 was not able to see the Surface RT machine and attach to the process. After some poking around I figured out that ping is not working to the Surface although during the configuration of the Remote Debugger I explicitly set the options to allow debugging on any network. Once ping was enabled everything worked out. You can enable ping by issueing the following command from an elevated Command Prompt:

netsh firewall set icmpsetting 8

Also, don't forget to install the Remote Tools for Visual Studio 2012 (ARM) from http://go.microsoft.com/?linkid=9810474

March 20, 2013

Migrating Legacy Applications to the Cloud

ToolkitWith everybody jumping on the cloud computing bandwagon lately, developers and architects need to spend extra time analyzing applications that can become good candidates for migration. It is wrong to believe that every legacy application can be easily migrated from the traditional on-premise infrastructure to any cloud computing environment. Therefore such migration efforts should be approached carefully and systematically.

Let's look at couple of issues that you may face when trying to migrate legacy applications to the cloud.

Client-Server Applications

Client-server applications are characterized with tight coupling between the business logic and the data tier. Most of the times the business logic is implemented as stored procedures in the database and pulling it ut can be a substantial effort. In addition such application establish a sticky session between the client and the server, which violates common cloud architecture patterns and complicates the migration process.

The obvious approach for migrating client-server applications to the cloud is to gradually abstract the business logic in a service layer and deploy the latter to the cloud. The cleaned up data tier can still be hosted on the current infrastructure until time comes to either migrate the data or retire it. At a high level your should follow these steps:

  • Identify the business services that are exposed to the clients
  • Implement those services as a separate business layer
  • Deploy the new business layer on a cloud enabled infrastructure (either IaaS or PaaS)
  • Implement a thin client layer on top of the services (in certain cases you may be able to modify the existing clients to connect to the services instead the data tier)
  • Roll-out the new client among your users
  • Retire the business logic in the data tier

This approach provides smooth migration because it postpones the data migration, a highly critical business component, to a later stage and in the mean time the organization is gaining important knowledge and discovers potential issues with the cloud technologies.

Scheduled Tasks

Scheduled tasks or batch jobs are another legacy application pattern that can introduce some challenges when migrating to the cloud. The premise of such applications is that they are triggered either at certain intervals or by a new batch of data that gets delivered. Majority of the times the latter approach involves transfers of files between machines. Two things that are at the core of such applications contradict with the modern cloud architectural patterns:

  • The reliance on always-up machines that will trigger the execution at certain intervals
  • The reliance on always-available file system used for file exchange

Functionality that such applications provide is easily achieved through the queue-centric workflow application pattern as described by Bill Wilder in his book Cloud Architecture Patterns. However, redesigning those legacy applications to use message queues can be substantial implementation effort. Hence you should approach the migration in phases. For jobs that rely on file transfers you can use these steps:

  • Change the jobs to use cloud storage instead local file systems
  • Add functionality at the delivery side to drop a message in the queue in addition to dropping the file
  • Remove the polling functionality in the processing job and instead use the message in the queue as a triggering mechanism

For the scheduled tasks you need to change the implementation to use messages in the queue instead time intervals to trigger the tasks.

You can achieve additional benefits if you add Map-Reduce as part of your modern application design. 

Scale Up Applications

Last but not least is the type of applications that rely on additional local resources in order to handle increased loads. Such resources can be CPU speed, memory or disk storage. Unfortunately such applications are hard to migrate to the cloud unless they get redesigned to use horizontal instead vertical scaling. Most of the times such challenges are imposed at the data tier of the applications and can be solved through data-sharding.

The process for migration involves:

  • Analyzing the data and potential de-normalization
  • Identifying the shard key
  • Splitting the data amongst the shards

As bottom line the gains for the organization in the above mentioned migration approaches are:

  • Improved (and more cloud-ready) application architecture
  • Enabled economies of scale at the different tiers of the application

However the biggest benefits is the cloud computing knowledge that the organization gains throughout the process.

November 06, 2012

Windows Phone 8 Development on Mac - All the Gotchas

After failing with my attempts to kickstart my Windows 8 development I decided to give Windows Phone 8 a try. Creating Windows 8 Developer account was straight forward compared to the Windows 8 experience however setting up the development environment on my Mac turned out to be painful. I will save you all the steps, which involved reinstalling Mountain Lion, setting up Bootcamp partition, installing Parallels etc. and resulted in few dosens screenshots with errors and exceptions in Visual Studio. Let's get straight to the point.

Is your machine ready for Windows Phone 8 development?

First of all, if you want to do Windows Phone 8 development (the same is true for Windows 8 development), you need a Windows 8 machine. In all fairness this is no different than iOS development... however there are the gotchas below :)

One of the most important things that you need when you do device development is the Emulator. If you don't want to spend all your money buying every possible Windows Phone the Emulator will be your solution for testing. Now, the problem with Windows 8 Phone development is that the Emulator uses the Hypervisor build into Windows 8. However in order to run Windows 8 Hypervisor your development machine (actually the CPU in your machine) needs to support SLAT. If you need more details you can go check Wikipedia's article about SLAT but briefly it is a new virtualization technology available in the modern CPUs.

Here are the official requirements for Windows Phone 8 SDK posted by Microsoft.

If you are like me and you bought your Mac awhile ago it may not be equipped with a "modern" CPU. Here are the steps you can use to check whether you will be able to run Windows Phone 8 Emulator on your Mac:

  1. Install Windows on your Mac
    If you have Windows already running on it it is fine. Doesn't need to be Windows 8, Windows 7 will work just fine. You can use Parallels or Bootcamp or whatever else you want. The sole purpose of this step is to have a way to run the program from Step #2.
  2. Download Coreinfo
    Coreinfo is a small program written by Mark Russinovich that checks low level hardware components.
  3. Run Coreinfo and check the output
    Here is the interesting part. If you have older Intel based Mac (or Macbook Pro like me) your machine may be equipped with Core 2 Duo processor that doesn't support SLAT. General rule is that Intel i-processors support SLAT while everything before that doesn't. Wikipedia has a good list of all Mac models and their CPUs.

For me here is how the output from Coreinfo looked:

CoreInfo

I have Core2 Duo P8600 and although it supports hardware virtualization I wasn't able to run Windows Phone 8 Emulator (as well as the Hyper-V in Windows 8). Interesting though Windows 8 Simulator runs fine although I thought is is based on the Hyper-V too. The best I managed to achieve was the following:

Win8Phone Designer Issue

Interestingly before upgrading from Lion to Mountain Lion I was able to at least see the designer working. Now this is gone though.

Should you use Parallels or Bootcamp?

If your Mac passed the CPU test the answer on this question is quite simple - you must use Bootcamp (or at least for now).The reason is that Microsoft Hyper-V doesn't run in virtualized environment hence your only option is to have Windows 8 installed on a separate partition using Bootcamp.

My personal opinion is that the approach Microsoft took (again) will prevent some of the potential developers to start development for Windows Phone 8 (and Windows 8) as you can see from the following articles and blog posts:
Developers praise Windows Phone 8 SDK, but virtualization and upgrades rankle
Windows Phone 8 SDK what a Big Time Flop
http://forum.parallels.com/printthread.php?t=264863
Windows Phone 8 Emulator on MacBook Pro – No SLAT

There are two articles that I found explaining how you can use VMWare Fusion to run Windows 8 on your Mac. I doubt it will be successful if your machine doesn't have SLAT support but maybe it is worth looking into them:

http://blog.brightpointuk.co.uk/running-windows-phone-8-emulator-mac-os
http://www.developer.nokia.com/Community/Discussion/showthread.php?238276-Windows-Phone-8-development-on-Mac-and-VMWare-also-using-the-Phone-Emulator

Good luck with your Windows (Phone) 8 development :)

November 01, 2012

I was so close to like Windows 8... so close!

Everything was great (kind of)! Build conference started lame - with a T-shirt and a shopping tote but SteveB fixed the things in the next hour by giving every one of us a Surface, Lumia 920 and 100GB Skydrive. In the last two days Microsoft made pretty good job to excite me about Windows 8 and Windows Phone 8, and after listening to Josh (Twist)'s talk about Azure Mobile Services today I decided to give it a try. Yes, I decided to develop my first application for Windows 8 (and maybe Windows Phone 8).

I went to the Microsoft Company Store to buy a copy of Windows 8 and Visual Studio 2012 with the vaucher that every Build 2012 attendee was given. Windows 8 was on the shelves but there was no sign of Visual Studio 2012 - not in the physical nor in the e-store. Nevertheless, using my MSDN account I spent 1/2 h downloading and another hour installing and I was able to get Windows 8 and VS2012 running in Parallels on my Mac. Impressive! Windows 8 was running much faster than Windows 7 and was not killing my machine. So far so good!

The first thing I did was to go to Windows Store Apps Dev Center and to sign-up for developer account (forgot to mention that SteveB also lowered the sign-up fee from $99 to $8 for the 8 days following the Build start). And here is when my enthusiasm vanished in just a minute.

Clicking on Get your developer account now send me to the following screen:

Error "We don't recognize the computer you're using"

First, I had no idea what content I am supposed to see but at least it was clear that I should do this from a machine that has the name LAJOLLA. I used to have a Windows machine that was called with that name but now my Mac is called LAJOLLA so I decided to give it a try. Hehe, silly! I must be kidding! Of course it didn't work! My assumption was that the machine must be a Windows one.

OK, no problem! I can easily rename the newly installed Windows 8 VM to LAJOLLA and give it a try. Well... Not really! Although I renamed the machine to LAJOLLA when I login to the page above I still see the same error. Must be something else!

My only option is to choose "Not using this computer anymore! Update your info." So I clicked on that and was asked whether I want to delete LAJOLLA. Hell, yeah! I want to create my Developer Account as fast as possible! But... SURPRISE! The info will be deleted in 30 days. For security reasons! Ugh, why? I am not sure I understood but whatever, I can go and add my brand new Surface and my newly installed Win8 VM as trusted PCs and I should be set to go. Nope! Those will be added ... guess when... in 30 days.

OK, let's go back and undelete the LAJOLLA PC and see what can we do to get my developer account set up. Because I WANT MY WINDOWS 8 DEVELOPER ACCOUNT to develop my first Windows 8 application. 

Although I did the deletion from the same browser session when I clicked on Cancel the deletion I got the following:

Now I am hosed with no trusted PC and no way to create Windows 8 Developer account for (at least) the next 30 days.

It is not about the $91 that I will save. It is about the 1h frustration and anger against the PM who invented this feature. I am wondering - what is the point of the 30 days wait time? Can't you just send me an email, or SMS on the phone, or ask me the security question? What will happen in 30 days more than in a day? Except that you may lose one more customer. 

Update: I just chatted with one of the Windows 8 people present at the Build conference. The suggestion I got was to create new Microsoft Account (or LiveID or Passport or whatever the name of it is) that I can use as my development account. He mentioned that they had similar issues because the MS Account team does not allow more than 5 machines connected to it (yes, ONLY FIVE!) and they use this internally to overcome the limitation.

Update #2: One thing that I forgot to mention is that although I renamed my newly installed Win8 to LAJOLLA when I access the Windows Account UI I still see the old name of the machine. Hence I thought that if I create new Windows 8 VM and name it LAJOLLA it may work (yeah, I really thought that this feature is not only lame but also badly implemented). However, after doing the above I ended up with the following:

Screen Shot 2012-11-01 at 9.02.25 PM

Well, this proves that it won't be easy to cheat the feature but as a user... I really don't get it! I am acessing the site from a machine with name LAJOLLA - why can't I change the security info? And what the heck does it mean that LAJOLLA will be deleted on 11/30/12 and added on 11/30/12?

Update #3: Here is also transcript of the chat I had with the Windows Store app development support representative:

Please wait for an agent to respond. You are currently '1' in the queue.

Privacy Statement

You are now chatting with 'Steven'.

Toddy: hi

Steven: Hello Toddy, my name is Steven. How can I help you?

Toddy: I am unable to create Windows 8 Developer Account

Toddy: there are some issues with my Windows Account I don't have anymore access to my trusted PC and I am unable to add any other PC as trusted for the next 30 days

Steven: Have you already completed the developer registration or is this preventing you from being able to actually complete the registration?

Toddy: This is preventing me from completing the registration

Toddy: It comes out immediately when I click the registration link

Steven: Unfortunately the 30 day waiting period cannot be bypassed. You may want to create a different Microsoft account and go through the registration with the new account. Unfortunately you won't be able to link your Windows Phone developer registration because the publisher name will be locked to your Microsoft account that is already registered as a Windows Phone developer account.

Toddy: Why is this waiting period required?

Steven: This is to protect the owner of the account if the account access has been compromised. The 30 day period is the time provided for the user to realize that they no longer has access to the account and to report the problem.

Steven: In a case where the account has not been compromised, the user must wait the 30 days, as this process cannot be bypassed or expedited.

Toddy: Can't this be done via email confirmation or phone SMS or something more advanced than 1 month waiting period? I do own the account and I can change everything like password etc.

Steven: You must already have a different method that has already been confirmed on the website below:

Steven: https://account.live.com/Proofs/Manage

Steven: If you do have a method listed there that had previously been approved, you can use that method to confirm your account.

Toddy: All other methods will be added also in a month

Steven: Unfortunately if the methods hadn't previously been added, you won't be able to verify your account through this authentication process until after the 30 day reset period.

Toddy: So! What you are saying is that there is no way for me to become Windows 8 Developer today correct?

Steven: Unless you were to use a different Microsoft account to go through the registration. This will require you to use a different publisher name than your Windows Phone developer account. If you have a business account, the publisher name must match your business name, so you would need to wait for the 30 day reset period to expire.

Toddy: Well, what can I say not very welcoming for new Windows 8 developers.

Toddy: Thank you for your time

Steven: You're welcome, sorry I could not do more to get you into the registration today.

Steven: Have a nice day.

Chat session has been terminated by the agent.

 

September 03, 2012

Configuring Logging in Python - The Real Life Example

For some time I was playng with Python and few nights ago I finally reached the point that I had to implement configurable logging in order to capture information that would help me troubleshoot some issues. Although I am pretty familiar with Log4J, which is very similar to the Python logging module it took me some time to get my loggers configured properly. Unfortunately the documentation is not very rich on examples hence I thought it will be useful also for other people to publish a good Python logging example tailored to a real life scenario.

I will describe few gotchas that I discovered while playing with the logging module.

Python Logging Scenario

In order to demonstrate the logging functionality I will use a simple Python application that consist of the following packages:

rootpackage
rootpackage.levelonepackage
rootpackage.levelonepackage.leveltwopackage


Each package (except the rootpackage one) contains two modules with exactly the same code in it (I know, I know - it is against all rules of software development to repeat code:)). The code initializes the logger and also defines a function that logs a message with each log level. Here is it:

[1] import logging
[2] import logging.config
[3] from rootmodule.utils import get_logging_config

[4] # set up logging
[5] logging.config.fileConfig(get_logging_config())
[6] logger = logging.getLogger(__name__)

[7] def log_messages():
[8]     """Logs a message with each of the log levels
[9]     """
[10]   print logger.level
[11]     logger.critical("This is a critical message")
[12]     logger.error("This is an error message")
[13]     logger.warn("This is a warning message")
[14]     logger.info("This is an info message")
[15]     logger.debug("This is a debug message")

Line [6] above is important to note because this is the easiest way to get the full name of the module (including the package name). This will allow you to easily change the logging configuration using the package hierarchy structure.

The rootpackage package contains additional module called utils with just one function in it with the sole purpose to retrieve the absolute path to the logging configuration file. Here the code for it:

from os import path

def get_logging_config():
    """Returns the absolute path to the logging config file
    """
    return path.join(path.split(rootpackage.__file__)[0], 'logging.conf')

The logging configuration file logging.conf is also stored in the folder where the rootpackage modules are stored.

Here is the structure of the files on the app.

+ rootpackage
    - __init__.py
    - logging.conf
    - root_logging.py
    - root_logging2.py
    - utils.py
    + levelonepackage
        - __init__.py
        - lovelone_logging.py
        - levelone_logging2.py
        + leveltwopackage
            - __init__.py
            - leveltwo_logging.py
            - leveltwo_logging2.py


Hence there are 3 packages(rootpackage, levelonepackage and leveltwopackage) in the sample app, 7 modules (root_logging,root_logging2, utils, levelone_logging, levelone_logging2
leveltwo_logging and leveltwo_logging2) and 1 logging configuration file (logging.conf). You can download the full source code for the Python logging example from my site.

Sample App Logging Goals

Now that you have looked at the app let's set the following four goals for logging from the app:

  • Limit the messages to a particular log level for a module in particular package
  • Limit the messages to a particular log level for a specific package
  • Log the messages from particular package in a file while the messages from other packages to the console
  • Log the messages from the same package in a file as well as on the console together with the messages from other packages

Those seems to be enough to give you a good idea what is possible as well as how to change the configuration to achieve the desired outcome. Let's start!

Configuring Python Logging

As you may already be aware from Python documentation there are three main objects in Python logging package that you need to configure. Those are the Logger, Handler and Fromatter. In this post I will concentrate on the Logger and the Handler objects and will use a simple Formatter as this one:

format=%(asctime)s - [%(name)s] - %(levelname)s - %(message)s


This formatter will print a message in the following format:

2012-08-23 23:46:03,463 - [rootpackage.root_logging] - DEBUG - This is a debug message


There are other object like Filter for example but I will not discuss those.

Here is the initial code for the logging.conf file that enables DEBUG level for all the the module in my sample app:

[1] [loggers]
[2] keys = root,rootlogging,rootlogging2,levelonelogging,
levelonelogging2,leveltwologging,leveltwologging2


[3] [handlers]
[4] keys = console


[5] [formatters]
[6] keys = generic


[7] [logger_root]
[8] level = DEBUG
[9] handlers = console
 


[10] [logger_rootlogging]
[11] level = DEBUG
[12] handlers = console
[13] qualname = rootpackage.root_logging
[14] propagate = 0


[15] [logger_rootlogging2]
[16] level = DEBUG
[17] handlers = console
[18] qualname = rootpackage.root_logging2
[19] propagate = 0


[20] [logger_levelonelogging]
[21] level = DEBUG
[22] handlers = console
[23] qualname = rootpackage.levelonepackage.levelone_logging
[24] propagate = 0


[25] [logger_levelonelogging2]
[26] level = DEBUG
[27] handlers = console
[28] qualname = rootpackage.levelonepackage.levelone_logging2
[29] propagate = 0


[30] [logger_leveltwologging]
[31] level = DEBUG
[32] handlers = console
[33] qualname = rootpackage.levelonepackage.leveltwopackage.leveltwo_logging
[34] propagate = 0


[35] [logger_leveltwologging2]
[36] level = DEBUG
[37] handlers = console
[38] qualname = rootpackage.levelonepackage.leveltwopackage.leveltwo_logging2
[39] propagate = 0


[40] [handler_console]
[41] class = StreamHandler
[42] level = DEBUG
[43] formatter = generic
[44] args = (sys.stdout,)


[45] [formatter_generic]
[46] format=%(asctime)s - [%(name)s] - %(levelname)s - %(message)s

Except the Formatter that I explained earlier I have defined 7 Loggers (one for each of the modules in the app except for the utils one and one for the root Logger) and one Handler for handling console logs.

Lines 2, 4 and 6 from the logging.conf file define the dictionary keys for the configuration of the different logging objects (Loggers, Handlers and Formatters). If you define a configuration key for one of the objects you MUST have section in logging.conf that corresponds to this key. For example I have defined Logger configuration key levelonelogging. The corresponding section for this key is [logger_levelonelogging] and the information from that section will be used to configure the corresponding logger. Also, as Gevin Baker mentions in his blog post A real Python logging example, you SHOULD NOT add any spaces in the keys list.

If you try to call the log_messages() function from any of the modules you will see similar output on the console:

10
2012-08-26 20:32:40,302 - [rootpackage.root_logging] - CRITICAL - This is a critical message
2012-08-26 20:32:40,302 - [rootpackage.root_logging] - ERROR - This is an error message
2012-08-26 20:32:40,302 - [rootpackage.root_logging] - WARNING - This is a warning message
2012-08-26 20:32:40,302 - [rootpackage.root_logging] - INFO - This is an info message
2012-08-26 20:32:40,302 - [rootpackage.root_logging] - DEBUG - This is a debug message


The only difference you will see is that for different modules you will get the module names in the square brackets. The first row of the output is the numeric equivalent of the log level, and in the case of DEBUG this is 10.

Changing the Log Level for Python Module

Now, let's see what changes we need to make in order to limit the messages from one of the modules to only ERROR messages. Let's for example choose module rootpackage.levelonepackage.levelone_logging2 for our test. The configuration for that module is done in section [logger_levelonelogging2] starting at line 25. The only change you need to do is to change line 26 to the following:

[26] level = ERROR

and you will get the following output if you call log_messages() from the module rootpackage.levelonepackage.levelone_logging2:

40
2012-08-26 21:55:29,654 - [rootpackage.levelonepackage.levelone_logging2] - CRITICAL - This is a critical message
2012-08-26 21:55:29,654 - [rootpackage.levelonepackage.levelone_logging2] - ERROR - This is an error message


All other modules (including rootpackage.levelonepackage.levelone_logging) will continue to log with level DEBUG. As you can see this was pretty simple.

Changing the Log Level for Python Package

Next we wanted to have all the modules from particular package to use the same log level. Because we started with a module rootpackage.levelonepackage.levelone_logging2 in package rootpackage.levelonepackage let's configure the logging in a way that all modules in this package (both levelone_logging and levelone_logging2) log with level ERROR. We will need to do few more modifications to our configuration file. Here are the modifications that we need to make:

  • Line 2: Remove the levelonelogging2 key from the list. Here is how the line should look like:
    keys = root,rootlogging,rootlogging2,levelonelogging,
    leveltwologging,leveltwologging2
  • Lines 20-24: Change those as follows
    [20] [logger_levelonelogging]
    [21] level = ERROR
    [22] handlers = console
    [23] qualname = rootpackage.levelonepackage
    [24] propagate = 0


    Note that in line 23 we removed the module name and left just the name of the package
  • Lines 25-29: Remove those as we won't need them anymore

Now, if you call log_messages() from the modules rootpackage.levelonepackage.levelone_logging or rootpackage.levelonepackage.levelone_logging2 you will get the following result:

0
2012-08-26 21:55:29,654 - [rootpackage.levelonepackage.levelone_logging] - CRITICAL - This is a critical message
2012-08-26 21:55:29,654 - [rootpackage.levelonepackage.levelone_logging] - ERROR - This is an error message

0
2012-08-26 21:55:32,254 - [rootpackage.levelonepackage.levelone_logging2] - CRITICAL - This is a critical message
2012-08-26 21:55:32,254 - [rootpackage.levelonepackage.levelone_logging2] - ERROR - This is an error message

Note that the log levels for the loggers (in this particular case rootpackage.levelonepackage.levelone_logging and rootpackage.levelonepackage.levelone_logging2) are not set (value is 0) but it is set for the package logger and all the modules in the package will log with the same level if they don't have separate logger configured.

Changing the Log Handler for Python Package

Last two goals we had were to log the messages from particular package to a file or to a file and to the console at the same time. In order to achieve the former let's define new handler in our configuration file. Here are the steps for that:
  • Line 4: Add additional key file
  • Line 22: Change the handler from console to file for the rootpackage.levelonepackage package
    [22] handlers = file
  • At the bottom of the config file add the following section that defines the file handler
    [handler_file]
    class = FileHandler
    level = ERROR
    formatter = generic
    args = ('sample.log', 'a')

Now, if you call log_messages() from the modules rootpackage.levelonepackage.levelone_logging or rootpackage.levelonepackage.levelone_logging2 the only thing you will see on the console is the number 0 because the function prints the log level. If you exit Python and look in the folder where you started it you will find a file sample.log that contains the log messages. If you want to see the log messages on the cosole and also log them to a file the only change you will need to do is to add the handler to the list of handlers for the package as follows:

[22] handlers = console,file


And rememer - NO SPACES in the list!

Hope this will kick start your logging with Python and make your life easier.

August 23, 2012

Converting Single-Tenant to Multi-Tenant Apps

Characteristics of Successful SaaS Application

Scott Chate, the VP or Product at Corent Technologies very well describes the characteristics of a successful SaaS application in hist post Convert your Web Application to a Multi-Tenant SaaS Solution from 2010. As per his post successful SaaS application must possess the following characteristics.

  • It must support multi-tenancy
  • It must offer self-service sign-up
  • It must have subscription and billing mechanisms in place
  • It must scale efficiently
  • It must support monitoring and management of tenants
  • It must support user authentication and authorization for each tenant
  • It must support tenant customization

In order to achieve true multi-tenancy, which also allows the highest efficiency your application should be able to share the database and the application logic among tenants. 

However what does this mean for application developers. 

Database Redesign

The first step in the application redesign is the introduction of tenant identifier column in each database table and view. The tenant identifier is used to filter the data that belongs to a particular tenant. This has several implicatioins for the application developers:

  • All database scripts need to be changed so they can include the tenant idetifier. This includes creation scripts, updates to primary and foreign keys, stored procedures etc. For example if you have an order processing application and you used the order number as primary key you need to make sure that now the primary key includes also the tenant ID. Thus two different tenants can have the same order numbers if their policies require it.
  • As part of the database redesign you need to update the indices on all tables so that these take into account the tenant id. This will make sure database queries that reuire tenant specific information are executed with the necessary performance in mind.
  • Next you need to update all database queries made at the business logic tier and the tenant identifier. This has direct impact on the source sode and depending on how well your application is architected this may be relatively easy or hard to do. If for example there is no designated data access layer and SQL queries are hardcoded and spread all across the code, changing those will be a nightmare.
  • Last but not least you need to think how you can scale the database tier. Now that you store data from multiple tenants in the same database the chances are that you will reach the limit much faster than when you have separate database for each tenant. You need to think how to shard the data, and whether you will do this at the application tier or at the data tier.

Security

The next big topic you need to consider during the redesign process is the security. Although it is always about securing the data there are two aspects here:

  • Security at runtime
  • Security at the data tier

In the true-multitenancy case the business logic code is shared among multiple tenants. What that means is that the users from different tenants will be handled by the same code running not only on the same machine but even in the same process on that machine. In order to ensure that users from particular tenant never see the data of other tenants you need to be much more diligent about security.

Let's look at a particular scenario. Imagine that you have a mortgage calculator that calculates the monthly payments for a customer based on the principal amount of the loan and the length of the loan supplied by the customer, and the interest rate that you read from the database. Because the interest rate does not change very often and is the same for every customer you may be tempted to cache this in a static field in your application. This may work OK for a single-tenant application but if you want so have multiple banks using your application in a multi-tenancy scenarios it will be disastrous. The issue is that you cannot assume that all banks will offer the same interest rate to their customers and the code that reads the interest rate from the database will overwrite the static varieble for each tenant. In this case you will not only provide the end user with misleading information but will also expose competitive information to the rest of the tenants.

As we already discussed, on the data tier each tenant must be uniquely identified when accessing the data. You may want to create different logins for each tenant and give them permissions to just their view of the data or you may want to restrict the access to it by special WHERE clause to achieve the same. And of course each tenant may have different access permissions for users from different roles, so you will need to keep the user authorization code from your single-tenant app (maybe with some modifications).

Last but not least data access auditing is even more important for multi-tenant applications than for single-tenant ones. Now you need to keep track not only of which user accessed the data but to which tenant this user belongs to in order to be able to trace back any unauthorized access.

 

Scale and Performance

 

I've already touched a bit on this topic in the Database Redisign section when I discussed the need for data sharding but there are other things that you need to consider when you are converting your application to multi-tenant one.

One of them is the diverse set of tenants you may have. If we take the previous example, the mortgage calculator may be used by banks from any size - like small local banks and credit unions with just few thousand clients and by big banks with millions of clients. In a multi-tenant environment you cannot expect that each tenant will be the same size and you need to make sure that your application is able to serve them equally, and it is easy to scale out and in when the need arises. As part of the application design you need to take care of things like:

  • Throttling the request of demanding tenants. Some times scaling out your application may require some time and it can vary from couple of seconds to tens of minutes or even may require manual intervention. In the mean time if your application is not able to throttle the requests from the one tenant that consumes all the resources you other tenants may be down. Hacker attacks or security issues may also be the reason for such spikes in particular tenant's activity.
  • Avoiding code that stores the session state in memory on the server side. If you suddenly need to scale your application out the odds are that the next request from the user may not land on the same server and if the session state is stored in memory then they will lose all that information. You need to make sure that such state is stored either on the client size (browser cookie or local browser storage) or in a shared location like database. Although this one is true for every cloud application, not only multi-tenant ones, you need to keep in mind that scale out scenario is much more common in multi-tenant applications.
  • Gracefully hadle errors. Lot of things can go wrong when your application is under heavy load. Timeouts, session data loss, connectivity loss are just few of the causes for errors. You need to make sure that such fault scenarios are easy to recover from as well as on the server also on the client side.

Those are just some of the design considerations for multi-tenant applications. There are certainly platforms (like my current employer's Apprenda) that will do most of the work for you when you migrate your applications to multi-tenant ones, however you still need to be aware of possible areas where such automatic conversion cannot be done. Taking a closer look at your code is always necesary in conjunction with the automation platforms.

June 21, 2011

Configuring Tomcat Logging

If you looked at my recent posts I was playing with Java and Tomcat a lot, and trying to run those on Windows Azure. One of the things I wanted to achieve is to store Tomcat log files if folder different than the default Tomcat location. Surprisingly for me configuring Tomcat logging turned out to be not so intuitive. Let’s start with the basics…

 

Where are Tomcat Logs Stored?

By default Tomcat stores the log files under
$CATALINA_BASE\logs

Where CATALINA_BASE is the folder where Tomcat is installed. If you open that folder you will see something like this:

 

06/21/2011  02:49 PM  7,534 catalina.2011-06-21.log
06/21/2011  01:37 PM      0 host-manager.2011-06-21.log
06/21/2011  02:49 PM  1,872 localhost.2011-06-21.log
06/21/2011  02:49 PM      0 localhost_access_log.2011-06-21.txt
06/21/2011  01:37 PM      0 manager.2011-06-21.log

 

For more information what each file is about you can read the Tomcat Logging page.

My goal was to move those log files to a folder different from

$CATALINA_BASE\logs

 

How to Configure Tomcat Logging (Really How)?

If you search Google (or Apache’s web site) you will find out that in order to configure Tomcat logging you will need to either:

  • edit the logging.properties file in $CATALINA_BASE\logs
  • or create new logging.properties and set the java.util.logging.config.file System property to point to it

The easiest way to use the second approach is to set the Environment Variable


LOGGING_CONFIG=”-Djava.util.logging.config.file=[your_logging.properties_file_location]”

 

As you may expect the default logging.properties file is located in $CATALINA_BASE\conf.


Now, the hard part with this is that you CANNOT use Environment Variables in Java properties file. And of course this was what I really wanted to do. In general what I wanted to do is to use the %ROLEROOT% Environment Variable in the location path for all the log files (see What Environment Variables Can You Use in Windows Azure). The workaround to this problem is to set Java System property to use the Environment Variable (ie. the –D option for java.exe). Tomcat startup scripts use Environment Variable JAVA_OPTS for exactly this purpose:

 

set JAVA_OPTS=-DMY_SYSTEM_PROPERTY=%MY_ENVIRONMENT_VARIABLE

 

For Windows Azure specifically you can use the Variable tag in CSDEF:

 

<Variable name="JAVA_OPTS" value="-D[my_property_name]=%ROLEROOT%\[some_folder]" />

 

Next, in order to use the System property in the Java properties file you need to specify it in the following format:

 

${[my_property_name]}

 

Here is what I actually did. In CSDEF you set the Environment Variables as follows:

 

<Environment>
    <Variable name="TomcatLocalResourcePath"
              value="%ROLEROOT%\Approot\temp" />
    <Variable name="JAVA_OPTS" value="-DTomcatLocalResourcePath=%
              TomcatLocalResourcePath%" />
</Environment>

 

and in the logging.properties you use the Java System property as follows:

 

1catalina.org.apache.juli.FileHandler.directory = ${TomcatLocalResourcePath}

2localhost.org.apache.juli.FileHandler.directory = ${TomcatLocalResourcePath}

3manager.org.apache.juli.FileHandler.directory = ${TomcatLocalResourcePath}4host-manager.org.apache.juli.FileHandler.directory = ${TomcatLocalResourcePath}

 

This is all good, however it takes care only of the following log files:

 

catalina.2011-06-21.log
host-manager.2011-06-21.log
localhost.2011-06-21.log
manager.2011-06-21.log

 

What about localhost_access_log.2011-06-21.txt? The access log file in Tomcat is not configured via logging.properties file but in server.xml file. You can read more about the Access Log Valve (which controls the access log) on Apache’s web site. The simple thing that you need to do is to set the directory attribute on the Valve tag as follows:

 

<Valve className="org.apache.catalina.valves.AccessLogValve"
               directory="${TomcatLocalResourcePath}" 
               prefix="localhost_access_log." suffix=".txt"
               pattern="%h %l %u %t &quot;%r&quot; %s %b"
               resolveHosts="false"/>

 

UNIX vs. Windows vs. Java Property Files

As a final note some clarification on when to use dollar sign $, percent % and dollar sign with curly braces ${} as I think it may be confusing for some people:

  • As you know dollar sign $ is used to evaluate Environment Variables in UNIX. For example if you define the following Environment Variable in UNIX:

    setenv MYTEMPPATH /usr/temp

    you can use it later on as follows:

    setenv SOMEPATH $MYTEMPPATH/new
  • In contrast Windows uses percent % to evaluate Environment Variables. Here is the same example for Windows:
    set MYTEMPPATH=C:\Temp

    and you can use it as follows:

    set SOMEPATH=%MYTEMPPATH%\new

  • Property files in Java use the UNIX type of format but with curly braces to evaluate System properties. For example if you define the System property as follows:

    -DMyTempPath=C:\Temp

    you can use it in Java properties files as follows:

    some.property=${MyTempPath}

October 10, 2010

3 Things Developers Always Miss... And Customers Get Annoyed With

Last week I had to deal with very interesting bug that reminded me of the common mistakes developers make. Here is the story. We use inclusion list (aka “white list”) for allowing access to users from specific domains. Everything worked fine with the initial list of domains we had, however after adding some more domains to the list users from those domains were not able to login. To amplify the problem the issue appeared to be present only in our production environment, and we were not able to reproduce it in any of our test environments. 

As it turned out the problem was quite simple. Can you guess it? Yes - trimming white spaces. Something so simple is able to turn the live of developers and testers into a nightmare but not only that - it can annoy customers like hell. In each and every project I was part of, I have seen this one repeated again and again together with two other common mistakes: case insensitive comparison and sorting in the UI. Let me elaborate a bit on all of those.

Trimming Forwarding and Trailing White Spaces

Honestly, I can’t think of a scenario where one doesn’t want to trim forwarding and trailing white spaces when user input is involved. The chances that users will add additional space while typing or paste text with spaces at the front or the end are close to 100%. From other side we are so used to software that handles such simple transformation for us that we already take it as given. Hey, we are not in the 90s anymore (or at least most of us)!

For comma or semicolon separated lists trimming the white space is a must have. The simple reason for that is that people intentionally add spaces for readability, and not trimming those will result in such silly bugs.

Here is an interview tip for you! Next time you interview a developer, ask him to read the items from comma separated list and compare it to some user input. Depending on how he does on this simple assignment you can guess the quality of your future implementation. I am pretty sure you will get very interesting answers.

Case Insensitive Comparison

How often happens to you that type your credentials on some web site and you cannot login? Then you start wondering whether you forgot your password only to discover after 5 mins after that they forgot to mention on the login screen that your username is case sensitive. Smart, isn’t it? Not really! Shift and Caps Lock keys are so close to each other and the odds that one switches to caps lock while typing are quite high. My nickname is ToddySM doesn’t matter whether I type it “toddySM”, or “Toddysm”, or “ToDdYsM” - I still expect you to sign me in. If you don’t the chances are that I will use your product less or not at all.

One note here though - make sure you don’t do case insensitive comparison for passwords. People expect passwords to be case sensitive because this is part of the algorithm to create strong password. And to give you good example for usability, Windows warns you with yellow bang when you have caps lock on on the login screen - small hint that I bet lot of people appreciate (including myself).

To expand my interview tip from above - ask the developer to implement back end for login screen. Pay good attention how the comparison of the string is done - case insensitive for username but case sensitive for the password

Sorting in the UI

Let me be clear here - it is not about sorting, it is about providing intuitive way to find the one thing one needs among tens or hundreds of others. Take for example the list of countries - if nobody bothered to show a sorted list of countries in the drop down lists every registration page on the Web would be nightmare to fill in. Unless you have special reason not to sort, and this reason is more than obvious for the average user, sorting is the your best bet to make your UI usable. Even if the default view you provide is not sorted you should give an option for the user to sort ascending or descending. 

Here an example from TweetDeck where I have few lists created. Of course, the developers of TweetDeck forgot the sorting - it is manageable (but still annoying) at the moment but can you imagine if I have more than 20 lists created (and I do plan to have more than 20).

Screen shot 2010-10-08 at 10.06.23 PM

Different example I have is the way regions are presented in Windows Azure Portal. Here is the current list:

Anywhere Asia
Anywhere Europe
Anywhere US
East Asia
North Central US
North Europe
South Central US
Southeast Asia
West Europe 

Of course the list is sorted, but is it usable? I don’t think so - If I am US customer I will expect to see all US regions to be together so that I don’t need to search for them. Here is a better choice (again sorted): 

Asia - Anywhere
Asia - East
Asia - Southeast
Europe - Anywhere
Europe - North
Europe - West
US - Anywhere
US - North Central
US - South Central

(Don’t worry! We are fixing this soon ☺)

My last interview tip - use the first list above, and ask the developer to present it in the UI.  

Development is not anymore about moving bits from one place to another - development nowadays is about the experience. I hope the tips above will help you develop better and more usable software in the future. 

 


Advertisement


    Free trial