Friday, December 14, 2007

Mines in the Field of GWT Development Planning

Recently, Google Web Toolkit attracted attention of web developers from all over the world. GWT is a great technology for AJAX development actually. It helps to get rid of many head-aches associated with cross-browser development, user interaction and development cycle.

The library provides unprecedented possibilities to build Web 2.0 applications with high levelof interactivity. And there is a trap.

Often, the user interface of web-applications is simpler to build than desktop UIs due to two factors. First, the HTML/CSS ecosystem provides for a great number of tools to easily express designers vision. And there is no need to build/run/debug cycle for the web-pages.

Second, HTML provides for a significantly lower level of interactivity that is encapsulated by browser. So developers don't need to debug each.

Revolutionary AJAX development GWT changes these factors also. GWT apps can be highly interactive and they are component based. So the initial expectation about easy change of web applications’ look'n'feel is not completely true anymore. Now, if you want to change it in the visual design this may require a significant effort. Moreover, as we now use components visual design is limited by available set of controls. A little spicy text edit field invited by a web-designer may create a real mess for the dev team. Isn't it like in the good old times of the desktop application UI?

So GWT development requires very tight cooperation of the design and development teams. Keep these guys in one place, preferably in a room with closed doors and a small window to serve food.

Another feature of a GWT application is interactivity. Yes, this is a great feature and a great head-ache.

As I have said before, web 1.0 applications rarely provide an interactive response on the user's actions. So the first move of web designers is to provide static mockups of the UI. This does not work here. Developers need a specified behavior of the application. And the behavior is a part of the code. That is not an easy thing to change. So you need to have enough reserved resources to debug and tweak the UI.

So let me say this again. Keep your developers and interface designers as close as possible. May be this is why Google apps (gmail, reader and others) so well thought. Usability the team works very closely with developers.

Wednesday, December 12, 2007

Production system maintenance. Part 2

In the previous post I've described several organizational moment of the production system update. Now it is time for technical tips and tricks.

Name code equally on all system nodes
Usually, this is a good idea to name all things equally across systems you manage. It can be a real problem to have jboss4, jboss4.0.5 and jboss as names of the same piece of code on you cluster nodes. Use single and simple convention for code naming and placement.

Name resources differently
While single naming schema for code lowers amount of time spent on unproductive things, resource naming is much different. In mature environment code can be easily restored from several places like developer machines, continuous integration, staging servers, source repository & etc. Resources (i.e. data) are different. Information and schema of your database might be unique on some moments. Moreover, the data may represent great value for your company.

So damage to data should be avoided by any means. As the first line of defence name your databases differently depending on type of environment (production, testing, development & etc), contents and schema version. I usually use following db identifiers:

contents-timestamp_of_last_schema_update-database_type

For example: chirp-20071210-prod While looks quite complex, the given notation may protect you from actions done on a wrong data by mistake. Unfortunately, "Oh God, I've dropped wrong database" problem is not so rare I was expected firstly. ;)

Use consistent hostnames
Correctly set hostname is not an absolute requirement for server functioning. But right names may give you some help during exploitation. It is funny to have server called 'snoopy', if you have 3 total. When you have more, their names should be little more transparent and linked with their properties as the hosting data center and ip address.

Also server software need to know name of the server it runs on. That is why hostname and dns name must be the same.

Highlight your current context Most maintenance errors I have seen were "right command with in a wrong context". For example, I have dropped production db while were thinking it was staging environment. (This is why I make backup of all data on the production server before update now).

So put information about your current host, database, directory everywhere.
  • Put hostname, username and directory must present in the command line prompt.
  • Put hostname into the xterm title.
  • Put hostname and database name into mysql client prompt.
And be attentive.
Don't work as root
Actually, this is impossible. Just try to work as superuser as less as possible. root or Administrator can do many dangerous things. You know, one error and your root filesystem is empty. ;-)

Have a remotely controlled power switch
This is actually is not required for VPS. However, this may be essential for
dedicated servers. It is not a so rare task for system administrator to configure network remotely. And this may be very very very inconvenient to have a node with improperly configured network which can be managed by ssh only.

Yes, data center support team may press "Power" for you. But only if you pay for 24x7 support.

Use cluster management software
Not long ago, Debian package of the day published an article about ClusherSSH which I found very useful. There are several similar programs also. While relatively simple, these programs may easy your life and decrease number of errors.

Production system maintenance

Lately, I've spent around 5 hours recovering production system from failure caused by improper actions of engineering staff during update procedure. I was lucky, so it was not me who has pressed the red button. But after that emergency situation I was trying to analysis causes of the problem and establish a better deployment process for software update. So these are thoughts somebody may found interesting:

Keep people without relevant experience out during important updates
Yes, some are pretty cool in development and they may be very attentive. But production system update require engineer to be a little paranoid. If you have not bad experience in this area, you probably will be unable to predict side-effect and bad-thing-that-might-happen. In ideal case the update must be done by a pair of engineers. Human-factor will be depressed much in that case.

Have a detailed update plan
Well, if the first thing is not the case... Prepare a detailed update plan with all things (shell commands, file & database operation & etc) included. It would be wonderful if you test it in your staging environment first. In ideal case you must have an auto-install package (deb, rpm or what ever).

Have a detailed fallback plan
This is continuation of the previous thought actually. Your update plan should include graceful fall back procedure description. Some errors appear on production system only. Or your development team might miss something, or . So you must have a way to return to previous version. Always.

Have a Plan B
In some cases fall back scenario can not help. For example in case of improper execution of update script. For example, if somebody deleted important files or dropped wrong database. Yes, your update script contains backup step. But what if that step failed silently?
General rule is "wait for trouble, always". So regular backup of production is a must have thing. Some data centers (as ours) even provides daily server backups in default service package.