|
Crossing the Rubicon (Op-Ed)
By mirleid Sat Aug 6th, 2005 at 11:57:59 PM EST
|
|
|
How often have you lovingly designed a system, using proven guidelines and patterns, making sure that there are consistent and coherent interfaces, only to have it butchered later in the development stage by others more concerned with immediate goals like processing X transactions per second, rather than longer term ones like maintainability and scalability?
This has happened to me more than once in projects that I worked on. The end result was always a system that performed according to spec, but that was not viable in the longer term.
So, I ask you: What is more desirable?
- A system that is consistently designed along coherent guidelines, using well understood design principles, even though those design principles might cause it to perform less efficiently than it would otherwise perform if some of those principles were relaxed or altogether dropped
- A system that is designed around achieving the end performance targets but suffers from design inconsistencies (i.e., different components being designed using different approaches) due to addressing performance concerns in inappropriate ways
|
|
|
|
|
|
The questions above become intensely relevant when the task at hand is building a mission-critical enterprise system of any meaningful size, with a projected lifespan in the order of 8-12 years. In fact, the very nature of the system and its life expectancy make issues like technology choices, maintenance and scalability of solution increasingly important. And, I would argue, the system's expected performance normally stands in the way of making the right choices at design time. "What would be the point of creating a beautifully designed system that does not meet performance requirements, and is thus not fit for purpose?" you ask. If you are interested in my take on it, please read on.
Technology choices
When initiating a programme to build a system such as the one described above, one of the first things that needs to be decided is what technologies should be used to support it. By this I mean making choices like
- Should we go WinTel or Unix (oversimplifying the issue, because we can have Linux running on Intel machines, but it'll serve to illustrate the point)
- Which type of programming language should we use (Java, .NET, C++)
- Do we want a relational database or an object database
There are people saying that the targeting of the system should only happen later in its development cycle, but I would argue that these choices need to be made up front. You might not get down to the level of selecting specific vendors, but you need to have a clear idea of what you are going to have available at product level when designing the system. For instance, if the system is required to support a 24/7 mode of operation, and the regulator for the specific market the company is operating on requires you to have two live-live data centers, with a DR site in a different country, this might influence you to chose a specific database vendor and product, due to the parallel operation and data replication requirements that this implies.
Maintenance
This is where things start getting hairy. Let's assume that, after examining your requirements, you decided that what you need is a system running on Solaris targeted at a J2EE-based execution platform. Well, writing EJBs (and doing it properly) is not something that anybody can do and is a skill that is hard to find (contrary to what most people posting CVs on the job sites would have you believe; and, yes, there's a bit more to J2EE than writing JSPs). So, you decide that you need to create some infrastructure code that will be used by the developers employed to write the system: this infrastructure code will materialize a number of patterns and coding guidelines aimed at "dumbing down" J2EE and thus making it possible to employ people that only know J2SE. Furthermore, creating this piece of infrastructure will ensure that there is a consistent "theme" to the code produced, making it simpler to code review and debug (or so you hope). It should also have a beneficial effect on the maintenance of the system, since the code that comprises it will fit a particular pattern that should be well documented (if not simple to understand).
Obviously, there is a flip side:
- Developers will start complaining almost instantly: this piece of infrastructure will necessarily restrict what they can do, and developers do not like to be restricted. If you hired consultants from the Big 5, their expectation is that, by being staffed to your project, they will acquire J2EE coding skills, which your infrastructure is preventing them from doing.
- By its very nature, your infrastructure piece will introduce overhead. This overhead means that there's less time for business logic to execute if SLAs are to be achieved, which leads to more complaining from the developers, because infrastructure is preventing them from delivering to spec. Overall, you are effectively slowing the system down by adding infrastructure.
Scalability
Given that the system must live for quite a long time, it needs to be able to cope with (hopefully) expanding business volumes. In other words, it needs to be able to scale. In order to make sure that it scales, your architecture is based on asynchronous communication, so that you detach the producers from the consumers, enabling you to tune your system and allocate resources where they are most needed.
On the flip side, you have also decided that messages should be passed not in binary format, but in XML, because you communicate with a number of external systems (which is expected to grow) and you do not want to have to keep updating translation code throughout the life of the system. And, from a scalability perspective, this all makes perfect sense: asynchronous communication ensures that you can deploy more consumers if you need to process messages faster (even when the system is live and running), XML ensures that you have decoupled your internal data formats from what external systems expect to see. The problem is that all this has introduced more overheads. And the system has, as a result, become less performant.
So, what do you do?
You are now between a rock and a hard place. Your design achieves all the desired targets except for the one that you'll primarily be measured upon: the system being able to process those business transactions as fast as the business costumers have (more often than not, arbitrarily) decided they should be processed.
The big question is: do you start compromising, "cutting corners" in the architectural design so as to accommodate specific performance requirements (i.e., allowing some components to invoke another synchronously), or you just tell your client to buy (rent/lease/whatever) bigger boxes?
I would argue that the right option is the "bigger boxes" one. Obviously that this is not to say that you should not perform optimisations on your system, nor this is to say that you can get away with producing crap code that runs like a dog. What I am saying is that, in the long run, and at the rate that technology is evolving, you can always throw a bigger box at the problem, assuming your system scales (which is something that you might not even be able to achieve with the "optimised" approach to designing the system, given to the fact that you created a system that uses different techniques and mechanisms in different parts of it, and they respond differently to hardware based scaling). What you cannot do is to revert this state of affairs once you have gone down an "optimised" approach.
Bottom line
Prepare to be confronted with the "what-are-you-talking-about-16-cpus-the-current-system-works-fine-on-1-and-runs-
pretty-fast" syndrome. People that say this neglect to say that that the current system is written in assembler and nobody can maintain it, except for those two contractors that have been around for 10 years earning more money than the CEO. And that there is a reason why it is being replaced.
What they also neglect to say (or put a dollar/pound/euro value on) is that the system cannot be extended because it simply wasn't designed for it: everything that was done to it over the last 10 years was to add yet another patch to something that now looks like prime hippy handicraft.
So, try to argue your point, and argue it strongly. When you finally lose (which you will 9 out of 10 times), remember to get in writing that it was their option. Otherwise, may God have mercy on your soul.
|
|
|