Category: Rants

Optimization - still an Enemy?

May 31, 2009

In 1974, within the context of Structured ProgrammingDonald Knuth said “Premature optimization is the root of all evil”.

All evil huh, Donald?

Eventually, with evolution “Don’t optimize early” or “Don’t optimize unless something is a problem” became one of the mantras of software development. In fact, it is one of the most obediently followed mantras in software industry. Probably the reason they follow it so closely is because the rule says “Don’t do it”. Don’t do anything, and you are following the principle. Don’t do anything (that you already don’t wanna do) and you are getting a pat on your back for doing great. How awesome! How easy! Don’t you wish all other principles in software development (and life in general) were as lenient as this one?

That cute little tip from the Guru decades ago has now been cleverly molded to a free-pass to build things of the barest minimum quality just enough to save your ass at the moment. As long as it seems to work somehow, its fine. In a way, its legal and ethical to do things the wrong (often quicker) way as long as its not a problem today. In the uber-speedy AGILE’s caricatures of today, it means “Just do it and push it into production”. In a slightly different scenario, it could mean “Just do it for now and  realize at the end of the week what you just did was a crap”. Whether you are releasing the insecure and bloated piece of unresponsive crap that nobody is proud of or fixing the crap to make it less crap depends on how shameless and ignorant you are. Unless a customer has found the problem, a problem is not a problem. So release it anyway coz its not a problem yet. Let it be a problem tomorrow.

Businesses, in general, seem to have a phobia for quality. Few companies make a quality product, and worry about scaling it to quantities. The majority get quantities of low-quality products, and then spend their life trying to uplift the curse. Some don’t have any clues, they just exist coz they were established and they have to keep moving.

Early analysis & optimization is a measure of your commitment towards quality. Ironically, when you hear of optimization, you think its is a developer’s problem - you think its that extra for-loop in the developers code. I agree, developers are not perfect. But neither is anyone nor is the whole process.

Of course, in a practically perfect team, the customers would know exactly what they want. And the analysts would know exactly whats feasible and appropriate. The architect would architect the perfect holistic view of the business requirements with the system requirements and the technologies. The designers would design the perfect interfaces and workflows. The developers would write the perfect code. And all this would be accomplished in a perfect environment provided by the manager. Oh well, do we still need testers coz we have already written the perfect implementation for the perfect set of requirements?

The reality is that the customer has no clue about what he wants. The analyst has no idea about whats feasible. The architect is the one who retired from programming and hasn’t learnt any new technologies in the last five years. The developers have no clue about anything other than the few shortcuts in their favorite IDEs. Oh ok. We need testers by now. After all, we all are humans and to err is human. So testers are the people that would come at the end and validate the work.

But it solves only a tiny piece of the puzzle. Quality assurance will verify that the codes and UI behaves as per the requirements, and the verification is based on the few sets of tests it runs. It doesn’t validate the requirements. It doesn’t validate the architecture. It doesn’t validate the code. It doesn’t validate the aesthetics or the usability. It doesn’t validate the infrastructure. If you can have bugs in the developers code, you can have bugs in the requirements too. You can have bugs in the architecture too. You can have bugs in the process and management itself. And the effect of defects in the rest of the process is usually a lot bigger than a that of the code and the coder.

You see the fundamental flaw in the process that I am talking about? Validation is done at the end of the whole cycle, and worse, it validates a small portion of the whole work. The rest goes unvalidated for the most part, and you try to hide the filth under the carpet for ever. But the ghost will appear from under the carpet.

What should happen is, after every phase of the software development life cycle, you should validate the output against your standards and optimize it right then if necessary. Don’t let the flaw bubble up the stack. You should optimize your requirement, you should optimize your design, you should optimize your architecture. Everything! If you are just postponing your problems, you are making them bigger and incurable. What would take 2 days to fix it during design will take a month to fix it after its developed.

Early optimization is a taboo word in software development industry. Its a shameful thing to do coz you are breaching the most popular software principle.

If you are a developer programming for the web, its is even more complicated than it appears. You develop the product on your laptop like a laboratory test. What runs on your box will now be shipped to the Internet to be used by people all over the world with different connections, with different clients and tools with different degrees of understanding. Its such a complicated thing that you do have to set your standards and priorities correct. If you don’t think about performance until its a problem, then you are doomed. If you don’t think about scalability, until its a problem, then you are doomed. Because if you got your basics wrong in design and analysis, the only way to fix and optimize it would be to redo the whole thing, which is usually not an option. You cannot optimize a crappy product and make it great again, all you can do is to make it less crappy.

Optimization isn’t about spending weeks in a bat-cave trying to find a problem in somebody else’s shitty code. Its more about doing things right from the beginning and minimizing the stupid things that could be avoided. It about setting up a realistic standard for yourself, and living upto it everyday. It about having a vision even before you do something, and doing what it takes to make it happen. Optimization is about fixing things that you know will be a problem in relevant future. Its about curing the cancer before you even develop it. Its about being optimistic that you will be in the business one year from now.

How many times did you rush a feature in a week, and then spent the next three weeks fixing the problems in production? How many times you made the wrong design decisions coz you thought Early Optimization is evil? How many times have you worked on the rewrites of a recent project? How many times did you try to achieve things like performance, scalability and security backwards, rather than incorporating it from ground up?

NOT Optimizing early is clearly a much bigger problem in software development than optimizing early. But even then, if you can get away with a bad product, why bother making a good product. Right?

Wordpress @ Slicehost

March 26, 2009

So I finally moved this blog from a shared-hosting with Godaddy to Slicehost 256MB VPS slice running Ubuntu Hardy . The whole process of setting up DNS and installing Apache, MySql, Postfix and Wordpress (including my favourite theme and plugins) was very easy, and I didnt run into any problems. I did back up my database with Godaddy before migration, but the ‘Export/Import as XML’ seemed to work just fine. All in all, I was able to get it up and running in about an hour with all the content migrated. When there are documents like Mensk.com and Slicehost Articles, you really don’t have anything left to think.

With that saying, I really wanted to get rid of Wordpress this time, or any other Wordpress wannabes. Wordpress is an awesome piece of software, but it’s just not what I ideally would like to have.

1. Wordpress isn’t really suited for posting long snippets of code. If you want to get it working, you end up spending some time trying to fix those endcoding, line wraps and syntax highlighting issues.

2. Wordpress is just too big for me. I don’t need those fancy features.

3. I don’t need databases to store some handful rants of mine. Ideally, I would like to write a blog in a text file (using some basic markup), and then just FTP it to my sever to a specific directory, and it would just work. The day I don’t want to have a blog anymore, I would just grab that directory from my server and take it with me.

4. Everytime I see a cool plugin or a theme I wanna try, I don’t want to be looking into every single line of code to see if there is anything malicious in there.

5. Every time I hear about any new vulnerability found in Wordpress, I don’t want to be worried about doing an upgrade.

I did briefly go through the major blogging and some wiki softwares but they are all built around the same philosophy and more or less suffer from the same problems. At one point, I almost went with Webby (static site-generator based on Ruby), but then I would have to go through a separate plugin for comments like Disqus, which I didn’t want.

So eventually I had to decide between writing my own basic blogging software or using Wordpress. I chose the latter, coz I think there are things way more important to do in the world than writing your own blogging software in 2009. Well, thats might be just another way saying that I am a loser.

Frozen

February 18, 2009

Imagine what would happen to the earth, if the sun was suddenly shut down for like 10 minutes (by some cosmic force). I guess the earth would be wrapped up in ice within instants, and most forms of life would be extinct within minutes. (Now some intellectually and morally bankrupt Hollywood filmmaker will steal this original imagination of mine from my blog and make a science fiction movie out of it, without any due to me. But thats not what I am worried about right now.)

We normally take the sun for granted.

I did a little research. If we are able to trap all the solar energy falling onto the surface of earth for 3 minutes, it will solve all energy needs of the world for the next 25 years.

If the above is true, then how do you justify the $600 billion and 4000 American lives (Iraqis don’t have a life. So lets not count theirs) lost during the Oil-War in Iraq? I wonder how many barrels of oil on average is a soldier that dies equivalent to. And I wonder how many years of solar energy (or any other alternative energy for that matter) is that $600 billion equivalent to.

REST WS is like fastfood!

January 17, 2009

I don’t know why I always had this picture in my mind. When I think of SOAP, I would think of a typical public company and its IT department in a never-ending integration dilemma. When I think of REST, I would mistakenly visualize some recently hacked social networking site or web2.0 application surviving on Google ads. I don’t know where that image came from, and but I know its not fair and not always true.

SOAP wasn’t the easiest or the best, but there wasn’t anything better - up until REST emerged, when suddenly people started typecasting SOAP as bulky and over-engineered for most problems. I went with the crowd. If SOAP WS was the protocol of the Enterprise, REST brought a new set of developers to the game. REST was the protocol of the get-it-done and we-will-see believers. 

While I have been consuming RESTful web services for a couple of years now, I hadn’t actually designed and implemented one yet. But I knew HTTP, so I thought I knew REST. Just represent a resource in the URL and let different actions be invoked on it. If you use REST, you probably are using Rails or Grails or Django or the likes. That makes it even easier. Just make your domain object addressable through the URL, and thats all. We will address the other concerns in future iterations.

But once I started writing one myself, I started digging into books and articles and discussions looking for technical details of the RESTful world. But all I saw was a mostly bewildered crowd and a small creative bunch that has already embraced REST. And all advices and guidelines I could read was purists and fanatics who stamp this is Low REST, this is High REST, this is POX (Plain Old XML), this is unRESTfully REST, this is RESTfully unREST. I don’t understand why RESTful Quotient is so overemphasized. Who the hell do you think anyone cares what Roy Fielding meant by REST in 2000? We just care about RESTful web services over HTTP, with an emphasis on every word except REST. For me RESTful services means a way where I can expose some functionality through the URL. Whether I want to be disciplined about what I call resources is my design, but REST as an architecture style would have to answer me my enterprise concerns.

For me, a webservice - whether it is RPC or SOAP or REST, should have  to answer the same enterprise concerns. The difference is only how they approach it:

1. Simple: Ok, REST is simple. REST is simple if you are talking about the barest minimum it takes to get it up and running. 

2. Secure: There are new Standards like OAuth, and HMAC-based Authentication, that makes REST a little too complex but they solve only a limited set of problems and scenarios. How do you secure your resource? If you are POSTing or PUTing an XML file, how do you secure it? How can you keep talking about REST without talking about authentication mechanism for XML or any data exchange format for that matter? That takes us back to the SOAP envelope and WS Security or something similar, doesn’t it?

Developers work on tight deadlines, since security isn’t a builtin aspect of REST, developers tend to sacrifice it to meet deadlines, just like they used to do with JUnit. That makes a REST security practically a nice to have. The few of the REST services that I have used all dealt with critical information, but one of them didn’t even have a basic authentication, one had Basic and at the most it had Basic over SSL.

3. Transactional: Is a resource the correct unit of transaction? What if you need to deal with transactions across multiple resources? Do you handle that on the serverside or let the client handle it? If I make a convenience API to deal with transactions and expose it as URL, would you blame me for breaching the RESTful contract? How important is being truly RESTful? Does REST even want to address transactions or is PUT and POST on a resource enough?

4. Efficient: How can it be efficient without support for complex transactions? Whether it is REST or not, webservices are to provide the clients an appropiate control to their assets, and it should be reasonably fast. Again, is providing the necessary functionality important or is being RESTful important? How do I CRUD on a collection of resources? Most of the RESTful public APIs that I have used either provide me too much (they are painfully slow) or provide me too less. What I need is information, and it could be across multiple resources.

Is RESTful service anything more than pretty URLs taking and returning XML/JSON? Does it really make sense when I need information and action across resources? Is it still fine if there is no API definition and keeps changing? Is it still fine if it only does the happy path well? How many developers know enough about REST to ensure that the standards that should have been specified and enforced by the architecture are met? Its scary that too many critical web services have already been written in REST just for the sake of ease, without putting too much thoughts on it.

I probably will never have to go back to SOAP again(its not my decision though), but the day REST has answered the important questions, it will look like some sort of SOAP WS stack - back to where we started.

Until then, REST is like fastfood. SOAP WS is a little tedious to cook, but you won’t regret it.

Erlang - overhyped or underestimated?

November 14, 2008

Erlang is like an exotic beautiful woman with no dressing sense.

I came across Erlang about a year ago. It is a language of a kind - precise, crafted and powerful. Its been there for about two decades now, and people have been using it for serious real-time applications. But the hype Erlang is getting in the last couple of years is mostly for wrong reasons.

Having said that, its a language every programmer should look into at least once - its refreshing. And it might just be the perfect language that fits your needs.

What doesn’t Erlang have? It has its own Virtual Machine that it runs on. It is ridiculously simple to write distributed applications, with its message-passing style concurrency. It has its own database. It supports hot code swapping without even restarting the servers, and its already been proven in some telecom applications that need very high uptime. Hot code swapping is still a nightmare even in a supposedly mature language like Java which is targeted at building web applications - that have high availability and scalability.

However there are certain things that either suck or are not as convincing about Erlang.

1. Today’s mainstream developers who are used to C or Java like syntax wont find its Prolog-like syntax too friendly. Unless you started programming since the 80’s & 90’s and are used to languages of similar syntax, it will take quite some time before you get comfortable with Erlang’s weird syntax. I never felt too comfortable with the syntax.

2. While the core language itself is small and easy to learn, the libraries within the language are inconsistent, incomplete and poorly documented. I posted a couple of questions in the forums regarding how to use a library, and usually the answers would be “Don’t use that library, use the other one”. (Oh yeah, kind of like the Java JDK Logging. Please use Commons Logging.)

3. Only a few people have written production level codes and you rarely get to hear from them. All you hear from is Erlang enthusiasts, who are hyping it as the next big thing, but haven’t done more than a few labs from Armstrong book.

4. I can’t imagine how you can organize large code-bases in Erlang or even work as team, and this doesn’t feel right to any OO programmer.

5. Most of the performance matrices are one-sided, and are performed by people who have an interest in Erlang. I would love to see some independent analysis.

6. Its support for web-development is very primitive. With web frameworks like rails and grails, there is a lot of serious work for Erlang if it ever intends to go to that market.

7. Did I talk about Strings in Erlang? IO speed?

I know weaknesses aren’t as important as the strengths of a language. Erlang has it own expertise, its syntax structure, and its own audience. But the flaws of Erlang might just turn away a new programmer, even before he gets to its beauty.

If you are writing a web crawler, Erlang may very well be your choice. If you want to write a client-server, where the client makes a large no of requests, and you want to spawn concurrent processes to process the requests, Erlang could be your choice. If you want to write a Distributed Hash Table, Erlang could be your choice. Or if you are writing a video streaming server or doing system integration or writing any system utility. But a regular developer (building a CRUD application on top of a database, right? ) doesn’t have much to do with Erlang as yet. Secondly, even if you are working on those highly scalable, reliable and concrurrent systems, people have a hard time accepting Erlang along with its flaws.

The industry has a definite space for Erlang, currently and more so in future as we deal with more and more users, more data, and more forms of distribution. If not for Erlang exactly, then for an improved version of Erlang. It isn’t here to be the next Java, but to solve out the problems that Java couldn’t do smoothly in over a decade (despite having such a great community).

Erlang is going from an underestimated to an overhyped language. I wish it can convert the hype and raw interest in Erlang into something meaningful. How about a modern variation of Erlang on the Erlang’s virtual machine. Is it too late?