Optimization - still an Enemy?

May 31, 2009

In 1974, within the context of Structured ProgrammingDonald Knuth said “Premature optimization is the root of all evil”.

All evil huh, Donald?

Eventually, with evolution “Don’t optimize early” or “Don’t optimize unless something is a problem” became one of the mantras of software development. In fact, it is one of the most obediently followed mantras in software industry. Probably the reason they follow it so closely is because the rule says “Don’t do it”. Don’t do anything, and you are following the principle. Don’t do anything (that you already don’t wanna do) and you are getting a pat on your back for doing great. How awesome! How easy! Don’t you wish all other principles in software development (and life in general) were as lenient as this one?

That cute little tip from the Guru decades ago has now been cleverly molded to a free-pass to build things of the barest minimum quality just enough to save your ass at the moment. As long as it seems to work somehow, its fine. In a way, its legal and ethical to do things the wrong (often quicker) way as long as its not a problem today. In the uber-speedy AGILE’s caricatures of today, it means “Just do it and push it into production”. In a slightly different scenario, it could mean “Just do it for now and  realize at the end of the week what you just did was a crap”. Whether you are releasing the insecure and bloated piece of unresponsive crap that nobody is proud of or fixing the crap to make it less crap depends on how shameless and ignorant you are. Unless a customer has found the problem, a problem is not a problem. So release it anyway coz its not a problem yet. Let it be a problem tomorrow.

Businesses, in general, seem to have a phobia for quality. Few companies make a quality product, and worry about scaling it to quantities. The majority get quantities of low-quality products, and then spend their life trying to uplift the curse. Some don’t have any clues, they just exist coz they were established and they have to keep moving.

Early analysis & optimization is a measure of your commitment towards quality. Ironically, when you hear of optimization, you think its is a developer’s problem - you think its that extra for-loop in the developers code. I agree, developers are not perfect. But neither is anyone nor is the whole process.

Of course, in a practically perfect team, the customers would know exactly what they want. And the analysts would know exactly whats feasible and appropriate. The architect would architect the perfect holistic view of the business requirements with the system requirements and the technologies. The designers would design the perfect interfaces and workflows. The developers would write the perfect code. And all this would be accomplished in a perfect environment provided by the manager. Oh well, do we still need testers coz we have already written the perfect implementation for the perfect set of requirements?

The reality is that the customer has no clue about what he wants. The analyst has no idea about whats feasible. The architect is the one who retired from programming and hasn’t learnt any new technologies in the last five years. The developers have no clue about anything other than the few shortcuts in their favorite IDEs. Oh ok. We need testers by now. After all, we all are humans and to err is human. So testers are the people that would come at the end and validate the work.

But it solves only a tiny piece of the puzzle. Quality assurance will verify that the codes and UI behaves as per the requirements, and the verification is based on the few sets of tests it runs. It doesn’t validate the requirements. It doesn’t validate the architecture. It doesn’t validate the code. It doesn’t validate the aesthetics or the usability. It doesn’t validate the infrastructure. If you can have bugs in the developers code, you can have bugs in the requirements too. You can have bugs in the architecture too. You can have bugs in the process and management itself. And the effect of defects in the rest of the process is usually a lot bigger than a that of the code and the coder.

You see the fundamental flaw in the process that I am talking about? Validation is done at the end of the whole cycle, and worse, it validates a small portion of the whole work. The rest goes unvalidated for the most part, and you try to hide the filth under the carpet for ever. But the ghost will appear from under the carpet.

What should happen is, after every phase of the software development life cycle, you should validate the output against your standards and optimize it right then if necessary. Don’t let the flaw bubble up the stack. You should optimize your requirement, you should optimize your design, you should optimize your architecture. Everything! If you are just postponing your problems, you are making them bigger and incurable. What would take 2 days to fix it during design will take a month to fix it after its developed.

Early optimization is a taboo word in software development industry. Its a shameful thing to do coz you are breaching the most popular software principle.

If you are a developer programming for the web, its is even more complicated than it appears. You develop the product on your laptop like a laboratory test. What runs on your box will now be shipped to the Internet to be used by people all over the world with different connections, with different clients and tools with different degrees of understanding. Its such a complicated thing that you do have to set your standards and priorities correct. If you don’t think about performance until its a problem, then you are doomed. If you don’t think about scalability, until its a problem, then you are doomed. Because if you got your basics wrong in design and analysis, the only way to fix and optimize it would be to redo the whole thing, which is usually not an option. You cannot optimize a crappy product and make it great again, all you can do is to make it less crappy.

Optimization isn’t about spending weeks in a bat-cave trying to find a problem in somebody else’s shitty code. Its more about doing things right from the beginning and minimizing the stupid things that could be avoided. It about setting up a realistic standard for yourself, and living upto it everyday. It about having a vision even before you do something, and doing what it takes to make it happen. Optimization is about fixing things that you know will be a problem in relevant future. Its about curing the cancer before you even develop it. Its about being optimistic that you will be in the business one year from now.

How many times did you rush a feature in a week, and then spent the next three weeks fixing the problems in production? How many times you made the wrong design decisions coz you thought Early Optimization is evil? How many times have you worked on the rewrites of a recent project? How many times did you try to achieve things like performance, scalability and security backwards, rather than incorporating it from ground up?

NOT Optimizing early is clearly a much bigger problem in software development than optimizing early. But even then, if you can get away with a bad product, why bother making a good product. Right?

Book: Writing secure code

April 3, 2009

Have there been any inventions in human history which are as insecure as the softwares we use today? Just pull up the news in the last couple of weeks:

1. Google shares Docs without your permission: March 7, 2009 and again on March 26, 2009

2. Facebook reveals private photos on wall posts: March 20, 2009

3. Safari browser cracked in 2 seconds: March 18th, 2009

4. Cached data exposes 20,000 Credit cards: March 20, 2009

If you actually go through SANS list, you will be scared. And that is the world we live in today.

writing_secure_code1

Writing Secure Code

An enormous number of softwares have been written, deployed, and exposed over the Internet in the last 10 years without enough thought on Security, and thus Security is going to be a huge huge thing for the next 10 years. After all, you have to clean up your own shit, right (unless you are a dog) ?

I started this book about a month ago, and I just finished it today. Written by Michael Howard and David LeBlanc from Microsoft, the book mostly talks in reference to C/C++, and the Dot Net framework. But unlike all other books that talked about Cryptography, secure protocols and algorithms, this book actually talks about writing secure code on a daily basis, and develops some principles for building secure software. In that sense, although a little old and a little too big, this book is an awesome read for someone wanting to write secure code. While Absolute Security is a Myth, at least you can make it difficult for attackers to exploit the vulnerabilities.

Every good developer is a hacker himself. So the book goes into details of Buffer overflows, Integer overflows, Cross-site Scripting, Sql Injections, Code Access Security, Using proper Access Control Lists, Cryptographic Techniques and its proper use, Encoding and Internationalization, Canonicalization etc.

The book argues that Security should be part of the design rather than an add-on at the end of coding. You should define your trusted and untrusted boundaries, analyze all the threats involved, and evaluate risks associated with the threats, and define your security goals based on the risk factor. All this should be a fairly short, simple and high level process, but it will tell the developer what you need to pay extra attention to and tell the QA what you need to watch out for. Through code review only, you can reduce your bugs by 80%, and most of the bugs found in code review will hardly ever be found through QA testing.

The most interesting side of the book is how it relates all the security problems to some breach of fundamental security principles. Just for my own reference, there are a quite a few security principles stated in the book that should be built into every developer’s subconsciousness:

1. Minimize the attack surface. (Think of hidden fields in forms)

2. All input is evil, unless proven otherwise. Also, assume all external systems you talk to are insecure. (Think whatever you want.)

3. Use principle of least privilege. Use elevated privilege only when you have to, and use it for the shortest amount of time possible.

4. Use defense in depth. Use OS level ACLs as the last line of defense.

5. Avoid security through obscurity. (Microsoft ?)

6. Security features != Secure features. Also don’t ever write your own encryption algorithms (unless ?).

7. Client-side security is an Oxymoron. Don’t try it (at work or otherwise).

All in all, this book has completely changed the way I used to look at my own code and systems, and has definitely made me a better developer and a thinker.

Now I see loopholes everywhere in my own code. Am I guaranteed to be safe from SQL Injection attacks just because I used parameterized query? Am I safe from Cross-site Scripting attacks just because I encoded the output? What if the attacker doesn’t use a browser? Did I canonicalize the filenames after the input? Although the language I use isn’t vulnerable to buffer overflows, am I safe from Integer overflows? What user is my application running as? What if somebody has already hacked into my system? While its not possible to chase after every theoretical bug within the code, we can at least prevent the ones that are obvious or extremely malicious.

But its businesses that build softwares, right? So here comes the million dollar question:

Is it possible to write highly secure softwares without costing extra money and substantial time for the company?

Answer: In general, at least most of the time, YES! Writing secure software is a habit more than anything else. However secure code is only a small piece of the puzzle, and cannot alone make the system secure if other basics are violated.

Wordpress @ Slicehost

March 26, 2009

So I finally moved this blog from a shared-hosting with Godaddy to Slicehost 256MB VPS slice running Ubuntu Hardy . The whole process of setting up DNS and installing Apache, MySql, Postfix and Wordpress (including my favourite theme and plugins) was very easy, and I didnt run into any problems. I did back up my database with Godaddy before migration, but the ‘Export/Import as XML’ seemed to work just fine. All in all, I was able to get it up and running in about an hour with all the content migrated. When there are documents like Mensk.com and Slicehost Articles, you really don’t have anything left to think.

With that saying, I really wanted to get rid of Wordpress this time, or any other Wordpress wannabes. Wordpress is an awesome piece of software, but it’s just not what I ideally would like to have.

1. Wordpress isn’t really suited for posting long snippets of code. If you want to get it working, you end up spending some time trying to fix those endcoding, line wraps and syntax highlighting issues.

2. Wordpress is just too big for me. I don’t need those fancy features.

3. I don’t need databases to store some handful rants of mine. Ideally, I would like to write a blog in a text file (using some basic markup), and then just FTP it to my sever to a specific directory, and it would just work. The day I don’t want to have a blog anymore, I would just grab that directory from my server and take it with me.

4. Everytime I see a cool plugin or a theme I wanna try, I don’t want to be looking into every single line of code to see if there is anything malicious in there.

5. Every time I hear about any new vulnerability found in Wordpress, I don’t want to be worried about doing an upgrade.

I did briefly go through the major blogging and some wiki softwares but they are all built around the same philosophy and more or less suffer from the same problems. At one point, I almost went with Webby (static site-generator based on Ruby), but then I would have to go through a separate plugin for comments like Disqus, which I didn’t want.

So eventually I had to decide between writing my own basic blogging software or using Wordpress. I chose the latter, coz I think there are things way more important to do in the world than writing your own blogging software in 2009. Well, thats might be just another way saying that I am a loser.

Invoking Private Methods

March 3, 2009

A private modifier in Java means that the member(variable or method) can only be accessed in its own class.

By rule, you should always make a class member private unless you have a reason not to. If you want a method to be visible outside of the class, you should make it public or protected. But let’s say you encounter a case when you need to invoke the private method of another class (You might need it while writing JUnit tests, or while writing debugger tools where you need to access all public and private members.). Can you access a private method of Class B from Class A? Is it possible?

Well, yeah. Use Reflection API in Java. This will allow you to supress default Java language access control checks when using reflected members.

The AccessibleObject class within java.lang.reflect package contains a method setAccessible(boolean flag). A false flag will enforce Java Language access checks, where a true flag will supress the access checks. So by setting flag to true, you will be able to invoke a private method of another class.

Lets say we have a Calculator class which has a private method called add.

package access;

public class Calculator {
	private int add(Integer a, Integer b) {
		return a + b;
	}
}

Now, by using Reflecton, you can get a java.lang.reflect.Method object that represents the specified method. The Method object inherits from the java.lang.reflect.AccessibleObject object which provides the setAccessible(boolean flag) method that you can use to supress the access checks.

package access;

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class MainApp {

	public static void main(String[] args) {

		Calculator ac = new Calculator();

		try {

			Class<?> c = ac.getClass();
			Class[] params = new Class[] { Integer.class, Integer.class };
			Method m = c.getDeclaredMethod("add", params);

			m.setAccessible(true);
			Object o = m.invoke(ac, 1, 2);

			System.out.println("The sum of the numbers is: "
					+ ((Integer) o).intValue());

		} catch (NoSuchMethodException x) {
			x.printStackTrace();
		} catch (InvocationTargetException x) {
			x.printStackTrace();
		} catch (IllegalAccessException x) {
			x.printStackTrace();
		}

	}

}

Once you set the Accessible flag to true, you can then invoke the method by passing any arguments that it requires. Running the class will print a sum of 3, which is calculated and returned by the private method ‘add’.

If you dont set the flag to true, you will get an IllegalAccessException saying:

Class access.MainApp can not access a member of class access.Calculator with modifiers “private”.

Note: If there is a Security Manager, the context in which the code is run must have the suppressAccessChecks permission.

Frozen

February 18, 2009

Imagine what would happen to the earth, if the sun was suddenly shut down for like 10 minutes (by some cosmic force). I guess the earth would be wrapped up in ice within instants, and most forms of life would be extinct within minutes. (Now some intellectually and morally bankrupt Hollywood filmmaker will steal this original imagination of mine from my blog and make a science fiction movie out of it, without any due to me. But thats not what I am worried about right now.)

We normally take the sun for granted.

I did a little research. If we are able to trap all the solar energy falling onto the surface of earth for 3 minutes, it will solve all energy needs of the world for the next 25 years.

If the above is true, then how do you justify the $600 billion and 4000 American lives (Iraqis don’t have a life. So lets not count theirs) lost during the Oil-War in Iraq? I wonder how many barrels of oil on average is a soldier that dies equivalent to. And I wonder how many years of solar energy (or any other alternative energy for that matter) is that $600 billion equivalent to.