Category: Java

Nginx Proxy to Jetty for Java Apps

March 4, 2011

Traditionally, I used to go with Apache, Mod Jk and Tomcat to host any Java web apps. But this time I was working on a small hobby project written in Groovy on Grails and had to deploy it to a VPS with a very limited resources. So I had to make the most of the server configuration that I had. So I went with a combination of Nginx and Jetty.

If you’ve never heard of Nginx, it is a very simple HTTP server that is known for its high-performance, low and predictable resource consumption and low memory footprint under load. It uses an asynchronous even-driven model to handle requests which enables it to efficiently handle a large no of requests concurrently.

Similarly, Jetty provides a very good Java Servlet Container. Jetty can be used either as a Standalone application server or can be embedded into an application or framework as a HTTP Component or a servlet engine. It servers as a direct alternative to Tomcat in many cases. Because of its use of advanced NIO and small memory footprint, it provides very good scalability.

Below, I will jot down the steps I went through to configure Nginx as a frontend to Jetty on my VPS running Ubuntu Hardy.

Install Java:

$sudo apt-get install openjdk-6-jdk
$ java -version
java version "1.6.0_0"
OpenJDK  Runtime Environment (build 1.6.0_0-b11)
OpenJDK 64-Bit Server VM (build 1.6.0_0-b11, mixed mode)

$ which java
/usr/bin/java

Install Jetty:

Download the latest version of Jetty, and upload the tar file to your directory of chose on your server.

$ scp jetty-6.1.22.tar user@sacharya.com:/user/java

Now login to your server, go to the directory where you uploaded Jetty above.

$ cd /user/java
$ tar xvf  jetty-6.1.22.tar

Now you can start or stop the Jetty server using the following commands:

$ cd /user/java/jetty-6.1.22/bin
$./jetty.sh start
ps aux | grep java
root     21766  1.2 72.4 1085176 387196 ?    Sl   Mar27   1:12 /usr/lib/jvm/java-6-openjdk/
/bin/java -Djetty.home=/user/java/jetty-6.1.22 -Djava.io.tmpdir=/tmp -jar
/user/java/jetty-6.1.22/start.jar /user/java/jetty-6.1.22/etc/jetty-logging.xml
/user/java/jetty-6.1.22/etc/jetty.xml

The jetty logs are under jetty-6.1.22/logs if you are interested.

Open up your bash profile and set the following paths:

$ vi ~/.bash_profile
JAVA_HOME=/usr/lib/jvm/java-6-openjdk/
JETTY_HOME=/user/java/jetty-6.1.22/

PATH=$JETTY_HOME/bin:$PATH

export JAVA_HOME JETTY_HOME

Now that Jetty is running, you can go the its default port 8080 and verify that everything is working as expected.

Now that you have Jetty, its time to deploy your app to the Jetty container.

$ scp myapp.war  root@sacharya.com:/user/java/jetty-6.1.22/webapps
$ tar -xvf myapp.war

$ vi /user/java/jetty-6.1.22/contexts/myapp.xml

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd">
<Configure class="org.mortbay.jetty.webapp.WebAppContext">
<Set name="configurationClasses">
<Array type="java.lang.String">
<Item>org.mortbay.jetty.webapp.WebInfConfiguration</Item>
<Item>org.mortbay.jetty.plus.webapp.EnvConfiguration</Item>
<Item>org.mortbay.jetty.plus.webapp.Configuration</Item>
<Item>org.mortbay.jetty.webapp.JettyWebXmlConfiguration</Item>
<Item>org.mortbay.jetty.webapp.TagLibConfiguration</Item>
</Array>
</Set>
<Set name="contextPath">/</Set>
<Set name="resourceBase"><SystemProperty name="jetty.home" default="."/>/webapps/myapp</Set>
</Configure>

Restart jetty and go to http://ipAddress:8080/myapp, and you should be getting your app.

Install Nginx:

$ sudo aptitude install nginx

This will install Nginx under /etc/nginx

You can start, stop or restart the Nginx server using the commands:

$ sudo /etc/init.d/nginx start
$ sudo /etc/init.d/nginx stop
$ sudo /etc/init.d/nginx restart

Go to your server ip address (or locahost of local) in your browser, and you should be able to see the default welcome page.

Nginx Proxy to Jetty:
Now, lets point configure Nginx as a proxy to our Jetty Server:

$ cd /etc/nginx/sites-available
$ vi default
Point your proxy_pass to:

location / {
proxy_pass         http://127.0.0.1:8080;
}

Basically, nginx listens on port 80 and forwards it to port 8080. Jetty sets anything on / to /webapps/myapp which means any request to http://127.0.0.1 from nginx is served from http://127.0.0.1:8080/myapp.

Now if you type your IP address or domain name in the browser, content will be served from your application in Jetty. Right now, you are serving everything through Jetty including the scatic files like images, javascript and css. But you can easily serve the static files directly through Nginx: Just add a couple of locations in there:

location /images {
root /user/java/jetty-6.1.22/webapps/myapp;
}
location /css {
root /user/java/jetty-6.1.22/webapps/myapp;
}
location /js {
root /user/java/jetty-6.1.22/webapps/myapp;
}

My final configuration is:

server {
listen   80;
server_name sacharya.com;

access_log  /var/log/nginx/localhost.access.log;

location / {
proxy_pass http://127.0.0.1:8080;
}
location /images {
root /user/java/jetty-6.1.22/webapps/myapp;
}
location /css {
root /user/java/jetty-6.1.22/webapps/myapp;
}
location /js {
root /user/java/jetty-6.1.22/webapps/myapp;
}

# redirect server error pages to the static page /50x.html
#
error_page   500 502 503 504  /50x.html;
location = /50x.html {
root   /var/www/nginx-default;
}
}

Find the Jar File Given a Class Name

March 3, 2011

Often times, while working in Java, you get a ClassNotFoundException or a ClassCastException and you are trying to find find out what Jar the class belongs to and where is it located in the classpath. Your application is either not finding the class or finding the wrong class with the same Class name in the classpath. So you wanna know what Jar is your class coming from at Runtime and whether that is the right class.

Grep to find all Jars with the Class name:

You could write a little bash script to do a find for the class file within your fileSystem, but that doesn’t tell you whats loaded in the classpath. So it will give you shit load of crap that have the class name:

$ cd ~/.groovy
$ $ find . -name "*.jar" -exec sh -c 'jar -tf {} | grep -H --label {} org.apache.commons.httpclient.HttpClient.class' \;
./lib/commons-httpclient-3.1.jar
./lib/commons-httpclient-3.1_patched.jar

This above command will search in the current directory and all sub directories for any jars. Then for each jar file, it will view the contents of the jar file using jar tf and look for the java class org.apache.commons.httpclient.HttpClient.class. The output will be different depending on where I am running the script from.

Java Class to find Jars with the given Class in Classpath:

But you only want to find the Jar File loaded into the Java Classpath. Here’s a simple Java Class that does the same from within an main class:

import java.net.URL;

import org.apache.commons.httpclient.HttpClient;

public class MainApp {

    public static void main(String[] args) {
        System.out.println(findPathJar(HttpClient.class));
    }

    public static String findPathJar(Class<?> context) throws IllegalStateException {
        URL location = context.getResource('/' + context.getName().replace(".", "/")
                            + ".class");
        String jarPath = location.getPath();
        return jarPath.substring("file:".length(), jarPath.lastIndexOf("!"));
    }
}

This will print the jar file in the classpath that contains the class HttpClient.class:

/Users/sacharya/Documents/MyLibs/commons-httpclient-3.1.jar

The output will be same no matter where I am running the class from, since it is looking at the classpath and not the current directory.

Handy Groovy Script to find Jar with the given Class in Classpath:

#!/usr/bin/env groovy

def klass
System.in.withReader {
   println "Enter the Class name you want to find the jar for:"
   klass = it.readLine()
}
def context = Class.forName(klass)
def absolutePath = context.getResource('/' + context.name.replace(".", "/")
           + ".class").getPath()
println absolutePath.substring("file:".length(), absolutePath.lastIndexOf("!"))

Running this as a script, I get:

$ ./getJarFile.groovy
Enter the Class name you want to find the jar for:
org.apache.commons.httpclient.HttpClient
/Users/sacharya/.groovy/lib/commons-httpclient-3.1.jar

Again, the output will be same no matter where I am running the script from.

Last of all, I find the site http://jarfinder.com/ very helpful too to find Jar files for a given class.

Using Memcached with Java

August 10, 2009

Why not JBoss Cache?
By default, if you are looking for a caching solution for your Java based enterprise application, the tendency is to go with Java Caches. I have been using JBoss Cache for a couple of years now. It is a very powerful smart cache, which provides clustering, synchronized replication and transaction support. Meaning, given a cluster of JBoss cache, each instance is aware of the others and will be kept in sync. That way, if one of the instance is down, other instances still be serving your data.

Having been plagued with memory problems over and over again, I finally gave up on JBoss Cache and decided to go with a a simple and dumber solution - Memcached.

Memcached is widely popular esp. in the PHP and Rails community. My main reasons for switching from JBoss Cache to Memcached are:

1. JBoss Cache is replicated, so there is the overhead of syncing the nodes. All the nodes try to keep the same state. Memcached is distributed and each node is dumb about the other nodes. Each piece of data lives in only one of the nodes. And the nodes don’t know about each other. If one node fails, only some hits are missed. While this may seem like a disadvantage, it is actually a blessing if you are willing to give up the complexity for simplicity and ease of maintenance.

2. JBoss cache comes with a pretty complicated configuration. Memcached doen’t require any configuration.

3. JBoss Cache lives in your JVM, and you have to tune the JVM for optimum memory, which isnt always fun as the nature and amount of your data changes . Memcached uses the amount RAM you specify. If the memory becomes full, it will evict older data based on LRU.

In short, the fact that Memcached is so simple and requires almost no maintenance was a big big win for me. However, if your application is such that the sophisticated caches makes sense, you should definitely consider using them.

Memcached:

Memcached server (protocol defined here) is an in memory cache that stores anything from binary to text to primitives associated with a key as a Key-Value pair. Like with any other caches, storing data in memory prevents you from going to the database or fileserver or any backend system everytime a user requests for the data. That saves a lot of load of your backend systems, leading to higher scalability. Since the data is stored in memory, it is generally faster than making an expensive backend call too.

However, Memcached is not a persistent store, and doesn’t guarantee something will be in the cache just because you stored it. So you should never rely on the fact that Memcached is storing your data. Memcached should strictly be used for caching purposes only, and not for reliable storage.

The only limitation with Memcached (that you need to be aware of) is that the key in memcached should be less that 255 chars and each value shouldn’t exceed 1 MB.

Installation:
1. Install Libevent
Memcached uses the Libevent library for network IO.

$ cd libevent-1.4.11-stable
$ autoconf
$ ./configure --prefix=/usr/local
$ make
$ sudo make install

2. Install Memcached:
Download the latest version of Memcached from Danga.com who developed Memcached originally for Livejournal.

$ cd memcached-1.4.0
$ autoconf
$ ./configure --prefix=/usr/local
$ make
$ sudo make install

3. Run memcached:
Start memcached as a daemon with 512MB of memory on port 11211(default). Then you can telnet to the server and port and use any of the available commands.

$memcached -d -m 512 127.0.0.1 -p 1121

$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get joe
END
set joe 0 3600 10  (Note: TTL 3600 and 10 bytes)
California
STORED
get joe
VALUE joe 0 10
California
END

Spy Memcached (Memcached Java Client):
Basic Usage:

There are a few good java clients for Memcached. I briefly looked at the Whalin’s Memcached Client and Dustin’s SpyMemcached Client, and decided to go with the latter for minor reasons.You can start with the API as shown in the docs:

MemcachedClient c=new MemcachedClient(new InetSocketAddress("127.0.0.1", 11211));
c.set("someKey", 3600, someObject);
Object myObject=c.get("someKey");
c.delete("someKey")

The MemcachedClient is a single-threaded client to each of the Memcached server in the pool. The set method sets an object in the cache for a given key. If a value already exists for the key, it overwrites the value. It takes a timeToLive value in seconds, which is the expiration date for the object. Even though there are many requests comings, the client handles only one thread at a time, while the rest wait in the queue. The get method retrieves the object based on the unique queue, and the delete method is used to delete the value.

There are other methods available for storage, retrieval and update but you will get by most of the times just with the three methods get, set and delete.

Security:

By design, memcached Server doesn’t have any authentication around it. So its your job to secure the memcached server or the port from outside network. Furthermore just to obscure the key, you can prefix your key with some secret code or use the hash of the key as the key.

For example:

String randomCode = "aaaaaaaaaaaaaaaaaaaa";
c.set(randomCode + "someKey", 3600, someObject);
Object myObject=c.get(randomCode + "someKey");

Adding/Removing a cache server:

If you need to upscale and want to add a new memcached server, you just need to add the server ip and port to the pool of existing servers, and the memcached client will take it into account. If you want to downscale and get rid of a server, just remove the server from the pool. There will be cache misses for the data living on the server for a while, but cache will soon recover itself as it will starting caching the data onto other available servers. Same thing will happen if you lose connectivity to one of the servers. If you are worried about flooding the database when you lose a memcached server, you should have the data pre-fetched onto another server. However, the memcached server themselves don’t know anything about each others. Its all the function of the client.

MemcachedClient c =  new MemcachedClient(new BinaryConnectionFactory(),
                        AddrUtil.getAddresses("server1:11211 server2:11211"));

Connection Pooling:

The MemcachedClient establishes TCP connection (Facebook has released a modified version of memcached to use UDP to reduce the number of connections) open to the memcached server.So you might want to know how many connections are being used.

$ netstat -na | grep 11211
tcp4       0      0  127.0.0.1.11211        127.0.0.1.59321        ESTABLISHED
tcp4       0      0  127.0.0.1.59321        127.0.0.1.11211        ESTABLISHED

There is really no way to explicitly close the TCP connections. However, since each get or set is atomic in itself, its fairly straightforward to have an array of connections already set up and reuse them. There is no really harm to opening as many TCP connections as you like as Memcached is designed to work well with large number of open connections. Just for predictability, I live to open a fixed no of TCP connections, and reuse the connections. That saves me from having to setup a TCP connection for every operation.

MyCache Singleton:

So with all the changes, here’s what my wrapper around MemcachedClient looks like:

import net.spy.memcached.AddrUtil;
import net.spy.memcached.BinaryConnectionFactory;
import net.spy.memcached.MemcachedClient;

public class MyCache {
	private static final String NAMESPACE= "SACHARYA:5d41402abc4b2a76b9719d91101";
	private static MyCache instance = null;
	private static MemcachedClient[] m = null;

	private MyCache() {
		try {
			m= new MemcachedClient[21];
			for (int i = 0; i <= 20; i ++) {
				MemcachedClient c =  new MemcachedClient(
                                                new BinaryConnectionFactory(),
						AddrUtil.getAddresses("127.0.0.1:11211"));
				m[i] = c;
			}
		} catch (Exception e) {

		}
	}

	public static synchronized MyCache getInstance() {
		System.out.println("Instance: " + instance);
		if(instance == null) {
			System.out.println("Creating a new instance");
			instance = new MyCache();
	     }
	     return instance;
	}

	public void set(String key, int ttl, final Object o) {
		getCache().set(NAMESPACE + key, ttl, o);
	}

	public Object get(String key) {
		Object o = getCache().get(NAMESPACE + key);
        if(o == null) {
        	System.out.println("Cache MISS for KEY: " + key);
        } else {
            System.out.println("Cache HIT for KEY: " + key);
        }
        return o;
	}

	public Object delete(String key) {
		return getCache().delete(NAMESPACE + key);
	}

	public MemcachedClient getCache() {
		MemcachedClient c= null;
		try {
			int i = (int) (Math.random()* 20);
			c = m[i];
		} catch(Exception e) {

		}
		return c;
	}
}

In the above code:
1. I am using the BinaryConnectionFactory (which is a new feature) that implements the new binary wire protocol which provides more efficient way of parsing the text.

2. MyCache is a singleton, and it sets up 21 connections when it is instantiated.

3. My keys are of the format: SACHARYA:5d41402abc4b2a76b9719d91101:key where SACHARYA is my domain. That way I can use the same memcached server to store data for two different applications. The random staring 5d41402abc4b2a76b9719d911017c592 is just for some security through obscurity which we discussed above. Finally the key would be something like userId or username or a sql query or any string that uniquely identifies the data to be stored.

Sample Use:

Generally you can use caching wherever there is bottleneck. I use it at the Data Access Layer layer for saving myself from making a database or a webservice call. If there is a computation-heavy business logic, I cache the output at the business layer. Or you can cache at the presentation layer. Or you can cache at every layer. It all depends on what you are trying to achieve.

public List<Product> getAllProducts() {
        List<Product> products = (List<Product>) MyCache.getInstance().get("AllProducts");
        if(products != null) {
              return products;
        }
        products = getAllProductsFromDB()
        if(products) {
              MyCache.getInstance().put("AllProducts", 3600, customer);
        }
        return products;
}

public void updateProduct(String id) {
        updateProductIntoDB(id)
        MyCache.getInstance().delete("AllProducts");
}
public void deleteProduct(String id) {
        deleteProductFromDB(id)
        MyCache.getInstance().delete("AllProducts");
}

Warming the Cache:

When the application is first started, there is nothing in the cache. So you might want to pre-warm the cache through a job scheduler, just to avoid large no of backend calls at once. I generally like to put this piece put outside of the application itself. It could be a separate app in itself where you prewarm the cache based on the hit-list of keys.

Measuring Cache Effectiveness:

The stats command provides important information about how your cache is performing. Among other parameters, it provides the total get request and how many were hit and missed.

$ telnet localhost 11211
stats
STAT cmd_get 13219
STAT get_hits 12232
STAT get_misses 512

This means of total 13219 cache requests, it came back with results for 12232, resulting in 12232/13210=92.5% of cache hit, which isn’t that bad.

Now once you have a general idea of your cache hit rate, you can improve it even further by logging which particular requests were missed and optimizing them over time.

You can get the memory stats by using command “stats slabs” or you can invalidate items in cache using “flush all”.

Conclusion:

You should never rely on your cache only though. If you somehow lost connectivity to your caching server, the application should perform exactly the same. You should use caching only for scalability and/or speed. Implementing the cache itself is pretty simple. The difficult part is which data to cache, how long to cache, when to invalidate the cache, when to update stale data, and how to prevent the database being flooded once the cache is invalidated. This is something that depends on the nature of your data, how fresh you want it and how you update it. You should keep on measuring the stats and gradually improve the effectiveness over time.

Java on Google App Engine

April 9, 2009

Google launched Java support on Google App Engine yesterday, which is the Google Cloud Infrastructure. Different companies like to define this ambiguous term called ‘Cloud’ to their own benefits, but mostly what they are talking about is a cluster of a few virtual machines, that are easier to provision on demand compared to a traditional dedicated server. But unlike many others, Google’s really makes it look like a cloud - not just in words.

As much as I like to hate Google, I think this one is going to have a slow but defining impact on how the community is going to embrace Cloud Hosting.

While Amazon seems to be leading the Cloud industry, Google seems to be warming up with a different intention. While the rest of the competitors are ecstatic with the few million dollars they have monetized, Google doesn’t seem to be bothered about money as yet (coz they can afford to?). While many of the so-called cloud-service providers are busy convincing the enterprises for a trial of the Cloud, Google for the time being plans to sell the Evil Opium mostly for free and obviously, Google’s target is the fearless newer generation, rather than some stubborn corporate CTO and CEO.

Java Hosting has traditionally been very difficult and expensive, and you can imagine why there are so many applications written and deployed in a language like PHP. I am pretty sure that if Google makes it as easy and welcoming for the developers for the next two years, this accidental  and unforseen collaboration of dynamic JVM languages with Google’s cloud might pave the way for Java (JVM) to be  the new PHP of the Internet, and Google the GOD in the Clouds.

It really is exciting to any developer, and it couldn’t have come from anyone other than Google, and it was all more or less expected as part of the Google’s master plan of the Internet.

Personally, I think the whole “Cloud” thing is illusive and evil, and it will just give better control of the Internet to the corporate giants, making them even richer and more powerful.

Invoking Private Methods

March 3, 2009

A private modifier in Java means that the member(variable or method) can only be accessed in its own class.

By rule, you should always make a class member private unless you have a reason not to. If you want a method to be visible outside of the class, you should make it public or protected. But let’s say you encounter a case when you need to invoke the private method of another class (You might need it while writing JUnit tests, or while writing debugger tools where you need to access all public and private members.). Can you access a private method of Class B from Class A? Is it possible?

Well, yeah. Use Reflection API in Java. This will allow you to supress default Java language access control checks when using reflected members.

The AccessibleObject class within java.lang.reflect package contains a method setAccessible(boolean flag). A false flag will enforce Java Language access checks, where a true flag will supress the access checks. So by setting flag to true, you will be able to invoke a private method of another class.

Lets say we have a Calculator class which has a private method called add.

package access;

public class Calculator {
	private int add(Integer a, Integer b) {
		return a + b;
	}
}

Now, by using Reflecton, you can get a java.lang.reflect.Method object that represents the specified method. The Method object inherits from the java.lang.reflect.AccessibleObject object which provides the setAccessible(boolean flag) method that you can use to supress the access checks.

package access;

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class MainApp {

	public static void main(String[] args) {

		Calculator ac = new Calculator();

		try {

			Class<?> c = ac.getClass();
			Class[] params = new Class[] { Integer.class, Integer.class };
			Method m = c.getDeclaredMethod("add", params);

			m.setAccessible(true);
			Object o = m.invoke(ac, 1, 2);

			System.out.println("The sum of the numbers is: "
					+ ((Integer) o).intValue());

		} catch (NoSuchMethodException x) {
			x.printStackTrace();
		} catch (InvocationTargetException x) {
			x.printStackTrace();
		} catch (IllegalAccessException x) {
			x.printStackTrace();
		}

	}

}

Once you set the Accessible flag to true, you can then invoke the method by passing any arguments that it requires. Running the class will print a sum of 3, which is calculated and returned by the private method ‘add’.

If you dont set the flag to true, you will get an IllegalAccessException saying:

Class access.MainApp can not access a member of class access.Calculator with modifiers “private”.

Note: If there is a Security Manager, the context in which the code is run must have the suppressAccessChecks permission.