Tag: form-based file upload

File Upload (Using Java and Commons Upload)

January 31, 2008

Form-based File upload in HTML is defined in RFC 1867. The enctype attribute of the form element in HTML specifies the content type of the data submitted to the server. The default encrypt type is application/x-www-form-urlencoded . This content type is not efficient for forms containing non-ASCII data, binary data or files. In such case, the encrypt type multipart/form-data is used. A typical HTML form for file upload looks like:

<form action="fileUploadServlet" enctype="multipart/form-data" method="post">;
<input name="file" type="file" />;
<input name="submit" type="submit" value="submit" />;
</form>

Once the form is submitted, the binary multipart form-data is available in the HttpServletRequest as an InputStream. The regular form data are not available as Parameter and request.getParameter(“submit”) will always return null. So for any data submitted, you will have to know the content and then retrieve the content by parsing through the stream:

InputStreamReader input = new InputStreamReader(request.getInputStream());
BufferedReader buffer = new BufferedReader(input);
String line="";
while(line=buffer.readLine()!=null) {
     System.out.println("Multipart data " + line );
}

You can write a parser yourself to analyze each multipart data from the stream and grab it as they apply to your application. There are some good ones already written. Apache has a Commons Fileupload Library, now let’s see how we can use it to upload files.

Using Commons Fileupload API:

First of all, download the commons-fileupload.jar and put in in your classpath. It has a dependency on commons-io.jar. So, download it and put it in the classpath too.Given a request object, using the API you can check if it is a multipart/form-data request.

ServletFileUpload.isMultipartContent(request);

Now you know whether the request is multipart request or not. You can choose to decide whether the uploaded item is retained in memory or written temporarily to a file on disk.

FileItemFactory factory = new DiskFileItemFactory();
ServletFileUpload upload = new ServletFileUpload(factory);
List items = upload.parseRequest(request);

(For portlets, you have PortletFileUpload class which works just like the ServletFileUpload.)The default constructor DiskFileItemFactory() used to create the factory writes the item to memory before processing it further. You can provide the constructor an integer threshold of file-size (in bytes above which files will be stored in disk in the given repository path, and below which they are retained in memory)and the repository path on the disk to create a factory.

DiskFileItemFactory factory = new DiskFileItemFactory(0, "/test");

You can also set different configuration options:

DiskFileItemFactory factory = new DiskFileItemFactory();
factory.setSizeThreshold(1000);
factory.setRepository("/test");
ServletFileUpload fileUpload = new ServletFileUpload(factory);
fileUpload.setSizeMax(10000); // in bytes, -1 indicates no limit

Once you have the FileItems, you want to know which of the FileItems are regular form fields, and which of them are file uploads, and treat them accordingly.

Iterator iter = items.iterator();
while (iter.hasNext()) {
     FileItem item = (FileItem) iter.next();
     if (item.isFormField()) {
        //Now, process the regular form field
        String name = item.getFieldName();
        String value = item.getString();
     } else {
        //Process the fileItem
        String fileName = item.getName();
        String contentType = item.getContentType();
        boolean isInMemory = item.isInMemory();
     }
}

If the item is in memory, you can either get the file as an InputStream or as array of bytes or write the item to a file.

//You can now get the data by reading the stream
InputStream inputStream = item.getInputStream();
//Get the file in an array of bytes
byte[] fileInBytes = item.get();
 //Or, write the item to a file
item.write(new File("file.txt"));

If it is a DiskFileItem (meaning, temporarily stored in a given location on disk), you can get the File Object from the temporary storage on the disk. Once you have the handle to the file, you can do whatever you like with it.

File file = item.getStoreLocation();

These temporary files in the disk are deleted automatically once the corresponding File object is no longer used and garbage collected. At any point, if an exception occurs, or if the file is too big, or you are done possing the file, you should delete the FileItem when you no longer require it.

item.delete();

When you do, item.getName(), the file name returned depends on the browsers. Some browsers like (IE and Opera) return the original file name along with the whole path information, while others (Firefox) return only the file name. So you will have to strip off the unwanted part of the file name.

String fileName = "C:\upload\file.txt";
fileName.substring(fileName.lastIndexOf("\")+1);
String fileName1 = "/upload/file1.txt";
String fileName1.substring(fileName1.lastIndexOf("/")+1);

This is just the basics of uploading a file from a client machine to the server using the web browser. Now if the file is too big, it may take some time to upload. So you may want to update the user about the progress by creating a ProgressListener() class. Users can now upload file of all sorts which may contain some viruses. So the first thing you need to do on server side is to scan the file item and check for virus. Files which are held in memory are not seen by virus scanners. So the files have to be written to a disk temporarily and scanned for viruses before they are sent over to the permanent storage.