Sunday, August 29, 2010

Facebook Chat with Pidgin

In a previous post, I wrote about how you can use Pidgin to replace Office Communicator (OCS). Now, I will show you how you can use Pidgin for Facebook chat, thereby making Pidgin a truly universal chat client!
  • Start the Pidgin client.
  • Add a new Account with the following details:
    • Protocol: XMPP
    • Username: yourfacebookusername
    • Domain: chat.facebook.com
    • Password: facebookpassword
    • On the Advanced tab, untick "Require SSL/TLS".

Friday, August 27, 2010

Leaving DB

Today is my last day at Deutsche Bank.

I've been at DB for six years. I started as a graduate and, over the years, have learnt so much, not just about technology, business and processes, but also about myself. I've had the opportunity to work with some of the best minds in the bank and have made some great friends. I would like to thank all my colleagues for their guidance and support and for making my time at DB so enjoyable. I will miss you all!

Leaving DB is by far the hardest decision I've ever had to make, but I feel that the time has come for me to move on to new challenges.

I'm certainly planning to keep blogging and tweeting, so stay tuned to find out what I am getting up to.

Journey onward!

Wednesday, August 25, 2010

Software Inventory of my Windows Machine

As I will be leaving my computer soon, I have decided to make a quick list of useful applications currently installed:
CategoryNameDescription
DatabaseAqua Data StudioDatabase Query Tool
DatabaseJaySQLJDBC Database Tool
DatabaseToadOracle Database Tool
DevAntBuild Tool
DevAxis 1.4Web Services Engine
DevAxis 2Web Services Engine
DevChainsawLog Viewer
DevCoberturaCode Coverage Tool
DevEclipseIDE
DevGroovyProgramming Language
DevGWTGoogle Web Toolkit
DevHermesJMSJMS Browser
DevHSQLDBJava Database Engine
DevJADJava Decompiler
DevJavaProgramming Language
DevMavenProject Build Tool
DevPerlProgramming Language
DevPuTTySSH Client
DevPython 2.6Programming Language
DevSciTESource Code Editor
DevsoapUIWeb Service Caller
DevSQLiteSQL database engine
DevTomcatServlet Engine
DevTortoiseCVSCVS Client
DevWinMergeDifferencing and Merging Tool
Devxmlbeans-2.4.0XML to Java type binding
EditorAltova XML SpyXML Editor
EditorEmacs 22.3Text Editor
EditorTextPad 4Text Editor
PerformanceCachemanXPWindows Tuneup Utility
PerformanceCCleanerPC Cleaner
PerformanceDefragglerDisk Defragmenter
PerformanceHijackThisSystem Scanner
ProductivityAlt-Tab Task SwitcherMS Powertoys
ProductivityCalculator PowertoyMS Powertoys
ProductivityClipXClipboard History Manager
ProductivityCmdHere PowertoyMS Powertoys
ProductivityLClockClock
ProductivityPassword SafePassword Manager
ProductivityProcessExplorerTask Manager
ProductivityPsToolsWindows Tools
ProductivitySlickRunQuick Application Launcher
ProductivityTaskixReorder taskbar items
ProductivityTweak UIMS Powertoys
ProductivityUnxUtilsGNU Utilities for Windows
Productivityxplorer² liteWindows File Manager
UtilitiesConvert Image To PDFPDF Converter
UtilitiesdoPDF 7PDF Converter
UtilitiesFoxitPDF Reader
UtilitiesWinRARArchive Manager
UtilitiesxmltidyTextPad addon
WebMozilla FirefoxWeb Browser
WebPidginInstant Messenger
And, ofcourse, browsing wouldn't be the same without my Firefox addons:
Firefox Addons
Adblock Plus
All-in-One Sidebar
British English Dictionary
Cache Status
Colorful Tabs
Delicious Bookmarks
Download Statusbar
DownThemAll!
dragdropupload
FaviconizeTab
Firebug
Fission
Flashblock
Forecastfox
FoxClocks
Greasemonkey
Mouse Gestures Redox
Tab Mix Plus
Tab Preview
Ubiquity
URL Fixer
If you think you have better alternatives for any of the applications above, please write a comment!

Saturday, August 21, 2010

Faster XPaths with VTD-XML

I've recently started using VTD-XML for applying XPaths on large XML documents. DOM is a memory hog and is too slow. However, VTD-XML allows you to run XPaths and provides random access to nodes, similar to DOM, but much more efficiently. You can't apply XPaths with a SAX parser nor can you access nodes randomly or traverse the document easily.

VTD-XML was 60 times faster compared to DOM when processing my XML document (20MB).

This post shows you how to use VTD-XML for fast XPath evaluation.

Sample XML:
I will use the following XML document in the examples below.

<?xml version="1.0"?>
<catalog>
 <book id="bk101">
  <author>Gambardella, Matthew</author>
  <author>Doe, John</author>
  <title>XML Developer's Guide</title>
  <genre>Computer</genre>
  <price>44.95</price>
  <publish_date>2000-10-01</publish_date>
 </book>
 <book id="bk102">
  <author>Ralls, Kim</author>
  <title>Midnight Rain</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2000-12-16</publish_date>
 </book>
 <book id="bk103">
  <author>Corets, Eva</author>
  <title>Maeve Ascendant</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2000-11-17</publish_date>
 </book>
</catalog>
Loading the XML document:
The following code parses the XML file and creates the navigator and autopilot objects.
final VTDGen vg = new VTDGen();
vg.parseFile("books.xml", false);
final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
Selecting all titles:
Print out all the title nodes using an XPath expression of /catalog/book/title. First call selectXPath to compile the expression and then use evalXPath to move the cursor to the selected nodes in the result.
ap.selectXPath("/catalog/book/title");
while (ap.evalXPath() != -1) {
  int val = vn.getText();
  if (val != -1) {
    String title = vn.toNormalizedString(val);
    System.out.println(title);
  }
}
Selecting all book ids and authors:
This one is a bit more involved as a book can have many authors. In the code below, I first run an XPath to select the books and then iterate over the children, selecting the author nodes.
ap.selectXPath("/catalog/book");
while (ap.evalXPath() != -1) {
  int val = vn.getAttrVal("id");
  if(val != -1){
    String id = vn.toNormalizedString(val);
    System.out.println("Book id: " + id);
  }

  if(vn.toElement(VTDNav.FIRST_CHILD,"author")){
    do{
      val = vn.getText();
      if(val != -1){
        String author = vn.toNormalizedString(val);
        System.out.println("\tAuthor:" + author);
      }
    }while(vn.toElement(VTDNav.NEXT_SIBLING,"author"));
  }
  vn.toElement(VTDNav.PARENT);
}
The output is:
Book id: bk101
 Author:Gambardella, Matthew
 Author:Doe, John
Book id: bk102
 Author:Ralls, Kim
Book id: bk103
 Author:Corets, Eva
Related Posts:
Using XPath with DOM

Friday, August 20, 2010

Fixing a ConcurrentModificationException

Question:
The following code throws a ConcurrentModificationException. What additional code can you add between the <FIXME>...</FIXME> tags in order to prevent this exception from being thrown?
final List<String> list = new ArrayList<String>();
list.add("HELLO");
final Iterator<String> iter = list.iterator();
System.out.println(iter.next());
list.add("WORLD");
//<FIXME>

//</FIXME>
System.out.println(iter.next());
Solution:
In this example, a ConcurrentModificationException is thrown because the Iterator detects that the list over which it is iterating has been changed. If you look into the source code for these classes you will find that when an Iterator is created, it contains an int variable called expectedModCount which is initialised to the modCount of the backing list. Whenever the backing list is structurally modified (with an add or remove operation, for example) then the modCount is incremented. As a result, the iterator's expectedModCount no longer matches the list's modCount and the iterator throws a ConcurrentModificationException.

In order to prevent this exception from being thrown, we need to bring the expectedModCount of the iterator and the modCount of the list back in line with each other. Here are a couple of ways this can be done:

1. Reflection:
Reflection is the easiest way to change the internal counters of the iterator and list. In the fix below, I have set the expectedModCount of the iterator to the same value as the modCount of the list. The code no longer throws the ConcurrentModificationException.

final List<String> list = new ArrayList<String>();
list.add("HELLO");
final Iterator<String> iter = list.iterator();
System.out.println(iter.next());
list.add("WORLD");
//<FIXME>
/* Using Reflection */
try{
  //get the modCount of the List
  Class cls = Class.forName("java.util.AbstractList");
  Field f = cls.getDeclaredField("modCount");
  f.setAccessible(true);
  int modCount = f.getInt(list);

  //change the expectedModCount of the iterator
  //to match the modCount of the list
  cls = iter.getClass();
  f = cls.getDeclaredField("expectedModCount");
  f.setAccessible(true);
  f.setInt(iter, modCount);
}
catch(ClassNotFoundException e){
  e.printStackTrace();
}
catch(NoSuchFieldException e){
  e.printStackTrace();
}
catch(IllegalAccessException e){
  e.printStackTrace();
}
//</FIXME>
System.out.println(iter.next());
2. Integer Overflow:
Another approach is to keep modifying the list until the integer modCount overflows and reaches the same value as expectedModCount. At the moment, modCount=2 and expectedModCount=1. In the fix below, I repeatedly change the list (by calling trimToSize), forcing modCount to overflow and reach expectedModCount. This code took 38s to run on my machine.
final List<String> list = new ArrayList<String>();
list.add("HELLO");
final Iterator<String> iter = list.iterator();
System.out.println(iter.next());
list.add("WORLD");
//<FIXME>
for(int i = Integer.MIN_VALUE ; i < Integer.MAX_VALUE ; i++){
  ((ArrayList)list).trimToSize();
}
//</FIXME>
System.out.println(iter.next());

Sunday, August 15, 2010

DateFormat with Multiple Threads

The DateFormat class is not thread-safe. The javadocs state that "Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally."

The following code shows how you would typically use DateFormat to convert a String to a Date in a single-threaded environment. It is more efficient to get the format as an instance variable and use it multiple times so that the system doesn't have to fetch the information about the local language and country conventions multiple times.

public class DateFormatTest {

  private final DateFormat format =
            new SimpleDateFormat("yyyyMMdd");

  public Date convert(String source)
                      throws ParseException{
    Date d = format.parse(source);
    return d;
  }
}
This code is not thread-safe. We can test it out by invoking the method using multiple threads. In the calling code below, I create a thread pool with 2 threads and submit 5 date conversion tasks to it. I then examine the results.
final DateFormatTest t = new DateFormatTest();
Callable<Date> task = new Callable<Date>(){
    public Date call() throws Exception {
        return t.convert("20100811");
    }
};

//lets try 2 threads only
ExecutorService exec = Executors.newFixedThreadPool(2);
List<Future<Date>> results =
             new ArrayList<Future<Date>>();

//perform 5 date conversions
for(int i = 0 ; i < 5 ; i++){
    results.add(exec.submit(task));
}
exec.shutdown();

//look at the results
for(Future<Date> result : results){
    System.out.println(result.get());
}
When the code is run the output is unpredictable - sometimes it prints out the correct dates, sometimes the WRONG ones (e.g. Sat Jul 31 00:00:00 BST 2012!) and at other times it throws a NumberFormatException!

How can you use DateFormat concurrently?
There are different approaches you can take to use DateFormat in a thread-safe manner:

1. Synchronization
The easiest way of making this code thread-safe is to obtain a lock on the DateFormat object before parsing the date string. This way only one thread can access the object at a time, and the other threads must wait.

public Date convert(String source)
                    throws ParseException{
  synchronized (format) {
    Date d = format.parse(source);
    return d;
  }
}

2. ThreadLocals
Another approach is to use a ThreadLocal variable to hold the DateFormat object, which means that each thread will have its own copy and doesn't need to wait for other threads to release it. This is generally more efficient than synchronising sa in the previous approach.

public class DateFormatTest {

  private static final ThreadLocal<DateFormat> df
                 = new ThreadLocal<DateFormat>(){
    @Override
    protected DateFormat initialValue() {
        return new SimpleDateFormat("yyyyMMdd");
    }
  };

  public Date convert(String source)
                     throws ParseException{
    Date d = df.get().parse(source);
    return d;
  }
}

3. Joda-Time
Joda-Time is a great, open-source alternative to the JDK's Date and Calendar API. It's DateTimeFormat is "thread-safe and immutable".

import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;
import java.util.Date;

public class DateFormatTest {

  private final DateTimeFormatter fmt =
       DateTimeFormat.forPattern("yyyyMMdd");

  public Date convert(String source){
    DateTime d = fmt.parseDateTime(source);
    return d.toDate();
  }
}

Saturday, August 14, 2010

Using Compressed JMS Messages

If you are publishing large XML messages onto a JMS topic or queue, compression will give you much better performance because less data is sent over the network. Also, your JMS server can hold more messages and there is less risk of running out of memory. XML messages are great candidates for compression, due to the repetitive nature of XML.

Compressing JMS messages
The following code shows how you can create a compressed BytesMessage and publish it onto a topic:

InputStream in = null;
GZIPOutputStream out = null;
try {
  ByteArrayOutputStream bos = new
                            ByteArrayOutputStream(1024 * 64);
  out = new GZIPOutputStream(bos);

  String filename = "input.xml";
  in = new BufferedInputStream(new FileInputStream(filename));

  byte[] buf = new byte[1024 * 4];
  int len;
  while ((len = in.read(buf)) > 0) {
      out.write(buf, 0, len);
  }
  out.finish();

  //publish it
  BytesMessage msg = session.createBytesMessage();
  msg.writeBytes(bos.toByteArray());
  publisher.publish(msg);
}
catch (IOException e) {
  e.printStackTrace();
}
finally {
  if (in != null) {
    try {
        in.close();
    }
    catch (IOException ignore) {
    }
  }
  if (out != null) {
    try {
        out.close();
    }
    catch (IOException ignore) {
    }
  }
}
Decompressing JMS messages
The following code shows how you can decompress a JMS BytesMessage when your subscriber receives it and write it to file:
if (mesg instanceof BytesMessage) {
 final BytesMessage bMesg = (BytesMessage) mesg;

 byte[] sourceBytes;
 try {
    sourceBytes = new byte[(int) bMesg.getBodyLength()];
    bMesg.readBytes(sourceBytes);
    System.out.println("Read " + sourceBytes.length + " bytes");
 }
 catch (JMSException e1) {
    throw new RuntimeException(e1);
 }
 GZIPInputStream in = null;
 OutputStream out = null;
 try {
    in = new GZIPInputStream(
         new ByteArrayInputStream(sourceBytes));
    String filename = "message.xml";
    out = new FileOutputStream(filename);
    byte[] buf = new byte[1024 * 4];
    int len;
    while ((len = in.read(buf)) > 0) {
        out.write(buf, 0, len);
    }
    System.out.println("Wrote to " + filename);
 }
 catch (IOException e) {
    e.printStackTrace();
 }
 finally {
    if (in != null)
        try {
            in.close();
        }
        catch (IOException ignore) {
        }
    if (out != null)
        try {
            out.close();
        }
        catch (IOException ignore) {
        }
 }
}

Thursday, August 12, 2010

Java Monitoring Tools

JDK 6.0 comes bundled with a number of handy, but often overlooked, utilities to monitor, manage and troubleshoot Java applications. They can be found in the bin directory of your installation. Here are a few of them explained:

1) jps - report java process status
This command prints information about active java processes running on a given machine. The output contains the JVMID, the name of the main class and any arguments passed to it or the JVM.

sharfah@starship:~> jps -lmv
3936 sun.tools.jps.Jps -lmv -Xms8m
5184 test.TestClient -Xmx1024m

2) jstat - statistics monitoring tool
This command allows you to monitor memory spaces of a JVM. For example, using the "-gcutil" option will show you the utilisation of the eden (E), survivor (S0, S1), old (O) and permanent (P) generations and how long the minor and major GCs are taking. You can gather these statistics continuously by specifying a sampling interval and how many samples you wish to take.

sharfah@starship:~> jstat -gcutil 5184 1s 5
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
  0.00   0.00   3.04  85.47  11.68      5    0.009     1    0.023    0.032
  0.00   0.00   3.04  85.47  11.68      5    0.009     1    0.023    0.032
  0.00   0.00   3.04  85.47  11.68      5    0.009     1    0.023    0.032
  0.00   0.00   3.04  85.47  11.68      5    0.009     1    0.023    0.032
  0.00   0.00   3.04  85.47  11.68      5    0.009     1    0.023    0.032

3) jstack - stack trace
Prints out a complete thread dump of your application (just like "kill -3"). Useful for investigating what your application is doing and identifying deadlocks.

4) jmap - memory map
Use this command to print a histogram of the heap to show you the number of instances of each java class and how much memory they are occupying. You can also dump the heap to a file in binary format and then load it into Eclipse Memory Analyser as I described here.

sharfah@starship:~> jmap -histo 5184
sharfah@starship:~> jmap -dump:format=b,file=heap.bin 5184

5) jhat - heap analysis tool
This command reads a binary heap file (produced by the jmap command, for example). It launches a local webserver so that you can browse the heap using a web browser. The cool thing is being able to execute your own queries using Object Query Language (OQL) on the heap dump.

sharfah@starship:~> jhat heap.bin

6) jinfo - configuration info
Prints out java system properties (like the classpath and library path) and JVM command line flags. Doesn't work on Windows though! Also allows you to enable/disable/change VM flags.

sharfah@starship:~> jinfo -flag PrintGCDetails  4648
-XX:-PrintGCDetails
sharfah@starship:~> jinfo -flag +PrintGCDetails 4648