Sunday, December 26, 2010

Tuning HBase and Mapreduce for Mass Inserts

For several months my colleagues and I have been earning our battle scars as we run progressively larger bulk inserts into HBase. I'm going to try to recall as many of the lessons learned as possible. At the time of this writing HBase 20.6 is the latest production release:

  1. Use  HBase 20.6. We had been running 20.3 and it suffered from a hideous deadlock bug, that would cause clients to simply stall and do nothing, until they were killed by the Hadoop task tracker. Even worse, because they were killed and restarted, this caused a vicious cycle of reducers stalling, restarting, stalling, etc. All the while threads and client connections were eaten up in HBase.
  2. Use huge numbers of mappers, and huge numbers of reducers so that your data sets are broken into bite sized pieces. To do this, set mapred.tasktracker.[map|reduce].tasks.maximum (the max value that will be run by a single task tracker at once) to a value correlated to the number of cores on each box (we'll cal this "C") and the number of boxes "B". Then set set mapred.[map|reduce].tasks (the total number of mappers or reducers that will get run) equal to a*C*B. "a" is then a free parameter which controls how big your bite sized batches of data will be. As a rule of thumb, if you are reading data out of HBase via TableMapper, I try to make a*C*B between 1000 and 10000, but it all depends on how much processing you ned to do on each chunk.
  3. This is an addendum to rule (2). All sorts of bad things happen when your mappers or reducers bite off too much data, and run for too long. Just a few items on the buffet of errors and warnings that you'll chase if your mappers or reducers try to handle too much data:
    • HBase clients leases expiring
    • Chance of OoME
    • Failure to report status for more than 600 seconds causes task to get shot in the head
    • Hbase client slows down on writes(annecdotal. I can't prove this, but I know that if you want to avoid ptential slowdowns, then following rule 2, in my experience, helps a lot)
  4. Use fine grained status. Make calls to progress(), setStatus(msg), and use counters to track what your job is doing. Nothing is more annoying that not being able to tell what the hell your job is doing when it has been running for 5 hours. Rule 2 helps here too. The more mappers and reducers you have, the more continuous and fine grained will be your status reporting as each task completes with its data load into HBase.
  5. Don't screw around too much with the default settings. Ryan Rawson has put out a good set of defaults in this PPT. You can turn these dials ad infinitum but I can pretty much promise you that there is no magic set of settings. This can be a massive time sync. Stick with Ryan's recommendations.
  6. Watch out for problems with your hostnames. Let's say your hostname is "". If you execute "hostname" and all you get back is "s1" or "localhost" you are in for problems. Check your /etc/hosts and make sure that aliases like "s1" don't come before the fully qualified hostname.
  7. Watch out for msiconfigured NICs. Problems with vanishing packets sent me chasing my tail for weeks. This was the classic buffet of errors, with everything seemingly leading to no resolution.
  8. Disable the Write Ahead Log from your client code. Use an appropriately sized write buffer. There'sa point of diminishing returns here. Exceeding the size of the serverside buffers won't do you any good.
  9. Don't try to get too smart. Most bulk HBase data-producing M/R jobs are going to do some processing, then write the data from the reducer. Since the reducers all receive keys in the same order, this causes all the reducers to attack the same HBase region simultaneously. We had this "great idea" that if we reversed the keys that we wrote out of our mapper, then un-reversed them in the reducer, that our reducers would be randomly writing to different region servers, not hitting a single region in lock step. Now, I have some theories on why this seemingly innocuous approach repeatably destroyed our entire Hbase database. I won't wax philosophical here, but one thing is certain. Any table created via batch inserts of randomized keys got totally hosed. Scans became dirt slow and compactions ran constantly, even days after the table was created. None of these problems made a whole lot of sense, which is why it took 3-4 weeks of debugging for us to back this "key randomizing" out of our code. The hosed tables, actually had to be dropped for the problem, and ensuing chaos to totally abate.
  10. tail all your logs and scan them assiduously for errors and warnings. I use VNCserver with different desktops for tails of all the logs so that I never have to "re-setup" all my tailf commands when I log in and out of the server.
  11. Disable speculative execution. You don't want two reducers trying to insert the exact same content into the database.

Monday, November 29, 2010

Black Friday Buys

Picked up a wireless access point for 10 bucks on Placed it at the highest point in the house, and set it up to pull it's own IP address from DHCP off my existing wireless access point, so it's basically a range extender. For a $10 "TRENDnet", I was really impressed at how easy it was to setup. I tried the same thing with a Linksys several months back. The linksys would work on and off and just periodically die, and it cost something like $50. Anyway, I'm posting from my living room and receiving a good signal, so definitely worth the $10.

Saturday, November 27, 2010

NoSQL: Quite Possibly The Stupidest Catch Phrase Ever?

A recent article on Gigaom is titled "Why cloud computing sells and NoSQL is fading".  I'd like to parlay that discussion into one of "why NoSQL is quite possibly the stupidest catch phrase ever". Here's where I'm coming from. Clearly the "NoSQL" moniker arose from a desire to describe systems that avoided the scalability limits of relational databases. But from the get-go, the term is just...STUPID. First of all, SQL is a query language, and there are relational systems that don't use SQL. So, "NoSQL" isn't even a good way to say "non-relational". Secondly, defining systems in terms of what they *don't* do means that dozens of systems that shouldn't get grouped together, do get grouped together. Corn: NoSQL! Windbreakers: NoSQL!

Adding the confusion was this classic: Map Reduce: A Major Step Backwards.  It's amazing that this was written by Michael Stonebreaker, but this just goes to show you that even a genius can write a ridiculous article when he's writing for a PR blog. Then there was the 180 degree follow up (ooops, how stupid do we look?): Reaffirming Our Commitment to Hadoop and Mapreduce.

Honestly, the entire NoSQL thing is, to me, a mark of short sited PR as well as a failure to recognize that SQL databases can and must peacefully co-exist with systems designed to tackle other aspects of scalability and data processing. When Hadoop and Mapreduce get thrown in with "NoSQL" it especially drives me nuts, because Hadoop and Mapreduce are not even data retrieval or management systems! C'mon folks, Hadoop is for data processing. And there are lots of reasons to take the output and put it in a relational database. 

So, I was happy to see the Gigaom article. The NoSQL emperor has no clothes. It's a stupid catch phrase, and a clear indicator that the person using it is clueless.

Sunday, October 24, 2010

SPIRE 2010 Los Cabos

I recently attended the String Processing and Information Retrieval conference (SPIRE 2010). In terms of content, this conference far exceeded my expectations. Excellent presentations on the latest research. Stuff I'd *never* even heard of: Wavelet Trees for example. Very friendly presenters, who enthusiastically spent time with me between presentations to thoroughly teach some of the concepts.

 Yahoo! Research presents Time Explorer

Microsoft Research Keynote

No complaints about the venue either. Met so many cool and interesting researchers in the field of IR (which is basically "search").

Java SimpleDateFormat: simply a waste of Time

I've now sunk probably 8 hours  into trying to make a minimally robust date parser that can handle dates in a variety of formats. Sadly, I'm afraid this is the best I've been able to do, given the fragility of the Java SimpleDateFormat class. I've composed a little framework in two pieces. The first piece is a properties file that defines the formats that the user will be allowed to enter dates in. The second piece is the code that jumps through an absurd series of hoops in order to work around bugs and shortcomings in SimpleDateFormat. If anyone knows a better way, that's been *tested*, please, for the love of GOD, post a comment here.

format1=yyyy-MM-dd HH:mm:ss.SSS z
format2=yyyy-MM-dd HH:mm:ss.SSS
format3=yyyy-MM-dd HH:mm:ss
format4=yyyy-MM-dd HH:mm
format5=E, MMM dd, yyyy
format6=MMM dd yyyy
format7=MMM dd yyyy HH:mm
format8=MMM dd, yyyy
format9=MMM. dd, yyyy
format14=yyyy-MM-dd, E
format15=yyyy MMM. dd, HHh 1mmm ss's'

Notice in the formats above that all months and days are defined with two format chars. E.g. "MM" or "DD". Why not "M" or "D"?  SimpleDateFormat format strings are very rigid. We want the user to be able to enter 10/1/07 or 10/01/07. But I don't want to clutter the properties file with single and double format chars. For this reason, the code automatically pads any single digits with zero. So "1" in isolation is converted to "01". So if the user enters "Jul 7, 2001" the string is automatically zero-padded to "Jul 07, 2001". This allows the properties file to include only the double-digit variants of date numbers without having to account for every variation in which a digit could be either single or double digit. With regard to the code below, I haven't taken time to extract it as a library, so please forgive the unexplained parameters such as "id" and "paramName".This code convers the time provided by the user into the current Greenwich Mean Time.

Java Source Fragments

private static final Pattern ISOLATED_DIGIT = Pattern.compile("(^|\\D|\\s)(\\d)(\\D|$|\\s)"); //digit can be surrounded by non-digit but also begining or end of line or whitespace

public static Object parseDate(String dateString, String id, String paramName) {
        if(null == dateString){
            return dateString;
            //default time stamp is Zulu time (time at prime meridian, Greenwich)
            long currentTime = System.currentTimeMillis();
            Timestamp ts = new java.sql.Timestamp(currentTime-TimeZone.getDefault().getOffset(currentTime));
            return ts;
            dateString = zeroPadIsolatedDigits(dateString);
            if(log.isDebugEnabled())log.debug("input zero-padded to " + dateString);
            for(Map.Entry entry:DATE_FORMATS.entrySet()){
                String key = entry.getKey().toString();
                DateFormat dateFormat = new SimpleDateFormat(entry.getValue().toString());
                    java.util.Date date = dateFormat.parse(dateString);
                    //I have found bugs in the parse method, in which strings that *don't* actually match the format template
                    //get parsed without Exception, thus truncating trailing parts of the date (like the timezone). For this reason
                    //I am forced to regurgitate the date string, and compare it to the input, and if they don't match, skip forward in the loop
                    String regurgitated = dateFormat.format(date);//...and no, we can't just compare the regurgitated string b/c PST and get adjusted to PDT
                    if(regurgitated.length() != dateString.length()){ //compare their lengths to make sure nothing got truncated
                        if(log.isDebugEnabled())log.debug(dateString + " was parsed with format " + entry.getValue() + " but was apparently parsed incorrectly as " + regurgitated + " length difference was "+(regurgitated.length()-dateString.length()));
                        //the length of the regurgitated string matches the length of the input.
                        //We still need to eat the regurgitated string, and compare the resulting  ms since epoch to the Date we originally parsed to make sure we didn't
                        //encounter a SimpleDateFormat parsing bug.
                         //Example: 2010-9-23 12:22:23.000 PST gets regurgintated as 2010-09-23 13:22:23.000 PDT, which is different text, but same ms since epoch
                        //So the above illustrates why we cannot just compare the regurgitated text to the input dateString, because Java may decide on an equivalent but
                        //textually different representation from the input (PST is the same as PDT in the half of the year when daylight savings time isn't in effect).
                        java.util.Date reparsed = dateFormat.parse(regurgitated);
                        if(reparsed.getTime() != date.getTime()){
                            if(log.isDebugEnabled())log.debug(dateString+" produces different ms since epoch than  " +regurgitated );
                    if(log.isDebugEnabled())log.debug("handled date" + dateString +" using format template: " + entry.getValue().toString() + " which regurgitated " + dateString);
                    TimeZone timeZone = dateFormat.getTimeZone();
                    //convert to GMT time and account for timezone
                    return new java.sql.Timestamp(date.getTime()-timeZone.getOffset(date.getTime()));
                }catch(ParseException pe){
                        log.debug(dateString + "couldn't be parsed with format " + entry.getValue());
            throw Util.newApplicationException(id, "date "+ dateString+" could not be parsed.","DataFormatException", 130, paramName);

     private static String zeroPadIsolatedDigits(String s) {
        //Java date parsing will prepend leading zeros to islolated digits. For this reason, I prepend a zero to the isolated digits before parsing the string, so that the regurgitation step
        //produces a string identical to the input
        boolean done = false;
        //this is an obscure case in which the matching regions overlap, therefore m.find() only finds one of the overlapping instance.
        //For instance, 100-1-2. Find() won't find "-2 " because it overalps "-1-". Consequently we have to repeatedly call find on a new matcher after each replacement has been made
        while (!done) {
            Matcher m = ISOLATED_DIGIT.matcher(s);
            StringBuffer buf = new StringBuffer();
            if (m.find()) {
                if(log.isDebugEnabled())log.debug("found isolated digit:" +;
                m.appendReplacement(buf, + "0" + +; //need to stuff back in the stuff before the isolated digit, then 0 (zero-pad), then the digit, then stuff after)
            }else {
                done = true;
            m.appendTail(buf); //OK, al isolated digits have been replaced by leading-zero padded digits
            s = buf.toString(); //replace input with zero-padded input
        return s;

Saturday, October 23, 2010

Data Storm

This morning's project was installing a Davis Vantage Pro weather station. It's really an amazing piece of equipment. It logs windspeed and direction, solar radiation, and dozens of other variabls. It even has a rain collector. The data comes off the unit in real-time, and is transmitted to a nice LCD console that shows graphs, weather forecasts etc. I hope to log enough data on sun and wind in order to make a determination as to whether solar panels or wind power will be my best alterative energy source. I'm also interested in quantifying just how bizarre the microclimate of western san francisco actually is. How many days is it foggy? What time does the fog typically roll in or out in a given month? With a bit of free time, I hope to be able to load the data into nextdb to create a detailed data history. 

 I'm already getting some valuable siting data on windspeed. There are several interesting vertical-axis helical wind-power generators that might be appropriate. Hmmm.

Saturday, September 25, 2010

Government "Black Ice" Becomes a Reality

Keeping with the William Gibson theme, the future - whoops, I meant "the present"- look increasingly like something dreamed up by said sci fi author.  This week we learned about STUXNET: Government Black Ice designed to attack industrial control facilities, and in particular the Bashir Nuclear Reactor. The sophisticated source code includes the ability to inject itself into programmable logic devices; hacking the actual hardware itself, relying on proprietary hardware secretes stolen from two Taiwanese manufacturers.  Apparently too complex to have been written by a hacker in a dorm room, the media concludes that it must be "state sponsored". Maybe. Probably. But in a present that increasingly looks like the cyberscape described in Neuromancer we shouldn't rule out multinational corporations as the source. As pointed out in the media, STUXNET *is* a *weapon*. Cyberspace is a battlefield where corporations, states, and individuals can wage war on a practically equal footing, with the ability to cloak their identity. Imagine that; anonymous war with motives known only to the participants. Target: a nuclear reactor.

Wednesday, September 01, 2010

Awesome Op-ed in New York Times by William Gibson

This a fantastic Op-ed by William Gibson on Google. 

Thursday, July 15, 2010

Neuromancer Vs. Matrix

Neuromancer was written by William Gibson and published in 1984.  I first read it high school back in 1993. Every few years I read it again, and each time I find something new to like.

In my most recent reading, I find myself noticing how much of the concept matter in The Matrix is based on Neuromancer. Gibson coined the term "The Matrix" and "cyberspace", and Gibson describes a Rasta "freeside" called "Zion" (albeit in orbit), and simstim memory sticks that jack in through a socket behind the ear. It's hard to imagine that Gibson wrote Neuromancer before there was a world wide web.

"The Matrix is ... a concentual hallucination. ..A graphic representation of data abstracted from the banks of every computer in the human system. Unthinkable complexity. Lines of light arranged in the nonspace of the mind, clusters and constellations of data. Like city lights, receding..."

This part reminds me of the "symbols" that fall down the screens in the movie The Matrix:

"And in the bloodlit dark behind his eyes, silver phosphenes boiling in from the edge of space, hypnagogic images jerking past like film compiled from random frames. Symbols, figures, faces, a blurred fragmented mandala of visual information". 

Wednesday, July 07, 2010

Android Earcon

Did you notice that Google's Android phones have a startup sound that's quite familiar...that is, if you are a fan of Blade Runner. The little "bootup jingle" is from the opening scene soundtrack by Vangelis in  Blade Runner. Listen here, its at 1 minute 40 seconds.

Hadoop Summit

Last week I attended the Hadoop Summit. It was sold out ten days prior! That doesn't happen very often these days, in an environment where corporate travel is severely cut back. I even saw sizable contingents from Japan and Brazil. This is a very exciting time to be involved in the field of search and data processing. The science track was quite interesting, as was the talk by facebook about how they are crunching something like 90 TB per week, and are siting on 85 Peta Bytes. They are using hadoop mapreduce in near real-time. Processing latencies are getting close to 1 minute. That's nearly real-time availability of all the information you post on facebook, to everyone else (and every moving part) in the facebook system. Oh, and the lunch was great.

Tuesday, July 06, 2010

badassery with bits: counting 1 bits

I was recently looking at some code to count the number of bits in an integer that are set to 1. Here is the method:

int = getARandomInt();
   n &= n-1;
print("There are "+count+" bits turned on.");

Can you figure out why it works?

hint: negative one has every bit turned on because negating a number is to take it's two's complement (which flips every bit), and add 1. Watch what happens with the carry bit when you add n to -1.

Sunday, June 13, 2010

Matrix Challenge

Here is an interesting challenge posted on the website of a company called Rapleaf:

We wish to find, for an arbitrary N, an N x N matrix with the following property: each entry in the matrix is the smallest non-negative number which does not appear either above the entry or to its left.
  1. Write a function that computes the entry in a particular row and column. That is, complete the following:

    int entry_at(int row, int column) {
        // ...
  2. Can you write the entry_at function to run in constant space and constant time?
  3. Can you prove that your algorithm is correct?

 For instance, here is a 6x6

I posted a solution before, which was wrong, because I only looked at a 6x6, and my solution failed for larger matrices. Thanks to whomever posted the comment pointing out that my solution was broken. So I rewrote it in Java, after looking at some larger matrices.  Here is a printout of the 32x32 matrix produced by the program.

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
 1  0  3  2  5  4  7  6  9  8 11 10 13 12 15 14 17 16 19 18 21 20 23 22 25 24 27 26 29 28 31 30 
 2  3  0  1  6  7  4  5 10 11  8  9 14 15 12 13 18 19 16 17 22 23 20 21 26 27 24 25 30 31 28 29 
 3  2  1  0  7  6  5  4 11 10  9  8 15 14 13 12 19 18 17 16 23 22 21 20 27 26 25 24 31 30 29 28 
 4  5  6  7  0  1  2  3 12 13 14 15  8  9 10 11 20 21 22 23 16 17 18 19 28 29 30 31 24 25 26 27 
 5  4  7  6  1  0  3  2 13 12 15 14  9  8 11 10 21 20 23 22 17 16 19 18 29 28 31 30 25 24 27 26 
 6  7  4  5  2  3  0  1 14 15 12 13 10 11  8  9 22 23 20 21 18 19 16 17 30 31 28 29 26 27 24 25 
 7  6  5  4  3  2  1  0 15 14 13 12 11 10  9  8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 
 8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 
 9  8 11 10 13 12 15 14  1  0  3  2  5  4  7  6 25 24 27 26 29 28 31 30 17 16 19 18 21 20 23 22 
10 11  8  9 14 15 12 13  2  3  0  1  6  7  4  5 26 27 24 25 30 31 28 29 18 19 16 17 22 23 20 21 
11 10  9  8 15 14 13 12  3  2  1  0  7  6  5  4 27 26 25 24 31 30 29 28 19 18 17 16 23 22 21 20 
12 13 14 15  8  9 10 11  4  5  6  7  0  1  2  3 28 29 30 31 24 25 26 27 20 21 22 23 16 17 18 19 
13 12 15 14  9  8 11 10  5  4  7  6  1  0  3  2 29 28 31 30 25 24 27 26 21 20 23 22 17 16 19 18 
14 15 12 13 10 11  8  9  6  7  4  5  2  3  0  1 30 31 28 29 26 27 24 25 22 23 20 21 18 19 16 17 
15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 
17 16 19 18 21 20 23 22 25 24 27 26 29 28 31 30  1  0  3  2  5  4  7  6  9  8 11 10 13 12 15 14 
18 19 16 17 22 23 20 21 26 27 24 25 30 31 28 29  2  3  0  1  6  7  4  5 10 11  8  9 14 15 12 13 
19 18 17 16 23 22 21 20 27 26 25 24 31 30 29 28  3  2  1  0  7  6  5  4 11 10  9  8 15 14 13 12 
20 21 22 23 16 17 18 19 28 29 30 31 24 25 26 27  4  5  6  7  0  1  2  3 12 13 14 15  8  9 10 11 
21 20 23 22 17 16 19 18 29 28 31 30 25 24 27 26  5  4  7  6  1  0  3  2 13 12 15 14  9  8 11 10 
22 23 20 21 18 19 16 17 30 31 28 29 26 27 24 25  6  7  4  5  2  3  0  1 14 15 12 13 10 11  8  9 
23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24  7  6  5  4  3  2  1  0 15 14 13 12 11 10  9  8 
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7 
25 24 27 26 29 28 31 30 17 16 19 18 21 20 23 22  9  8 11 10 13 12 15 14  1  0  3  2  5  4  7  6 
26 27 24 25 30 31 28 29 18 19 16 17 22 23 20 21 10 11  8  9 14 15 12 13  2  3  0  1  6  7  4  5 
27 26 25 24 31 30 29 28 19 18 17 16 23 22 21 20 11 10  9  8 15 14 13 12  3  2  1  0  7  6  5  4 
28 29 30 31 24 25 26 27 20 21 22 23 16 17 18 19 12 13 14 15  8  9 10 11  4  5  6  7  0  1  2  3 
29 28 31 30 25 24 27 26 21 20 23 22 17 16 19 18 13 12 15 14  9  8 11 10  5  4  7  6  1  0  3  2 
30 31 28 29 26 27 24 25 22 23 20 21 18 19 16 17 14 15 12 13 10 11  8  9  6  7  4  5  2  3  0  1 
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0 

The valAboveDiag method takes advantage of the fact the matrix is symetric about the main diagonal. For values below the diagonal, the row and column numbers are simply transposed. To compute an arbitrary value in an NxN matrix, the method executes O(log(N)). However, if I always intended to print the entire matrix, I could have rewritten this to take advantage of values already stored in the matrix. In that case, adding a new value to the matrix would occur in constant time. The central idea behind the code below, is that the matrix is built up of matrices whose sizes are powers of 2 (e.g. 2x2,4x4,8x8, etc), organized in a pattern [[A,B][B,A]]. For a given cell, above the main diagonal, the cell is either in quandrant [0,0], [0,1], or [1,1]. Because of the pattern [[A,B][B,A]], this means that quadrant [0,0] is A, quadrant [0,1] is B, and [1,1] is A. Observing that B=A+mid, where "mid" is the "middle value" midway between 0 and the maximum value of the power-of-two block, we can only need to be able to produce A to find any other value in the power-of-two block. This process simply recurses until A is found to be the identity matrix (i.e. [[0,1],[1,0]]).

package random;

import java.util.Arrays;

 * @author geoffreyhendrey
public class MadMatrix {
    int[][] a;

    public MadMatrix(int size){
        a = new int[size][size];
        for(int r=0;r<size;r++){
            for(int c=0;c<size;c++){
                a[r][c] = valAboveDiag(r,c);

    public String toString(){
        StringBuilder buf = new StringBuilder();
        for(int[] row:a){
            for(int column:row){
                buf.append(String.format("%2d ", column));
        return buf.toString();

    private int valAboveDiag(int r, int c) {
            int tmp = c;
        int order = getOrder(c);
        if(0 == order){
            if(r==c) return 0;
            return 1;
        int mid = 1<<order;
        if(c >=mid && r<mid){
            return valAboveDiag(r, c-mid) + mid;
            return valAboveDiag(r-mid, c-mid);

    private static int getOrder(int c){
        int order = 0;
        while(c > 1){
            c >>>=1;
        return order;

     * @param args the command line arguments
    public static void main(String[] args) {
        MadMatrix mm = new MadMatrix(32);


Sunday, April 04, 2010

AES Encryption in Java

I wrote this block of code a couple years ago, but it has served me well. I remember it being quite a challenge to get it to work properly. There was, and maybe still is a dearth of decent examples on using AES encryption from Java. The examples I found didn't deal with Cypher Block Coding (CBC). CBC is essential to decent encryption because it feeds the output of one cyphered block forward into the next, thus creating a cascade of randomness. All you need is an initial random phrase in the data you wish to encrypt. If you don't use CBC, you'll notice that blocks of your encrypted data can easily be recognized as separators in what you are encrypting. For example, let's say I encrypt the phrase "user:bob:password:abc". The substring "user:" and "password:" will cause an identifiable pattern in the encrypted data. If someone has access to snoop the wire, they can literally lift the encrypted username and password out of someone else's transmission, and paste it into their own. While CBC with a leading random phrase will make it impossible to identify such non-randomness, you still have to guard against someone injecting cruft into the encrypted data. This should be done by appending an MD5 hash of the unencrypted data, then encrypting the whole ball of wax.

    private static String encrypt(String encryptMe, byte[] cryptKey, SecretKey secretKey, byte[] iv){
        try {
            Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding", "SunJCE");
            cipher.init(Cipher.ENCRYPT_MODE, secretKey, new IvParameterSpec(iv));
            byte[] raw = encryptMe.toString().getBytes("ASCII");
            if(log.isDebugEnabled())log.debug("unencrypted bytes: " + raw.length);
            byte[] cipherText = new byte[cipher.getOutputSize(raw.length)];
            int ctLength = cipher.update(raw, 0, raw.length, cipherText, 0);
            ctLength += cipher.doFinal(cipherText, ctLength);
            if(log.isDebugEnabled())log.debug("ctLength: " + ctLength);
            if(log.isDebugEnabled())log.debug("cipherText Length: " + cipherText.length);
            //raw = cipher.doFinal(raw, 0, raw.length);
            byte[] copy = new byte[ctLength];
            System.arraycopy(cipherText, 0, copy, 0, ctLength);
            String encrypted = new String(new Base64().encode(copy), "US-ASCII");
            return encrypted;
        } catch (Exception ex) {
            throw new RuntimeException(ex.getMessage(), ex);

Friday, April 02, 2010

what's in your history?

[ghendrey@geoffunix hadoop]$ cut -f1 -d" " ~ghendrey/.bash_history | sort | uniq -c | sort -nr | head -n 20
    206 hadoop
     27 ls
     23 cd
     18 su
     17 grep
     15 tac
     11 pwd
     10 exit
      8 env
      5 less
      4 /mnt/hadoop/hbase-0.20.3/bin/hbase
      4 man
      4 ifconfig
      3 telnet
      3 source
      3 mount
      3 ant
      2 vi

Tuesday, February 23, 2010

Webscale Computing and Hadoop

I've been using Hadoop extensively for a month or so now at the office. I've been a big fan of Doug Cutting's Lucene technology, so when I heard that Doug Cutting was the guy behind, Hadoop, that pretty much pushed me over the edge and I started using Hadoop for building new types of database indexes from multi-gigabyte datasets.

Hadoop is an incredibly rich software stack consisting of three main pieces:
  1. The Hadoop Distributed File System (HDFS)
  2. A Map/Reduce infrastructure consisting of Daemons that layer services on top of HDFS
  3. A set of APIs allowing a data processing task to be split into simple Map and Reduce phases
There's a fourth piece, which is the many related tools (HBase, Hive, Pig, etc) which are layered on top of Hadoop.

If one was to read the description of map/reduce one might conclude that it is pretty much nonsense. In fact, it sounds to trivial to even be called an algorithm. Put things into groups, operate on the groups. Big deal. It sounds pretty much like common sense. Until you work with Hadoop, you really cannot appreciate all of the benefits that the Hadoop stack brings. It's really the collective benefits of the entire stack that make for the game changing experience that programming with Hadoop really is.

With problems of processing huge datasets, the devil is in the details. Hadoop provides a framework that removes all of the annoying complexity associated with big data sets. In essence you write two simple methods and from that point on you don't care much whether you are operating on 1 byte or 1 TB. This is possible in large part because HDFS allows you to think of terrabytes worth of data as a simple URL such as hdfs://namenode/user/geoff/BigDataSet. HDFS takes care of replicating your data, and figuring out where the data blocks actually reside in the cluster. As an added bonus, Hadoop automatically deals with nuances such as the files being zipped. But wait, there's more. From the hadoop command line, you can run commands to cat and grep these URLs, again, acting as if they were simple local files.

For me, one of the most interesting side effects of running a hadoop cluster has been how it changes ones ability to interact with peers on a webscale computing project. You can now shoot your co-worker a URL to a 200GB file, and they can poke and prod at the contents by doing things like "hadoop fs -text hdfs://namenode/user/geoff/BigData".

I'll try to write more about this later, but for now suffice it to say that Hadoop is quite exciting to work with. I think the criticism by Michael Stonebreaker have totally missed the point of what a good implementation of a map/reduce framework can yield, probably because their criticism focused on the map/reduce algorithm, which in and of itself is trivial. And that's really the point. It's really all about the tools and the entire stack that make map/reduce simply a "part of this complete breakfast" when it comes to hadoop. So don't forget the toast, juice, and yummy Cheerios!

Sunday, February 21, 2010

HATEOAS - Hypermedia As The Engine Of Application State

Roy Fielding's PhD dissertation on Architectural Styles and the Design of Network-based Software Architectures is the seminal paper on the RESTful style of web architecture. Before discussing the paper, let's first discuss what it is, because it seems quite unusual in its approach as a PhD dissertation. Reading this paper in the 21st century, one might conclude that the paper is an "experience paper" (as defined here). However, as noted, "An experience paper that merely confirms that which is well known as opposed to that which is widely believed is of little value." However, in the early 1990's, the topic of the dissertation was probably not widely known. The irony now, is that as the term REST becomes more and more commonplace, and is applied to non-RESTful systems, the level of confusion over REST might be rising, not falling.

I was certainly confused for a long time about what constitutes a RESTful system. Isn't anything that has a URL RESTful? Definitely not, as the presence of resource identifier is just one of several properties that make a system RESTful. Fielding has recently written this article listing the properties he sees as essential to a RESTful system. After working quite a bit on the RESTful system for, it seems that me that the principle of Hypermedia As The Engine Of Application State (HATEOAS), identified on the original dissertation section 5.1.5, might be the most important principle of all. Incidentally, of all the properties outlines in the recent article, Fielding chose to title the article "REST APIs Must be Hypertext Driven". To my mind, that confirms my hunch that HATEOAS is perhaps the least understood principle of REST, and the principle in need of the most discussion and explanation.

The essence of HATEOAS is that the client needs to understand as little as possible. In essence, this is quite similar to the experience a human requires when he browses a website. In order to move from one page to the next, we must be given links that we can navigate. This is our common experience on web pages. Consider an alternative, and less user friendly,  scenario in which the home page simply consisted of a set of instructions (an API, if you will). One instruction might read "To view articles about sports, visit a URL consisting of "http://sitename/topics?name=topic-name" and replace topic-name with 'SPORTS'". You would be forced to manually construct the URL and enter it into the browser. The notion of such a website seems absurd, and yet it describes the scenario common to RPC or "fake REST" APIs in which the client is responsible for constructing the state-transition links rather than being handed a hypermedia document that provides the available state transitions. 

Quoting Fielding: "...all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user’s manipulation of those representations".

This is an interesting article that shows how non-HTML hypermedia can be used in keeping with HATEOAS. However, I think there are excellent reasons why Plain Old HTML (POH) is the best hypermedia choice for use with RESTful systems:

  1. HTML was made for hypermedia (it has well known structures for encoding links)
  2.  Your mom knows what HTML is
  3. By using POH for HATEOAS, your RESTful system's content can be indexed by search engines
  4. You can easily view and traverse your system states using a web browser
POH for HATEOAS is the approach taken by NextDB's REST system. In the documentation, we do describe the explicit structure of several of the URL's, but we stick strongly to the principle outlined by Fielding that, beyond the entry point URL, or "bookmark" you shouldn't need to know the structure of any of the URLs. Using POH for HATEOAS, the proof of NextDBs' adherence to this principle is quite straightforward: I simply give you an entry point URL (a bookmark) to a table and allow you to navigate the links served in the POH. For example, here is the bookmark to an account named 'geoff' with a database named 'testchars' and a table named 'lines':

From the URL above you can click on links to visit various pages of the content and sort the content. No knowledge of an API is required. Taken to its logical conclusion, I see no reason why the RESTful service should not be allowed to directly present styled content, when the consumer of the content is human. So that is exactly what we do in NextDB, by allowing the POH to include CSS. Here is an example:

We're techically violating HATEOAS because we don't provide links to the styled content, rather the developer has to know about the available styles by reading our documentation. However, we'll soon correct this, as well as allowing the developer to pass in links to his own CSS for inclusion in the returned POH.

Finally, not only do we allow you to style the POH, but we also allow you to apply XSLT transformations to the POH in order to alter the structure (as opposed to style) of the POH. Fielding discusses "Processing Elements" or "Transformers of Data" in his thesis. I believe XSLT to be the POH analog for processing elements, in that they are well understood, easily encapsulated in a markup, and supported by a wide array of processors (including browser-side processing, although this tends to be less portable, which is why NextDB opted to perform the transformation on the server).

In summary, NextDB is a truly RESTful database. This is important not because it conforms to a buzzword, but because it has made NextDB so easy to use that even non-programmers are able to embed the POH hypermedia in their sites. Project Have Hope is a great example of one such site. The "catalogs" of sponsorable children and women in Africa is POH served out of  The HATEOAS architecture is the key to opening the database content to CMS engine-driven sites, as well as accessibility to web search engines.

Monday, February 15, 2010

Spam Bucket

I'm curious to see how much spam accumulates in this database if I put a publicly writable database table on my blog

UPDATE: I have the answer: "A LOT"
"Salad Man", I commend you! "get three inches on your dick!" is a noble spam to leave on my blog, but I am dissapointed you did not include a video!!!

Saturday, February 13, 2010

NextDB REST tables

NextDB REST tables can be easily embedded anywhere on the web. For example:
<iframe src=';~cols=PK,video_CONTENT_TYPE,video_FID/style/newspaper-a/pgsz/3' />

Friday, February 12, 2010

An update on JTIdy and XSLT

A couple months back I blogged about JTidy I have an update to that story. If you plan to run your XHTML through an XSLT transformation, don't use tidy.setXHTML(true). The reason for this, I found after a lot of debugging. There are a "named entities", specifically ones such as &acirc; (and many others) that are declared in the XHTML DTD. And guess what? They are NOT valid in XML.

Quick refresher: XML has only five named entities that it supports

So you're thinking, no biggie, I don't think my documents will ever have a &acirc; named entity in them so what do I care? Well, they can creep in my accident, if you are allowing users to save inputs into a database. For example, there are ISO-8859-1 (Latin-1) characters with no UTF-8 equivalent. One that I have repeatedly seeen over the years from European users is the "left quote". It doesn't even exist on an american keyboard, and when it gets posted to a system expecting UTF-8 it wreaks havoc, causing a run of several incorrectly decoded characters.

And so, you can get "inadvertant" named entities in your XHTML output due to this sort of character mangling where the UTF-8 byte stream interpretation gets borked. So now, instead of just getting some gibberish in your XSLT, the entire transformation crashes, and you get no output at all except a complaint that the input document had an unsupported character reference. Hence the solution: tell jtidy NOT to produce XHTML, but instead to produce plain old XML, like this: tidy.setXmlOut(true); When you do this JTidy doesn't put these named entities in anymore. 

One other surprise I ran into is that XML actually *DOES* support numeric character entities. So you will still get numeric character entities in the XML output such as &#128, which although specifically forbidden by HTML still render properly in most browsers. So JTidy happily outputs these numeric character entities in its XML output. You are now safe to apply XSLT transformations on a valid XML input, however, just be aware that the XHTML output of your XSLT transformation is in violation of the aforementioned forbidden, but practically allowed, numeric entities.

Confused yet? Me too. Let's all speak Esperanto and adopt 7-bit ASCII as the only allowed character set.

Thursday, January 21, 2010

Character Encoding Hell

Over the years, one of the perennial hassles that web services programmers grapple with, is undoubtedly character encoding. One of the biggest contributing factors to the problem goes all the way back to the fact that the original definition of URL-encoding failed to specify how to deal with UTF-8 or other non-ascii encodings outside the reserved character set. The current spec states that if you want to send UTF-8 strings, for example "てすと   (te-su-to) ", then you should percent encode each byte of the UTF-8 character sequence (%E3%81%A6).

Two excellent pages for test data, should you need to test some multi-byte UTF-8 characters are:

Sadly for AJAX applications, actually generating these percent encoded byte sequences is non-trivial, and doesn't seem to be readily available "off the shelf". I've had to resort to JS source such as this.Without such scripts, various attempts to use "off the shelf" JS methods produce wacky results, such as unicode representations of the strings (like \uXX\uXX) which is completely useless for transmission in a URL.

On the server, there are also problems in receiving and properly decoding the bytes. One of the biggest problems is that Java web servers don't do it in a standard way. For example, when I was working on Tomcat 5.x,  the Servlet getParameter method would assume ISO-8859-1 (Latin 1) encoding, which would garble any properly UTF-8 percent encoded bytes. There is a well hidden setting to switch the default to UTF-8.

On the other hand, Jetty *does* assume the URL is UTF-8 percent encoded (love jetty!!). So without some config tweaks, don't expect servlet containers to *uniformly* deliver properly decoded UTF-8 strings.

My latest installment of UTF-8 Hell comes as I implement some more REST capabilities for Here is the content of a message I just posted to Paul Sandoz of Jersey fame (man, that guy must never sleep. he is *on the ball* on the Jersey mailing list). Mad props to Jersey. It is just killer. But I digress:

Hi Paul,

I got to the bottom of this by trying to unmarshal the string in three different ways. As I already mentioned the first way was just to call FormDataBodyPart.getValue().

toString(). This produced the improperly decoded String.

Then I tried two other ways, both of which correctly unmarshalled the bytes from the POST. As supporting information, here is the CURL line I was testing with, and an excerpt from the CURL trace, showing the proper bytes being posted.

curl --trace traced -F line=てすと http://localhost:8080/nextdb/rest/geoff/testchars/lines/rowid/PK/1

The 9 bytes highlighted below are the three japanese characters.

=> Send data, 148 bytes (0x94)
0000: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ----------------
0010: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 63 61 --------------ca
0020: 33 32 31 65 31 30 39 66 36 37 0d 0a 43 6f 6e 74 321e109f67..Cont
0030: 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e 3a ent-Disposition:
0040: 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d 65  form-data; name
0050: 3d 22 6c 69 6e 65 22 0d 0a 0d 0a e3 81 a6 e3 81 ="line".........
0060: 99 e3 81 a8 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ......----------
0070: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ----------------
0080: 2d 2d 2d 2d 63 61 33 32 31 65 31 30 39 66 36 37 ----ca321e109f67
0090: 2d 2d 0d 0a          

Both of the following two methods will properly unmarshal the correct String:

method 1: use InputStream to get raw bytes

                    InputStream is = theFormDataBodyPart.
                    try {
                        byte[] bytes = Util.readInputStream(is, 1024 * 1024, 1024 * 1024 * 1024);
                        log.debug("this many bytes: " + bytes.length);
                        for(byte b:bytes){
                        String s = new String(bytes, "UTF-8");
                        return s;
                    } catch (IOException ex) {
                        throw new RuntimeException(ex.
getMessage(), ex);

Method 2) use theFormDataBodyPart.


Monday, January 04, 2010

Well said....

The exchange between Churchill & Lady Astor: She said, "If you were my husband I'd give you poison." He said, "If you were my wife, I'd drink it."

A member of Parliament to Disraeli: "Sir, you will either die on the gallows or of some unspeakable disease." "That depends, Sir," said Disraeli, "whether I embrace your policies or your mistress."

"He had delusions of adequacy." - Walter Kerr

"He has all the virtues I dislike and none of the vices I admire." - Winston Churchill 

"I have never killed a man, but I have read many obituaries with great pleasure." Clarence Darrow

"He has never been known to use a word that might send a reader to the dictionary." - William Faulkner (about Ernest Hemingway). 

"Thank you for sending me a copy of your book; I'll waste no time reading it." - Moses Hadas

"I didn't attend the funeral, but I sent a nice letter saying I approved of it." - Mark Twain

"He has no enemies, but is intensely disliked by his friends." - Oscar Wilde

"I am enclosing two tickets to the first night of my new play; bring a friend.... if you have one." - George Bernard Shaw to Winston Churchill

"Cannot possibly attend first night, will attend second... if there is one." - Winston Churchill, in response.

"I feel so miserable without you; it's almost like having you here." - Stephen Bishop

"He is a self-made man and worships his creator." - John Bright

"I've just learned about his illness. Let's hope it's nothing trivial." - Irvin S. Cobb

"He is not only dull himself; he is the cause of dullness in others." - Samuel Johnson

"He is simply a shiver looking for a spine to run up." - Paul Keating 

"In order to avoid being called a flirt, she always yielded easily." - Charles, Count Talleyrand

"He loves nature in spite of what it did to him." - Forrest Tucker

"Why do you sit there looking like an envelope without any address on it?" - Mark Twain

"His mother should have thrown him away and kept the stork." - Mae West

"Some cause happiness wherever they go; others, whenever they go." - Oscar Wilde

"He uses statistics as a drunken man uses lamp-posts... for support rather than illumination." - Andrew Lang (1844-1912)

"He has Van Gogh's ear for music." - Billy Wilder

"I've had a perfectly wonderful evening. But this wasn't it." - Groucho Marx