• Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL

Chapter 4. Tools for Doing the Heavy Lif... > A Conversation with the jGurus

A Conversation with the jGurus

Photos by Heather Champ Tom Burns

Terence Parr

jGuru (jguru.com) is a great example of a niche community site with a specific problem to solve. In the beginning, it used a commercial backend product to help power the site. But as it grew, it needed a much more powerful tool, so Chief Scientist Terence Parr and CEO Thomas Burns wrote their own backend.

While their story isn’t universal (how many startups are lucky enough to have a Chief Scientist who studied linguistics and pattern recognition in college?), it tells a common tale of outgrowing a backend system. In their case, the jGurus wound up building something entirely customized, unavailable anywhere else, and perfectly suited to their needs. And it’s no wonder. When Parr talks about the system, he starts to confuse his pronouns. Several times in our interview, he said “I” to describe the actions of the system. With people this passionate at the helm, it’s little wonder jGuru’s community is thriving.

I spoke with the jGurus over email and phone in mid-2001.

Please introduce yourself and jGuru.

Burns: I’m Tom Burns, the CEO of jGuru, which is the largest independent site for Java developers. It features over 5,000 answered Java questions, 40+ Java forums, 14 Java training courses, articles from the major Java sites, and news. I co-founded jGuru because I was frustrated at how bad companies were at marketing to and supporting developers (I have 13 years experience writing software). I also decided that the absence of a good/cost-effective means of reaching developers greatly slowed down the progress of software. If it takes a huge amount of money to get developers to look at a new technology, there will not be many successful new technologies. The goal of jGuru is to improve the situation—make it less expensive for companies to provide developers with the information they want and less expensive to reach them (hopefully putting developers in reach of smaller companies).

Parr: I’m Terence Parr, the Chief Scientist at jGuru, responsible for the implementation of the software that generates the jGuru website, and an all-around nice guy. jGuru is a portal for Java developers that provides all the tools they need to do their job. Basically, I call it “your view of the Java universe.”

In the beginning, you used a commercial backend product to help power the site. What were the pros and cons of this?

Burns: The biggest advantage was eliminating choices! I know that sounds strange. We were in a big hurry, and I (correctly, it turned out) assumed that everything we thought was important would be wrong. So I didn’t want us to worry too much about the detail—just get something up that was worth criticizing. The product that we used (Epicentric) had a definite bias with regard to site design, and it forced us to use its approach. I think it would have taken much longer to get the site done if we had started with a blank slate. Ultimately, we ended up completely changing our site design, and Epicentric was no longer appropriate.

Parr: In general, I’ve found that, as a fairly experienced hardcore technoid, the things that other people want tend to annoy me. So 90 percent of their effort goes into making the software soft and fluffy, with a great web interface, which is really irritating for me.

How do I install that on a machine by just unzipping something? I can’t—I’ve gotta go click click click, delete this guy, click click click, add that guy. It was annoying as a programmer, because, in general, the things that people will provide are for people unlike me.

Even if you do like fluffy interfaces, in general, you’re gonna be able to beat it, if you have the time and expertise, you’re gonna be able to do a much better job yourself.

Ultimately, you decided to code your own backend. Why did you make this decision? How was the transition?

Burns: The transition was fairly easy for us. Epicentric was really in the middle—we did the back and the very front. The new design of the site was fairly easy to implement in the sense that it has just a couple of concepts (basically, view lists sorted and filtered in various ways).

You’ve put a great deal of thought into creating your current custom backend. Tell us a little about it.

Burns: From a performance perspective, the biggest change we made to the backend is that we put everything except “person records” in memory at startup (there are too many registered users to load them all). This makes a huge difference in both complexity (it is much simpler) and performance. I would recommend it to anyone who inherently has a manageable amount of data (our machines have a gig of RAM). Other big features are that we made the look templatized—we can change the look by just changing a few files.

Parr: Our goal is to build a great set of FAQs. We have managers who used to answer questions. People would submit questions, and if they were good enough, we’d add them to the FAQ. So we had a human who was the direct interface to the outside world.

So there was a choke point there, right? It had to be the expert who answered your question, and they had to have the time to get to it. So we came up with a highly structured forum that was really just a series of questions, and then everybody could answer them. That way you don’t have to wait to get your question answered by the expert at the choke point. Anybody can do it.

Then the expert, who’s now a manager, could come in and say, “Hey, that was a great answer!” and promote it, pull it out of the forums, edit it, and stick it into the FAQ.

So you’ve got the jGuru community answering each other’s questions. But how do you keep the forums from discussing the same questions that are already answered in the FAQ?

Parr: What we do is, when you submit a question, we search for an appropriate answer and provide you with a list of potentials, in an effort to say, “Here’s the answer.” That way the system automatically tries to reduce the amount of noise in the forums.

Further, it tries to guess when you’re in the wrong forum. You don’t want someone in the database forum asking a question about building a GUI (Graphical User Interface). So, if someone does that, the system says, “There’s an above average chance that you’re in the wrong forum. I suggest one of these topics.” Then the user can just click on it and switch to the right forum.

The system also tries to detect when you haven’t said anything about Java. So if somebody just says, “Hey, what’s this site about?” or posts a thigh cream commercial, the system says, “You know, there’s not a single word in there that I recognize as part of the Java lexicon. So click here to re-edit and add something that has to do with Java.”

How do you know what is and isn’t Java talk?

Parr: We have a fuzzy logic search engine that tries to strip out everything but the important keywords in your question. Then I do a fuzzy comparison against all of the other FAQ entries in our system. I do this not by which FAQs have these keywords, but by how important these keywords are in that particular FAQ. In other words, how often they’re used, and the frequency of the use of that word, and how important this document is compared with the rest, so I can bubble the most important one to the top.

The way we started this system was to spider the New York Times website. I got, I dunno, 250,000 words. And I said, okay, that’s English. And then I spidered our own website, and I said, that’s Java. And then, to distinguish the Java lexicon from the English, I did some complicated fuzzy logic stuff that revealed a set of keywords that are specific to Java.

Because those words don’t appear in the New York Times?

Parr: Well, they may appear in the New York Times, but it’s a difference in their usage. For example, the word “compile” is probably at the New York Times website. However, when you’re talking about Java, it’s used way more. So, if a word is overused in the Java lexicon, and underused, relatively speaking, in English, I say, “Aha! That’s Java.”

So once I get a definition of what Java looks like, I can take any question you ask me, strip out all the English words, and then do pattern matching to figure out what you’re talking about.

So in the “buy vs. build” debate, you’re on the build side?

Burns: I am actually generally for buying when you can, but Ter prefers to build (and he is the one actually doing all the work, so his vote trumps mine). One thing to watch out for when “buying” is that integration can be a nightmare—if you can’t buy something that does almost everything you need, you will probably be better off building.

The biggest advantage to our approach—a fully custom site—is that it is all very integrated and easy to maintain. We can build a new jGuru box and get a site up with just a few commands.

Parr: If you don’t have the expertise, buy. It’s as simple as that.

One of the things I would recommend is: Keep your setup as simple as possible. For example, Oracle (databases) have to have several partitions on the disk, you gotta have a full-time administrator, it’s just a mess. We had a very simple database problem, so we bought a very simple database (from Solid Technology) that runs on a small amount of memory. It installs by unzipping the thing. You back it up by copying a file! That solves a huge amount of trouble.

Also, use stuff where you have the source. Like we use Apache (web server). Use free tools if you know that they are good and you get the source. Just make it as simple as possible.

  • Creative Edge
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint