In-Depth

Open-source and commercial tools mashup

Special ReportYou don’t often encounter folks who’ve gone gaga over commercial, off-the-shelf reporting tools. There’s a reason for this. Products such as Crystal Reports, Actuate e.Spreadsheet and Cognos ReportNet are slick, powerful, and often extensible solutions. In most cases, their price is commensurate with their attributes, so much so that users expect these products to perform as advertised.

Contrast this with the open-source reporting tools space, where users of Eclipse BI Reporting Tool, JasperReports and JFreeReports seem almost ecstatic about their respective solutions although they lack the capabilities of commercial reporting tools. Free trumps paid, most of the time. See related story, “The open-source business intelligence ecosystem expands.”

Soft spot for open source
Consider the case of Steve Birtles, a quality assurance manager with UK electronics giant Micromark, who’s a happy JasperReports user. Like a lot of other codejockeys, Birtles has a soft spot for open-source software. But he’s by no means an open-source bigot; he recently transitioned from Eclipse to Oracle’s JDeveloper IDE.

When Micromark needed an inexpensive way to report against a product information repository that’s populated with more than 28,000 retail items (spanning 2,500 product lines), Birtles went a-Googling—not in search of costly COTS tools, but, instead, for serviceable open-source reporting solutions.

He found plenty, checked out BIRT and others and selected JasperReports, which is supported by one of the most active user communities in the Sourceforge.net universe. JasperReports isn’t a reporting tool, exactly—it’s a command line-driven reporting library that uses XML templates to generate ready-to-print documents—and it’s vastly different from Crystal Reports and other visually stunning COTS fare.

Given the choice, most end users would probably opt for the slickness of a Crystal Reports or a ReportNet over the command line efficiency of JasperReports, which is frequently paired with its GUI-based design complement, iReport. But Birtles—like most codejockeys—isn’t so easily seduced by slickness. JasperReports and iReport do all that he could ask of them, he says. They offer capabilities such as dynamic report rendering that rival COTS tools, integrate with his preferred IDE—and, best of all, they’re free. See related story, “Tools of change.”

“[We had] a number of options,” Birtles says. “[We could] generate 2,500 document files for the evaluation process and try to keep them updated to the latest requirements. [Or you could] keep the requirements on a database, grouped by product type, then dynamically generate the inspection reports the morning or hour you’re going to do the inspection, along with product photographs. Which option would you choose?”

In this respect, Birtles explains, JasperReports and iReport more than get the job done. “[We can] take the results of an inspection, enter back into the database, and generate a release and shipping document, dynamically, and at the same time tie it into a fax system. It faxes the generated report to the supplier and generates a cross report of that supplier’s performance and quality levels, with up to date figures—from a couple of minutes ago.”

Enterprise apps go BI
BIRT has matured rapidly. It plugs into Eclipse, has an embeddable reporting engine, offers report lifecycle management capabilities and ships with what many users say is a slick, graphical authoring tool.

That’s good, because David Peterson, a programmer with a prominent New York online publisher, says his employer is asking BIRT to do a lot more than plain-vanilla reporting.

The open-source business intelligence ecosystem expands

Radding-App IntegrationThere’s a thriving open-source reporting tools segment–of which the Eclipse BI Reporting Tool and Sourceforge’s JasperReports are probably the two most visible examples–that’s complemented by a broader BI services stack. It includes an online analytical processing project called Mondrian, which is part of a full-fledged BI stack, dubbed Pentaho.

Few users of BIRT, JasperReports or other open-source tools seem to be aware of this open-source BI ecosystem. That could change over time, predicts BI guru Wayne Eckerson, director of TDWI.

If anything, the open-source BI tools segment will grow in importance, largely in lockstep with still another emerging trend–that of integrating more and more BI functionality into apps.

“I think ultimately, BI is embedded into applications,” Eckerson comments. “That’s where it’s getting the most use by the most people.”

In this sense, the development and maturation of the open-source BI stack could reprise the evolution of a related animal–Microsoft’s SQL Server BI stack. Microsoft not only delivered a new version of SQL Server Reporting Services in SQL Server 2005, it also rolled out a retooled BI stack.

Open source has a pattern Among J2EE codejockeys, Eclipse is nearly as prevalent as Visual Studio is among .NET developers. And most open-source reporting projects–even those that aren’t explicitly J2EEoriented—accommodate Eclipse, either as a sanctioned tool as with BIRT or via plug-in support. Ditto for burgeoning open-source BI projects like Pentaho and Mondrian, the OLAP component of Pentaho.

Of course, Eclipse–unlike Visual Studio– isn’t the last word in its respective app dev universe. In spite of the promise of nominal compatibility among J2EE IDEs, the reverse is often the case. Nor is the Eclipse experience universal from one app dev platform such as Linux or Windows, to another such as Macintosh OSX. Even so, BIRT, JasperReports and other solutions support via plug-ins a range of different IDEs. So codejockeys can, theoretically, work with the IDE and platform that’s most suited to their needs.

What’s missing is an analog So the app architecture and IDE pieces are more or less there. What’s missing–in substance, if not exactly in form–is the open-source analog of a SQL Server, which in the .NET universe is the foundation of Microsoft’s BI value proposition. Open-source relational databases abound, of course, but not even the most visible of open-source RDBMSes–MySQL and Postgres–can approximate SQL Server’s market reach or celebrity.

And there’s a lot to recommend Microsoft’s all-in-one SQL Server BI value proposition. Some J2EE codejockeys–and otherwise happy users of open-source reporting tools–acknowledge as much.

Likes and dislikes “What we liked about SSRS was the powerful server backend with its own SQL server cache of reports, automated report generation and email, and an auto generated homepage to navigate our collection of reports,” says Luke Philips, a software engineer with a prominent U.S. telco.

What Philips didn’t like about the SQLServer and SSRS combination was its resource-intensiveness. His employer eventually selected the open-source BIRTproject, which, he says, offers superior reporting capabilities. “What I miss in BIRTthat SSRS has is the server backend bells and whistles, good automated caching of reports and the homepage for navigation of my report portfolio.”

Nevertheless, Philips isn’t sure a combined Java and open-source BI stack could flourish at his company. “We don’t have a lot of Java expertise in the BI portion of our IT shop,” he says. “As a consequence, we will not make use of Mondrian or Pentaho anytime soon.” If anything his employer will instead pursue an opensource and SQL Server 2005 approach.

“It is possible we could use open-source .NET tools in this space. We are using BIRTas a reporting tool primarily from the OLTP area at the moment.”

Stephen Swoyer

“We’re using BIRT to create performance reports for our e-mail newsletter distribution,” Peterson explains. “Basically, people come to our sites and sign up to receive newsletters. We then schedule periodic e-mails to go out to that group and we’d like to measure and report on the performance of those newsletters. We like to track positive metrics of interest like click-throughs to our Web sites to gauge the interest level in what we’re sending out. We also track and report negative metrics like opt-outs.”

No time to lose or choose
Performance management is the newest frontier in the COTS reporting space, but Peterson says he didn’t consider using a commercial tool in place of BIRT.

“I chose BIRT because the due date for this project is very short and I cannot wait for the full product evaluation to be completed, vendor selected, and project implemented,” he explains. “I chose BIRT based upon recommendations from friends, price, open-source, examination of the tutorials and online documentation for capabilities, and the Eclipse plug-in availability.”

In this respect, he concedes, BIRT probably isn’t quite as sexy as a COTS offering—but it doesn’t have to be. “I’ve evaluated other commercial offerings… from BusinessObjects, Informatica, AbInitio, and Microsoft [Analysis Services]. They all offer a higher breadth of capabilities, features and ease of use. However…the timing of this project was such that I could not wait for the full fledged [COTS] product to be chosen.”

More to the point, Peterson says, BIRT does what he asks of it—and does it well. His company is currently in the midst of a separate process to select its next-gen data warehouse, ETL tool and front-end reporting solution. “We’re investigating Crystal Reports, Cognos and BIRT, but we will likely expand the search into other vendors as well,” he explains.

Tools of Change

Tools of ChangeAs reporting requirements grow more sophisticated, many enterprises try several tools before finding a satisfactory solution.

That's more or less what happened to Mark Lorenz, an application architect with a major healthcare company based in the Southeast. "In the past, we have used Apache's Formatting Objects Processor, which is a low-level PDF framework," he says. "This was very difficult, brittle and time-consuming to use. We also looked at iText, which was meant to make PDF generation easier. However, we had technical problems with that as well.";

The company also considered using Crystal Reports, with technical staff creating reports and then distributing them along with the company's product. "This was a problem, with buying the product, learning how to use the product, administering the product installation and configuration and upgrades, and then having a relatively large amount of time spent on report creation and editing separate from our development processes," Lorenz says. What's more, Crystal Reports wasn't as Eclipse-friendly as most of the open-source solutions he evaluated.

Lorenz & Co. opted for BIRT. Since then, he reports, they haven't looked back. "It is fully integrated with Eclipse, which is our development environment, and it uses standard technologies," like JavaScripting. Lorenz explains, "It is extremely powerful yet easy to use, and it has a strong and active support community."

J2EE programmer Steve Atkins also recently changed from JasperReports to BIRT. "I'm using a private instance of Tomcat to run the BIRTreporting engine," he explains, adding that his main app code basically calls the BIRTengine as a Web service. "The main application code takes the XMLtemplates from the user, stores them in the database, alters them as necessary, then writes them out as temporary files for Tomcat and BIRTto access. It then throws HTTP requests at Tomcat to generate PDF and HTML reports."

Atkins says he looked at a number of different reporting tools, Java and otherwise. Cross platform support (or available source code), availability of a GUI-based design tool, and support for charts were his key selection criteria. "I looked at several Java reporting engines, but the only one that came close was JasperReports, and while the reporting engine was adequate it really required more programming to use than an end user could be expected to handle. The GUI designers available were downright clunky compared to BIRT."

Stephen Swoyer

Goods delivered on Java
Open-source reporting tools—and open-source BI solutions, in general—are increasingly popular among Java and J2EE developers.

Take Christiaan des Bouvrie, a Java programmer with Cost Engineering Consultancy, a Dutch provider of cost management software and services. His company is developing a new product that it expects to release by the end of the year. It’s a tool that calculates total lifecycle costs for engineering construction in a variety of different vertical markets including petrochemical, energy and heavy industry.

Cost Engineering designed the product as a rich client, two-tier app based on JDK 1.5—des Bouvrie is actually testing on the JDK 1.6 beta—and JDO2.

“We currently have a product on the market which is developed in Delphi using local databases,” he explains. “Based on customer input, we needed to move the application to a client-server technology. The company decided to make the most of this change. [So] not only the [technology] had to be renewed, [but] also the functional part had to meet the newest needs of the market.”

As des Bouvrie notes, his company’s app relies on the strength of its reporting capabilities. You would think that’s enough reason to pay for a powerful, and extensible COTS solution. Instead, Cost Engineering is using JasperReports as the lynchpin in its lifecycle management app.

The open-source decision wasn’t easy, however. When the company identified its app requirements, it found that none of the available open-source tools were up to snuff. So Cost Engineering selected J2PrinterWorks, which deBouvrie describes as an “inexpensive” commercial solution. “We’ve actually just started using JasperReports for this. When we started this project more than a year ago, there weren’t many good reporting tools for low cost that were able to deal with Java objects,” he explains.

Neck-breaking speed
Like BIRT, which went from zero to version 2.0 in less than 18 months, JasperReports is evolving at a breakneck pace. So when des Bouvrie and Cost Engineering identified a potential showstopper in J2PrinterWorks—it has difficulty printing very large tables—they circled back to an open-source reporting space that, des Bouvrie says, had been all but reborn.

“JasperReports looks very promising so far,” he says. “We can easily modify the design at runtime so we can provide our users with easy-to-use wizards. It can handle very large reports.” des Bouvrie has used JasperReports to render reports with more than 16,000 pages. “And it has … some open-source UI components, which we can integrate with our application,” he continues. “Also, the development team is very responsive on the newsgroup forum.”

From a reporting perspective, des Bouvrie says, JasperReports more than delivers the goods. “Our users want an easy-to-use tool; they are not programmers. As such, they don’t want to be bothered with difficult report generators, SQL queries or database schemas,” he explains.

“Reports should resemble the user interface as closely as possible. If the user has re-ordered the columns in a certain view, applied certain filters, it shouldn’t be difficult to get something similar for printing.”

Used in tandem with its iReport UI complement, JasperReports addresses all of these requirements—along with another. “Cost is definitely an issue since we have a wide variety of customers, from small contractors to large multinationals. Especially for the small customers, we need to deliver a price-competitive product,” des Bouvrie explains.

Universal acid eats through everything
Cost is also an issue as reporting evolves and begins to permeate apps throughout the enterprise. Philosopher and cognitive scientist Daniel Dennett famously describes Darwin’s account of evolution as a kind of “universal acid” that eats through successive “containers” of thought. The idea, says Dennett, is that evolution permeates and explicitly outstrips everything.

Reporting may not be acidic, but it is—or soon will be—permeating. In this sense, it’s like an acid that leaches into, and ultimately outstrips, each of its containers. At its essence, reporting is the most basic of activities, reflecting—in its business context, anyway—a feral hunger for qualitative or quantitative insight into performance and operations.

It shouldn’t surprise anyone, then, that once codejockeys find an open-source reporting tool that’s scalable, and mostly user-friendly, they discover new ways to use it. Unconstrained by licensing issues, support costs, or other familiar COTS showstoppers—and encouraged by IDE plug-in support, thriving user communities with expert help and code samples a go-go, and other programmer-friendly features—they weave reporting capabilities sinuously into the fabric of their app dev efforts.

That’s what happened to Luke Philips, a software engineer with a prominent U.S. telco. “We needed a reporting tool to help us [produce] professional-looking reports with the difficulty coming from regularly repeating this task with new or updated data,” Philips says. His company needed a solution—preferably Eclipse-friendly—that could analyze and report against an enormous repository of geospatial data. The reporting tool had to appeal to two very different user constituencies—software engineers and technology-averse business users. It also needed to provide a quantitative and qualitative feedback loop for developers and business users alike.

“Our work of address validation from customers is a process of massive geocoding of customer address to latitude and longitude coordinates,” Philips explains. “Through the advanced development of our group, our systems, and our algorithms, we know how well we found your location, from which spatial method we found your location, whether your location is valid from known quantities … and if we didn’t find you, why we didn’t, plus how close can we get from what we know.”

They also needed a tool to track company assets. “Being a large telco with tens of thousands of miles of underground cable, our ability to locate accurately and precisely where a certain segment of cable is anywhere in the country—and world—is crucial to our ability to upgrade, maintain, and repair quickly and affordably,” he says.

Betting on BIRT
Philips’ employer chose BIRT because it’s free, it has an active user community, and most importantly, it supports Eclipse. The strongest commercial contender was Microsoft’s SQL Server Reporting Services, which is starting to vie with best-of-breed COTS tools for market share. Philips’ employer already uses SQL Server so he could’ve deployed SSRS at no charge. But SSRS is not as Eclipse-friendly as the Eclipse-native BIRT.

“Unfortunately, to do report generation right, you need Visual Studio 2005 and some SQL tools, which can start eating upwards of 2 to 4GB [of system memory] on your box,” Philips says. He also doesn’t “find the reports to be as deep and powerful as BIRT reports.”

Philips and his team use BIRT to provide a kind of feedback loop into their geocoding and spatial identification efforts. Because the BIRT engine is Java-based, it plugs right into the company’s WebLogic app servers —another strike against SSRS. This functionality helps Philips’ team easily publish report abstracts for business users via the company’s intranet.

“The business needed something quantitative to base decisions off of and the developers wanted a quantitative way to measure application performance and see which enhancements and bug fixes would return the best results,” he says.

BIRT is more than up to the task, according to Philips. It can report against massive volumes of information, support most popular rendering or publishing formats—HTML, PDF, CSV—and thanks to its Eclipse integration, is easily incorporated into his team’s app dev projects.

That’s just the beginning, however. Philips—like many of his colleagues—is intrigued by the universal acidity of BIRT and other open-source reporting solutions. “If you are a report engineer, you can already see that there are tons of report possibilities in the work we deal with,” Philips explains.

The point, Philips says, is that if it’s so easy to incorporate robust reporting capabilities into the J2EE apps that he and his team build, why not do it? “As advertised, BIRT plugs right into our world of Eclipse development and gets the information from our databases, analyzes that information, then generates summaries, charts, and analysis for our reports,” he concludes.