9th VMworld Draws 20k Attendees, Crowd of Vendors

It's hard to believe they wrapped up the 9th annual VMworld conference in San Francisco last week. It seems like only yesterday I sat down with Diane Greene, VMware's co-founder, at a LinuxWorld conference to talk about a then largely misunderstood technology that she, her husband, Stanford professor Mendel Rosenblum, two graduate students (Edouard Gugnion and Scott Devine) and a friend from Berkeley (Edward Wang) had unearthed from the mainframe midden and re-imagined for x86.

This year's show drew an estimated 20,000 attendees and took up all three wings of the Moscone Center. But more remarkable to me was the number of third-party vendors working booths and making announcements at this sprawling event. Here are a few of many highlights I could have mentioned (and would have if they gave me more space in this blog). Some were big news, while some you might not have noticed, but should have:

- Cloud and virtual infrastructure control company HyTrust, for example, unveiled version 3.0 of the HyTrust Appliance -- which is a good example of VMware working with third-party vendors to move virtualization into the enterprise. HyTrust provide policy management and access control to virtual infrastructures. As the company's president and co-founder, Eric Chiu, explained it to me, the HyTrust Appliance (which is virtual) "enforces policies on the control plane of VMware-based virtual infrastructure and provides the visibility required for security and compliance."

"If you think about VMware, it's a new operating system for the data center," Chiu said. "We provide security and compliance controls around VMware's vSphere, in particular around the management and administration. So you get this fine-grained authorization; any time anyone is managing the environment, technically they're going through us."

The Mountain View, Calif.-based company also has one of my favorite slogans from the show: "Virtualization Under Control."

- Zend Technologies announced a partnership with show organizer VMware that integrates the vFabric Application Director with Zend Server. By integrating Zend's PHP-based Web-app server with VMware's cloud-enabled application provisioning solution, the two companies aim to make it easier for enterprises to deploy and manage their virtualized PHP apps to public, private and hybrid clouds. The Cupertino, Calif.-based provider of is the creator and commercial maintainer of the PHP dynamic scripting language and various frameworks, solutions, and services supporting it.

The companies noted in a statement that a "key" to this integration was a set of portable deployment blueprints that Zend created by working closely with VMware. "The blueprints feature a reference implementation that codifies the best-in-class standards of Zend Server for private, public and hybrid clouds," the companies said. "By creating the blueprints, Zend and VMware have made it easier for users to create a self-service interface to provision PHP applications."

- ServiceNow, a provider of cloud-based services that automate enterprise IT operations, announced the addition of end-to-end lifecycle automation for managing VMware virtual machines (VMs). The new capabilities in its cloud-based software are designed to manage the VMs "from creation to retirement," the company said.

"Until now, VMware has provided utilities for provisioning a VM and retiring one, but everything in-between and beforehand has been missing," spokesperson Caitlin Regan for the San Diego-based company said in an e-mail.

Company founder and chief product officer Fred Luddy lead a session at the show entitle "VMs Rock. But managing them on behalf of other people … sucks."

- Seattle's ExtraHop Networks, a provider of network-based application performance monitoring (APM) technology, launched version 3.7 of its Application Delivery Assurance system at this year's show. The new version introduces features that have not been available in the ARM market before, including:

Advanced Web Payload Analysis, which makes it possible for companies to manage all mission critical APIs; Precision Syslogging, which allows orgs to log critical events and metrics that hadn't been available for analysis by log aggregation solutions; Flex Grids, which provides a way to create versatile reporting summaries of user-specified metrics across devices, device groups and apps; and Dynamic GeoMaps, which shows worldwide activity and metrics based a translation of the IP address to a geographical location.

The ExtraHop folks pointed me to a blog post on their site about Gartner's latest Magic Quadrant for APM report. Worth a look if you're following this market.

- Savvis, a provider of cloud infrastructure and hosted IT solutions for the enterprise, unveiled a new cloud ecosystem program, with which the company aims to "deliver greater flexibility in the cloud computing environment through collaboration/partnerships with innovative cloud technology providers."

I'd swear I heard "cloud" and "ecosystem" more often at this show than "virtualization." But the Savvis news is worth noting, if for no other reason than the lineup of participants. Among the participants in the newly launched Savvis Enterprise Cloud Ecosystem Program, the company lists BMC Software, ServiceMesh, Rackware, Compuware, DataGardens, Racemi, RiverMeadow Software and ScaleXtreme.

Under the auspices of this program these companies are making their offerings available to Savvis clients. In return, they get complimentary access to the Savvis Symphony Virtual Private Data Center (VPDC) for API integration and testing. The company describes the VPDC as an "enterprise-class virtual private data center cloud solution." Participants also get direct access to Savvis' product management and engineering teams. They even get to use Savvis' sales and marketing resources.

Savvis is owned by Monroe, La.-based telecommunications company CenturyLink. 

- If there was a buzzphrase at this year's show, it was VMware's "software-defined data center." Network solutions provider Brocade offered a variation on that theme by unveiling its ADX Series of application delivery switches, which are part of its "software-defined networking (SDN) vision, strategy and innovation roadmap." The switches are designed to deliver highly scalable VXLAN (Virtual eXtensible Local Area Network) gateway services for virtualized cloud networks developed in partnership with VMware.

- VMware announced at the show an expansion of its VMware Ready for Networking and Security Program, which the company describes on its Web site as "a partner-focused initiative to integrate third party networking and security products into the VMware vCloud suite. One partner taking advantage of that expansion is F5 Networks, a provider of application delivery networking solutions. The company announced plans to integrate its BIG-IP products using the new program. BIG-IP is a suite of app delivery services designed to work together on the same hardware platform or software virtual instance.

- MokaFive, provider of a desktop-as-a-service platform, demonstrated a new product capability at the show. Dubbed Trickleback, it's a cloud sync capability designed to leverage commercial cloud-storage providers (Amazon S3, for example) to provide customers with "secure, encrypted synchronization of data across all of their computers and mobile devices." It works with both the MokaFive Suite, which includes tools that allows users to create, run, distribute and manage VMs called "LivePCs;" and MokaFive for iOS, which provides secure access to corporate files from an iPad.

Posted by John K. Waters on September 4, 20120 comments


Java after 2.5 Years with Big O

It would be hard to exaggerate the collective apprehension that seized the Java community when Oracle announced its plans to acquire Sun Microsystems back in 2009, and with it, the stewardship of Java. That acquisition was completed in January 2010, and Java jocks everywhere held their breath.

And then nothing really bad happened.

In fact, if you ask IDC analyst Al Hilwa, Java has fared relatively well in the occasionally clumsy grasp of the database giant in Redwood Shores. In a published report, "Java: Two and a Half Years After the Acquisition," Hilwa argues that the acquisition has been mostly good for Java.

"The story here is one of fears not materializing and a company learning to do the right thing with open source," Hilwa told me in an e-mail.

In his report, Hilwa cites the long-awaited release of Java SE 7 last July as one piece of evidence that "Java made more significant advancements after the Sun acquisition than in the two and half years prior to the acquisition." He also cites Oracle's influence on "key vendors," including IBM, Apple, and SAP, who have joined the OpenJDK open source implementation of Java and "anointed OpenJDK as the reference implementation for the technology."

Perhaps one of the greatest fears at the time of the acquisition was that Oracle would attempt to rule the Java community with a heavy corporate hand without sufficient community input. But in fact, Hilwa writes, "The Java ecosystem is healthy and remains on a growing trajectory, with more programming languages than ever now hosted on the Java Virtual Machine (JVM) and with many new developers further bolstering the broader Java skills ecosystem as mobile Android developers."

But Big O has managed to make an enemy or two on its way up the slippery Steward-of-Java learning curve, perhaps most notably when it decided to continue refusing to provide the Apache Software Foundation with a license for a Technology Compatibility Kit (TCK), which the ASF needed to complete work on its Harmony implementation of Java SE. Sun was the original denier. The ASF resigned its seat on the Executive Committee (EC) of the Java Community Process (JCP) in November 2011, and the Harmony project was sent to the "Attic," where Apache projects go when they lose their committers.

Earlier that year, Oracle stepped on toes in the open-source Hudson community when it announced plans to migrate the Java-based continuous integration (CI) server project to its java.net infrastructure and trademark the Hudson name. That decision led Hudson community members to vote to rename the project "Jenkins" and move the code from java.net to GitHub. Oracle later donated Hudson, lock stock and source code, to the Eclipse Foundation.

But Hilwa argues that, despite a few wrong turns, Oracle has "navigated most decisions with a deliberate and decisive approach that should inspire the community's confidence in Java's long-term prospects."

Challenges remain for Oracle and the Java community, Hilwa observes, not the least of which is growing pressure on Java "from competing developer ecosystems, including the aggressively managed Microsoft platform ecosystem and the broader Web ecosystem with its diverse technologies and lightweight scripting languages and frameworks." He also believes that the success of Android, and its potential "evolution into client and server form factors," has put Java at risk of fragmentation into "multiple forks of loosely similar but competing technologies" 

"To remain relevant and attractive to new developers," Hilwa concludes, "Java must evolve on a faster schedule and effectively support the ongoing industry transformation into mobile, cloud, and social applications."

Posted by John K. Waters on August 21, 20120 comments


Apache Hadoop Community Promotes YARN -- But Don't Call it MapReduce 2

The Hadoop community recently promoted YARN -- the next-gen Hadoop data processing framework -- to the status of "sub-project" of the Apache Hadoop Top Level Project. The promotion puts YARN on the same level as Hadoop Common, the Hadoop Distributed File System, and MapReduce. It had been part of the MapReduce project; the promotion means it'll now get the spotlight and developer attention its proponents believe it deserves.

"We now have consent from the community to separate YARN from MapReduce," says Arun C. Murthy. "Which is as it should be. YARN is not another generation of MapReduce, and I really don't like the 'MapReduce 2.0' label. This is a different paradigm. This is much more general and much more interesting."

Murthy ought to know: he's has been a full-time contributor to the Hadoop MapReduce project since it got off the ground at Yahoo in early 2006. Back then, he and fellow Yahoo software engineer Own O'Malley set a world data-sorting record (http://sortbenchmark.org/) using Map-Reduce: a terabyte in 60 seconds. Today, Murthy is a member of the Apache Hadoop Project Management Committee and a co-founder of Hortonworks, one of the chief providers of commercial support and services for Hadoop.

And he's been working on YARN full-time for about two and a half years.

"We knew that we were going to have to take Hadoop beyond MapReduce," Murthy says. "The programming model—the MapReduce algorithm—was limited. It can't support the very wide variety of use-cases we're now seeing for Hadoop. YARN turns Hadoop into a generic resource-management-and-distributed-application framework that lets you implement multiple customized apps. I expect to see MPI, graph-processing, simple services, all co-existing with MapReduce applications in a Hadoop YARN cluster. You can even run MapReduce now as an application for YARN."

Hadoop, of course, is the open-source framework for running applications on large data clusters built on commodity hardware (let's just say it: Big Data). I sometimes forget that Hadoop is actually a combination of two technologies: Google's MapReduce and HDFS. MapReduce is a programming model for processing the large data sets that supports parallel computations on so-called unreliable clusters. HDFS is the storage component designed to scale to petabytes and run on top of the file systems of the underlying operating systems.

What Murthy and others are hoping to do is redefine Hadoop from "HDFS-plus-MapReduce" to "HDFS-plus-YARN."

"The users can now look at Hadoop as a much more general-purpose system," Murthy says. "And from a developer perspective, we've opened up Hadoop itself to the point where now anyone can implement their own applications without having to worry about the nitty-gritty details of how you manage resources in a cluster and what you do for fault tolerance. [Promoting it] will also help us get more users and more developers to build an ecosystem around YARN. I guarantee you that next year at this time, we will be looking at four or five ways of doing real-time processing on Hadoop."

And I had to ask: What does YARN stand for?

"We were sitting around at lunch one day, trying to come up with the most inane names for our product," Murthy confessed to me. "The result was 'Yet Another Resource Negotiator—YARN.' I know: it's a really bad name."

But really promising technology.

Hortonworks is in the process of publishing a still-unfolding series of blogs by Murthy and Hortonworks' product marketing director Jim Walker on the subject of YARN and its implications for Hadoop. And there's a new collaboration mailing list ([email protected]) for those who want to get involved in the project.

Posted by John K. Waters on August 15, 20120 comments


Judge Orders Google, Oracle to Disclose Paid Bloggers

This week a California court ordered both Google and Oracle to disclose the identities of any bloggers, commentators or journalists who were paid to write about the companies' courtroom Java battle.

"The Court is concerned that the parties and/or counsel herein may have retained or paid print or internet authors, journalists, commentators or bloggers who have and/or may publish comments on the issues in this case," wrote Judge William Alsup in a Tuesday filing.

The judge added that even though this particular case is almost over, "the disclosure required by this order would be of use on appeal or on any remand to make clear whether any treatise, article, commentary or analysis on the issues posed by this case are possibly influenced by financial relationships to the parties or counsel."

Both sides in the case were ordered to file a statement clearly identifying "all authors, journalists, commentators or bloggers who have reported or commented on any issues in this case and who have received money (other than normal subscription fees) from the party or its counsel during the pendency of this action." The two companies are required to file those statements by Friday, August 17.

Oracle had alleged that Google infringed on Java-related patents and copyrights when it developed its Android operating system. The jury in the case ruled unanimously in May that Google had not infringed on those patents when it developed its Android operating system. But it delivered a partial verdict on May 7, holding that Google had infringed on Oracle's copyrights in its use of 37 Java APIs, but deadlocked on whether that infringement could be considered "fair use."

Alsup is the judge who presided over the case in the U.S. District Court for the Northern District of California, and ruled in June that the Java APIs are not subject to copyright, though he kept his ruling narrow: "This order does not hold that Java API packages are free for all to use without license," Alsup wrote. "It does not hold that the structure, sequence and organization of all computer programs may be stolen. Rather, it holds on the specific facts of this case, the particular elements replicated by Google were free for all to use under the Copyright Act."

On April 18, blogger Florian Mueller, who writes the FOSS Patents blog and is a long-time follower of the Oracle v. Google case, disclosed to his readers a new consulting relationship with Oracle. Mueller wrote:

  • "I have been following Oracle v. Google since the filing of the lawsuit in August 2010 and have read pretty much every line of each court filing in this litigation. My long-standing views on this matter are well-documented. As an independent analyst and blogger, I will express only my own opinions, which cannot be attributed to any one of my diversity of clients. I often say things none of them would agree with. That said, as a believer in transparency I would like to inform you that Oracle has very recently become a consulting client of mine. We intend to work together for the long haul on mostly competition-related topics including, for one example, FRAND licensing terms."

Mueller noted in that posting that he "vocally opposed Oracle's acquisition of Sun Microsystems."

Posted by John K. Waters on August 8, 20120 comments


No Project Jigsaw in Java 8

Looks like we won't be seeing the Java-native module system known as Project Jigsaw in the upcoming Java 8 release. In a blog posted this week, the chief architect of Oracle's Java Platform Group, Mark Reinhold, proposed to defer the project to the Java 9 release. Java 8 is currently on track for a September 2013 ship date. Java 9 is currently expected in 2015.

Although "steady progress is being made" on Jigsaw, some "significant technical challenges remain," Reinhold wrote, adding, "There is, more importantly, not enough time left for the broad evaluation, review, and feedback which such a profound change to the Platform demands."

Not to be confused with the weird puppet-head guy in the Saw movies, Project Jigsaw is the OpenJDK project focused on implementing a standard module system for Java Standard Edition (SE). Sponsored by the Java programming language Compiler Group, and originally aimed at modularizing just the JDK, the project will ultimately apply to the Java SE, EE, and ME platforms and the JDK.

"The growing demand for a truly standard module system for the Java Platform motivated expanding the scope of the [Jigsaw] Project," the sponsors explain on the OpenJDK Web site. The goal of the project is to "produce a module system that can ultimately become a JCP-approved part of the Java SE Platform and also serve the needs of the ME and EE Platforms."

When it is implemented, a modular system for Java will "ease the construction, maintenance, and distribution of large applications, at last allowing developers to escape the ‘JAR hell' of the brittle and error-prone class-path mechanism," Reinhold wrote." Such a system will support customizable configurations that scale from large servers to embedded devices and "in the long term, enable the convergence of Java SE with the higher-end Java ME Platforms." Reinhold also pointed out that "Modular applications built on top of a modular platform can be downloaded more quickly, and the run-time performance of the code they contain can be optimized more effectively."

Reinhold expects Java 8 to include the much-anticipated Project Lambda (JSR 335), which adds closures and related features to the Java language to support programming in multicore environments. Java 8 will also include the new Date/Time API (JSR 310), Type Annotations (JSR 308), and "a selection of the smaller features already in progress," he said.

Work on Jigsaw will, in the meantime, proceed at full speed, he added.

In that same blog post, Reinhold also advocated for a regular two-year cycle for all future Java SE releases.

"In all the years I've worked on Java," Reinhold wrote, "I've heard repeatedly that developers, partners, and customers strongly prefer a regular and predictable release cycle. Developers want rapid innovation while enterprises want stability, and a cadence of about two years seems to strike the right balance. It's therefore saner for all involved -- those working on new features, and those who want to use the new features -- to structure the development process as a continuous pipeline of innovation that's only loosely coupled to the actual release process, which itself has a constant rhythm. If a major feature misses its intended release train then that's unfortunate but it's not the end of the world: It will be on the next train, which will also leave at a predictable time."

IDC analyst and long-time Java watcher Al Hilwa believes that the delayed release of Project Jigsaw is probably the right move.

"Java does not exist in a vacuum and delays in the Java modularity project of the JDK will no doubt hinder certain parts of the ecosystem," Hilwa told ADTmag. "However, under the circumstances, I think it is wise to prioritize schedule over features. The maturity of any development process is measured by the predictability of its schedule. Oracle has done a decent job of steering Java to be schedule driven, and kudos to the team for owning up at the right time, because the ecosystem needs to know as early as possible."

I would argue that the success of the annual Eclipse Release Train, now in its seventh year, offers an example of the value of predictable releases, both in terms of reassuring the commercial adopters and the community itself. Hilwa believes that the releases should come even faster.

"I would argue that, in the era of cloud services, social interaction, and mobile app stores, a faster cadence is needed," Hilwa added, "and the two-year cycle should give way to a more incremental and faster approach to development everywhere."

 

Posted by John K. Waters on July 19, 20120 comments


Spring Creator Rod Johnson Leaves VMware

Rod Johnson, who wrote the first version of the open-source, Java-based Spring framework, and later co-founded SpringSource, has left his position as SVP and GM of VMware's SpringSource product division. Johnson joined the Palo Alto, Calif.-based virtualization company when it acquired SpringSource in 2009, where he then served as CEO.

In the blog post announcing his departure, Johnson gave no specific reasons for leaving the company, but described that past decade as "a wild and engrossing ride that I could never have imagined when I wrote the first lines of BeanFactory code in my study in London in 2001."

The Spring Framework is one of the most popular Java application frameworks on the market today. It's a layered Java/J2EE framework based on code published in Johnson's book Expert One-on-One Java EE Design and Development (Wrox Press, October 2002). He also wrote the first version of the framework. Although SpringSource has been a Java-focused operation, the company has ported its framework to .NET.

The open source Spring project was launched in 2003, and Johnson co-founded SpringSource in 2004. When the company was acquired by VMware in 2009, Johnson saw the merger as a joining of forces.

"Both of these companies grew up around great technology," he told ADTMag.com at the time. "We believe that the technology synergies are very, very strong, and that they will allow us to do incredibly exciting things with Platform as a Service and Java cloud technologies."

The VMware merger is responsible, at least in part, for the Spring Framework's expansion into management, runtimes, and non-Java development tools. In 2010 the company launched a lightweight version of its tc Server to provide a small footprint for running applications in virtualization and cloud-deployment architectures. The division also acquired data management vendor GemStone that year with plans to use that company's GemFire enterprise data fabric to give developers using the Spring Framework the infrastructure necessary for emerging cloud-centric applications.

Johnson served as member of the Executive Committee (EC) of the Java Community Process (JCP) and was an outspoken critic of the JCP's slow progress toward resolution of problems with J2EE. In 2009, during the latest dustup in an ongoing conflict between Sun Microsystems and the Apache Software Foundation (ASF) over Sun's refusal to provide the Foundation with a license for a Technology Compatibility Kit (TCK), Johnson expressed his disappointment with the process to @ADTmag: "This issue raises legitimate concerns about the credibility of the JCP as a whole," he said. "I mean, the JCP is either open or it's not. I have a lot of sympathy for the Foundation on this issue."

In 2011, Johnson told attendees at the annual JAX Conference in San Francisco that Java developer needed to "seize the lead in cloud computing." Developers would soon to need to be able to build applications that "leverage a dynamic and changing infrastructure, access data in non-traditional storage formats, perform complex computations against large data sets, support access from a plethora of client platforms and do so more quickly than ever before without sacrificing scalability, reliability and performance," he declared. What's called for now, is "an open, productive Java Platform-as-a-Service."

Mike Milinkovich, executive director of the Eclipse Foundation, believes that, whatever Johnson does next, he'll be remembered for his work on the Spring Framework and his efforts to simplify Java development.

"I think Rod will be remembered as one of the pioneers of the open source and Java community," Milinkovich said. "He showed how open source can be used to create innovative technology that is widely used by the enterprise Java community. His lasting legacy will be forcing the simplification of the enterprise Java middleware stack. In doing so, he played a very large part in making Java a success."

IDC analyst Al Hilwa sees Johnson as a "role model in entrepreneurship" who has had a big impact on Java developers during his tenure at the head of SpringSource.

"Even though at heart he is a developer, very few have been able to roll obscure developer frameworks into an acquisition of the size that VMware paid for SpringSource," Hilwa said. "We may continue to evaluate whether VMware's ventures in application platform will make a lasting business model or generate sustainable revenue, but Rod's impact on the life of developers is undisputedly sizeable. The ideas pioneered by the Spring Framework have had long-lasting impact in the Java world as well as wide adoption. What's more, they affected the way Java EE has evolved, which has absorbed many of these innovations."

In his blog post, Johnson expressed his satisfaction with the success of Spring as a means of simplifying Java development. "Spring was created to simplify enterprise Java development, and has succeeded in that goal," he wrote, adding that Spring has become the dominant programming model for enterprise. Johnson also pointed to the framework's evolution as enterprise technology "well beyond the scope of the original Spring Framework." He cited a range of "Spring-created technology at the forefront of enterprise development," including Spring for Apache Hadoop (Big Data), Spring Data (NoSQL and distributed datastores), Spring Social (social networking), and Spring Mobile (mobile development).

Johnson also sought to reassure members of the open source Spring community in his blog post: "Spring will continue to be driven forward by the Spring project leads, whom you've all come to know and trust over the past several years. Their experience, deep technical knowledge and innovative thinking will continue to guide Spring's development. I look forward to seeing what they'll create for the next decade, in partnership with their communities."

Posted by John K. Waters on July 10, 20121 comments


Hadoop Summit 2012 Highlights

The fifth annual Hadoop Summit brought an estimated 2,100 attendees to the Convention Center in downtown San Jose, Calif., last week. The two-day, big-data event was hosted by Yahoo, Hadoop's first large-scale user, and Hortonworks, a leading commercial support-and-services provider.

Among the announcements coming out of this year's summit were updates from the three leading commercial Hadoop distributors. Hortonworks unveiled the first general release of its Apache Hadoop software distro, Hortonworks Data Platform (HDP) 1.0, a day before the start of the show. The company bills the open source data management platform as "the next generation enterprise data architecture." Built on Apache Hadoop 1.0, this release includes a bundle of new provisioning, management, and monitoring capabilities built into the core platform. It also comes with an integration of the Talend Open Studio for Big Data tool.

Cloudera got a big jump on the competition by announcing a new release a week earlier, but the company showed on its new CDH4 and Cloudera Manager 4, which are part of Cloudera Enterprise 4.0, at the show. Version 4 of CDH, the company's open source Hadoop platform (on which Enterprise 4.0 is built), expands the number of computational processes executable under Hadoop and introduces a new feature designed to software programs to be embedded within the data itself. Dubbed "coprocessors," these programs are executed when certain pre-defined conditions are met.

MapR Technologies showed off version 2.0 of its Hadoop distro, the first to support multi-tenancy. The new version also comes with advanced monitoring management tools, isolation capabilities, and added security. MapR is offering this release in a basic edition (M3), and an advanced edition (M5). The MapR Hadoop Distribution M3 supports HBase, Pig, Hive, Mahout, Cascading, Sqoop and Flume. The M5 edition adds high availability features and additional security tools, including: JobTracker HA, Distributed NameNode HA, Snapshots and Mirroring.

Also, VMware launched a new open source project codenamed "Serengeti" at the show. The Web site describes the project's goal "to enable the rapid deployment of an Apache Hadoop cluster... on a virtual platform." VMware says the project aims to produce a virtualization-aware Hadoop configuration and management tool. VMware is partnering with Cloudera, Hortonworks, MapR and big data analysis company Greenplum on this project.

Apache Hadoop is an increasingly popular, Java-based, open-source framework for data-intensive distributed computing. They system is designed to analyze a large amount of data in a small amount of time. At its core, it is a combination of Google's MapReduce and the Hadoop Distributed File System (HDFS). MapReduce is a programming model for processing and generating large data sets. It supports parallel computations over large data sets on unreliable computer clusters. HDFS is designed to scale to petabytes of storage and to run on top of the file systems of the underlying OS.

Attendance at this year's Hadoop Summit set a record. The first event, held in 2008, drew an estimated 500 attendees. The Summit's sponsorship roster underscores the growing importance of the data analysis platform. Cisco, Facebook, IBM, Microsoft and VMware were among the heavy hitters adding their support to the event; there were 49 event sponsors total.

Speaking at the conference, Facebook engineer Andrew Ryan talked with attendees about his company's record-setting reliance on the HDFS clusters to store more than 100 petabytes of data. During his talk, Ryan explained how Facebook has worked around Hadoop's key weakness: its reliance on a single name server (Namenode) to send and receive all filesystem data via a pool of Datanodes. If a Datanode goes down there's little impact on the cluster, but if Namenode goes down, no clients can read or write to the HDFS. The fix: AvatarNode, a piece of software designed to provide a backup Namenode. Ryan laid out the details from his talk in a blog post.

Posted by John K. Waters on June 18, 20120 comments


JNBridge 'Lab' Helps .NET Devs With Hadoop

JNBridge, maker of tools that connect Java and .NET Framework-based components and apps, released a free interoperability kit for developers looking for new ways of connecting disparate technologies on Monday. This second JNBridge Lab demonstrates how to build and use .NET-based MapReducers with Apache Hadoop, the popular Java-based, open-source platform for data-intensive distributed computing.

The company began offering these kits in March. The first JNBridge Lab was an SSH Adapter for BizTalk Server designed to enable the secure access and manipulation of files over the network. This new Lab aims to provide a faster and better way to create heterogeneous Hadoop apps than other current alternatives, the company claims. All of the Labs come with pointers to documentation and links to source code.

The new Hadoop Lab shows developers how to write .NET-based Hadoop MapReducers against the Java-based Hadoop API, which avoids the overhead of the Hadoop streaming utility. The resulting .NET code can run directly inside Hadoop processes.

"Streaming works," said JNBridge CTO Wayne Citrin, "but it's kind of thin gruel. It really makes non-Java MapReducers into second-class citizens in the Hadoop world. You have to manage and configure a separate process. You have to parse the output and put it back together when you're done, which is another overhead cost. Then there's the overhead of going through sockets. It's not surprising that not that many people actually use .NET in this case."

The code provided in the Hadoop Lab can be run as an example, Citrin explained, or it can be used as a design pattern for users to develop their own Hadoop apps using C# or VB.NET.

JNBridge started its Labs project started earlier this year as part of the company's 10-year anniversary celebration.

"It was a way of showing people how to use the out-of-the-box functionality of JNBridgePro to do useful things that they may not have thought of, or that don't exist out there as products," Citrin said.

The company's flagship product, JNBridgePro, is a general purpose Java/.NET interoperability tool designed to bridge anything Java to .NET, and vice versa, allowing developers to access the entire API from either platform. Last year the company stepped into the cloud with JNBridgePro 6.0.

Why would anyone want to build MapReducers in .NET?

"For the same reasons you would want to use JNBridgePro in the first place," Citrin said. "Your organization might have .NET-based libraries they need or want to use in a Hadoop application. Your company might have more people skilled in .NET than Java. Or you might be working with Windows Azure, which supports Java, but the .NET tooling is better."

Citrin confesses that developers have yet to begin trampling each other to download the JNBridge Labs, but there has been enough interest and feedback to keep the project going.

The JNBridge Labs are available for download, free from the company's Web site. Although the kits are free, they require a JNBridgePro license for use beyond the trial period. The company announces new Lab releases on its blog.

Posted by John K. Waters on May 21, 20120 comments


Architect Spotlight: Brian Noyes, High Flying F-14 Vet Turned .NET MVP

Brian Noyes didn't set out to become a software architect. He started writing code "to stimulate his brain," while he was flying F-14 Tomcat fighter aircraft for the U.S. Navy. As his software expertise developed, he found himself "going down a technical track" managing onboard mission computer software in the aircraft, and later, systems and ground support software for mission planning and controlling satellites.

"It was just a hobby," Noyes says, "but it led me to work that I still love to do."

Noyes left the Navy in 2000 and today is chief architect at IDesign, a .NET-focused architecture, design, consulting, and training company. He's also a Microsoft Regional Director and an MVP, and the author of several books, including: Data Binding with Windows Forms 2.0: Programming Smart Client Data Applications with .NET (Addison-Wesley Professional, 2006) and Developer's Guide to Microsoft Prism 4: Building Modular MVVM Applications with Windows Presentation Foundation and Microsoft Silverlight (Microsoft Press, 2011).

Noyes specializes in smart client architecture and development, presentation-tier technologies, ASP.NET, workflow and data access. He writes about all these topics and more on his blog, ".NET Ramblings."

Not surprisingly, Noyes is a fan of Microsoft's Extensible Application Markup Language (XAML). He says Microsoft got a lot of things right when it created this declarative, XML-based language for the .NET Framework back in 2005/2006.

"XAML provides a clean separation between the declarative structure and the code that supports it," Noyes says. "That can either come in the form of the code-behind that's inherently called to it in the way Visual Studio does it, or using the Model View ViewModel (MVVM) pattern to have even better separation. They put mechanisms into the bindings and control templates and data templates that just give you this nice separation of things -- if you want them." "

They really facilitated both ends of the spectrum," he continues. "They made it so you have a drag-and-droppy, RAD-development kind of approach, where you're not so concerned about the cleanliness of the code and how maintainable it is and you just want to get it done. Or, if you're more of maintainability Nazi, as I am, and want absolutely clean code and separation of concerns and things like that, it facilitates that as well."

XAML shipped with the .NET 3.0, along with the Windows Presentation Foundation (WPF), of which Noyes is also a fan. "One thing I always say about WPF is that they did a darned good job of getting it right the first time," he says, "because, since the first release, there has been very little change to the core framework. Whereas with Silverlight they've had to do substantial improvements with each release to inch it up closer to what WPF was capable of."

Noyes explores uses for all of these tools and technologies in his sessions scheduled for upcoming Visual Studio Live! conferences. "For events like this, it's about giving them knowledge they can take home and use in the trenches the very next day," he says. "I try to keep things close to the code."

Posted by John K. Waters on May 11, 20120 comments