Carnegie Mellon To Engage Yahoo! Open Source Supercomputing Project

Carnegie Mellon University will become the first higher education institution to work with Yahoo!'s M45, a new project announced by the Internet firm designed to advance distributed computing research and software development. The program, which leverages the Apache Software Foundation's open source Hadoop, will allow researchers to test software running on a Yahoo!-provided 4,000-processor supercomputer.

According to Yahoo!, its M45 project differs from other supercomputing projects in that it's focused exclusively on "pushing the boundaries of large-scale systems software research." For the program, Yahoo! will make available to researchers a 4,000-processor computing cluster capable of performing 27 teraFLOPS and sporting 3 TB of memory and 1.5 petabytes of storage. It will run the latest version of Hadoop (to which Yahoo! is one of the principal contributors) and other open source software, including, according to Yahoo!, the Pig parallel programming language.

"Hadoop has become an important computing environment for data-intensive applications and Yahoo! is playing a leading role in its development. We are excited about collaborating with Yahoo! on systems software research, helping to advance the state of the art, and creating new research possibilities in this critical area," said Randall E. Bryant, dean of the School of Computer Science at Carnegie Mellon, in a statement. "We look forward to working with Yahoo! and jointly contributing back to the open source community."

Carnegie Mellon, for its part, will be the first institution to use the system. CMU professors Garth Gibson and Greg Ganger will "instrument the system and evaluate its performance," according to Yahoo! "Simultaneously, Carnegie Mellon computer science professors Jamie Callan and Christos Faloutsos, academic leaders in text and Web mining, will solve challenging information retrieval and large-scale graph problems on the cluster. Carnegie Mellon faculty members Alexei Efros, Noah Smith, and Stephan Vogel will also use the cluster to tackle large-scale computer graphics, natural language processing, and machine translation problems, respectively."

CMU is also involved in Google/IBM's parallel computing initiative, a pilot academic program also centered around Hadoop and a large-scale processor cluster.

Yahoo! said it plans to make M45 available to other institutions for research in the future.

About the Author

Dave Nagel is the executive editor for 1105 Media's educational technology online publications and electronic newsletters. He can be reached at [email protected].