Corporations want employees to "think big." Some want them to "think many," as well. Servers, that is. We're talking about thousands of servers linked together and delivering the power of "cluster" or "cloud computing." That's what people are calling the computing model that takes vast amounts of computational horsepower, produced by many machines working in parallel, and makes that resource available via the Internet or some other network.
In the age of the Internet and its attendant "Internet-scale" computing, it's "no longer enough to program one machine well," blogs Googler Christophe Bisciglia, a senior software engineer and thought-daddy of the Academic Cluster Computing Initiative (ACCI), a program offered jointly by Google and IBM. In tackling tomorrow's challenges, he says, "Students need to be able to program thousands of machines to manage massive amounts of data in the blink of an eye."
That's what Google does, and it's what others can do in a cloud-computing environment. Consequently, ACCI is making such an environment accessible to academic researchers. This fall, students and faculty at Arizona State University are becoming part of this Google/IBM effort to bring cloud computing to college campuses.
Not a cloud in the sky
"To practice cloud computing, you have to have a cloud," notes Adrian Sannier, university technology officer for Arizona State. "That's the obstacle most universities have. A cloud is a big thing, and it's very expensive."
Since last year, ACCI has provided a cloud to students at six U.S. university campuses: MIT, Stanford, Carnegie-Mellon, the University of Washington, the University of Maryland and the University of California at Berkeley. ACCI runs at one Chinese university and two in Taiwan, as well.
For all of these institutions, there is more to ACCI than sheer computing power. Students also get instruction, which was pioneered through curricula originally produced at the University of Washington, Bisciglia's alma mater.
At ASU, the course will undergo development this fall in the form of a series of workshops which commence September 22. From those workshops, a curriculum will evolve, which will be offered next spring.
"We'll start with an overview of important computing models," says Daniel Stanzione, director of the high-performance computing center at ASU's Fulton School of Engineering and a member of the faculty team that will teach the cloud-computing course. "Then, we'll learn programming in Google style," he adds.
That style includes MapReduce, a Google programming model for processing large data sets. In addition, students will use Hadoop, a software platform whimsically named after a child's toy elephant. It, too, focuses on large-scale data processing.
Raghu Santanam, professor of information systems at the W. P. Carey School of Business and another cloud-computing course instructor, explains the idea behind MapReduce this way: "First you split up the problem and send it off to multiple computers. Then, you bring those data back together and combine them into a single answer."
According to Santanam, this kind of process applies to many different applications. But, "if you want to do something like that, you need a huge computing infrastructure," he notes. This is where the ACCI cloud comes in handy.
The technology students and researchers get to work on mirrors that are used in Google's internal operations on a smaller scale. Google provides computing resources, and IBM provides system administration, according to Google spokesman Andrew Pederson.
"Our students get to work on the same kind of computing platform that Google uses in its own computing operations," Santanam says.
How did ASU get involved? The school is well known to Google. ASU is the largest user of Google Apps, says Kari Barlow, assistant vice president of the university's technology office. In fact, all 65,000 of ASU's students have Gmail, and the system was deployed in a two-week period. "We peaked during the last semester at 63,000 seven-day active users," she says. This means that, "during a seven-day period, 63,000 different people logged in. The next highest institutional user is at 32,000, so our system is almost double its size."
Barlow believes that ASU's ability to deploy Gmail and other Google Apps on the fly and get adoption so quickly proved its commitment to top-tier technology and worthiness as a business partner.
She also thinks ASU's commitment to interdisciplinary education earned a nod from Google. According to Barlow, other campuses are focusing on the computer science of cloud computing. ASU will put "a set of computer science students working with business students or people from biology or political science to solve their research questions." In that way, the groups will perform like a cross-functional team in a corporation. They'll apply cloud computing to the same kinds of down-to-earth problems they may encounter on the job.
Apps fit for the cloud
What kinds of applications are appropriate for this computing venue?
"Cloud computing is a compelling model for a lot of research problems, especially the data-intensive ones," says the Ira A. Fulton School's Stanzione. "There are a lot of biology problems that might work well, such as searching through a large gene databank to find correlations to some disease. It also could analyze financial information -- "looking for trends and anomalies."
Geology might be another use for clouds, he adds. "You could search through large image repositories to look for certain features -- things that might indicate the presence of oil or minerals." Stanzion's computing center worked on a storage solution for data that will be gathered this fall, when NASA launches a moon-imaging mission. It doesn't necessarily require cloud computing, "although we might use it," he says.
So might Santanam, who is working on a computerized simulation of flu pandemic. He and his team have been looking at the social ties people encounter day-to-day -- at work, school, restaurants, the movies -- and virtually replicating how disease spreads through these human networks.
"Think about the people who might need this information," he says. "It's decision-support for executives who are looking at pandemic planning. You can give them this tool so they can see what happens if they quarantine an area. What happens if they close all the schools in the area?"
Santanam notes that this is a computer-intensive application, particularly as it could, conceivably, pull from many sources, such as the U.S. Department of Transportation's data on how many people travel between airports daily or census data on travel between counties. That's a great application for a cloud, he says.
So is business data mining, maintains tech officer Sannier. "Business intelligence has moved to the forefront" of data-intensive operations, he says. "We've been gathering these data for years and only using a small fraction of them to help us understand how to run our businesses. These kinds of large scale algorithms, which today are the province of only a few, will be the thing that separates successful companies from unsuccessful ones in the global marketplace. Students who encounter how to do that in the cloud are going to bring companies into that new world."
Reach for the clouds
Putting such applications into the cloud is akin to outsourcing your information technology department, says Haluk Demirkan, professor of information systems at the W. P. Carey School of Business. And, he says, such computing models demand different skill sets than using a computer or buying a server. Based on his service-oriented technology and management research work with corporate giants such as Intel, IBM and American Express, he sees a number of shifts ahead in IT procurement and contracting. This provides many opportunities in addition to new challenges for the corporate executives and employees.
For instance, business and IT people will need to negotiate service-level agreements, not just purchasing contracts, once they move to such computing models. Instead of focusing on the details of a transaction, contracting for cloud-computing services means forging a relationship, he says. Integration and communication skills become keys to success, as IT people shift from writing code to designing and coordinating dynamic services.
According to Sannier, it was the relationship that ASU had already fostered with Google that prompted him to push for inclusion in ACCI.
"For students, cloud computing is one of the rarest and highest-valued skills in computer science at the moment. And, it's one of the most difficult to acquire," he says. "Short of working in a commercial environment, you don't have the computing horsepower necessary to learn how to do problems at the scale of the Internet. It's a big deal to get access to that kind of environment and put on your resume that you've operated MapReduce at scale."
And, students will gain from this kind of exposure, says Google's Pederson, if they want to keep up with how IT is evolving.
"As large-scale, highly parallel computing -- cloud computing -- becomes the industry standard, the next generation of software developers will need to move towards a model based on hundreds or thousands of computers working together," he says. "Lowering the barriers to entry for students and faculty to access these otherwise prohibitively costly resources will ensure that the next generation of software developers is equipped to build the Internet-scale applications of tomorrow."
- Google and IBM have teamed up to provide cloud-computing resources to a small number of U.S. universities for students and faculty to use in research.
- Arizona State University has been selected to join schools such as MIT and Stanford in this initiative.
- A series of cloud-computing workshops will take place this fall; a cloud-computing course will be offered in the spring.
- This initiative gives students access to the same type of computing architecture Googlers use in developing Internet-scale applications.