Lucio Barral, Diego Malvar Ă lvarez, Hugo Medina Albero, Miguel
The Cloud Computing the path that led us here and the current possibilities
This document has been produced by Diego Lucio Barral, Hugo Malvar Álvarez and Miguel Medina Albero, during the academic year 2012-2013 as part of the course Inglés Técnico of Computing Sciences, which imparts Dr. Pablo Cancelo López, senior lecturer at the English Department of the University of a Coruña. All the book and website references that has been used in this work are cited in section Bibliography/ Webgraphy.
In the small column, you can find minor information and some concept explanations about the subject-matter.
This is the main column. Here you can find the major issues treated at each chapter. It uses lists to organize elements such as: – Characteristics. – Advantages and disadvantages. – Examples.
In the gray box, you can find historical facts, related anecdotes and relevant quotes.
Citations [b.1] Refers to the book 1 on the bibliography. [w.1] Refers to the article 1 on the webgraphy.
Index of contents Part 1 The evolution of computing structures Typical / Basic deployment Clusters Grid Computing Volunteer Computing Utility Computing
Part 2 Cloud computing The path to the cloud computing The different kinds of cloud Infrastructure as a Service The 4 levels of IaaS Platform as a Service Software as a Service
Part 3 Practical Case Study FOLDING@home Autodesk 360 Google Drive
Conclusions Bibliography / Webgraphy
Part 1
The evolution of computing structures
Typical / Basic deployment The basic computing structure consists of a single computer working isolated. It has a certain fixed storage and processing capacity, which is used to perform tasks and to store information. Applications runs on the designated server. That implies some limitations: –
–
–
Let's get back to the first wave of Internet-based computing, sometimes called Web 1.0, arrived in the 1990s. In the typical interaction between an user and a web-site, the web-site would display some information, and the user could click on the hyperlinks to get additional information. Information flows was thus stricktly one-way, from institutions that maintained websites to users. This model was that of a gigantic library, with search engines being the library catalog. [b. 3]
The used server may overload, causing the slowing down of the processes or even the loss of data. The other available servers stay idle, not taking advantage of their capacity during long periods of time. While the other servers are idle, their capacity is not being utilized.
Image from the video IBM Grid Computing Demo, at pcudc.es.
Clusters If we want to get more performance than can be obtained from a single computer, we can join several of them creating a cluster. A cluster involves a set of connected computers that work together so that they can be viewed as a single system. Clusters use the resources of all their computers together and are usually deployed to improve performance and availability with a better cost-effective ratio. The IBM's sequoia was said to be the world fastest super-computer until 2012, with a performance of 16.32 petaFLOPS, running on over 1.5 million processors, more than 98 000 computers with 16 core each one. It runs on LINUX.
This way of work is known as Parallel Computing and it is a computing structure in which computers are tightly coupled.
Scheme from Wikipedia.
Grid Computing
The second wave of Internet computing developed in the early 2000s, when applications that allowed users to upload information to the Web became popular. This brought about a new class of applications due to the rapid growth of usergenerated content, social networking, etc. This new generation Internet usage is called the Web 2.0 and, if Web 1.0 looked like a massive library, Web 2.0 is more like a virtual world which in many ways looks like a replica of the physical world. [b. 3]
Distributed Computing [w. 3] A distributed system consist of multiple computers that communicate through a computer network. They interact to achieve a common goal. A program that runs in a distributed system is called a distributed program.
A grid is a form of distributed computing that involves a group of computers connected with a network so that they are able to share their resources, storage and processor. Compared to clusters, grids tend to be more loosely coupled, heterogeneous and dispersed. The advantages of Grid Computing compared to isolated computers are: –
– –
–
It allows you to leverage the resources of the whole grid assembly, in order to perform large tasks. It gives users the ability to work together in a single database. It may be more cost-efficient for obtaining a high level performance, and it can be more reliable. It is easier to expand.
A grid implements a 'virtual super computer' made up of a group of networked computers acting together to perform large tasks. [b. 2]
It can basically work in two ways: – Having a common goal, like solving a computational problem. – Each computer has individual needs, the purpose is to coordinate the use of shared resources.
Image from the video IBM Grid Computing Demo, at pcudc.es.
The concept of “The Grid” was presented in the early 1990s by Ian Foster and Carl Kesselman. [b.2] Parallel / Distributed [w. 3] The definitions overlap a lot. But we can say that parallel computing is a tightly coupled form of distributed computing, while ordinary grids are loosely coupled.
Some computing providers started to offer the deployment of networks for the enterprises so that the employes would work on a big single database. Here we can stand out IBM Grid System, whose structure used an automatic scheduler for setting rules and priorities.
Volunteer Computing Sometimes, a certain public entity or enterprise needs to perform a computational task of public interest that involves huge amounts of data that could not be preformed by their computers in an acceptable amount of time. Volunteer Computing structure is a type of distributed computing in which computer owners all around the world donate their computer resources, processing power or storage, to perform the operations. Some of these initiatives have been very successful. Two of the most important ones could be: –
–
FOLDING@home: Project for disease research that simulate protein folding, computational drug design and other types of molecular dynamics. SETI@home: Project that analyze radio signals, searching for signs of extra-terrestrial intelligence.
Both of them by Stanford University.
Utility Computing
If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry. John McCarthy, 1961 [b. 4]
Utility Computing was an intermediate step between the Grid Deployment and the Cloud Computing. It was the first attempt to commercially provide ondemand resource provisioning. A service provider makes computing resources and infrastructure management available to the customer as needed, and charges them for specific usage rather than a flat rate. The word utility is used to make an analogy to other services, such as electrical power, that seek to meet fluctuating customer needs.
Newly appointed IBM Chairman Samuel J. Palmisano said last month that he is betting $10 billion that customers will turn to Big Blue to deliver computing resources the way a power utility doles out electricity. The company will spend not just $10 billion to deliver on-demand computing, but an estimated $10 billion per year over the next several years. That figure includes a significant marketing and sales education budget as well as acquisitions, R&D and build outs of hosting facilities. Hewlett-Packard is also looking to carve out a share of the utility computing market with its Planetary Scale Computing initiative. Dan Farber, 2002 [w.4]
Original generated scheme
There is an ever-growing list of providers that have succesfully used cloud-like architectures with little or no centralized infrastructure or billing systems, such as the peer-to-peer network BitTorrent. [b. 2]
Part 2
Cloud Computing
The path to the Cloud Computing Everyone seems to have a different definition. Some analysts and vendors define cloud computing narrowly as an updated version of utility computing: basically virtual servers available over the Internet. Others go very broad, arguing that anything you consume outside the firewall is in the cloud. [4] Let's see some of them: The Wikipedia definition of Cloud computing is basically delivering computing at the Internet scale. IBM has defined it as follows: "A cloud is a pool of virtualized computer resources. A cloud can host a variety of different workloads, including batch-style backend jobs and interactive and user-facing applications."
A client is a computer program that runs on the user computer and that shows to the user the information sent by the server. It basically have two functions: –
Sending a request to the server, with information provided by the user.
–
Receiving the server information and showing it to the user in a specific established form.
"Cloud Computing is one of the major technologies predicted to revolutionize the future of computing." [b. 3] The Utility Computing system by HP, Amazon and many others mentioned before established the basis of a computing resources industry. But what we know as Cloud Computing goes further beyond that. More than simple resources providers, Cloud providers establish online platforms with which the user interacts through a client, usually the internet browser itself. On one hand, the cloud platforms have many advantages for ordinary users: – Everything is located in a server that can be accessed remotely from any location and at any time. – The interaction with those remote servers is multi-platform, that means that the client can be used in a wide range of devices with internet connection. – Cloud platforms offer its own applications that allow the users to create the documentation directly on the cloud, i.e., on the server. – Depending on the cloud system, the user can customize or even create its own applications to perform specific actions. – Security issues and hardware maintenance are in charge of the Cloud provider, so that consumers can forget about those questions.
On the other hand, the model of delivering IT as a service has several advantages for the enterprises: – It enables current businesses to dynamically adapt their computing infrastructure to meet the rapidly changing requirements of the environment. – It greatly reduces the complexities of IT management, enabling more pervasive use of IT. – It is an attractive option for small and medium enterprises to reduce upfront investments, enabling them to use sophisticated business intelligence applications that only large enterprises cloud previously afford.
The different kinds of Cloud As seen before, Cloud Computing is not a clear concept. In the next chapter, we will try to make an approach to the actual characteristics of the clouds by understanding the three basic types or levels. – Infrastructure as a Service. (IaaS) – Platform as a Service. (PaaS) – Software as a Service. (SaaS) These three cloud service types focus on a specific layer in a computer's runtime stack –the hardware (IaaS), the system software or platform (PaaS) and the applications (SaaS). [b.3, ch.1]
Infrastructure as a Service The IaaS model is about providing compute and storage resources as a commercial service. [b.3] It is the inheritor of the HP and Microsoft first attempts to utility compute. It allows users to run any application on the cloud supplier's hardware.
Scheme from Moving to the Cloud [b.3]
The cloud user can manage the virtual resources as desired, including installing any desired OS, software and applications. Therefore IaaS offers complete control over the software stack that they run. Well-known IaaS platforms include Amazon EC2, Rackspace and Rightscale. Traditional vendors such as HP, IBM and Microsoft offer solutions that can be used to build private IaaS. Amazon Web Services is basically divided in two main columns: Storage as a Service and Compute as a Service. The storage services by Amazon are called Amazon Simple Storage Server (S3) –based on HTTP- and Amazon Simple DB – key-value store-. The Compute as a Service part is called Amazon Elastic Compute Cloud (EC2), and it provides computing resources and the possibility of associate them with the storage.
The 4 levels of IaaS
Server Virtualization is the key which allows the IaaS system. You can run an application in a virtual machine, breaking the link between the physical hardware and the software. Using this method it’s possible to physically move a running application from its original hardware into another only recreating its virtual hardware in both sides. This means a huge increase in the efficiency of the resources by performing a load balancing. It is also possible to ‘divide’ a physical server into some virtual units which can work individually for different users and applications without any interference.
There are 4 levels of this structures, from using your own user private cloud to the use of public servers, in which you use virtual server instances: – Private cloud: The servers are owned and maintained by the consumer, i.e., the user has the servers allocated in his own physical space. – Dedicated hosting: Servers are owned and maintained by a service provider, but there is some specific physical server completely dedicated to a specific consumer. – Hybrid hosting: Critical servers are owned by the user and the rest, that need scalability, are in owned by the service provider. – Cloud hosting: Servers are fully owned and maintained by the service provider, and the user ignores where is his data allocated in.
Platform as a Service
The PaaS include services to develop, test, deploy, host and manage applications to support the application development life cycle. [b.2, ch.2.5]
The PaaS model is to provide a system stack or platform for application deployment as a service. [b. 3] It provides the environment and tools for creating new online applications. That offers some flexibility and control over the system but not as much as the IaaS model.
Scheme from Moving to the Cloud [b.3]
The hardware, as well as any mapping of hardware to virtual resources, such as virtual servers, is controlled by the PaaS provider. The user can configure and build on top of this middleware, such as define a database and develop applications.
Windows Azure and Google App Engine are well-known PaaS platforms.
Google App Engine.
Software as a Service The traditional model of software distribution, in which software is purchased for and installed on personal computers, is sometimes referred to as Software-as-aproduct. Software-as-a-Service is a software distribution model in which applications are hosted by a vendor or service provider and made available to costumers over the internet. [b.2, ch.2.6]
The SaaS model provides the complete application as a service. Actually, any application that can be accessed using a web browser can be considered as SaaS. [b. 3] It run existing online applications. Those applications can be free or by subscription, and are accessible from any computer and are collaborative.
Scheme from Moving to the Cloud [b.3]
The SaaS provider controls all the layers apart from the application. Users can log in to the SaaS service and use the application and configure it.
The expansion of SaaS
An important use of the cloud computing is for sharing data. This sharing can be either: - With friends and colleagues. - Just for personal use across multiple devices. It is not uncommon for a consumer to own more than one computing device. In those cases, use of a cloud service just to upload a document to a secure place and using it anytime and anywhere is very valuable. If the cloud service allows one to share them to modify and update, then it will be a very useful tool for collaboration as well. [bib.3, ch.4]
SaaS is becoming an increasingly prevalent delivery model as underlying technologies that support web services. It is also often associated with a pay-asyou-go subscription licensing model. [b.2, ch.2.6] Many types of software are well suited to the SaaS model (e.g., accounting, human resources, conferencing, web content management). The distinction between SaaS and earlier applications delivered over the Internet is that SaaS soltions were developed specifically to work within a web browser. The options offered by SaaS platforms are countless, from the most important ones (Google Drive, Pixlr, Autocad WS) to thousands of specific browser applications by independent developers that has reached a big success (Youtube, Whatsapp, etc.).
Part 3
Practical Case Study
Autodesk 360 case study Autodesk is an enterprise that focuses on development 3D design software for architecture engineering and other industries. In the last years, Autodesk 360 is offering online services at different levels. In this example we are going to use the Rendering Tool for creating a rendered image from a previously uploaded .dwg (Autocad format) generated by us. This tool can also be ran directly from the desktop Autocad program in the lastest version 2013. We can classify this tool as an Utility computing service because we are using computing resources from a remote server. There is not such a strict classification of all the web service tools, but this one would fit the SaaS kind, as the user control is limited to the utilization of the tool and setting controls. In a technical aspect, the service provider is performing this action by computing the rendering in virtualized servers that can be used by many users at the same time.
1. Create an Autocad 2012 document that contains the 3D shapes, materials, textures, lights, views and any other elements that take part in the desired render image.
2. Save it as a .dwg file type.
3. Enter the Autodesk 360 Rendering Tool and log in with a free account.
4. You will access to a web desktop that offers you some options for rendering files. Use the Select File option for Rendering from Autocad.
5. Select the file in an explorer in order to upload it.
6. The online tool will upload the file and validate the type.
7. A new option menu opens with image and modeling settings. It also calculates the time the process will take and the time it would take by using the regular computer resources.
8. After validating, the web tool will take some time to perform the actions.
9. After rendered, the web tool offers some options, such as downloading the image, sending by e-mail or sharing it on Facebook.
10. You can preview it directly using the web tool. This is the final rendered image. The render can be seen publicly at the Facebook profile of Autodesk 360. Click here
FOLDING@home case study Volunteer Computing structure is a type of distributed computing in which computer owners all around the world donate their computer resources, processing power or storage, to perform the operations. FOLDING@home, by Stanford University, is a Project for disease research that simulate protein folding, computational drug design and other types of molecular dynamics. The University Servers manage the whole process of the computing while they send part of the computing operations to the donor computers. The operations that are performed using the CPU resources of the user run on the back-end so that user has no access to the process itself. The project uses the idle processing resources of thousands of personal computers owned by volunteers who have installed the software on their systems.
1. The project website allow the users to download a client that will manage to perform the actions requested by the server on the user CPU and to send back the results.
2. After downloading the installation file, instal the client on your hard disk drive.
3. After the installation process is completed, run the application FAHControl.exe. The client main window has a menu with different options to manage the processes and setting options.
4. Start the operations by clicking the button Fold. The process will start running on the back-end. The resources consumption of the CPU increases rapidly.
5. To temporally stop the processes and the CPU resources consumption, click Pause.
6. The client also can show you the overall of the operations that your CPU is performing.
7. The user can also hide the client while running by using the button Hide. Clicking the button Finish or Quit causes the client to stop the process when the ongoing operations are performed.
Google Drive case study Google Drive is a popular cloud-based application. It allows users to create, store and edit documents online as well as enables teams of people to share and work together on a single document. It consist on a platform that can be accessed through the web browser that gives access to the user files. Those files are allocated on the remote servers of the service provider. Server virtualization, previously explained, allows those public servers to be shared by many users at the same time and independently. Those operations are automated so that human action is not needed to perform the most of the actions. Apart from technical aspects, users can perform many actions while working on this platform. Some of those actions are the following:
1. This is the Google Drive user interface that can be accessed with a web browser using a free Google account. It allows users to see their files and to access them.
2. The files and folders can be either created using Google Docs or uploaded from the local computer onto Google Docs. This screen shows a new created document being edited with the Google Drive text processor tool.
3. It allows the owner of a certain file or folder to assign specific properties to it, such as sharing it with other users and allowing them to edit it or just view it.
4. Google Docs provides several default options to create new documents, such as a word processor, a spreadsheet or a presentation package. But it also offers the possibility to natively integrate a wide variety of applications from the Chrome Web Store, or even to create and run the code of your own application.
5. Integrating web applications to the Google Drive allow users to run or edit specific file types or to save new created files from online applications, everything directly on the user Drive.
6. Finally, there is also the possibility to download the Google Drive client for the different operating systems.
7. After installing, the user must log in using his Google account to allow synchronization.
8. During the installation process, you can configure the client to perform synchronization on a certain way, or to synchronize only some of the folders.
9. This desktop tool provides a simple drag-and-drop interface in order to work faster and easier with your documents.
Conclusions
The enormous expansion that cloud platforms have had in the last five years occurs at the same time as the popularization, in many countries, of mobile Internet access and fixed broadband connections. Furthermore, we must understand that this process that has been accomplished, especially, from the 90s to reach the full connectivity of the computers has not occurred in linear and simple steps. Besides the technical aspects, there have been a number of obstacles that have not yet been overcome. On one hand, the issue of security in the network. Furthermore, the problem to reach to agreement many leader companies about the foundations that must be set for the development of internet technologies. Perhaps in the next decade it will be reached a high integration of web services, causing the abandon many of the bases upon which sits the modern computing. – A commercial adaptation, from conventional software development to the increasingly widespread, pay-as-you-go or applications or that of low price and high range. – Moving massively to the online storage systems, accessible from anywhere in the world. What is clear is that the last five years has been a time when the expansion of web technologies to millions of new users has served as trial and error method to start establishing procedures for the development of the these services.
Information sources Bibliography 1. HWANG, Kai; FOX, Geoffrey C.; DONGARRA, Jack J. Distributed and Cloud Computing: From Parallel Processing to the Internet of Things. Walthman, Massachusetts. Ed. Morgan Kaufmann, 2012. ISBN 978-0-12-385880-1 2. RITTINGHOUSE, John W.; RANSOME, James F. Cloud Computing: Implementation, Management, and Security. Boca Raton, Florida. Ed. CRC Press, 2010. ISBN 9781-4398-0680-7 3. SITARAM, Dinkar; MANJUNATH, Geetha. Moving to the Cloud: Developing Apps in the NewWorld of Cloud Computing. Waltham, Massachusetts. Ed. Syngress, 2012. ISBN 978-1-59749-725-1 4. GARFINKEL, Simson. Architects of the Information Society: 35 Years of the Laboratory for Computer Science at MIT. MIT Press, 1999. ISBN-13: 978-0262071963
Webgraphy 1. KNORR, Eric; GRUMAN, Galen. What cloud computing really means. At InforWorld.com 2. MYERSON, Judith. Cloud computing versus grid computing: Service types, similarities and differences, and things to consider. At ibm.com 3. WIKIPEDIA.org: In our research, we have tried to make the most of this online large source of information, which has the virtue of a very clear organization, and at the same time very deep, in terms of Information Technologies issues. 4. FARBER, Dan. On-demand computing: What are the odds?. 2002