صفحه 1:
CLOUD COMPUTING:
CONCEPTS,
TECHNOLOGIES AND
BUSINESS IMPLICATIONS
صفحه 2:
OUTLINE OF THE TALK
۱ ار dee
۱ یت و
processors, parallel computing models, big-data storages...
Bee RC eNO BONO Cate
۱
1۱ ettsrt kate a hel)
Sie واهوه
+ Data and Computing models: MapReduce
Do Larus nates
Were eee ene asco]
+ Questions and Answers
Wipro Chenna 2011 CEO)
صفحه 3:
SPEAKERS’ BACKGROUND IN
CLOUD COMPUTING
ae
۱ ا تا
puerta ie a aes 38995.
1 al
eel Mpa gece natalia alien
Sette
ا
Bei: in 6815011736 8 سنا م08
Sr ae at ل ae
eee 0 1
١ ار 0000000008
Wipro Chenna 2011 5200
صفحه 4:
Introduction: A Golden Era in
Computing
Powerful
multi-core
processors
۳ General
Explosion of purpose
willie هر 25
ie Ss 0
5-2
software
methodologies
Virtualization
Wider bandwidth leveraging the
for communication powerful
hardware
صفحه 5:
CLOUD CONCEPTS,
ENABLING-
TECHNOLOGIES, AND
MODELS: THE CLOUD
CONTEXT
صفحه 6:
0
Wipro Chenna 2011
EVOLUTION OF INTERNET
COMPUTING
صفحه 7:
Top Ten Largest Databases
CEE) 1 تقصدمدك مدا
صفحه 8:
CHALLENGES
Alignment with the needs of the business / user / non-
computer specialists / community and society
Need to address the scalability issue: large scale data,
high performance computing, automation, response
time, rapid prototyping, and rapid time to production
Need to effectively address (i) ever shortening cycle of
obsolescence, (ii) heterogeneity and (iii) rapid changes
in requirements.
Transform data from diverse sources into intelligence
and deliver intelligence to right people/user/systems
What about providing all this in a cost-effective
manner?
Wipro Chenna 2011 5200
صفحه 9:
ENTER THE CLOUD
Cloud computing is Internet-based computing,
whereby shared resources, software and information are
provided to computers and other devices on-demand,
like the electricity grid.
The cloud computing is a culmination of numerous
attempts at large scale computing with seamless access
to virtually limitless resources.
* on-demand computing, utility computing, ubiquitous
computing, autonomic computing, platform computing,
edge computing, elastic computing, grid computing, ...
Wipro Chenna 2011 CEO)
صفحه 10:
“GRID TECHNOLOGY: A SLIDE FROM MY
PRESENTATION
10 ۱۱۱۲۵۱5۲۳۷ )2005(
Emerging enabling technology.
Natural evolution of distributed systems and the Internet.
Middleware supporting network of systems to facilitate
sharing, standardization and openness.
Infrastructure and application model dealing with sharing
of compute cycles, data, storage and other resources.
Publicized by prominent industries as on-demand
computing, utility computing, etc.
Move towards delivering “computing” to masses similar to
other utilities (electricity and 70 communication).”
CEO)
Wipro Chennai 201
صفحه 11:
IT IS A CHANGED WORLD
NOW
+ Explosive growth in applications: biomedical informatics, space exploration, business analytics,
web 2.0 social networking: YouTube, Facebook
۱ ee ee ete eT
+ Extraordinary rate of digital content consumption: digital gluttony: Apple iPhone, iPad, Amazon
160016
۱ 0
(virtualization)
nae a eee ee RN te eV ne ee
ete iris
OU haga eae ue eee ee aed
(Google, Hadoop), multi-core, wireless and mobile
۱ com ae
+ You simply cannot manage this complex situation with your traditional IT infrastructure:
Wipro Chennai 2011 5200
صفحه 12:
ANSWER: THE CLOUD
COMPUTING?
SMa tel ري lel * [1H
San
+ 50۲0۸۵۲۵ )5285(,
+ infrastructure (laa),
+ Services-based application programming interface (API)
+ A cloud computing environment can provide one or more of these
۱ کی
55ع 2 أكناط 0# أع0مم ۱
0 رن ole Reg Mute Naan Nagel tas
وصنصه محط 6
* An organization could also maintain a private cloud and/or use both.
Wipro Chenna 2011 CEO)
صفحه 13:
ENABLING TECHNOLOGIES
Models:
53,
BigTable,
BlobStore,
Multi-core architectures
Wipro Chenna 2011 64-bit CEO)
processor
صفحه 14:
COMMON FEATURES OF CLOUD
PROVIDERS
Production ع
Environment Environment
IDE, SDK,
es Table
Simple
storag prone Drives 7
e
value>
Management Console and Monitori
tools
& multi-level security
Wipro Chennai 2011 CEO)
صفحه 15:
WINDOWS AZU cd
Enterprise-level on-demand capacity builder
Fabric of cycles and storage available on-request for a
cost
You have to use Azure API to work with the infrastructure
offered by Microsoft
Significant features: web role, worker role , blob storage,
table and drive-storage
Wipro Chenna 2011 CEO)
صفحه 16:
amazon
webservices”
AMAZON EC
Amazon EC2 is one large complex web service.
EC2 provided an API for instantiating computing instances
with any of the operating systems supported.
It can facilitate computations through Amazon Machine
Images (AMIs) for various other models.
Signature features: S3, Cloud Management Console,
MapReduce Cloud, Amazon Machine Image (AMI)
Excellent distribution, load balancing, cloud monitoring tools
Wipro Chenna 2011 CEO)
صفحه 17:
GOOGLE APP ENGI
This is more a web interface for a development environment
that offers a one stop facility for design, development and
deployment Java and Python-based applications in Java, Go
اه و
Google offers the same reliability, availability and scalability
at par with Google’s own applications
Interface is software programming based
Comprehensive programming platform irrespective of the size
(small or large)
Signature features: templates and appspot, excellent
monitoring and management console
Wipro Chenna 2011 CEO)
صفحه 18:
DEMOS
+ Amazon AWS: EC2 & S3 (among the many infrastructure
services)
Pee تا ری
* عمتطعقمم دنرملصالالا
٠١ دوقع أاممة عداءمعامع ععتا-ععرط م
* Google app Engine
1۱ eee Ran imeLS
Perey ری ری ری زاره یر ری ری etn
یر SW ACaUl Ky
|۱7 eels
* MS Visual Studio Azure development and production environment
Wipro Chenna 2011 5200
صفحه 19:
CLOUD
PROGRAMMING
MODELS
صفحه 20:
THE CONTEXT: BIG-DATA
+ Data mining huge amounts of data collected in a wide range of
domains from astronomy to healthcare has become essential for
planning and performance.
+ We are in a knowledge economy.
* Data is an important asset to any organization
* Discovery of knowledge; Enabling discovery; annotation of data
+ Complex computational models
+ No single environment is good enough: need elastic, on-demand
capacities
+ We are looking at newer
+ Programming models, and
* Supporting algorithms and data structures.
Wipro Chennai 2011 CEO)
صفحه 21:
GOOGLE FILE SYSTEM
+ Internet introduced a new challenge in the form web logs, web
crawler’s data: large scale “peta scale”
+ But observe that this type of data has an uniquely different
characteristic than your transactional or the “customer order”
data : “write once read many (WORM)” ;
* Privacy protected healthcare and patient information;
* Historical financial data;
* Other historical data
* Google exploited this characteristics in its Google file system
(GFS)
Wipro Chenna 2011 CEO)
صفحه 22:
WHAT IS HADOOP?
®@ At Google MapReduce operation are run on a special file
system called Google File System (GFS) that is highly
optimized for this purpose.
© GFS is not open source.
® Doug Cutting and others at Yahoo! reverse engineered the GFS.
and called it Hadoop Distributed File System (HDFS).
® The software framework that supports HDFS, MapReduce and
other related entities is called the project Hadoop or simply
Hadoop.
® This is open source and distributed by Apache.
Wipro Chennai 2011 CEO)
صفحه 23:
FAULT TOLERANCE
Failure is the norm rather than exception
A HDFS instance may consist of thousands of server machines,
each storing part of the file system’s data.
Since we have huge number of components and that each
component has non-trivial probability of failure means that there
is always some component that is non-functional.
Detection of faults and quick, automatic recovery from them is a
core architectural goal of HDFS.
Wipro Chennai 2011 CEO)
صفحه 24:
27 ار را 5 ]مام
=|
صفحه 25:
HADOOP DISTRIBUTED FILE
SYSTEM
HDFS Server Master node
Name Nodes
Wipro Chenna 2011 62372010
صفحه 26:
WHAT IS MAPREDUCE?
© MapReduce is a programming model Google has used successfully
is processing its “big-data” sets (~ 20000 peta bytes per day)
OA map function extracts some intelligence from raw data.
OA reduce function aggregates according to some guides the
data output by the map.
O Users specify the computation in terms of a map and a reduce
1600,
O Underlying runtime system automatically parallelizes the
computation across large-scale clusters of machines, and
O Underlying system also handles machine failures, efficient
communications, and performance issues.
-- Reference: Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data
Pee Rn cic is tase maculae oe ois asa mPXCe
107-113
0000 ecrey 5200
صفحه 27:
CLASSES OF PROBLEMS
“MAPREDUCABLE”
® Benchmark for comparing: Jim Gray’s challenge on data-
intensive computing. Ex: “Sort”
® Google uses it for wordcount, adwords, pagerank, indexing data.
© Simple algorithms such as grep, text-indexing, reverse indexing
© Bayesian classification: data mining domain
© Facebook uses it for various operations: demographics
®@ Financial services use it for analytics
® Astronomy: Gaussian analysis for locating extra-terrestrial
objects.
© Expected to play a critical role in semantic web and i ۳ ۷۵ 0
0
صفحه 28:
0
صفحه 29:
MAPREDUCE ENGINE
MapReduce requires a distributed file system and an engine
that can distribute, coordinate, monitor and gather the results.
Hadoop provides that engine through (the file system we
discussed earlier) and the JobTracker + TaskTracker system.
JobTracker is simply a scheduler.
TaskTracker is assigned a Map or Reduce (or other operations);
Map or Reduce run on node and so is the TaskTracker; each
CEO)
task is run on its own JVM on a node.
Wipro Chenna 2011
صفحه 30:
DEMOS
* Word count application: a simple foundation for text-
mining; with a small text corpus of inaugural speeches
by US presidents
* Graph analytics is the core of analytics involving linked
structures (ab
Wipro Chenna 2011 CEO)
صفحه 31:
A CASE-STUDY IN
BUSINESS:
CLOUD STRATEGIES
صفحه 32:
PREDICTIVE QUALITY PROJECT
OVERVIEW
Identify special causes that relate to bad outcomes for the quality-
related parameters of the products and visually inspected defects
Complex upstream process conditions and dependencies making the
problem difficult to solve using traditional statistical / analytical
methods
Determine the optimal process settings that can increase the yield
and reduce defects through predictive quality assurance
Potential savings huge as the cost of rework and rejects are very high
Peete cerns
صفحه 33:
WHY CLOUD COMPUTING FOR
THIS PROJECT
Well-suited for incubation of new technologies
+ Semantic technologies still evolving
+ Use of Prototyping and Extreme Programming
+ Server and Storage requirements not completely known
Technologies used (TopBraid, Tomcat) not part of emerging or
core technologies supported by corporate IT
Scalability on demand
Development and implementation on a private cloud
Wipro Chenna 2011 CEO)
صفحه 34:
PUBLIC CLOUD VS. PRIVATE CLOUD
Rationale for Private Cloud:
٠ Security and privacy of business data was a big concern
+ Potential for vendor lock-in
+ SLA’s required for real-time performance and reliability
* Cost savings of the shared model achieved because of the
multiple projects involving semantic technologies that the
company is actively developing
Wipro Chenna 2011 CEO)
صفحه 35:
CLOUD COMPUTING FOR THE
ENTERPRISE
WHAT SHOULD IT DO
Revise cost model to utility-based computing: CPU/hour,
GB/day etc.
Include hidden costs for management, training
Different cloud models for different applications -
CVE hi
Use for prototyping applications and learn
Link it to current strategic plans for Services-Oriented
Architecture, Disaster Recovery, etc.
Wipro Chenna 2011 CEO)
صفحه 36:
REFERENCES & USEFUL LINKS
Amazon AWS: http://aws.amazon.com/free/
AWS Cost Calculator: http://calculator.s3.amazonaws.com/calc5.html!
0
0
ا ا زر
00
قطء5_مذنا امع ممع رع ه09 ن ماع رطا ا ممم أز- نالع .0 منارى وأ مهنم لالم /: مقاط
tz_MLG2010.pdf
For miscellaneous information: http://www.cse.buffalo.edu/~bina
Wipro Chenna 2011 CEO)
صفحه 37:
SUMMARY
We illustrated cloud concepts and demonstrated the cloud
capabilities through simple applications
We discussed the features of the Hadoop File System, and
mapreduce to handle big-data sets.
We also explored some real business issues in adoption of
cloud.
Cloud is indeed an impactful technology that is sure to
transform computing in business.
Wipro Chenna 2011 5200