صفحه 1:
XQuery+Cloud
Daniela Florescu
Oracle
صفحه 2:
My personal history,
© PhD in object-oriented query
processing/optimization
© Loved the database theory and
practice (relational, object-oriented,
semi-structured)
© Got really interested in it, and thought
it was important...
© ,...then | joined Oracle.
صفحه 3:
... after 4 years in Oracle
© Applications are the rea//y important issue
= How to develop, deploy, maintain, evolve,
customize
= Databases are a side effect
® Customers are educated to think they need them
® DB are only useful as part of a general application
architecture
© Customer is the king
® If they don’t make $$$, you don‘t either
© Customers are | --)/ building apps right
now
صفحه 4:
Agenda
© Current pain in building apps
© What can XQuery do for customers ?
® What can the Cloud) do for customers ?
© How do we put them together ?
0
۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6
problem ?
Some open researc! problems
صفحه 5:
Imagine | am ٩ |
need to build a new app,
1. How much does it cost
* Cost of developing the app (salaries)
= Cost of deploying the app
* Hardware, software licenses, maintenance
* Loss of income because of mis-
provisioning
= Do! have to pay up front?
= Is the cost proportional with the
income ?
صفحه 6:
Other questions ?
2. How fast can | deliver the app
® Quicker on the market then my competitors ?
3. How good the application is
* More customers for the app. => more income
* Acceptable operational characteristics ?
4. Can | adapt if something changes ?
® Operational characteristics
© Functionality
5; Can || Customize the Same app in a 3
different vertical / different set of
customers ?
6. Is there a risk in the technology ?
صفحه 7:
Customers concerns
Cost
Time to market
(۷
Customizability
Sustainability
RAIS.
© 6 © © © ©
Often a tradeoff
صفحه 8:
۱211161611۲ 6135565 1
© Entetpriseyeg Bank of America)
aay
ec
16و81
سك
رو
۲1۳۵ ۱0 ۵۵
© Government agency (eg. DoD)
ره و۳
Cost
Time to market (?)
۱ ea)
Customizability
كنا
onsumer (e.g Craiglist)
771۳116 10 ۲
غ665
Flexibility
Customizability
۳ 7
itis 4
CC)
Ce ee)
صفحه 9:
Typical enterprise app
stack
Communication
(XML, REST, WS) cielo
IBM
Application logic SA
(java, C#) Microsoft
Database
SQL),
1
صفحه 10:
Cost ? $$$$!
Cost of developing the app
Cost of deploying the app ل لك
(hardware, software licenses, (XML, REST, WS
maintenance)
, WS)
)
Loss of income because of mis- Application logic
provisioning (Java, C#
Do | have to pay up front?
Is the cost proportional with the income ?
SQL)
صفحه 11:
Time to market ? Years!
. How fast can | deliver the app
Communication
(XML, REST, WS)
Application logic
(Java, C#)
Database
SQL)
صفحه 12:
Flexibility ?
Customizability? Mara
) any,| changes 6
۱۶ Operational characteristics
5
Communication
(XML, REST, WS)
3
Can | customise it to a different vertical? Application logic
(Java, €#)
Database
SQL)
ا 12
cle experience: for every $1M
Oracle app licenses, customers
2 و ری(
P experience even worse :-)
صفحه 13:
۲۷۷۵ ۲۳۵[0۲ 5۷۱۱ ۱5
EMule) layer infrastructure
سس 0۳۵-۲۵05 ۶۱ ۳۱۱۹۱۱5 7
Application
Logic
(Schema-less)
© New apps:
« Even the Oracle apps !
aes a. یت Ka)
(schemia-less)
© New platforms;
* Salesforce, GoogleApps, كت
XQuery a possible solution. 7
صفحه 14:
Another evil point
® Lack of cost elasticity
® Cost proportional with income
© Lack of 21951۱ ۱۱ 11١ 052 0 6
= Response time independent of # clients
The Cloud is the beginning orecrso)l ution)
صفحه 15:
Agenda
© Current pain in building apps
© What can XQuery do for customers ?
® What can the Cloud) do for customers ?
© How do we put them together ?
0
۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6
problem ?
Some open researc! problems
صفحه 16:
Why XML ?
© Covers all spectrum from structured
0212 ۱0 قلاع 17
© Schema independent
© Platform independent
© Continuity with the basic Internet
infrastructure (URI, HTML, HTTP)
صفحه 17:
What is XQuery ?
A programming language for XML processing
Functional in style
Turing complete
©
® Navigation
® Declarative query and aggregation (FLWOR)
= Search (full text)
® Declarative updates
5 ا 5
® Scripting
J
ل
5
۲
Streaming and windowing
Error handling and second order expressions
Packaging (modules)
© Has limitations (further),
17
صفحه 18:
History and status
® Standard of the W3C
* Good and bad
10 years old
40 existing implementations
Implemented in major databases
Best implementations in| open source
If you have XMIL data, itis hand! to
hve) le
@ecee @
صفحه 19:
Navigation
® fn:doc("catalog.xml") /items/item
fn:doc("catalog.xml")/items//item
© fn:doc("catalog.xml”)/items//*
© fn:doc("catalog.xml”)/items/@item
© fn:doc("parts.xml")/parts/part|partno =
$i/partno]
© $x/items/item
صفحه 20:
FLWOR
for $i in
fn:doc("catalog.xml")/items/item,
م0 م
fn:doc("parts.xml")/parts/partipartno
,1 <<
$s in fn:doc("suppliers.xml)/suppliers
/supplier[suppno = $i/suppno]
order by $p/description, $s/suppname
return 5 5
20
صفحه 21:
Creation of new
۹ 0 ۱
{ for $i in fn:doc("catalog.xml")/items/item,
$p in fn:doc("parts.xml")/parts/part[partno
-
$s in fn:doc("suppliers.xml")/suppliers
/supplier[suppno = $i/suppno]
order by $p/description, $s/suppname
return -
<item> { $p/description, $s/suppname,
$i/price }
</item> }
</descriptive-catalog>
21
صفحه 22:
Textual search
® $doc ftcontains ( ( “mustang” ftand
({("great", "excellent")} any word
occurs at least 2 times) ) window 11
words ftand ftnot “rust” ) same
paragraph
22
صفحه 23:
Declarative updates
for $p in /inventory/part
let $deltap := $changes/part[partno eq
$p/partno]
return
replace value of node $p/quantity with
$p/quantity + $deltap/quantity
23
صفحه 24:
Transforms
let $oldx := /a/b/x
return
copy $newx := $oldx
modify
(rename node $newx as "newx",
replace value of node $newx by $newx * 2)
return ($oldx, $newx)
24
صفحه 25:
Streams and windowing
for sliding window $w in (2, 4, 6, 8, 10, 12,
14)
start at $s when fn:true()
only end at $e when $e - $s eq 2
return <window>{ $w }</window>
© Result of the above query:
5د 6 ا ۱
<window>4 6 8</window=
<window>6 8 10</window>
<window>8 10 12</window>
<window>10 12 14</window> ch)
صفحه 26:
5 ۲۱۵۱۱۲۱9 ه
1
declare $a as xs:integer := 0;
declare $b as xs:integer := 1;
declare $c as xs:integer := $a + $b;
declare $fibseq as xs:integer* := ($a,
while ($c < 100)
0
set $fibseq :=
($fibseq, $c);
set $a := $b;
set $b := $c;
set $c := $a + $b;
¢fihcanq: ah
صفحه 27:
Where can It be used in
today’s architectures?
© Databases
© Middle tiers
® Information dispatch
ه 10
® Data integration
© Browsers (see XQ/B 061۱۵, ۷۷۷۷۷ 9
paper)
® a clayicas (XQuery on iPhone anyon
صفحه 28:
28
XQuery’s real potential
® Standalone MIE گر
programming language
for information
intensive applications Application
© Can build extremely Logic
rich applications (XQuery)
صفحه 29:
1. Cost
Why XQuery 7 2. Time to marke
3. Flexibility
4. Customizability
© Because of XML 5. Sustainability
® Schema independent 6. Risk
® Continuity with basic Internet infrastructure
* Continuity structured data <--> textual
(2 0
© XQuery’s own advantages
® Declarative
= Single layer code
® Open source friendly
© Extra Goodies
® Opportunity to rethink ACID transactions
® Unique opportunities for introspection
® Code and data migration 2
صفحه 30:
د
Declarativity
© Small number of lines of code
= Development cost
= Time to market
نا # bugs
© Easy to optimize automatically
© Easy to parallelize automatically
و 5۵661۵1۱۷ ۱۳۱۱۵۵۲۱۵۱۱۲ ۱۱ ۲۵ ۵
® Easier to achieve elasticity in) performance
© Easier to generate automatically
® Important for smart/non-developers UIs
صفحه 31:
Declarativity, negative
596
. Less number of developers capable
of writing such code
2. Easy to write, harder to read
3. Tools harder to make (e.g.
debuggers)
4, Performance can be unstable
© Despite that, in the history of €S we evolve in the
direction of declarativity
s Assembly, C, C++, Java, Haskell
Cobol, SQL
صفحه 32:
Rethink transactions and
data consistency,
XQuery silent as ACID transactions go
= On purpose !
Are ACID transactions really needed ?
Are they really enforced in Web apps ?
No.
Open research field
۶ Interaction of programming languages with new
transactional models and new data consistency
models
32
©
S
كن
©
3
صفحه 33:
09 رل بت
© Data consistency is something to optimize, 01
an absolute requirement
© Data consistency models [Tanembaum]
يا 5۱۱2۱60-1151» ۱۱۵۱۷ ۵۵۵ (
® No concurrency control at all
® Eventual Consistency (Basic Piouoco)))
® Updates become visible any time and will persist
® No lost update on page level
لا
© All or no updates of a transaction become visible
pseicls, Heal your writes, Mongtoriic كال ا
Naina
® Strong Consistency
® database-style consistency (ACID) via OCC
© Data consistency a la carte 35
صفحه 34:
34
Introspection
ecoséunwortal LIES
© Everything is (or will be) XML
s Data, schemas, code, PULs, metadata,
configs, runtime information
© Unique opportunity to:
۶ introspect at runtime all of them
® reason about them
= change them dynamically (not only 919
but schemas, code and configuration)
O Oo research field:
»ع Consequences on programming
صفحه 35:
Why NOT XQuery
XML is complicated
XML Schema is hard/impossible to understand
XQuery is complicated
YOIaViS داز رررن ع زر (maybe research opport.?)
® Missing a standard persistent data modell
= Missing DDL functionality (indexes, integrity constraints)
® Missing basic functionalities (e.g. eval, function.
overloading)
® Missing basic data modeling functionality (n:m
relationships)
XQuery lacks a standard environment (e.g) J2EE)
(maybe research opport.?)
No tools (debuggers, profilers) ~- 55 =).
goport.?)
clear yee (certainly research تمر دا كر كيز
nAnnoary !)
35
صفحه 36:
Agenda
© Current pain in building apps
© What can XQuery do for customers ?
® What can the Cloud) do for customers ?
© How do we put them together ?
0
۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6
problem ?
Some open researc! problems
صفحه 37:
What is Cloud
radigrn for cornputing
of (Certain) aspects of )
0
© Goal 1: Reduction| of Cost
* principle: fine-grained renting of resources
8 pay as you go” (elasticity of cost)
3 Boel 2: Simplification of Management
® potentially infinite/unbreakable computing,
resources
= potentially no administration
® Goal 3: Elasticity of performance:
= Same resp time independently of workload
a 0 ا 0 a RY
صفحه 38:
Case Study: Amazon AWS
EC2 : scalable virtual private servers
using Xen.
S3 : WS based storage for
applications
GQG > hosted wessage queue Par we
اجرج PAS PA ee
SimpleDB ; the core functionality of a
database
Hadoop based functionality
Similar oroviders: Je) Slur ناه
Microsoft Azure, (GoogleApp engine)
كك
ك3
صفحه 39:
39
The limits of the (Amazon)
Cloud
© Cloud Computing a great starting
jee) ial
© Unfortunately, only a fraction of the
stack
صفحه 40:
Making use of the Cloud
® Solution 1 (conservative) ۲و2 ۴
* Take an existing application
(Java+SQL, etc) and try to make
it run on the cloud (e.g. make
Oracle run on AWS)
® Solution 2 (reactionary)
« 06۵16 20 165 ۷
infrastructure, specially
designed for Web apps
requirements, to be deployed in
the cloud
0
صفحه 41:
Solution 1 (conservative)
® take a traditional DBMS (e.g., Oracle, MySQL, ...)
® install it on an EC2 instance
® use S3 or EBS as a persistent store
كنئ
® traditional databases are available
® proven to work well; many tools
® people trained and confident with them
O DiseedlVvaintages
Se ee ae ROG Ber ok nent arn cre tea (ome mts
2 باه (
ne eacecd ۹ مر ا کرک ررض ذا
اسصم رم رم زرط
0
صفحه 42:
Solution 2 (reactionary)
© Rethink the whole system) anchitecture
= do NOT use a traditional DBMS and app server
* create new breed of application server (with DB)
= run application server on nm E@2 instances
* use S3 + distributed consistency protocols
ورد ۱۱۱/۰۱۱۱۰۱ ب Disadvantages
DO requires new breed of (immature) systerns + tools
= Solves the right problern and gets it right
© Examples:
* GoogleApps (Python in the cloud)
* Sausalito (www.28msec.com) (XQuery in the cloud)
كت
صفحه 43:
Agenda
© Current pain in building apps
© What can XQuery do for customers ?
® What can the Cloud) do for customers ?
© How do we put them together ?
0
۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6
problem ?
Some open researc! problems
43
صفحه 44:
XQuery + AWS Cloud
® Cookbook:
® Take an existing XQuery processor
= Partition the XML data on S3
* Map REST calls to XQuery programs
* Run the XQuery programs on EG2
* Use SQS for (asyncronous) updates
* Voila.
O Tne iiegic is in the glue (XQuery proc. +
AWS )
© Application 3 ۰ ۱۷ در ركد 2
* integrated XQuery based application stack for
Web-based apps
* fully SOA enabled 44
صفحه 45:
XQuery in ۲۱۱۶ 1010
(connected
صفحه 46:
Customers concerns
Cost
Time to market
(۷
Customizability
Sustainability
46
صفحه 47:
XQuery In the 61000 ۵
Server)
صفحه 48:
pene In ۱۱۶ 0
(offline)
صفحه 49:
۶ (۵۱۲۵ 1 |
® Look at
رف
use cases ( consumer and
enterprise mashups)
49
صفحه 50:
Competitors; Internet
® Web 2.0 Development Frameworks
= E.g., Ruby on Rails, PHP. / LAMP, .
_® Deployment in the cloud still ارو
® Google AppEngine, Facebook Apps:
7 4 لتر programming model (Python-
= Limited functionality
* Vendor lock-in, privacy issues,
© Oracle on AWS, do-it-yourself om AWS
= limited functionality and/or scalability
د
صفحه 51:
Competitors; Enterprise
© Salesforce AppExchange
® proprietary programming model
* Limited applications domain (CRM)
® Microsoft Azure
۱ Net programming model
® manual configuration needed
-® (recent offering, market adoption unclear)
OWinetiellrzeirion Eompeniias (a.y., VMlWelre)
= No offerings / expertise for data,
management
© Oracle (Grid, RAC)
® limited scalability, cost prohibitive 0
صفحه 52:
Web 2.0 Support vs, Cloud
Support
AWS
Google App Engine, Que:
عن ۸ ۷/9
Salesforce, Workday .
VMWare Cloud,
وه ictal
Ruby an Rails.
صفحه 53:
Agenda
© Current pain in building apps
© What can XQuery do for customers ?
® What can the Cloud) do for customers ?
© How do we put them together ?
0
۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6
problem ?
Some open researc! problems
صفحه 54:
Versions and variations
© Human mind does not like agreements
® We like our differences (for a good reason)
© Different ways to see;
= Data
* Schemas
* Code
© Current stack is imposing agreement
* unlike our own nature م
© We have to come Up with solutions that
allow, welcome and exploit Variations
© Darwinian, evolutionary approach to
data, schema and code mutations 7
صفحه 55:
Versions and Variations
J Raseercn oroolermns:
= What is a (data, schema, code)
variation ?
What does it mean to run an app in the
presence of variations ?
How do you store (index, etc)
variations ?
How do you re-integrate them back into
mainstream app (e.g. الل لكت
voting ?) =
What is the correct caves for data,
schema, code that allows and maximally
exploits variations ?
omer 1 a 56۳ را to. think of 55
1 رت مم 2 ca ل الل مر
صفحه 56:
Conclusion
© 0۱6۲۷ ۱۳ ۱6 61000 3٩3 5
alternative for some (large # and
large $$) customers
© Nothing equivalent in the
competition:
= How “solid” (standard, tested) this is
= Richness of applications ——
= Potential for optimization and
parallelization
* Ease of porting to the cloud
صفحه 57:
My advice
Keep the eye on the apps, not ه
db
Keep the customer tin mind كل
® Rethink the entire stack
۶ Don’t be afraid to shake down
existing ideas about how
applications are Supposed! to
work