صفحه 1:
XQuery+Cloud Daniela Florescu Oracle

صفحه 2:
My personal history, © PhD in object-oriented query processing/optimization © Loved the database theory and practice (relational, object-oriented, semi-structured) © Got really interested in it, and thought it was important... © ,...then | joined Oracle.

صفحه 3:
... after 4 years in Oracle © Applications are the rea//y important issue = How to develop, deploy, maintain, evolve, customize = Databases are a side effect ® Customers are educated to think they need them ® DB are only useful as part of a general application architecture © Customer is the king ® If they don’t make $$$, you don‘t either © Customers are | --)/ building apps right now

صفحه 4:
Agenda © Current pain in building apps © What can XQuery do for customers ? ® What can the Cloud) do for customers ? © How do we put them together ? 0 ۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6 problem ? Some open researc! problems

صفحه 5:
Imagine | am ٩ | need to build a new app, 1. How much does it cost * Cost of developing the app (salaries) = Cost of deploying the app * Hardware, software licenses, maintenance * Loss of income because of mis- provisioning = Do! have to pay up front? = Is the cost proportional with the income ?

صفحه 6:
Other questions ? 2. How fast can | deliver the app ® Quicker on the market then my competitors ? 3. How good the application is * More customers for the app. => more income * Acceptable operational characteristics ? 4. Can | adapt if something changes ? ® Operational characteristics © Functionality 5; Can || Customize the Same app in a 3 different vertical / different set of customers ? 6. Is there a risk in the technology ?

صفحه 7:
Customers concerns Cost Time to market (۷ Customizability Sustainability RAIS. © 6 © © © © Often a tradeoff

صفحه 8:
۱211161611۲ 6135565 1 © Entetpriseyeg Bank of America) aay ec ‏16و81‎ ‏سك‎ ‏رو‎ ‎۲1۳۵ ۱0 ۵۵ © Government agency (eg. DoD) ‏ره و۳‎ Cost Time to market (?) ۱ ea) Customizability كنا onsumer (e.g Craiglist) 771۳116 10 ۲ ‏غ665‎ ‎Flexibility ‎Customizability ‎۳ 7 itis 4 CC) Ce ee)

صفحه 9:
Typical enterprise app stack Communication (XML, REST, WS) cielo IBM Application logic SA (java, C#) Microsoft Database SQL), 1

صفحه 10:
Cost ? $$$$! Cost of developing the app Cost of deploying the app ‏ل لك‎ (hardware, software licenses, (XML, REST, WS maintenance) , WS) ) Loss of income because of mis- Application logic provisioning (Java, C# Do | have to pay up front? Is the cost proportional with the income ? SQL)

صفحه 11:
Time to market ? Years! . How fast can | deliver the app Communication (XML, REST, WS) Application logic (Java, C#) Database SQL)

صفحه 12:
Flexibility ? Customizability? Mara ) any,| changes 6 ۱۶ Operational characteristics 5 Communication (XML, REST, WS) 3 Can | customise it to a different vertical? Application logic (Java, €#) Database SQL) ‏ا‎ 12 cle experience: for every $1M Oracle app licenses, customers 2 ‏و ری(‎ P experience even worse :-)

صفحه 13:
۲۷۷۵ ۲۳۵[0۲ 5۷۱۱ ۱5 EMule) layer infrastructure سس 0۳۵-۲۵05 ۶۱ ۳۱۱۹۱۱5 7 Application Logic (Schema-less) © New apps: « Even the Oracle apps ! aes a. ‏یت‎ Ka) (schemia-less) © New platforms; * Salesforce, GoogleApps, ‏كت‎ ‎XQuery a possible solution. 7

صفحه 14:
Another evil point ® Lack of cost elasticity ® Cost proportional with income © Lack of 21951۱ ۱۱ 11١ 052 0 6 = Response time independent of # clients The Cloud is the beginning orecrso)l ution)

صفحه 15:
Agenda © Current pain in building apps © What can XQuery do for customers ? ® What can the Cloud) do for customers ? © How do we put them together ? 0 ۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6 problem ? Some open researc! problems

صفحه 16:
Why XML ? © Covers all spectrum from structured 0212 ۱0 ‏قلاع‎ 17 © Schema independent © Platform independent © Continuity with the basic Internet infrastructure (URI, HTML, HTTP)

صفحه 17:
What is XQuery ? A programming language for XML processing Functional in style Turing complete © ® Navigation ® Declarative query and aggregation (FLWOR) = Search (full text) ® Declarative updates 5 ا 5 ® Scripting J ل 5 ۲ Streaming and windowing Error handling and second order expressions Packaging (modules) © Has limitations (further), 17

صفحه 18:
History and status ® Standard of the W3C * Good and bad 10 years old 40 existing implementations Implemented in major databases Best implementations in| open source If you have XMIL data, itis hand! to hve) le @ecee @

صفحه 19:
Navigation ® fn:doc("catalog.xml") /items/item fn:doc("catalog.xml")/items//item © fn:doc("catalog.xml”)/items//* © fn:doc("catalog.xml”)/items/@item © fn:doc("parts.xml")/parts/part|partno = $i/partno] © $x/items/item

صفحه 20:
FLWOR for $i in fn:doc("catalog.xml")/items/item, م0 م ‎fn:doc("parts.xml")/parts/partipartno‏ ‏,1 << $s in fn:doc("suppliers.xml)/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return 5 5 20

صفحه 21:
Creation of new ۹ 0 ۱ { for $i in fn:doc("catalog.xml")/items/item, $p in fn:doc("parts.xml")/parts/part[partno - $s in fn:doc("suppliers.xml")/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return - <item> { $p/description, $s/suppname, $i/price } </item> } </descriptive-catalog> 21

صفحه 22:
Textual search ® $doc ftcontains ( ( “mustang” ftand ({("great", "excellent")} any word occurs at least 2 times) ) window 11 words ftand ftnot “rust” ) same paragraph 22

صفحه 23:
Declarative updates for $p in /inventory/part let $deltap := $changes/part[partno eq $p/partno] return replace value of node $p/quantity with $p/quantity + $deltap/quantity 23

صفحه 24:
Transforms let $oldx := /a/b/x return copy $newx := $oldx modify (rename node $newx as "newx", replace value of node $newx by $newx * 2) return ($oldx, $newx) 24

صفحه 25:
Streams and windowing for sliding window $w in (2, 4, 6, 8, 10, 12, 14) start at $s when fn:true() only end at $e when $e - $s eq 2 return <window>{ $w }</window> © Result of the above query: 5د 6 ا ۱ <window>4 6 8</window= <window>6 8 10</window> <window>8 10 12</window> <window>10 12 14</window> ch)

صفحه 26:
5 ۲۱۵۱۱۲۱9 ه 1 declare $a as xs:integer := 0; declare $b as xs:integer := 1; declare $c as xs:integer := $a + $b; declare $fibseq as xs:integer* := ($a, while ($c < 100) 0 set $fibseq := ($fibseq, $c); set $a := $b; set $b := $c; set $c := $a + $b; ¢fihcanq: ah

صفحه 27:
Where can It be used in today’s architectures? © Databases © Middle tiers ® Information dispatch ‏ه‎ 10 ® Data integration © Browsers (see XQ/B 061۱۵, ۷۷۷۷۷ 9 paper) ® a clayicas (XQuery on iPhone anyon

صفحه 28:
28 XQuery’s real potential ® Standalone MIE ‏گر‎ ‎programming language for information intensive applications Application © Can build extremely Logic rich applications (XQuery)

صفحه 29:
1. Cost Why XQuery 7 2. Time to marke 3. Flexibility 4. Customizability © Because of XML 5. Sustainability ® Schema independent 6. Risk ® Continuity with basic Internet infrastructure * Continuity structured data <--> textual (2 0 © XQuery’s own advantages ® Declarative = Single layer code ® Open source friendly © Extra Goodies ® Opportunity to rethink ACID transactions ® Unique opportunities for introspection ® Code and data migration 2

صفحه 30:
د Declarativity © Small number of lines of code = Development cost = Time to market ‏نا‎ # bugs © Easy to optimize automatically © Easy to parallelize automatically ‏و‎ 5۵661۵1۱۷ ۱۳۱۱۵۵۲۱۵۱۱۲ ۱۱ ۲۵ ۵ ® Easier to achieve elasticity in) performance © Easier to generate automatically ® Important for smart/non-developers UIs

صفحه 31:
Declarativity, negative 596 . Less number of developers capable of writing such code 2. Easy to write, harder to read 3. Tools harder to make (e.g. debuggers) 4, Performance can be unstable © Despite that, in the history of €S we evolve in the direction of declarativity s Assembly, C, C++, Java, Haskell Cobol, SQL

صفحه 32:
Rethink transactions and data consistency, XQuery silent as ACID transactions go = On purpose ! Are ACID transactions really needed ? Are they really enforced in Web apps ? No. Open research field ۶ Interaction of programming languages with new transactional models and new data consistency models 32 © S ‏كن‎ ‎© ‎3

صفحه 33:
09 رل بت © Data consistency is something to optimize, 01 an absolute requirement © Data consistency models [Tanembaum] ‏يا‎ 5۱۱2۱60-1151» ۱۱۵۱۷ ۵۵۵ ( ® No concurrency control at all ® Eventual Consistency (Basic Piouoco))) ® Updates become visible any time and will persist ® No lost update on page level ‏لا‎ ‎© All or no updates of a transaction become visible ‎pseicls, Heal your writes, Mongtoriic‏ كال ا ‎Naina‏ ‎® Strong Consistency ® database-style consistency (ACID) via OCC © Data consistency a la carte 35

صفحه 34:
34 Introspection ecoséunwortal LIES © Everything is (or will be) XML s Data, schemas, code, PULs, metadata, configs, runtime information © Unique opportunity to: ۶ introspect at runtime all of them ® reason about them = change them dynamically (not only 919 but schemas, code and configuration) O Oo research field: ‏»ع‎ Consequences on programming

صفحه 35:
Why NOT XQuery XML is complicated XML Schema is hard/impossible to understand XQuery is complicated YOIaViS ‏داز رررن ع زر‎ (maybe research opport.?) ® Missing a standard persistent data modell = Missing DDL functionality (indexes, integrity constraints) ® Missing basic functionalities (e.g. eval, function. overloading) ® Missing basic data modeling functionality (n:m relationships) XQuery lacks a standard environment (e.g) J2EE) (maybe research opport.?) No tools (debuggers, profilers) ~- 55 =). goport.?) ‎clear yee (certainly research‏ تمر دا كر كيز ‎nAnnoary !)‏ ‎35

صفحه 36:
Agenda © Current pain in building apps © What can XQuery do for customers ? ® What can the Cloud) do for customers ? © How do we put them together ? 0 ۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6 problem ? Some open researc! problems

صفحه 37:
What is Cloud radigrn for cornputing of (Certain) aspects of ) 0 © Goal 1: Reduction| of Cost * principle: fine-grained renting of resources 8 pay as you go” (elasticity of cost) 3 Boel 2: Simplification of Management ® potentially infinite/unbreakable computing, resources = potentially no administration ® Goal 3: Elasticity of performance: = Same resp time independently of workload a 0 ‏ا‎ 0 a RY

صفحه 38:
Case Study: Amazon AWS EC2 : scalable virtual private servers using Xen. S3 : WS based storage for applications GQG > hosted wessage queue Par we ‏اجرج‎ PAS PA ee SimpleDB ; the core functionality of a database Hadoop based functionality Similar oroviders: Je) Slur ‏ناه‎ ‎Microsoft Azure, (GoogleApp engine) كك ك3

صفحه 39:
39 The limits of the (Amazon) Cloud © Cloud Computing a great starting jee) ial © Unfortunately, only a fraction of the stack

صفحه 40:
Making use of the Cloud ® Solution 1 (conservative) ‏۲و2‎ ۴ * Take an existing application (Java+SQL, etc) and try to make it run on the cloud (e.g. make Oracle run on AWS) ® Solution 2 (reactionary) « 06۵16 20 165 ۷ infrastructure, specially designed for Web apps requirements, to be deployed in the cloud 0

صفحه 41:
Solution 1 (conservative) ® take a traditional DBMS (e.g., Oracle, MySQL, ...) ® install it on an EC2 instance ® use S3 or EBS as a persistent store كنئ ® traditional databases are available ® proven to work well; many tools ® people trained and confident with them O DiseedlVvaintages Se ee ae ROG Ber ok nent arn cre tea (ome mts 2 ‏باه‎ ( ‎ne eacecd‏ ۹ مر ا کرک ررض ذا اسصم رم رم زرط ‎0

صفحه 42:
Solution 2 (reactionary) © Rethink the whole system) anchitecture = do NOT use a traditional DBMS and app server * create new breed of application server (with DB) = run application server on nm E@2 instances * use S3 + distributed consistency protocols ‏ورد ۱۱۱/۰۱۱۱۰۱ ب‎ Disadvantages DO requires new breed of (immature) systerns + tools = Solves the right problern and gets it right © Examples: * GoogleApps (Python in the cloud) * Sausalito (www.28msec.com) (XQuery in the cloud) ‏كت‎

صفحه 43:
Agenda © Current pain in building apps © What can XQuery do for customers ? ® What can the Cloud) do for customers ? © How do we put them together ? 0 ۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6 problem ? Some open researc! problems 43

صفحه 44:
XQuery + AWS Cloud ® Cookbook: ® Take an existing XQuery processor = Partition the XML data on S3 * Map REST calls to XQuery programs * Run the XQuery programs on EG2 * Use SQS for (asyncronous) updates * Voila. O Tne iiegic is in the glue (XQuery proc. + AWS ) © Application 3 ۰ ۱۷ ‏در ركد‎ 2 * integrated XQuery based application stack for Web-based apps * fully SOA enabled 44

صفحه 45:
XQuery in ۲۱۱۶ 1010 (connected

صفحه 46:
Customers concerns Cost Time to market (۷ Customizability Sustainability 46

صفحه 47:
XQuery In the 61000 ۵ Server)

صفحه 48:
pene In ۱۱۶ 0 (offline)

صفحه 49:
۶ (۵۱۲۵ 1 | ® Look at ‏رف‎ ‎use cases ( consumer and enterprise mashups) 49

صفحه 50:
Competitors; Internet ® Web 2.0 Development Frameworks = E.g., Ruby on Rails, PHP. / LAMP, . _® Deployment in the cloud still ‏ارو‎ ‎® Google AppEngine, Facebook Apps: 7 4 ‏لتر‎ programming model (Python- = Limited functionality * Vendor lock-in, privacy issues, © Oracle on AWS, do-it-yourself om AWS = limited functionality and/or scalability د

صفحه 51:
Competitors; Enterprise © Salesforce AppExchange ® proprietary programming model * Limited applications domain (CRM) ® Microsoft Azure ۱ Net programming model ® manual configuration needed -® (recent offering, market adoption unclear) OWinetiellrzeirion Eompeniias (a.y., VMlWelre) = No offerings / expertise for data, management © Oracle (Grid, RAC) ® limited scalability, cost prohibitive 0

صفحه 52:
Web 2.0 Support vs, Cloud Support AWS Google App Engine, Que: ‏عن‎ ۸ ۷/9 Salesforce, Workday . VMWare Cloud, ‏وه‎ ictal Ruby an Rails.

صفحه 53:
Agenda © Current pain in building apps © What can XQuery do for customers ? ® What can the Cloud) do for customers ? © How do we put them together ? 0 ۳۱۵۷۷ 00 2۹۵۱۱۳۷۲۵۱۵۱ ۰ 6 problem ? Some open researc! problems

صفحه 54:
Versions and variations © Human mind does not like agreements ® We like our differences (for a good reason) © Different ways to see; = Data * Schemas * Code © Current stack is imposing agreement * unlike our own nature ‏م‎ ‎© We have to come Up with solutions that allow, welcome and exploit Variations © Darwinian, evolutionary approach to data, schema and code mutations 7

صفحه 55:
Versions and Variations J Raseercn oroolermns: = What is a (data, schema, code) variation ? What does it mean to run an app in the presence of variations ? How do you store (index, etc) variations ? How do you re-integrate them back into mainstream app (e.g. ‏الل لكت‎ voting ?) = What is the correct caves for data, schema, code that allows and maximally exploits variations ? omer 1 a 56۳ ‏را‎ to. think of 55 1 رت مم 2 ‎ca‏ ل الل مر

صفحه 56:
Conclusion © 0۱6۲۷ ۱۳ ۱6 61000 3٩3 5 alternative for some (large # and large $$) customers © Nothing equivalent in the competition: = How “solid” (standard, tested) this is = Richness of applications —— = Potential for optimization and parallelization * Ease of porting to the cloud

صفحه 57:
My advice ‎Keep the eye on the apps, not‏ ه ‎db‏ ‎Keep the customer tin mind‏ كل ‎® Rethink the entire stack ‎۶ Don’t be afraid to shake down existing ideas about how ‎applications are Supposed! to work

The magic is in the glue XQuery+Cloud Daniela Florescu Oracle My personal history PhD in object-oriented query processing/optimization Loved the database theory and practice (relational, object-oriented, semi-structured) Got really interested in it, and thought it was important… ….then I joined Oracle. 2 … after 4 years in Oracle Applications are the really important issue How to develop, deploy, maintain, evolve, customize Databases are a side effect Customers are educated to think they need them DB are only useful as part of a general application architecture Customer is the king If they don’t make $$$, you don’t either Customers are in pain building apps right now 3 Agenda Current pain in building apps What can XQuery do for customers ? What can the Cloud do for customers ? How do we put them together ? How do XQuery+Cloud solve the problem ? Some open research problems 4 Imagine I am a customer, I need to build a new app. 1. How much does it cost Cost of developing the app (salaries) Cost of deploying the app Hardware, software licenses, maintenance Loss of income because of misprovisioning Do I have to pay up front? Is the cost proportional with the income ? 5 Other questions ? 2. How fast can I deliver the app Quicker on the market then my competitors ? 3. How good the application is More customers for the app. => more income Acceptable operational characteristics ? 4. Can I adapt if something changes ? Operational characteristics Functionality 5. Can I customize the same app in a different vertical / different set of customers ? 6. Is there a risk in the technology ? 6 Customers concerns Cost Time to market Flexibility Customizability Sustainability Risk Often a tradeoff 7 Different classes of Enterprise (e.g. Bank of America) customers Cost Sustainability Risk Customizability Flexibility Time to market Government agency (eg. DoD) Sustainability Cost Time to market (?) Flexibility (?) Customizability Risk Consumer (e.g Craiglist) Time to market Cost Flexibility Customizability Sustainability Risk 8 Typical enterprise app stack Communication (XML, REST, WS) Application logic (Java, C#) Oracle IBM SAP Microsoft Database SQL) 9 Cost ? $$$$! Cost of developing the app Cost of deploying the app (hardware, software licenses, maintenance) Loss of income because of misprovisioning Do I have to pay up front? Is the cost proportional with the income ? Communication (XML, REST, WS) Application logic (Java, C#) Database SQL) 10 Time to market ? Years! 2.. How fast can I deliver the app Communication (XML, REST, WS) Application logic (Java, C#) Database SQL) 11 Flexibility ? Customizability? Hardly any ! Communication Can I adapt if something changes ? Operational characteristics Functionality Can I customise it to a different vertical? acle experience: for every $1M Oracle app licenses, customers y $2M to customize it. AP experience even worse :-) (XML, REST, WS) Application logic (Java, C#) Database SQL) 12 Two major evil points 1. Multi layer infrastructure 2. Schemas a pre-requisiteCommunication Application Logic (schema-less) New apps: Even the Oracle apps ! New platforms: put Persistent (key, value) storeget (schema-less) Salesforce, GoogleApps, Facebook XQuery a possible solution. 13 Another evil point Lack of cost elasticity Cost proportional with income Lack of elasticity in performance Response time independent of # clients The Cloud is the beginning of a solution. 14 Agenda Current pain in building apps What can XQuery do for customers ? What can the Cloud do for customers ? How do we put them together ? How do XQuery+Cloud solve the problem ? Some open research problems 15 Why XML ? Covers all spectrum from structured data to textual information Schema independent Platform independent Continuity with the basic Internet infrastructure (URI, HTML, HTTP) 16 What is XQuery ? A programming language for XML processing Functional in style Turing complete Contains: Navigation Declarative query and aggregation (FLWOR) Search (full text) Declarative updates Transforms Scripting Streaming and windowing Error handling and second order expressions Packaging (modules) Has limitations (further) 17 History and status Standard of the W3C Good and bad 10 years old 40 existing implementations Implemented in major databases Best implementations in open source If you have XML data, it is hard to avoid. 18 Navigation fn:doc("catalog.xml") /items/item fn:doc("catalog.xml")/items//item fn:doc("catalog.xml")/items//* fn:doc("catalog.xml")/items/@item fn:doc("parts.xml")/parts/part[partno = $i/partno] $x/items/item 19 FLWOR for $i in fn:doc("catalog.xml")/items/item, $p in fn:doc("parts.xml")/parts/part[partno = $i/partno], $s in fn:doc("suppliers.xml")/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return $ s 20 Creation of new <descriptive-catalog> information { for $i in fn:doc("catalog.xml")/items/item, $p in fn:doc("parts.xml")/parts/part[partno = $i/partno], $s in fn:doc("suppliers.xml")/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return <item> { $p/description, $s/suppname, $i/price } </item> } </descriptive-catalog> 21 Textual search $doc ftcontains ( ( "mustang" ftand ({("great", "excellent")} any word occurs at least 2 times) ) window 11 words ftand ftnot "rust" ) same paragraph 22 Declarative updates for $p in /inventory/part let $deltap := $changes/part[partno eq $p/partno] return replace value of node $p/quantity with $p/quantity + $deltap/quantity 23 Transforms let $oldx := /a/b/x return copy $newx := $oldx modify (rename node $newx as "newx", replace value of node $newx by $newx * 2) return ($oldx, $newx) 24 Streams and windowing for sliding window $w in (2, 4, 6, 8, 10, 12, 14) start at $s when fn:true() only end at $e when $e - $s eq 2 return <window>{ $w }</window> Result of the above query: <window>2 4 6</window> <window>4 6 8</window> <window>6 8 10</window> <window>8 10 12</window> 25 Scripting expressions block { declare $a as xs:integer := 0; declare $b as xs:integer := 1; declare $c as xs:integer := $a + $b; declare $fibseq as xs:integer* := ($a, $b); while ($c < 100) { set $fibseq := ($fibseq, $c); set $a := $b; set $b := $c; set $c := $a + $b; }; 26 Where can it be used in today’s architectures? Databases Middle tiers Information dispatch Transformation Data integration Browsers (see XQIB demo, WWW’09 paper) Mobile devices (XQuery on iPhone anyone ?) 27 XQuery’s real potential XML XML Standalone programming language for information Application intensive applications Logic Can build extremely (XQuery) rich applications XML 28 1. Cost 2. Time to market 3. Flexibility 4. Customizability Because of XML 5. Sustainability 6. Risk Schema independent Continuity with basic Internet infrastructure Continuity structured data <--> textual information Why XQuery ? XQuery’s own advantages Declarative Single layer code Open source friendly Extra Goodies Opportunity to rethink ACID transactions Unique opportunities for introspection Code and data migration 29 Declarativity Small number of lines of code Development cost Time to market # bugs Easy to optimize automatically Easy to parallelize automatically Especially important in the cloud Easier to achieve elasticity in performance Easier to generate automatically Important for smart/non-developers UIs 30 Declarativity, negative side 1. Less number of developers capable of writing such code 2. Easy to write, harder to read 3. Tools harder to make (e.g. debuggers) 4. Performance can be unstable Despite that, in the history of CS we evolve in the direction of declarativity Assembly, C, C++, Java, Haskell Cobol, SQL 31 Rethink transactions and data consistency XQuery silent as ACID transactions go On purpose ! Are ACID transactions really needed ? Are they really enforced in Web apps ? No. Open research field Interaction of programming languages with new transactional models and new data consistency models 32 Sigmod’08 Data consistency is something to optimize, not an absolute requirement Data consistency models [Tanembaum] Shared-Disk (Naïve approach) No concurrency control at all Eventual Consistency (Basic Protocol) Updates become visible any time and will persist No lost update on page level Atomicity All or no updates of a transaction become visible Monotonic reads, Read your writes, Monotonic writes, ... Strong Consistency database-style consistency (ACID) via OCC Data consistency a la carte 33 Introspection opportunities Closed world Everything is (or will be) XML Data, schemas, code, PULs, metadata, configs, runtime information Unique opportunity to: introspect at runtime all of them reason about them change them dynamically (not only data, but schemas, code and configuration) Open research field: Consequences on programming 34 Why NOT XQuery XML is complicated XML Schema is hard/impossible to understand XQuery is complicated XQuery is incomplete (maybe research opport.?) Missing a standard persistent data model Missing DDL functionality (indexes, integrity constraints) Missing basic functionalities (e.g. eval, function overloading) Missing basic data modeling functionality (n:m relationships) XQuery lacks a standard environment (e.g. J2EE) (maybe research opport.?) No tools (debuggers, profilers) (maybe research opport.?) Performance is not clear yet (certainly research 35 Agenda Current pain in building apps What can XQuery do for customers ? What can the Cloud do for customers ? How do we put them together ? How do XQuery+Cloud solve the problem ? Some open research problems 36 What is Cloud The „rental cars“ paradigm for computing Computing ? Commoditization of (certain aspects of ) Computing CPU, storage, and network Goal 1: Reduction of Cost principle: fine-grained renting of resources „pay as you go“ (elasticity of cost) Goal 2: Simplification of Management potentially infinite/unbreakable computing resources potentially no administration Goal 3: Elasticity of performance Same resp time independently of workload 37 Case Study: Amazon AWS EC2 : scalable virtual private servers using Xen. S3 : WS based storage for applications SQS : hosted message queue for web applications SimpleDB : the core functionality of a database Hadoop based functionality Similar providers: IBM Blue Cloud, Microsoft Azure, (GoogleApp engine) 38 The limits of the (Amazon) Cloud Cloud Computing a great starting point Unfortunately, only a fraction of the stack Customization, Training, ... Application Application Server DBMS Hardware 39 Making use of the Cloud Solution 1 (conservative) Risk Benefit Take an existing application (Java+SQL, etc) and try to make it run on the cloud (e.g. make Oracle run on AWS) Solution 2 (reactionary) Create an fresh new infrastructure, specially designed for Web apps requirements, to be deployed in the cloud 40 Solution 1 (conservative) take a traditional DBMS (e.g., Oracle, MySQL, ...) install it on an EC2 instance use S3 or EBS as a persistent store Advantages traditional databases are available proven to work well; many tools people trained and confident with them Disadvantages traditional DBMS solve the wrong problem anyway (e.g. focus on consistency) traditional DBMS make the wrong assumptions (DB optimizers fail on virtualized hardware) 41 Solution 2 (reactionary) Rethink the whole system architecture do NOT use a traditional DBMS and app server create new breed of application server (with DB) run application server on n EC2 instances use S3 + distributed consistency protocols Advantages and Disadvantages requires new breed of (immature) systems + tools solves the right problem and gets it right Examples: GoogleApps (Python in the cloud) Sausalito (www.28msec.com) (XQuery in the cloud) 42 Agenda Current pain in building apps What can XQuery do for customers ? What can the Cloud do for customers ? How do we put them together ? How do XQuery+Cloud solve the problem ? Some open research problems 43 XQuery + AWS Cloud Cookbook: Take an existing XQuery processor Partition the XML data on S3 Map REST calls to XQuery programs Run the XQuery programs on EC2 Use SQS for (asyncronous) updates Voila. The magic is in the glue (XQuery proc. + AWS ) Application Server + Web Server + Database integrated XQuery based application stack for Web-based apps fully SOA enabled 44 XQuery in the Cloud (connected) 45 Customers concerns Cost Time to market Flexibility Customizability Sustainability 46 XQuery in the Cloud (no Server) 47 XQuery in the Cloud (offline) 48 Demo at www.28msec.com ! Look at www.programmableweb.com for use cases ( consumer and enterprise mashups) 49 Competitors: Internet Web 2.0 Development Frameworks E.g., Ruby on Rails, PHP / LAMP, ... Deployment in the cloud still problematic Google AppEngine, Facebook Apps Proprietary programming model (Pythonbased) Limited functionality Vendor lock-in, privacy issues Oracle on AWS, do-it-yourself on AWS limited functionality and/or scalability 50 Competitors: Enterprise Salesforce AppExchange proprietary programming model Limited applications domain (CRM) Microsoft Azure .Net programming model manual configuration needed (recent offering, market adoption unclear) Virtualization Companies (e.g., VMWare) No offerings / expertise for data management Oracle (Grid, RAC) limited scalability, cost prohibitive 51 Web 2.0 Support vs. Cloud Support Deployment AWS Google App Engine, Facebook Cloud Salesforce, Workday Azure VMWare Cloud, Citrix Trad. XQuery+AWS Oracle Ruby on Rails Development Proprietary Standard 52 Agenda Current pain in building apps What can XQuery do for customers ? What can the Cloud do for customers ? How do we put them together ? How do XQuery+Cloud solve the problem ? Some open research problems 53 Versions and variations Human mind does not like agreements We like our differences (for a good reason) Different ways to see: Data Schemas Code Current stack is imposing agreement unlike our own nature We have to come up with solutions that allow, welcome and exploit variations Darwinian, evolutionary approach to data, schema and code mutations 54 Versions and variations Research problems: What is a (data, schema, code) variation ? What does it mean to run an app in the presence of variations ? How do you store (index, etc) variations ? How do you re-integrate them back into mainstream app (e.g. community voting ?) What is the correct lifecycle for data, schema, code that allows and maximally exploits variations ? Note: I have a easier time to think of 55 Conclusion XQuery in the cloud a serious alternative for some (large # and large $$) customers Nothing equivalent in the competition: How “solid” (standard, tested) this is Richness of applications Potential for optimization and parallelization Ease of porting to the cloud 56 My advice Keep the eye on the apps, not db Keep the customer in mind Rethink the entire stack Don’t be afraid to shake down existing ideas about how applications are supposed to work 57

51,000 تومان