XQuery+Cloud
اسلاید 1: The magic is in the glue XQuery+Cloud Daniela Florescu Oracle
اسلاید 2: 2My personal historyPhD in object-oriented query processing/optimizationLoved the database theory and practice (relational, object-oriented, semi-structured)Got really interested in it, and thought it was important…….then I joined Oracle.
اسلاید 3: 3… after 4 years in OracleApplications are the really important issueHow to develop, deploy, maintain, evolve, customizeDatabases are a side effectCustomers are educated to think they need themDB are only useful as part of a general application architectureCustomer is the kingIf they don’t make $$$, you don’t eitherCustomers are in pain building apps right now
اسلاید 4: 4AgendaCurrent pain in building appsWhat can XQuery do for customers ?What can the Cloud do for customers ?How do we put them together ?How do XQuery+Cloud solve the problem ?Some open research problems
اسلاید 5: 5Imagine I am a customer, I need to build a new app.How much does it costCost of developing the app (salaries)Cost of deploying the appHardware, software licenses, maintenanceLoss of income because of mis-provisioning Do I have to pay up front?Is the cost proportional with the income ?
اسلاید 6: 6Other questions ?How fast can I deliver the appQuicker on the market then my competitors ?How good the application isMore customers for the app. => more incomeAcceptable operational characteristics ?Can I adapt if something changes ?Operational characteristics FunctionalityCan I customize the same app in a different vertical / different set of customers ?Is there a risk in the technology ?
اسلاید 7: 7Customers concernsCostTime to marketFlexibilityCustomizabilitySustainabilityRiskOften a tradeoff
اسلاید 8: 8Different classes of customersEnterprise (e.g. Bank of America)CostSustainabilityRiskCustomizabilityFlexibilityTime to marketGovernment agency (eg. DoD)SustainabilityCostTime to market (?)Flexibility (?)CustomizabilityRiskConsumer (e.g Craiglist)Time to marketCostFlexibilityCustomizabilitySustainabilityRisk
اسلاید 9: 9Typical enterprise app stackCommunication(XML, REST, WS)Application logic(Java, C#)DatabaseSQL)OracleIBMSAPMicrosoft
اسلاید 10: 10Cost ? $$$$!Communication(XML, REST, WS)Application logic(Java, C#)DatabaseSQL)Cost of developing the app Cost of deploying the app(hardware, software licenses, maintenance)Loss of income because of mis-provisioning Do I have to pay up front?Is the cost proportional with the income ?
اسلاید 11: 11Time to market ? Years!Communication(XML, REST, WS)Application logic(Java, C#)DatabaseSQL)How fast can I deliver the app
اسلاید 12: 12Flexibility ? Customizability? Hardly any !Communication(XML, REST, WS)Application logic(Java, C#)DatabaseSQL)Can I adapt if something changes ?Operational characteristics FunctionalityCan I customise it to a different vertical?Oracle experience: for every $1M for Oracle app licenses, customerspay $2M to customize it.(SAP experience even worse :-)
اسلاید 13: 13Two major evil pointsMulti layer infrastructureSchemas a pre-requisiteNew apps:Even the Oracle apps !New platforms:Salesforce, GoogleApps, FacebookCommunicationApplication Logic(schema-less)Persistent (key, value) store(schema-less)XQuery a possible solution.putget
اسلاید 14: 14Another evil pointLack of cost elasticityCost proportional with incomeLack of elasticity in performanceResponse time independent of # clientsThe Cloud is the beginning of a solution.
اسلاید 15: 15AgendaCurrent pain in building appsWhat can XQuery do for customers ?What can the Cloud do for customers ?How do we put them together ?How do XQuery+Cloud solve the problem ?Some open research problems
اسلاید 16: 16Why XML ?Covers all spectrum from structured data to textual information Schema independent Platform independentContinuity with the basic Internet infrastructure (URI, HTML, HTTP)
اسلاید 17: 17What is XQuery ?A programming language for XML processingFunctional in styleTuring completeContains:NavigationDeclarative query and aggregation (FLWOR)Search (full text)Declarative updatesTransformsScriptingStreaming and windowingError handling and second order expressionsPackaging (modules)Has limitations (further)
اسلاید 18: 18History and statusStandard of the W3CGood and bad10 years old40 existing implementationsImplemented in major databasesBest implementations in open sourceIf you have XML data, it is hard to avoid.
اسلاید 19: 19Navigationfn:doc(catalog.xml) /items/item fn:doc(catalog.xml)/items//item fn:doc(catalog.xml)/items//*fn:doc(catalog.xml)/items/@itemfn:doc(parts.xml)/parts/part[partno = $i/partno]$x/items/item
اسلاید 20: 20FLWORfor $i in fn:doc(catalog.xml)/items/item, $p in fn:doc(parts.xml)/parts/part[partno = $i/partno], $s in fn:doc(suppliers.xml)/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppnamereturn $ s Groupby, having, outerjoins, etc
اسلاید 21: 21Creation of new information<descriptive-catalog> { for $i in fn:doc(catalog.xml)/items/item, $p in fn:doc(parts.xml)/parts/part[partno = $i/partno], $s in fn:doc(suppliers.xml)/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return <item> { $p/description, $s/suppname, $i/price } </item> } </descriptive-catalog>
اسلاید 22: 22Textual search$doc ftcontains ( ( mustang ftand ({(great, excellent)} any word occurs at least 2 times) ) window 11 words ftand ftnot rust ) same paragraph
اسلاید 23: 23Declarative updatesfor $p in /inventory/part let $deltap := $changes/part[partno eq $p/partno]return replace value of node $p/quantity with $p/quantity + $deltap/quantity
اسلاید 24: 24Transformslet $oldx := /a/b/x return copy $newx := $oldx modify (rename node $newx as newx, replace value of node $newx by $newx * 2) return ($oldx, $newx)
اسلاید 25: 25Streams and windowingfor sliding window $w in (2, 4, 6, 8, 10, 12, 14) start at $s when fn:true() only end at $e when $e - $s eq 2 return <window>{ $w }</window> Result of the above query:<window>2 4 6</window> <window>4 6 8</window> <window>6 8 10</window> <window>8 10 12</window> <window>10 12 14</window>
اسلاید 26: 26Scripting expressionsblock { declare $a as xs:integer := 0; declare $b as xs:integer := 1; declare $c as xs:integer := $a + $b; declare $fibseq as xs:integer* := ($a, $b); while ($c < 100) { set $fibseq := ($fibseq, $c); set $a := $b; set $b := $c; set $c := $a + $b; }; $fibseq; }
اسلاید 27: 27Where can it be used in today’s architectures?DatabasesMiddle tiersInformation dispatchTransformationData integrationBrowsers (see XQIB demo, WWW’09 paper)Mobile devices (XQuery on iPhone anyone ?)
اسلاید 28: 28XQuery’s real potentialStandalone programming language for information intensive applicationsCan build extremely rich applicationsApplication Logic(XQuery)XMLXMLXML
اسلاید 29: 29Why XQuery ?Because of XMLSchema independentContinuity with basic Internet infrastructureContinuity structured data <--> textual information XQuery’s own advantagesDeclarativeSingle layer codeOpen source friendlyExtra GoodiesOpportunity to rethink ACID transactionsUnique opportunities for introspectionCode and data migrationCostTime to marketFlexibilityCustomizabilitySustainabilityRisk
اسلاید 30: 30DeclarativitySmall number of lines of codeDevelopment costTime to market# bugs Easy to optimize automaticallyEasy to parallelize automaticallyEspecially important in the cloudEasier to achieve elasticity in performanceEasier to generate automaticallyImportant for smart/non-developers UIs
اسلاید 31: 31Declarativity, negative sideLess number of developers capable of writing such codeEasy to write, harder to readTools harder to make (e.g. debuggers)Performance can be unstableDespite that, in the history of CS we evolve in the direction of declarativityAssembly, C, C++, Java, HaskellCobol, SQL
اسلاید 32: 32Rethink transactions and data consistencyXQuery silent as ACID transactions goOn purpose !Are ACID transactions really needed ?Are they really enforced in Web apps ?No.Open research fieldInteraction of programming languages with new transactional models and new data consistency models
اسلاید 33: 33Sigmod’08Data consistency is something to optimize, not an absolute requirementData consistency models [Tanembaum]Shared-Disk (Naïve approach)No concurrency control at allEventual Consistency (Basic Protocol)Updates become visible any time and will persistNo lost update on page levelAtomicityAll or no updates of a transaction become visibleMonotonic reads, Read your writes, Monotonic writes, ...Strong Consistency database-style consistency (ACID) via OCCData consistency a la carte
اسلاید 34: 34Introspection opportunitiesClosed worldEverything is (or will be) XMLData, schemas, code, PULs, metadata, configs, runtime informationUnique opportunity to:introspect at runtime all of themreason about themchange them dynamically (not only data, but schemas, code and configuration)Open research field: Consequences on programming
اسلاید 35: 35Why NOT XQueryXML is complicatedXML Schema is hard/impossible to understandXQuery is complicatedXQuery is incomplete (maybe research opport.?)Missing a standard persistent data modelMissing DDL functionality (indexes, integrity constraints)Missing basic functionalities (e.g. eval, function overloading)Missing basic data modeling functionality (n:m relationships)XQuery lacks a standard environment (e.g. J2EE) (maybe research opport.?)No tools (debuggers, profilers) (maybe research opport.?)Performance is not clear yet (certainly research opport !)There are few XQuery developers (teaching opport )
اسلاید 36: 36AgendaCurrent pain in building appsWhat can XQuery do for customers ?What can the Cloud do for customers ?How do we put them together ?How do XQuery+Cloud solve the problem ?Some open research problems
اسلاید 37: 37What is Cloud Computing ?The „rental cars“ paradigm for computingCommoditization of (certain aspects of ) ComputingCPU, storage, and networkGoal 1: Reduction of Costprinciple: fine-grained renting of resources„pay as you go“ (elasticity of cost) Goal 2: Simplification of Managementpotentially infinite/unbreakable computing resourcespotentially no administrationGoal 3: Elasticity of performanceSame resp time independently of workloadNote: does not work yet for DB or apps
اسلاید 38: 38Case Study: Amazon AWSEC2 : scalable virtual private servers using Xen.S3 : WS based storage for applicationsSQS : hosted message queue for web applicationsSimpleDB : the core functionality of a databaseHadoop based functionalitySimilar providers: IBM Blue Cloud, Microsoft Azure, (GoogleApp engine)
اسلاید 39: 39The limits of the (Amazon) CloudCloud Computing a great starting pointUnfortunately, only a fraction of the stackHardwareDBMSApplication ServerApplicationCustomization, Training, ...
اسلاید 40: 40Making use of the CloudSolution 1 (conservative)Take an existing application (Java+SQL, etc) and try to make it run on the cloud (e.g. make Oracle run on AWS)Solution 2 (reactionary)Create an fresh new infrastructure, specially designed for Web apps requirements, to be deployed in the cloudBenefitRisk
اسلاید 41: 41Solution 1 (conservative) take a traditional DBMS (e.g., Oracle, MySQL, ...)install it on an EC2 instanceuse S3 or EBS as a persistent storeAdvantagestraditional databases are availableproven to work well; many toolspeople trained and confident with themDisadvantagestraditional DBMS solve the wrong problem anyway (e.g. focus on consistency)traditional DBMS make the wrong assumptions (DB optimizers fail on virtualized hardware)
اسلاید 42: 42Solution 2 (reactionary)Rethink the whole system architecturedo NOT use a traditional DBMS and app servercreate new breed of application server (with DB)run application server on n EC2 instancesuse S3 + distributed consistency protocolsAdvantages and Disadvantagesrequires new breed of (immature) systems + toolssolves the right problem and gets it rightExamples: GoogleApps (Python in the cloud)Sausalito (www.28msec.com) (XQuery in the cloud)
اسلاید 43: 43AgendaCurrent pain in building appsWhat can XQuery do for customers ?What can the Cloud do for customers ?How do we put them together ?How do XQuery+Cloud solve the problem ?Some open research problems
اسلاید 44: 44XQuery + AWS CloudCookbook:Take an existing XQuery processorPartition the XML data on S3Map REST calls to XQuery programs Run the XQuery programs on EC2Use SQS for (asyncronous) updatesVoila.The magic is in the glue (XQuery proc. + AWS )Application Server + Web Server + Databaseintegrated XQuery based application stack for Web-based appsfully SOA enabledall pre-configured and lean (ZERO admin)
اسلاید 45: 45XQuery in the Cloud (connected)
اسلاید 46: 46Customers concernsCostTime to marketFlexibilityCustomizabilitySustainability
اسلاید 47: 47XQuery in the Cloud (no Server)
اسلاید 48: 48XQuery in the Cloud (offline)
اسلاید 49: 49Demo at www.28msec.com !Look at www.programmableweb.com for use cases ( consumer and enterprise mashups)
اسلاید 50: 50Competitors: InternetWeb 2.0 Development FrameworksE.g., Ruby on Rails, PHP / LAMP, ...Deployment in the cloud still problematicGoogle AppEngine, Facebook AppsProprietary programming model (Python-based)Limited functionalityVendor lock-in, privacy issuesOracle on AWS, do-it-yourself on AWSlimited functionality and/or scalability
اسلاید 51: 51Competitors: EnterpriseSalesforce AppExchangeproprietary programming modelLimited applications domain (CRM)Microsoft Azure.Net programming model manual configuration needed(recent offering, market adoption unclear)Virtualization Companies (e.g., VMWare)No offerings / expertise for data managementOracle (Grid, RAC)limited scalability, cost prohibitive
اسلاید 52: 52Web 2.0 Support vs. Cloud SupportProprietaryStandardDevelopmentDeploymentCloudTrad.XQuery+AWSAWSAzureGoogle App Engine, FacebookRuby on RailsOracleVMWare Cloud,CitrixSalesforce, Workday
اسلاید 53: 53AgendaCurrent pain in building appsWhat can XQuery do for customers ?What can the Cloud do for customers ?How do we put them together ?How do XQuery+Cloud solve the problem ?Some open research problems
اسلاید 54: 54Versions and variationsHuman mind does not like agreementsWe like our differences (for a good reason)Different ways to see:DataSchemasCodeCurrent stack is imposing agreementunlike our own natureWe have to come up with solutions that allow, welcome and exploit variationsDarwinian, evolutionary approach to data, schema and code mutations
اسلاید 55: 55Versions and variationsResearch problems:What is a (data, schema, code) variation ?What does it mean to run an app in the presence of variations ?How do you store (index, etc) variations ?How do you re-integrate them back into mainstream app (e.g. community voting ?)What is the correct lifecycle for data, schema, code that allows and maximally exploits variations ?Note: I have a easier time to think of a solution if the app is in XML/XQuery rather if the app is in Java+SQL (even Python)
اسلاید 56: 56ConclusionXQuery in the cloud a serious alternative for some (large # and large $$) customersNothing equivalent in the competition:How “solid” (standard, tested) this isRichness of applicationsPotential for optimization and parallelizationEase of porting to the cloud
اسلاید 57: 57 My adviceKeep the eye on the apps, not dbKeep the customer in mindRethink the entire stackDon’t be afraid to shake down existing ideas about how applications are supposed to workThank you!
نقد و بررسی ها
هیچ نظری برای این پاورپوینت نوشته نشده است.