صفحه 1:
Chopter 0: Query Optcotzdiva
صفحه 2:
+ Oleger UP: یمن ببس
له
)ملظ اه تساه
و )۳ م3 مطدو()
(Grotsticd dePorewaton Por Oust Bstcratiza
Costbased opicrtzatics
Opens Programing Por Okovstey (Evchritos Phacer
Octertatzed views
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo ©Sbervehnts, Cork ced Cnakershe
صفحه 3:
محا لح +
بحصي ميف د لهج مرو فا ۲
سوه موه ۶
OP Rene ckprthos Por pack operation (Ohare (9) ©
مه بو و مه رم امه اج لو و ما سل بو ۲
موجه پم سا وی وا لا ۲
:امم Porwatvd obou retoivas. لس ©
oF tuples, ای ۱
اه معا سل خام اس ۱
Cre. >
esitvaiog Por terete resale یه ۱
of cowplex expressives ۱
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo ©Sbervehnts, Cork ced Cnakershe
صفحه 4:
(0) ملس +
۲ اد qeuerded by two equided expressivas have the sue set oF utiibutes:
score set oF tuples: تا مه لو
© though ther tuples/atrbues way be ordered dPPeredly.
1
۹
a
S bronch_city=Brooklyn 4
branch account —_depdsitor
customer name
(b) Transformed expression tree
wee
TN customer name
S brsnch_cty-Brookdyn
Da.
branch 4
account — depdsitor
(a) Initial expression tree
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 5:
(.20۰)) متسب
(Geoeraica oF quer-evotuaicg phoos Por os expressiva wolves severd
وت
0 Geueraioy byicdly equivdect expressive صطفوجه رو nies.
0) Ouuotatec ماه اب وا موه تلو query phic
Choosiey the cheapest plo اوه هی من لس
راون لوا پووی لا ممم مهو “he
4
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo ©Sbervehnts, Cork ced Cnakershe
صفحه 6:
+ نموت میب Rehiocd Cxpressive
§ موه مارا اما ی oe suid ty be equucedt P vo every leva
سنا | وله مرول expressivas yrorrue the suxve set oP tuples
© Ootet order of tuples is irrekevarat
B10 GQ), tops ced napus ore wulicets oP tuples
© Dion expresso ta he onset versiza of the rebtioad okebra ae said ها
be equivdedt Foc every lend docbase Retrace he buy expressive
مب the rere crn deat of tahoe
Oc equidewe rib sae hol expressions of two Pores ore equiv ال
vert حصان سن ,صصص Coc repr expressira of ret Pore by ©
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo ©Sbervehnts, Cork ced Cnakershe
صفحه 7:
مس سم ها
0. Conuanive جح و وا لول با موی شوه مشاه of
tordvicdual selertions.
Oy 9,(B) =0,, (0, (2)
©. Gelevivn pperuioes و وه
رت (0p,(B)) =0),(0;,(B)
9. Only او ماو خن ویو و و تا & ceeded, the vers cont be
ان
11, (11, (..(1;,()...)) =, (B
Oates proxkarts orl theta ker. رب لامرن سا مه وله
Go) = CM y ,0,۵
موی OM = )4 مرک امه ۲
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wr ©Sbervehnts, Cork ced Cnakershe
صفحه 8:
+ Oqucdwr Rubs (Ova)
9: Dhetepa operations (aad ccturd pie) are cori.
Mi, , 6د ي© ©
هه موه موز Ocird )9( .©
(ه با شره ۵9 ف
(b) Dheta pres ore cseoctive ta the ی مسا
CB, WPS ره WB, orn (Bs Bs)
بر لو ره ره مس اه where 0, tucker
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo
صفحه 9:
Rule 7a
1f 8 only has
attributes from El
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wo ©Sbervehnts, Cork ced Cnakershe
صفحه 10:
(0) () دج
۵, (۲ رهاط موه ماه the theta pis operctiva vender سا
او رو
(0) Okeu dl ke inbutes ۰ 0 اه ات بط را سامت ocr
oP the expressions (@,) beter ped.
Bs) = (CmOM Co , مه
(b) Whew 0 ywobes oy he airbates of B, ond 0, tucker
he aber Ce
Opa" 02 Be < ۵ 11 ٠, يره) )©0((
4
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 هه
صفحه 11:
+ ۳ ست ط Cbs (Ora)
تسس موه theta pis ون تا موه موم .0
Le! ذا ينا ما صطف ره امه 1 ۸ (0)
Tun (BoB) = (Th, (B)) XU, (B))
(b) اه سین B, 9 Ba.
© bet yaad b, be لو ما اه Ba, respectively.
© Let L, be ctirbntes of, trot are evolved is pis cndtion 8, but cre ort ta, U
Uy sx
© bet lg be cirbutes oF Brot ore volved ia pia corndion 8, but are oot ia Ly U
ie
Thon (A 9 B) = Thy ux, (Tl, x, Mo Th, uz, (BD)
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. 00 1 سا0 لح 0 لا سواه
صفحه 12:
+ ۳ ست ط Cbs (Ora)
۳ وی و اتسوا وه موی اوه اد
لا و6 < م8 دا,6 ۵
6,0 و© 2 م8 6 ©
BO eee teed eee teens
(0. Get unica gad totersenica are چیه
(@ UV @,) U @, = & UV (@, U By)
(@,. 9 8( 0 و8 - 0 (@z 0 Ez)
(0. Dke selection ممصم distributes over U, 0 od —
O% (E, — @) = 0,۸ - %(E,)
cand sirtady Por U od 9 ta place خام -
ben: Oy (@, - ©0( - -0©)ره Ce
vac sievkody Por 0 ict ارم oP —, but oot Por U
(0. Vke proecics ppercica detributes over vir
T(@, U @) = (1(@,)) V (1 (,)) ha
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موه
صفحه 13:
+ TresPorwdea Oxenply
18 Qher: Pied the oonves oP ol custowers why hove oo orem of sre broads
located ta Brocka.
(acc! deposior))) مب
wok nde Pa. ممق ناص 2110 ل
1
(racic gui (ore
(wer depositor)
۲ رو the seleiiog we por) oe poseble reduces the ste of the rettiza to
اس سا
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 مهو
صفحه 14:
+ لیس wit Ouiple TrawPorwdiows
B Qe: Cred he coves oP ol cstowers wit رطس( وه مسجت و
break whose uno bose is ner $IDOO.
Tirana = Uren” fe > DD سس أ
11
(break ((سسکطه مسل
۱ و ما fort woorktively (Rue Ox):
| اس )سین یی = roca *Iedewe COX
)
11
(مسستد سسج)
۱ to apply the “perPorw seleviivas ead”
(ح) ونوا سین 6
to the subexpression بانج رابص
)وه سا0
خی سا من موجه موه و سا ۲
وه
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 15:
+ بش TrixePoruxtiow (Ovu.)
Tl eustomer_nare
Tl customer_name |
۳ ۳۹
branch_city=Brooklyn
A balance < 1000 a ™
۳ depositor
4 4 Tia
70 © branch_city-Brooklyn 9;
branch ۳۹4 ty-Brooklyn Obalance < 1000
account depositor branch account
(a) Initial expression tree (b) Tree after multiple transformations
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موه
صفحه 16:
+ @rvjpvivd Operon Cxawple
cvbrttor) — ۳ ۳ موی ویس نکاس سا
یی سس انا ۲
ured) ۳ (اصصط) بييية ييه
we وا وله وا و و ان
(brouck_onre, brouh_niy, assets, uorvuat_aurber, bua)
© ch افو ری ات nuler Ou od Ob; elcpicdie ات وی Proow
totermedtte remus to cet!
11
Late ( (Groening (broek)
الا
werk)
© PerPorwiey the projeviva os ead) ws possible reduces he stze of the rehation to be joie
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 17:
+ tid Getrag execs
۱ سا ry
Mt, DA) لا مایت rs)
mw PN ye qe tare ody, De sod, we choose
Mh, Py ne
oo thot we cowie ond روموت ارو و و retatioc.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. جوهه
صفحه 18:
Biceps (Ov) م0 ول
ی
(brexarh) و0 )عت سس 11
((سسسسط ال ا
resub wits مر لمت بوذا سك مه مج نون الا
(Prema) سوسه مس
ما ماه جا صا بالا ص تلا مه بط
اجه جوا ۲ راما و وی وا of مه لو ه رین 7
broockes located ta Brocka
و te beter و
(a) ee
0
First.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 مهو
صفحه 19:
+ و یه Oqurded Oxpressow
© Quy openers we equivdeuse rules to systewotcdly ی expressive
تفج ip he موه مهف
مساو exevutay رل روا موه تقو آه وی امن ۲
لو( step vail oo sore expressions oot be
© Por euck expression Pound sv Por, use oll opplodble equicdeue rues
١ add ceil) yeuercied expressives ty the set of expressicos Pourd sv Por
BO OVhe hove wero & very expeweive fo space ued tke
۲ poe وه وله موی موه روا ال نو
۱ wendy ody the top level
oP the two ore dPPered, subtrees below oe the sexee ood coo be shared
رس appl icy joi ات .6 ۱
وه اه رو و را اد و و BOD
Dore dette shorty ©
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موجه
صفحه 20:
+ Ove سوه
۲ ین oP eurk opercior copter us deserted i: Obupter (2,
© Deed statsics oF اه نو
١ ام بر of tuples, sizes oF tuples:
۲ ما con be results of sub-expressioes
© طلجت و
© Do de oy, we require obhiced statistics
* Coy. cunvber oP detent udues Por اه من
© ere v0 ost وا مت
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. 9
صفحه 21:
مان ملس +
مسج له مه اه ts used مود نوات رام ی بل مره وه ۳
لسرا be re ممصو جهو نجل لأ he rete nen
TT customer_name (60tt to remove duplicates)
(hash join) هم
Dd (merge join) depositor
pipeline pipeline
balance < 1000
(use linear scan)
account
woe
9 branch_city = Brooklyn
(use index 1)
branch
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 22:
+ Oboe oP @uchntioa Phare
ماه واه ما مومت مره تاه BL Ox et coceider he fntererioa
د ی و مس ای روط بو مس بل رما یر
ux. سود امه با لاس
۱ جاتشعابب و
reduces the ost Por co cuter level وروی
© ce stedyop jet way provide opportuntiy Por pipettes
© Creted gery opikeizers tworporde eleweuts oP the مد ماو brood
اسر
4. ی اجه ام ۳ اه سم the best plas io
او( لاه
8 ام و واه وا یج و()
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 66م
صفحه 23:
+ Ovet- Bed Optica
BE مین Pdr he best preorder Porn, re. Mr,
Bl Vhere a (O(o- -م)/ا() 0)! dPPered jor orders Por chove expressiza.
Otk =, he امه QOOCOO, wik 0 = 00, he تا طلسم rruier tro
19 اسطا
BE ی وا له و di he pra orders. رومام مد روا te best
ost مج جر Por cay subset oP
و رات ام ...رت تا ed stored Por Pukire wer.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 هه
صفحه 24:
د ست يعاس نام 0) Opcenmc Pronrecowiny ia +
ان ل لا
© Do Pred best ph Por a set G oP arektions, ooasiter dll possible pkre
Pike Porn: G, (G-G,) whit ©) te سم بو subset of
C.
© Recursive) co pue ovsts Por pte subsets oF G to Prod the ومن of
eowk pk. Choose the cheupest oP the O° — ( ohercaives.
© ددا( pha Por coy subset & copied, store tired reuse ihghen tie
regnined oysia, testeud of مومس tt
© Opsenvic progr
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woe ©Sbervehnts, Cork ced Cnakershe
صفحه 25:
ملاح ردو ۳ ومیل +
)8 ساسا یسم
B (bevipkm| G].cvet # 00)
renara bevipkn | @]
Mebee besipkr| O) وه ,وه اجه جوا هچ I cE
(مصسام )0 باد صمح 6( B
ou the bent way تما اس ]انا لجی تاموتا
] apvessteny GO
vb Por pack are mbort Ol of G suck tt OU # G
(P= Prbevipkra( O11)
(POs Penbesipke(G - Gi)
ی و eee
1
م
root = oot .)6 هم
CG) pk = “execate (PU phat, سح 6 pha
pit remus of (PU od PO ری B”
bmg
اس موم
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 26:
+ belt (Deep deta Trew
Babee pia treee, fe righthoud-side fara Ror park jek i a rekon,
بحسم سوج سمج صو ب خم لمجو موا مج
00 (ب0 بسط wee ©Sbervehnts, Cork ced Cnakershe
صفحه 27:
Cost oP مدومن
Bik doonnir progrewiny we cowpleniy oP ppicotzaiod wits bushy trees te
06
© ل(زج سم جا ,000 حب ip GOOOO لمصطاط 186 خاح تمجصم
(«©)0 حا بصا سر ل
ب ا ل لا
© لحت ذجوا جلمد لمحل عار جد وصتمامب صمج لت مهن ون the pier
مار os lePtHsced side .نحو
5 باس سمحصه) مدا coepuied aad stored) leustovet pis order Por mack
هط و ما و الط و رم
۱
O(a") &
069 ه عم بجاوح سر ۰
but workuckte Por queries va hace ,وج و و0 لا
(10 > را queries kare sad a, لس سید
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wor ©Sbervehnts, Cork ced Cnakershe
صفحه 28:
+ 7| ود عله Orders in Ouet-Bosed Opicoinion
9 حلصت the expressions (r, mx BM) (4 4
Bl وه بجاو و order & u partouker sort order of tuples trot pou be ved
Por وه با و
© everctin he result ok, RL leorted va the utrbues corre wih:
rp or ry be we, but پم fl sorted po مین ما
oxy ryswnd rat set woe
x x
رد رت وا موه و way be oostter, bul way provide
رود rit sorted مس مه دا order
© Oot suP Picea to Prod the best pic order Por eack subset oP the set oP orgies
rehticgs; cust Pod the best ips order Por ack subset, Por pack او رس
order
© Grople exteusiva oP eater dvecnnic progrowwiey okprikeos:
© Osudly, accber of تسه orders is quite su7dl ord doesat oPPect
رتاو موی سوام
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موه
صفحه 29:
+ Wewsir Opivizdiva
© Costbused opitoizutivg fs expeusive, eved uty dyoanic programy.
© Gystews way use heurstics to reduce the oxeober of choices trot west be wore
feo costbased Poshion.
oP nudes trot اس و رس روا رو سا موم و محا ل
طسو (مه (but ce ta dl اسر
Perforw seleviva rary (reduces the cnnober of tint) ©
of uirbvter) اه Perfor propvion coy (rehices ©
© Cerforn wost resinoive selection oad jpit opercivas bePore ات ان
per oho.
© Cows ها وی اه ها زان ی رو wit portal
رن لامرن
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. 9 1 سا0 لح 0 لا سواه
صفحه 30:
+ سب 1a Typed “Leweto Opkotrtioa
| nmiKeLe sekrine hi 6 seqee 5
(Cqw. nie (.).
©. مرو موه موه مه( he query trey Por te ratest presble
برفجظ) مج ries ©, Pa, Pb, Ad).
اه ما مس نت فا سین مزلم ماه ما نس سح :9
rehaiocs (Bou. rue (۰
مه ماه و fro oe Pobiued by بهذم مه بل .@
(Bq. nie Pa). وه و و
©. لجی سم wove os Par dow he tee os posable منم اه تا
,حاف creukn SEW propuivws where weded (Bquv. nies 9, Ou, Ob,
ae).
0: (dew) hose subtrees whose operons canbe preted, ond exert
woke ماس
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 31:
+ 9 دی و Chmey Opies
© Ve Gpstew ون ماو ovusiders vay lePedeep pia orders. Mhis
reduces ماه rowpiediy ood yeourrates phos arecable to pipetoed
سار
Gystew /Gtorburst dev wees heuristics to push selevives oad proevives dow
the query tree.
۲ Weursic opieizatog vsed to sowe versives oP Orurte:
© Repediedy pick “best” relation ty jot ext
* موه Prow eark oF a startey poivis. Pick best ocoeny hese.
© Cor seus wsiey sevoudery indices, swe ppikeizers tobe fay occu the
probubliiy thot the poge mrotototry the tuple ts to the buPPer.
۲ مت of OGD complicate من توص
© Cy, vested subqueries:
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 مهمه
صفحه 32:
+ Grishre ob Chewy Opkotare (Oot)
۲ Gowe query opikotzers ها مر selection oad مجو موسر وا of
همم poor.
© Gpstew R od Gtarburst use 9 hierorchicd provedure based a the
هن یرالیه (SQL: heurtstic revoritesy لاو روا انوا
راون لصو
© و موه ری او ها جاه ی با رت مر
substocttal pverkecd.
۲ Dh expewe ماه ما وت را by soviep of query-exevutivg toe,
وت ال ناو اه ای با رل بو تالا
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 33:
+ Oratetod “Porwatod Por us! eva
بل oP tuples irs rekon وی نو
oF وه وی وا اه میم نبا
سر حك جاده خم ماد :ا
م 0
O, 1): Kober oP distiadt voles thot uppeur tar Por otdrute B; sume جد the size oF
T(r).
۲ 1 یا oP roe stored together physicdly tao Pie, theo!
ay
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 34:
Wewqaws
4
© Wistograe vo اه oye oF موم ما
610 11-15 16-20 21-25
value
wor
50
1-5
موس 0 لما
مسا او ۲
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 35:
+ Ocbutra One Betartia
Boe.)
١ سجاريصي : 000 / ب oP record thet wl outePY) the ادن
١ Davey coedio ow a hey othe! otze eoitrute = (
للا pzclr) (owe oP Gy. fr) ® sere)
© Leto dewte رای سوه بط oP Kp out ary her vox,
00 xed crxax(D yr) are و امه
> DEO Ru < amy)
v- min@r)
“max@7)- ming 7)
8 AP heteernee avakble, cou rePie above eter
۵ واه هه لصا تاه مسا وا wemneed ty be «, /2.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo
صفحه 36:
+ Cue Ovtadioa oP Oowpex Orkviow
Bh solve, of a ooention 0,6 the proboblly ato hile ایس مارب 0
© Rg, & he cxneber oP sushi Apes ir, he seleca ay oP 0, wea by &, ha.
BM وه لمح( بو م6 تون of
tole abe ede:
5 5 ...ری ی 5
اال ار
تا of یه هه رن ی نصا ۵
| 207 جرت كل درت 1 زا«
2 رید نمی لاه
« - <)0(
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 37:
+ tbat porta Candy Breage
|
هوك ١ مهد
و با مس ی
Bim 240,000.
BP. = CG, whick trophies trot
<00000/66 = FOO.
م
.00( = 60000/060 < بط
۲ اس 06000 < (س سم Kepler ho, or werne, cok
اوه مسا جمصصی
۱ ia deposior is a Poreky hey va msinwer.
* Ofesiwer_onve, netroer) = IDDOO (pricey key!)
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wor ©Sbervehnts, Cork ced Cnakershe
صفحه 38:
+ @vtertioa oP te Otze oP oko
Phe Ooreotes produit rx es ovotciee 0-0, fuples pack tule oooupies & + =,
bytes.
BERN C=O, beak ste he cae wr xe
BP RN Gtr akey Por ®, hea a hele of sud pic wis of const oe hee Pro
© erePore, the ouncber oP tuples fe eis a eter trom امه of
hues a =. x
BERN Cn 6a Pore bey i © يواسم وتاي R, اه امه عم tinker
the sue oe the oanvber of tuples ts: عبط
9 eer Por RO G bern 0 Poretes hey rePerearicy Gis
ورد
اس ماع هه امه ١ یط هو ار 3
Foren hey of rst x
© hewe, the result hos excl Ui jcen Apes, whick t& GOOD
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موه
صفحه 39:
+ Cstantiog oP the Gre oP vies (Onc)
BERN )© - )0(( بوط د اس حا hor Por C.
AB we weave tot every tiple fia R prockices tuples in RG, tk cncober of
Apes RO extend bbe:
ako,
662
AP he reverse nw, the eotcodte تا لس لاه
رت
0063
(he lower of سا وی میا جوا probably the wore و عمجت
© Cog topreve 00 oboe ۲ ره و مموستا
© Ose Porta sober صا cbove, Por each cell oP histogreres oo the to
اسر
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wee ©Sbervehnts, Cork ced Cnakershe
صفحه 40:
+ @vtaraiva oP the Otze oP ote (Ova)
Bl Owner the size esiedtes Bor deprntior ebper wahout vein kPoreatira
فص Pores keys
© Ofer etxrer_cnve, دمي
Ofretrrer_ ance, netewer) > 0
له 60,0۵0 - 10۵0۵/660 ۲ 6000 مه مه مس Dhe ©
SOOO * 0۵6۵/۵۵۵0 - 00
© Oe chovee the bwer poke, whick in this care, tr he sae os our coder
ری موه Pores Key.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wen ©Sbervehnts, Cork ced Cnakershe
صفحه 41:
© Get operctioces
© Cor wairsloiersevive oP selevioes va the sve retiva! rewrite oad use
ste estate Por selevticos
۱ Exp Oy) U Oyo (7) ea be (ص) مرت يرت جه صعميص
© Cor operctioes va dPPered rebtioas:
> eoitrated size oP Us = size oP r+ otze بج اه
( سوه مک 0 هه او oP rund sie بد خام
۱ با كان صو ای
« و سا نوت مه سا با( truscurde, bul provide upper
vothe stes.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 مهو
صفحه 42:
+ 6 سم (Ovu.)
ترصن 9
رو + oP Mb = ste oP DM و لبق ۶
و وم و ور Case oP »
خا مياد خم خم مواد + 4< ste Pr Xe = ste Pr 1۱
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wee ©Sbervehnts, Cork ced Cnakershe
صفحه 43:
اه مت و این( ان مشیم +
شا ره :ده
.4 = (۵) ,00,6 تس لو هس io و 0 BP
60-8 روم ١
:ساس خم اد للج اسرد د خم صم صم جام جز 09 م۵ ۲
.علس لجتاعصمد خانم O(®,.0, (7) = suncber
» (ex, (P=10 = 9 0 62 ((
vetevivs ocnttivg 6 is oP the Pow @ opr لا
votcrted O(P,6, (r)) = (Pr) * =
where ste he seleviviy oP the oelevion.
emirate oF سوت و cher owes! با
wi Pr), %6 (3)
© Dore wounnte estate con be yet stay probably theory, but this vor
works Proe بای
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wee ©Sbervehnts, Cork ced Cnakershe
صفحه 44:
+ Ostantioa oP Owtat Ockes (Ovu.)
eee
۲ ۱ ات اه tt Owe ما
ف ,« ,)1,7( مب < هلا ,010 لسع
۲ 1 ۵ اه ی )2 Prow rund GC Prow », theo estevoted
O@,™ و
000۳06۵ - 00,۵, ۵۵-۵۸۵۳۵۵8۵ 9.)
۶ Dore wow رت اب سا مت سرت probably theory, but tis oe
works Pie gecercthy
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wee ©Sbervehnts, Cork ced Cnakershe
صفحه 45:
+ Ostantioa oP Owtat Ockes (Ovu.)
۴ مه of dora نالف سیف تا Bor propane.
© Dhey ore te exer ts Ts 4) 9 RF
BE Dhe sae hkl Por group سويت خا وجا
ای وی و ۲
حت اجه سا موه ملس مه اه بط Corer ®) ord wax((P), ©
اه موی مس ۵ where (6) ,)اه
© مج سوه و و dh uckes or detect, od vee O(B,r)
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wee ©Sbervehnts, Cork ced Cnakershe
صفحه 46:
+ Optootziy ested Oubqures"*
۲ GQL cowephnlly treats vested subqueries ia he where ماه os Puarticas tot اه
ponnveters ond retura o stage vohue or set oP udkes
© Coraweters we vortbles Prow puter level query trot ore weed to he اس
حون موی لا بت لت سوه همطل
9۵"
جات مره لیر
© Coweptudly, vested subquery is executed vure Por euck tuple to the orvss-product
yesertted by the cuter level Prow chase
مه اوه سای ها مره بلق و
© مرن مان :ج0000 is where vkuse way be used ty orp 3 fos (ketewd of
pross-produes) bePore exert ihe cevied bnery
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wero ©Sbervehnts, Cork ced Cnakershe
صفحه 47:
0 دادج 0م Dested Gubqueries (Ovd.)
وه ماه حشي ججا بوجه مه لهوامون
ahi wober of ols way be wade to he vested query ©
a recut جه here wy be wovevessury reedow VO ©
(GQD opioiers لو متام صا اومجاه subqueries to iptos where possible,
مج موز ای use of باروج
جه query coc be rewrites له وی نو
سنجمه تسود
Prow borawer, depositor
acer 2-5-2-2 برس ای
© Dote: obove query desu! vorreniy ded ui: duplicates, vad be wodPied to de
55 55 we will see
4 geverd, itis ont possibte/strahPor werd to wove the eutire ترجه لو Prow
douse foto the cuter level query Pro ckruse
© ۵ روموت reltiod is oredied keteud, god wed fo body oP outer level query
4
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 موه
صفحه 48:
Obqurvn (Ove) سس سس ز
4a yecerd, GGL queries of the Pore below co be rewrites we shows
با بح
ukere P,)
pres tbl fo 9
eke ماطف
row by,
kere P,°
voles...
Pro انا
where P, od PP
© Pf cockice predates ta جا يفل de ای امه رون امه مه
© P28 ,بو سوه ام طلسم ولمم wil
امه لمیر ما
۱ ainbuies wed is predzaies wil correo vores Eg
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 49:
+ اس( بسن Oubqusries (Ooct.)
BE dome menor, the orkid costed query unakl be ها مامتا
rede tbl (we
تسه سم سیر
سرت مها
5000-5
Prow borrower, |
1۱ ١ مسد _سخصمصسص ور صجوما > بجون_ممجصت.
لا Phe process of nephrin ب جلاب باماعصدم) جاجز د خلت نو و با وضو له و
0 اا
Bh Qecorrckion & wore cowpioded whoo
© he ceed subnery wes exnrewntiva, oF
© he the reo oP یج موه اجه با i est Por ecco
© ken the تاها سلجم he costed subcery to the ober
حا تبي wo عاج
© ad ovo.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO.
صفحه 50:
+ Octertczed Oswe"*
11 cootertcteed Yew & 0 view whose codes oe oocpuied onl stored.
۴ با لین ew
و( اه اا
ردنا واد sano rout)
Brow bar
ممم سحا باجحو
۳ سا له view would be very جا مسجم وجا امنا جا خا مخ
required Prequeciy
© Gaves ونان جملا of عولی بل tuples: ood oddiog, up their cera
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 51:
و( بل [ س دب زا +
© Dhe tosh of لس ری لا و ریا wit the vader io date te
أده حادب وه منم ew wotdeurae
§ Odertdized views coo be woiutcced by من ولمم every upd
BO beter option is to مجه حصت view ومدمحامادجه
اس و to commie chames لو همطل با بو
Yew, whick trea usted
© Ow wotdtecsue van be door by
© Door) deRcioy iriqgers co fesert, delete, cod update oP و ما ای the
ماو ند
۱ wheaever dotobase rebiiocs are
ال
© Gupponted direviy by the dotcbuse
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. سا0 لح 0 لا سواه 1 مهمه
صفحه 52:
پر 0 +
۱ ۱ لو سم و deletes) to ot rektion or expressions or اس to oe
fo dPPerectd
© Get oP من و لس لو deleted Brow rare اس § oo
BV stop iPy our description, we oly اعد اجه ی ی
© We repre updies tou tuple by deletion oP the tuple Polloued by iesertica of
the update tuple
19 We desorbe how to copie the chooge to the resul oP rack ام operon,
yeu cheney 7 is tts
ا Qe teu vue how te hoode rebaiccd dyebra expressivas
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 53:
iota Opsrciva
he wrteridied ew 7 = r=) Mad oo updhte to ون لا
ل BE bet Mood deur the ob ood sew tes oP rekitioa
he ose oP oa keer ir ییون ۴
© Decanter ee (U1) 4ج
© Ocdreure te bos (PU, =X
۰ حا إل سر ین erry er ohh che of tr erst وجو 2
ع عوط ندب صطا جا صحصاء احص
BPs, Poker لاس 5)
Grok Por debs = (do)
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo
صفحه 54:
+ Gebota ci! Propotod percha
۴ تسیل Orconter 3 view 1 = 0,(r).
© (إ)يىنا الس سر
9 (اری - برع سر
Bo مس مه سس و
* © - )00۵( ۱ ( (فه) ب(فع)) ع
#9 ۱1 .ك) عاج طيجاد د
0 1P we delete the tuple (0,2) Brow we shod ot delete the ture (0) Pow
Tlo(r), but P we thea debe (0,9) a7 wel, we should delete the tuple
© or cock pe too preevion [],(r) , we wil keep a covet oP how woop ves tuo
ال
© Oc eert oP a ture tr B tke resukod tuple ts drewdy ts [g(r) we taorewet te
جحت ,سجن wwe kd a eu tuple wk cont = (1
© On dekte of 0 tue Brow 1, we deorewent the count of the correspoadn tuple i
Tol)
» Be count becowes , we delete the tuple Prow [,4(r) bs
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woe ©Sbervehnts, Cork ced Cnakershe
صفحه 55:
+ ®qyeqdioa يمن"
سگم
ای سوه اه سوه ان ©
٠١ or rack pkey at Phe comreepoenken (roy ty dbase presirct fv,
eerewed ty Dent, ee we onkl a oe phe wh: ovr =
اما وه ساره خن Dhow a oe ©
Bor marks Ripe tr wee books Por the ory LD 1, ond brant (| Prox her oot >
Por he wep.
AP thee cout becowes (D, we debts Brow v the phe مس سا و 00
B aciv= امس گ9ه
٩ او () سا اند لللی س ام روت با نادجو هط تمه سا موب
بمستع ساد ليمطلي ١ن لمسهصر ) Por tae cant
© ck) we etc he ov i ode eter rope عالت Keke. Ourk ares
ore debe Brow v
< اج راد ون sunny = D (why?)
Bo hen he came oP ag, ive atstta the pu orn! Poet
coger: Vokes separately, لمن vides of hor er
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo
صفحه 56:
expeveive. De hove to bok ان لب tuples oP rikot ore ta the seve yu
to Pred مج تن صا
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 57:
+ Oter سم
جر نموه بت ۲
۱
toy.
© 1P the tuple جز deleted Prow 1, we delete it Prow the tntersevtiza .سوسم صا ذا خأ
© Opdites to sre مرو
او و و الما و وله له موی Dhe ober set opperuives, ©
ما۳
اجب هی ویو لت لها عم چه رورت wurk the suze و متا بت ین ۲
we tewe dete ty you. ©
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. wor ©Sbervehnts, Cork ced Cnakershe
صفحه 58:
Nbealay Gupresspu
§ Ve koode oo eutre expressiva, we derive expressivas Por شوه the
fore edd chooge te the result oP ack sub-expressives, stortay Pro the
وه موم یله مور
یه و G@.y. cowry &, MG, where cack of @, ond Cy ww be نا
expresses
Oppose the set oP tipkes ty be teserted tir Bis sien by D, ©
Comrie poder, shoe srdder sb-expressions ore handed Prot >
© Dhow ter ort oP سا be اس 0, Days tors ly
Dro
7 Dhis te just the vsudl way of و ری
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 59:
۲ ed Drtorictard Owe
۱ Rewriters queries صا wee uteri views!
لاه 4 رس نی ۵ ©
© ۵ وه له ی r pe Dl
4 مجه و تا سم و بط ©
ها مس Por ماه امه مه تلو بو و با ی(
او تنج view by the لو و of اه ما ۲
© OD crterned vew y= Gb wobble, bu wihout cay trex oot
سوت هو و وله و0 ۶
© Guppose dbo thot & hos on index oo the coon utiribute B, ced r kos oc ice
دم trie )۰
© Dhe best pha Por his query way be to repkee vbr; Pk ror brad ty the
لاوييت كمام نحصو 5 x
© Qeey opiwizer should be extruded to crosider ol obove
ام امه ما ها وه له مهم ha
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 60:
QOctevidized Otew Gelevioa
۱ view selvion: “What is the best set of views to wotertaize?”,
© Dhis devisioa wet be wade vo the busis oP the systew مروت
۲ dodves ore pet the wotertaized views, problew of یاه ها ماه و
relied, to that oP wotertdtzed view selection, uous itis stopler.
© لول تن systews, provide tools to help the dotcbose ات ول
todex ood wolertatzed view selection.
Ocsdrer Gyre Oncewpe -O* Crim, Ooi ?, OOOO. woo ©Sbervehnts, Cork ced Cnakershe
صفحه 61:
Gad oP Okaper
Chapter 14: Query Optimization
Database System Concepts 5th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Chapter 14: Query Optimization
Introduction
Transformation of Relational Expressions
Catalog Information for Cost Estimation
Statistical Information for Cost Estimation
Cost-based optimization
Dynamic Programming for Choosing Evaluation Plans
Materialized views
Database System Concepts - 5th Edition, Aug 27, 2005.
14.2
©Silberschatz, Korth and Sudarshan
Introduction
Alternative ways of evaluating a given query
Equivalent expressions
Different algorithms for each operation (Chapter 13)
Cost difference between a good and a bad way of evaluating a query can be
enormous
Need to estimate the cost of operations
Statistical information about relations. Examples:
number of tuples,
number of distinct values for an attributes,
Etc.
Statistics estimation for intermediate results
to compute cost of complex expressions
Database System Concepts - 5th Edition, Aug 27, 2005.
14.3
©Silberschatz, Korth and Sudarshan
Introduction (Cont.)
Relations generated by two equivalent expressions have the same set of attributes
and contain the same set of tuples
although their tuples/attributes may be ordered differently.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.4
©Silberschatz, Korth and Sudarshan
Introduction (Cont.)
Generation of query-evaluation plans for an expression involves several
steps:
1.
Generating logically equivalent expressions using equivalence rules.
2.
Annotating resultant expressions to get alternative query plans
3.
Choosing the cheapest plan based on estimated cost
The overall process is called cost based optimization.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.5
©Silberschatz, Korth and Sudarshan
Transformation of Relational Expressions
Two relational algebra expressions are said to be equivalent if on every legal
database instance the two expressions generate the same set of tuples
In SQL, inputs and outputs are multisets of tuples
Note: order of tuples is irrelevant
Two expressions in the multiset version of the relational algebra are said to
be equivalent if on every legal database instance the two expressions
generate the same multiset of tuples
An equivalence rule says that expressions of two forms are equivalent
Can replace expression of first form by second, or vice versa
Database System Concepts - 5th Edition, Aug 27, 2005.
14.6
©Silberschatz, Korth and Sudarshan
Equivalence Rules
1. Conjunctive selection operations can be deconstructed into a sequence of
individual selections.
1 2 (E) 1 ( 2 (E))
2. Selection operations are commutative.
1 ( 2 (E)) 2 ( 1 (E))
3. Only the last in a sequence of projection operations is needed, the others can be
omitted.
L1 ( L2 ( ( Ln(E)))) L1 (E)
4. Selections can be combined with Cartesian products and theta joins.
a.
(E1 X E2) = E1
b.
1(E1
2
E2
E2 ) = E 1
Database System Concepts - 5th Edition, Aug 27, 2005.
1 2 E2
14.7
©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
5. Theta-join operations (and natural joins) are commutative.
E1 E2 = E2 E1
6. (a) Natural join operations are associative:
(E1 E2)
E3 = E1 (E2
E3 )
(b) Theta joins are associative in the following manner:
(E1
1
E2)
2 3
E3 = E1
2 3
(E 2
2
E 3)
where 2 involves attributes from only E2 and E3.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.8
©Silberschatz, Korth and Sudarshan
Pictorial Depiction of Equivalence Rules
Database System Concepts - 5th Edition, Aug 27, 2005.
14.9
©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
7. The selection operation distributes over the theta join operation under the following
two conditions:
(a) When all the attributes in 0 involve only the attributes of one
of the expressions (E1) being joined.
0E1
E2) = (0(E1))
E2
(b) When 1 involves only the attributes of E1 and 2 involves
only the attributes of E2.
1 E1
Database System Concepts - 5th Edition, Aug 27, 2005.
E2) = (1(E1))
14.10
( (E2))
©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
8. The projections operation distributes over the theta join operation as follows:
(a) if involves only attributes from L1 L2:
L1L2 (E1
(b) Consider a join E1
E2) (L (E1))
1
E2.
(L2 (E2))
Let L1 and L2 be sets of attributes from E1 and E2, respectively.
Let L3 be attributes of E1 that are involved in join condition , but are not in L1
L2, and
let L4 be attributes of E2 that are involved in join condition , but are not in L1
L2.
L L (E1
1
2
Database System Concepts - 5th Edition, Aug 27, 2005.
E2 ) L L (( L L (E1 ))
1
2
14.11
1
3
( L
2
L4
(E2 )))
©Silberschatz, Korth and Sudarshan
Equivalence Rules (Cont.)
9. The set operations union and intersection are commutative
E1 E2 = E2 E 1
E1 E2 = E2 E 1
(set difference is not commutative).
10. Set union and intersection are associative.
(E1 E2) E3 = E1 (E2 E3)
(E1 E2) E3 = E1 (E2 E3)
11. The selection operation distributes over , and –.
(E1 – E2) = (E1) – (E2)
and similarly for and in place of –
Also:
(E 1
– E2) = (E1) – E2
and similarly for in place of –, but not for
12. The projection operation distributes over union
L(E1 E2) = (L(E1)) (L(E2))
Database System Concepts - 5th Edition, Aug 27, 2005.
14.12
©Silberschatz, Korth and Sudarshan
Transformation Example
Query: Find the names of all customers who have an account at some branch
located in Brooklyn.
customer_name(branch_city = “Brooklyn”
(branch (account
depositor)))
Transformation using rule 7a.
customer_name
((branch_city =“Brooklyn” (branch))
(account
depositor))
Performing the selection as early as possible reduces the size of the relation to
be joined.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.13
©Silberschatz, Korth and Sudarshan
Example with Multiple Transformations
Query: Find the names of all customers with an account at a Brooklyn
branch whose account balance is over $1000.
customer_name((branch_city = “Brooklyn” balance > 1000
(branch
account))
depositor)
Second form provides an opportunity to apply the “perform selections early”
rule, resulting in the subexpression
branch_city = “Brooklyn” (branch)
depositor)))
Transformation using join associatively (Rule 6a):
customer_name((branch_city = “Brooklyn” balance > 1000
(branch
(account
balance > 1000 (account)
Thus a sequence of transformations can be useful
Database System Concepts - 5th Edition, Aug 27, 2005.
14.14
©Silberschatz, Korth and Sudarshan
Multiple Transformations (Cont.)
Database System Concepts - 5th Edition, Aug 27, 2005.
14.15
©Silberschatz, Korth and Sudarshan
Projection Operation Example
customer_name((branch_city = “Brooklyn” (branch)
account)
depositor)
When we compute
(branch_city = “Brooklyn” (branch)
account )
we obtain a relation whose schema is:
(branch_name, branch_city, assets, account_number, balance)
Push projections using equivalence rules 8a and 8b; eliminate unneeded attributes from
intermediate results to get:
customer_name ((
account_number ( (branch_city = “Brooklyn” (branch) account ))
depositor )
Performing the projection as early as possible reduces the size of the relation to be joined
Database System Concepts - 5th Edition, Aug 27, 2005.
14.16
©Silberschatz, Korth and Sudarshan
Join Ordering Example
For all relations r1, r2, and r3,
(r1
If r2
r2)
r3 is quite large and r1
(r1
r2)
r3 = r1
(r2
r3 )
r2 is small, we choose
r3
so that we compute and store a smaller temporary relation.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.17
©Silberschatz, Korth and Sudarshan
Join Ordering Example (Cont.)
Consider the expression
customer_name ((branch_city = “Brooklyn” (branch))
( account depositor))
Could compute account depositor first, and join result with
branch_city = “Brooklyn” (branch)
but account depositor is likely to be a large relation.
Only a small fraction of the bank’s customers are likely to have accounts in
branches located in Brooklyn
it is better to compute
branch_city = “Brooklyn” (branch)
account
first.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.18
©Silberschatz, Korth and Sudarshan
Enumeration of Equivalent Expressions
Query optimizers use equivalence rules to systematically generate expressions
equivalent to the given expression
Conceptually, generate all equivalent expressions by repeatedly executing the following
step until no more expressions can be found:
for each expression found so far, use all applicable equivalence rules
add newly generated expressions to the set of expressions found so far
The above approach is very expensive in space and time
Space requirements reduced by sharing common subexpressions:
when E1 is generated from E2 by an equivalence rule, usually only the top level
of the two are different, subtrees below are the same and can be shared
E.g. when applying join associativity
Time requirements are reduced by not generating all expressions
More details shortly
Database System Concepts - 5th Edition, Aug 27, 2005.
14.19
©Silberschatz, Korth and Sudarshan
Cost Estimation
Cost of each operator computer as described in Chapter 13
Need statistics of input relations
Inputs can be results of sub-expressions
Need to estimate statistics of expression results
To do so, we require additional statistics
E.g. number of tuples, sizes of tuples
E.g. number of distinct values for an attribute
More on cost estimation later
Database System Concepts - 5th Edition, Aug 27, 2005.
14.20
©Silberschatz, Korth and Sudarshan
Evaluation Plan
An evaluation plan defines exactly what algorithm is used for each operation, and how
the execution of the operations is coordinated.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.21
©Silberschatz, Korth and Sudarshan
Choice of Evaluation Plans
Must consider the interaction of evaluation techniques when choosing evaluation
plans: choosing the cheapest algorithm for each operation independently may not
yield best overall algorithm. E.g.
merge-join may be costlier than hash-join, but may provide a sorted output which
reduces the cost for an outer level aggregation.
nested-loop join may provide opportunity for pipelining
Practical query optimizers incorporate elements of the following two broad
approaches:
1. Search all the plans and choose the best plan in a
cost-based fashion.
2. Uses heuristics to choose a plan.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.22
©Silberschatz, Korth and Sudarshan
Cost-Based Optimization
Consider finding the best join-order for r1
There are (2(n – 1))!/(n – 1)! different join orders for above expression.
With n = 7, the number is 665280, with n = 10, the number is greater than
176 billion!
No need to generate all the join orders. Using dynamic programming, the leastcost join order for any subset of
{r1, r2, . . . rn} is computed only once and stored for future use.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.23
r2 . . . rn.
©Silberschatz, Korth and Sudarshan
Dynamic Programming in Optimization
To find best join tree for a set of n relations:
To find best plan for a set S of n relations, consider all possible plans
of the form: S1 (S – S1) where S1 is any non-empty subset of
S.
Recursively compute costs for joining subsets of S to find the cost of
each plan. Choose the cheapest of the 2n – 1 alternatives.
When plan for any subset is computed, store it and reuse it when it is
required again, instead of recomputing it
Dynamic programming
Database System Concepts - 5th Edition, Aug 27, 2005.
14.24
©Silberschatz, Korth and Sudarshan
Join Order Optimization Algorithm
procedure findbestplan(S)
if (bestplan[S].cost )
return bestplan[S]
// else bestplan[S] has not been computed earlier, compute it now
if (S contains only 1 relation)
set bestplan[S].plan and bestplan[S].cost based on the best way
of accessing S
else for each non-empty subset S1 of S such that S1 S
P1= findbestplan(S1)
P2= findbestplan(S - S1)
A = best algorithm for joining results of P1 and P2
cost = P1.cost + P2.cost + cost of A
if cost < bestplan[S].cost
bestplan[S].cost = cost
bestplan[S].plan = “execute P1.plan; execute P2.plan;
join results of P1 and P2 using A”
return bestplan[S]
Database System Concepts - 5th Edition, Aug 27, 2005.
14.25
©Silberschatz, Korth and Sudarshan
Left Deep Join Trees
In left-deep join trees, the right-hand-side input for each join is a relation,
not the result of an intermediate join.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.26
©Silberschatz, Korth and Sudarshan
Cost of Optimization
With dynamic programming time complexity of optimization with bushy trees is
O(3n).
With n = 10, this number is 59000 instead of 176 billion!
Space complexity is O(2n)
To find best left-deep join tree for a set of n relations:
Consider n alternatives with one relation as right-hand side input and the other
relations as left-hand side input.
Using (recursively computed and stored) least-cost join order for each
alternative on left-hand-side, choose the cheapest of the n alternatives.
If only left-deep trees are considered, time complexity of finding best join order
is O(n 2n)
Space complexity remains at O(2n)
Cost-based optimization is expensive, but worthwhile for queries on large
datasets (typical queries have small n, generally < 10)
Database System Concepts - 5th Edition, Aug 27, 2005.
14.27
©Silberschatz, Korth and Sudarshan
Interesting Orders in Cost-Based Optimization
Consider the expression (r1
An interesting sort order is a particular sort order of tuples that could be useful
for a later operation.
r2
r3)
r4
r5
Generating the result of r1 r2 r3 sorted on the attributes common with
r4 or r5 may be useful, but generating it sorted on the attributes common
only r1 and r2 is not useful.
Using merge-join to compute r1 r2 r3 may be costlier, but may provide
an output sorted in an interesting order.
Not sufficient to find the best join order for each subset of the set of n given
relations; must find the best join order for each subset, for each interesting sort
order
Simple extension of earlier dynamic programming algorithms
Usually, number of interesting orders is quite small and doesn’t affect
time/space complexity significantly
Database System Concepts - 5th Edition, Aug 27, 2005.
14.28
©Silberschatz, Korth and Sudarshan
Heuristic Optimization
Cost-based optimization is expensive, even with dynamic programming.
Systems may use heuristics to reduce the number of choices that must be made
in a cost-based fashion.
Heuristic optimization transforms the query-tree by using a set of rules that
typically (but not in all cases) improve execution performance:
Perform selection early (reduces the number of tuples)
Perform projection early (reduces the number of attributes)
Perform most restrictive selection and join operations before other similar
operations.
Some systems use only heuristics, others combine heuristics with partial
cost-based optimization.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.29
©Silberschatz, Korth and Sudarshan
Steps in Typical Heuristic Optimization
1. Deconstruct conjunctive selections into a sequence of single selection operations
(Equiv. rule 1.).
2. Move selection operations down the query tree for the earliest possible
execution (Equiv. rules 2, 7a, 7b, 11).
3. Execute first those selection and join operations that will produce the smallest
relations (Equiv. rule 6).
4. Replace Cartesian product operations that are followed by a selection condition
by join operations (Equiv. rule 4a).
5. Deconstruct and move as far down the tree as possible lists of projection
attributes, creating new projections where needed (Equiv. rules 3, 8a, 8b,
12).
6. Identify those subtrees whose operations can be pipelined, and execute them
using pipelining).
Database System Concepts - 5th Edition, Aug 27, 2005.
14.30
©Silberschatz, Korth and Sudarshan
Structure of Query Optimizers
The System R/Starburst optimizer considers only left-deep join orders. This
reduces optimization complexity and generates plans amenable to pipelined
evaluation.
System R/Starburst also uses heuristics to push selections and projections down
the query tree.
Heuristic optimization used in some versions of Oracle:
Repeatedly pick “best” relation to join next
Starting from each of n starting points. Pick best among these.
For scans using secondary indices, some optimizers take into account the
probability that the page containing the tuple is in the buffer.
Intricacies of SQL complicate query optimization
E.g. nested subqueries
Database System Concepts - 5th Edition, Aug 27, 2005.
14.31
©Silberschatz, Korth and Sudarshan
Structure of Query Optimizers (Cont.)
Some query optimizers integrate heuristic selection and the generation of
alternative access plans.
System R and Starburst use a hierarchical procedure based on the
nested-block concept of SQL: heuristic rewriting followed by cost-based
join-order optimization.
Even with the use of heuristics, cost-based query optimization imposes a
substantial overhead.
This expense is usually more than offset by savings at query-execution time,
particularly by reducing the number of slow disk accesses.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.32
©Silberschatz, Korth and Sudarshan
Statistical Information for Cost Estimation
nr: number of tuples in a relation r.
br: number of blocks containing tuples of r.
lr: size of a tuple of r.
fr: blocking factor of r — i.e., the number of tuples of r that fit into one block.
V(A, r): number of distinct values that appear in r for attribute A; same as the size of
A(r).
If tuples of r are stored together physically in a file, then:
nr
br
fr
Database System Concepts - 5th Edition, Aug 27, 2005.
14.33
©Silberschatz, Korth and Sudarshan
Histograms
Histogram on attribute age of relation person
Equi-width histograms
Equi-depth histograms
Database System Concepts - 5th Edition, Aug 27, 2005.
14.34
©Silberschatz, Korth and Sudarshan
Selection Size Estimation
A=v(r)
nr / V(A,r) : number of records that will satisfy the selection
Equality condition on a key attribute: size estimate = 1
AV(r) (case of A V(r) is symmetric)
Let c denote the estimated number of tuples satisfying the condition.
If min(A,r) and max(A,r) are available in catalog
c = 0 if v < min(A,r)
c=
nr .
v min(A, r)
max(A, r) min(A, r)
If histograms available, can refine above estimate
In absence of statistical information c is assumed to be nr / 2.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.35
©Silberschatz, Korth and Sudarshan
Size Estimation of Complex Selections
The selectivity of a condition i is the probability that a tuple in the relation r satisfies i .
If si is the number of satisfying tuples in r, the selectivity of i is given by si /nr.
Conjunction: 1 2. . . n (r). Assuming indepdence, estimate of
tuples in the result is:
s1 s2 ... sn
nr
nrn
Disjunction:1 2 . . . n (r). Estimated number of tuples:
Negation:
s
s
s
nr 1 (1 1 ) (1 2 ) ... (1 n )
nr of tuples:nr
nr
(r). Estimated
number
nr – size((r))
Database System Concepts - 5th Edition, Aug 27, 2005.
14.36
©Silberschatz, Korth and Sudarshan
Join Operation: Running Example
Running example:
depositor customer
Catalog information for join examples:
ncustomer = 10,000.
fcustomer = 25, which implies that
bcustomer =10000/25 = 400.
ndepositor = 5000.
fdepositor = 50, which implies that
bdepositor = 5000/50 = 100.
V(customer_name, depositor) = 2500, which implies that , on average, each
customer has two accounts.
Also assume that customer_name in depositor is a foreign key on customer.
V(customer_name, customer) = 10000 (primary key!)
Database System Concepts - 5th Edition, Aug 27, 2005.
14.37
©Silberschatz, Korth and Sudarshan
Estimation of the Size of Joins
The Cartesian product r x s contains nr .ns tuples; each tuple occupies sr + ss
bytes.
If R S = , then r
If R S is a key for R, then a tuple of s will join with at most one tuple from r
therefore, the number of tuples in r
tuples in s.
s is no greater than the number of
If R S in S is a foreign key in S referencing R, then the number of tuples
in r s is exactly the same as the number of tuples in s.
s is the same as r x s.
The case for R S being a foreign key referencing S is
symmetric.
In the example query depositor
foreign key of customer
customer, customer_name in depositor is a
hence, the result has exactly ndepositor tuples, which is 5000
Database System Concepts - 5th Edition, Aug 27, 2005.
14.38
©Silberschatz, Korth and Sudarshan
Estimation of the Size of Joins (Cont.)
If R S = {A} is not a key for R or S.
If we assume that every tuple t in R produces tuples in R
tuples in R S is estimated to be:
S, the number of
nr ns
V( A, s )
If the reverse is true, the estimate obtained will be:
nr ns
V( A, r )
The lower of these two estimates is probably the more accurate one.
Can improve on above if histograms are available
Use formula similar to above, for each cell of histograms on the two
relations
Database System Concepts - 5th Edition, Aug 27, 2005.
14.39
©Silberschatz, Korth and Sudarshan
Estimation of the Size of Joins (Cont.)
Compute the size estimates for depositor
about foreign keys:
customer without using information
V(customer_name, depositor) = 2500, and
V(customer_name, customer) = 10000
The two estimates are 5000 * 10000/2500 - 20,000 and
5000 * 10000/10000 = 5000
We choose the lower estimate, which in this case, is the same as our earlier
computation using foreign keys.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.40
©Silberschatz, Korth and Sudarshan
Size Estimation for Other Operations
Projection: estimated size of A(r) = V(A,r)
Aggregation : estimated size of AgF(r) = V(A,r)
Set operations
For unions/intersections of selections on the same relation: rewrite and use
size estimate for selections
E.g. 1 (r) 2 (r) can be rewritten as 1 2 (r)
For operations on different relations:
estimated size of r s = size of r + size of s.
estimated size of r s = minimum size of r and size of s.
estimated size of r – s = r.
All the three estimates may be quite inaccurate, but provide upper bounds
on the sizes.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.41
©Silberschatz, Korth and Sudarshan
Size Estimation (Cont.)
Outer join:
Estimated size of r
s = size of r
s + size of r
Case of right outer join is symmetric
Estimated size of r
Database System Concepts - 5th Edition, Aug 27, 2005.
s = size of r
14.42
s + size of r + size of s
©Silberschatz, Korth and Sudarshan
Estimation of Number of Distinct Values
Selections: (r)
If forces A to take a specified value: V(A, (r)) = 1.
If forces A to take on one of a specified set of values:
V(A, (r)) = number of specified values.
(e.g., (A = 1 V A = 3 V A = 4 )),
If the selection condition is of the form A op r
estimated V(A, (r)) = V(A.r) * s
e.g., A = 3
where s is the selectivity of the selection.
In all the other cases: use approximate estimate of
min(V(A,r), n (r) )
More accurate estimate can be got using probability theory, but this one
works fine generally
Database System Concepts - 5th Edition, Aug 27, 2005.
14.43
©Silberschatz, Korth and Sudarshan
Estimation of Distinct Values (Cont.)
Joins: r
s
If all attributes in A are from r
estimated V(A, r s) = min (V(A,r), n r
s)
If A contains attributes A1 from r and A2 from s, then estimated
V(A,r s) =
min(V(A1,r)*V(A2 – A1,s), V(A1 – A2,r)*V(A2,s), nr
s)
More accurate estimate can be got using probability theory, but this one
works fine generally
Database System Concepts - 5th Edition, Aug 27, 2005.
14.44
©Silberschatz, Korth and Sudarshan
Estimation of Distinct Values (Cont.)
Estimation of distinct values are straightforward for projections.
They are the same in A (r) as in r.
The same holds for grouping attributes of aggregation.
For aggregated values
For min(A) and max(A), the number of distinct values can be estimated as
min(V(A,r), V(G,r)) where G denotes grouping attributes
For other aggregates, assume all values are distinct, and use V(G,r)
Database System Concepts - 5th Edition, Aug 27, 2005.
14.45
©Silberschatz, Korth and Sudarshan
Optimizing Nested Subqueries**
SQL conceptually treats nested subqueries in the where clause as functions that take
parameters and return a single value or set of values
Parameters are variables from outer level query that are used in the nested
subquery; such variables are called correlation variables
E.g.
select customer_name
from borrower
where exists (select *
from depositor
where depositor.customer_name =
borrower.customer_name )
Conceptually, nested subquery is executed once for each tuple in the cross-product
generated by the outer level from clause
Such evaluation is called correlated evaluation
Note: other conditions in where clause may be used to compute a join (instead of a
cross-product) before executing the nested subquery
Database System Concepts - 5th Edition, Aug 27, 2005.
14.46
©Silberschatz, Korth and Sudarshan
Optimizing Nested Subqueries (Cont.)
Correlated evaluation may be quite inefficient since
a large number of calls may be made to the nested query
there may be unnecessary random I/O as a result
SQL optimizers attempt to transform nested subqueries to joins where possible,
enabling use of efficient join techniques
E.g.: earlier nested query can be rewritten as
select customer_name
from borrower, depositor
where depositor.customer_name = borrower.customer_name
Note: above query doesn’t correctly deal with duplicates, can be modified to do
so as we will see
In general, it is not possible/straightforward to move the entire nested subquery from
clause into the outer level query from clause
A temporary relation is created instead, and used in body of outer level query
Database System Concepts - 5th Edition, Aug 27, 2005.
14.47
©Silberschatz, Korth and Sudarshan
Optimizing Nested Subqueries (Cont.)
In general, SQL queries of the form below can be rewritten as shown
Rewrite: select …
from L1
where P1 and exists (select *
To:
from L2
where P2)
create table t1 as
select distinct V
from L2
where P21
select …
from L1, t1
where P1 and P22
P21 contains predicates in P2 that do not involve any correlation variables
P22 reintroduces predicates involving correlation variables, with
relations renamed appropriately
V contains all attributes used in predicates with correlation variables
Database System Concepts - 5th Edition, Aug 27, 2005.
14.48
©Silberschatz, Korth and Sudarshan
Optimizing Nested Subqueries (Cont.)
In our example, the original nested query would be transformed to
create table t1 as
select distinct customer_name
from depositor
select customer_name
from borrower, t1
where t1.customer_name = borrower.customer_name
The process of replacing a nested query by a query with a join (possibly with a
temporary relation) is called decorrelation.
Decorrelation is more complicated when
the nested subquery uses aggregation, or
when the result of the nested subquery is used to test for equality, or
when the condition linking the nested subquery to the other
query is not exists,
and so on.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.49
©Silberschatz, Korth and Sudarshan
Materialized Views**
A materialized view is a view whose contents are computed and stored.
Consider the view
create view branch_total_loan(branch_name, total_loan) as
select branch_name, sum(amount)
from loan
groupby branch_name
Materializing the above view would be very useful if the total loan amount is
required frequently
Saves the effort of finding multiple tuples and adding up their amounts
Database System Concepts - 5th Edition, Aug 27, 2005.
14.50
©Silberschatz, Korth and Sudarshan
Materialized View Maintenance
The task of keeping a materialized view up-to-date with the underlying data is
known as materialized view maintenance
Materialized views can be maintained by recomputation on every update
A better option is to use incremental view maintenance
Changes to database relations are used to compute changes to materialized
view, which is then updated
View maintenance can be done by
Manually defining triggers on insert, delete, and update of each relation in the
view definition
Manually written code to update the view whenever database relations are
updated
Supported directly by the database
Database System Concepts - 5th Edition, Aug 27, 2005.
14.51
©Silberschatz, Korth and Sudarshan
Incremental View Maintenance
The changes (inserts and deletes) to a relation or expressions are referred to as
its differential
Set of tuples inserted to and deleted from r are denoted ir and dr
To simplify our description, we only consider inserts and deletes
We replace updates to a tuple by deletion of the tuple followed by insertion of
the update tuple
We describe how to compute the change to the result of each relational operation,
given changes to its inputs
We then outline how to handle relational algebra expressions
Database System Concepts - 5th Edition, Aug 27, 2005.
14.52
©Silberschatz, Korth and Sudarshan
Join Operation
Consider the materialized view v = r
s and an update to r
Let rold and rnew denote the old and new states of relation r
Consider the case of an insert to r:
We can write rnew
And rewrite the above to (rold
But (rold s) is simply the old value of the materialized view, so the
incremental change to the view is just
ir s
Thus, for inserts
Similarly for deletes
s as (rold ir)
s) (ir
vnew = vold (ir
Database System Concepts - 5th Edition, Aug 27, 2005.
s
s)
s)
vnew = vold – (dr
14.53
s)
©Silberschatz, Korth and Sudarshan
Selection and Projection Operations
Selection: Consider a view v = (r).
vnew = vold (ir)
vnew = vold - (dr)
Projection is a more difficult operation
R = (A,B), and r(R) = { (a,2), (a,3)}
A(r) has a single tuple (a).
If we delete the tuple (a,2) from r, we should not delete the tuple (a) from
A(r), but if we then delete (a,3) as well, we should delete the tuple
For each tuple in a projection A(r) , we will keep a count of how many times it was
derived
On insert of a tuple to r, if the resultant tuple is already in A(r) we increment its
count, else we add a new tuple with count = 1
On delete of a tuple from r, we decrement the count of the corresponding tuple in
A(r)
if the count becomes 0, we delete the tuple from A(r)
Database System Concepts - 5th Edition, Aug 27, 2005.
14.54
©Silberschatz, Korth and Sudarshan
Aggregation Operations
count : v = Agcount(B)(r).
When a set of tuples ir is inserted
For each tuple r in ir, if the corresponding group is already present in v, we
increment its count, else we add a new tuple with count = 1
When a set of tuples dr is deleted
for each tuple t in ir.we look for the group t.A in v, and subtract 1 from the count
for the group.
– If the count becomes 0, we delete from v the tuple for the group t.A
sum: v = Agsum (B)(r)
We maintain the sum in a manner similar to count, except we add/subtract the B value
instead of adding/subtracting 1 for the count
Additionally we maintain the count in order to detect groups with no tuples. Such groups
are deleted from v
Cannot simply test for sum = 0 (why?)
To handle the case of avg, we maintain the sum and count
aggregate values separately, and divide at the end
Database System Concepts - 5th Edition, Aug 27, 2005.
14.55
©Silberschatz, Korth and Sudarshan
Aggregate Operations (Cont.)
min, max: v = Agmin (B) (r).
Handling insertions on r is straightforward.
Maintaining the aggregate values min and max on deletions may be more
expensive. We have to look at the other tuples of r that are in the same group
to find the new minimum
Database System Concepts - 5th Edition, Aug 27, 2005.
14.56
©Silberschatz, Korth and Sudarshan
Other Operations
Set intersection: v = r s
when a tuple is inserted in r we check if it is present in s, and if so we add
it to v.
If the tuple is deleted from r, we delete it from the intersection if it is present.
Updates to s are symmetric
The other set operations, union and set difference are handled in a similar
fashion.
Outer joins are handled in much the same way as joins but with some extra work
we leave details to you.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.57
©Silberschatz, Korth and Sudarshan
Handling Expressions
To handle an entire expression, we derive expressions for computing the
incremental change to the result of each sub-expressions, starting from the
smallest sub-expressions.
E.g. consider E1
expression
Suppose the set of tuples to be inserted into E1 is given by D1
E2 where each of E1 and E2 may be a complex
Computed earlier, since smaller sub-expressions are handled first
Then the set of tuples to be inserted into E1
D1 E2
E2 is given by
This is just the usual way of maintaining joins
Database System Concepts - 5th Edition, Aug 27, 2005.
14.58
©Silberschatz, Korth and Sudarshan
Query Optimization and Materialized Views
Rewriting queries to use materialized views:
A materialized view v = r
A user submits a query
We can rewrite the query as v
s is available
r
s
t
t
Whether to do so depends on cost estimates for the two alternative
Replacing a use of a materialized view by the view definition:
A materialized view v = r
s is available, but without any index on it
User submits a query A=10(v).
Suppose also that s has an index on the common attribute B, and r has an index
on attribute A.
The best plan for this query may be to replace v by r
query plan A=10(r)
s
s, which can lead to the
Query optimizer should be extended to consider all above
alternatives and choose the best overall plan
Database System Concepts - 5th Edition, Aug 27, 2005.
14.59
©Silberschatz, Korth and Sudarshan
Materialized View Selection
Materialized view selection: “What is the best set of views to materialize?”.
This decision must be made on the basis of the system workload
Indices are just like materialized views, problem of index selection is closely
related, to that of materialized view selection, although it is simpler.
Some database systems, provide tools to help the database administrator with
index and materialized view selection.
Database System Concepts - 5th Edition, Aug 27, 2005.
14.60
©Silberschatz, Korth and Sudarshan
End of Chapter
Database System Concepts 5th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use