کامپیوتر و IT و اینترنتعلوم مهندسی

مبانی ساختمان داده ها و الگوریتم ها 2

صفحه 1:
1 0:11:02 5ع عاناناناعبانا/1 5 2:10 120 5© 1" Focus of the course: ° Datastructuresand relatedalgorithms 5 Correctnessand(timeand space) complexity | Prerequisites 5 CS 2o0:stacks, queues,lists, binary searchtrees, ... 5 CS 4o:functions, recurrence equations,induction, ... 9 CS 60:C,C++,andUNIX

صفحه 2:
Course Organization BGrading: 5 See course web page = Policy: 5 Nolatehomeworks. 5 Cheatingandplagiaris: F gradeand disciplinary actions 9 ‏1[آأذ[ذ‎ Homepage: www.cs.ucsbh.edu/~cs730a = Email:cs730a@cs.ucsh.edu "Teaching assistants: See course web page

صفحه 3:
Introduction = Afamous quote: Program =Algorithm+ Data Structure. = ALC of youhave programmed! thus have already been exposedto algorithmsanddata structure. = Perhaps youdidn't see themas separateentities: = Perhaps yousaw data structures as simple programming constructs (provided by STL--standardtemplatelibrary). = However, data structures are quite distinct fromalgorithms, andveryimportantintheir own right.

صفحه 4:
Objectives = Themain focus of this courseis tointroduce youtoa systematic study of a(gorithmsanddata structure. = Thetwoguiding principles of the courseare:abstractionand formal analysis. @ Abstraction: Wefocus on topics that are broadly applicableto avariety of problems. = Analysis; Wewant aformal way tocompare two objects (data structures or algorithms). = Inparticular,wewill worry about ‘always correct'-ness,and worst-case bounds ontimeandmemory (space).

صفحه 5:
Textbook | Textbook for thecourseis: = DataStructuresandAlgorithmAnalysisin C++ = byMarkAllen Weiss =ButIwillusematerial fromother booksand research papers,sotheultimate source shouldbe my lectures.

صفحه 6:
Course Outline ™ C++Review (Ch. 7) © A(gorithmAnalysis (Ch. 2) Setswithinsert /delete/member: Hashing (Ch. 5) Setsingeneral: Balancedsearch trees (Ch.4and72.2) ® Setswithpriority: Heaps, priority queues (Ch. 6) ® Graphs:Shortest-pathalgorithms (Ch. 9.7.9.3.2) ® Setswithdisjoint union: Union/findtrees (Ch. 8.7.8.5) Graphs: Minimum spanning trees (Ch. 9.5) Sorting (Ch. 7)

صفحه 7:
120: AlgorithmAnalysis = Foundations of ACgorithmAnalysisandData Structures. = Analysis: © dow topredictanalgorithm’'s performance 2 How wellanalgorithmscalesup 5 4fow tocompare different algorithms for a problem = DataStructures 5 Sow to efficiently store, access,manage data Oo Data structures effect algorithm's performance

صفحه 8:
Example Algorithms = Twoalgorithms for computing the Factorial = Whichoneis better? ® int factorial (int n) ( if(n<=7) return?! efse return n *factorial(n-7)! > ™ intfactorial dnt n) ( if(n<es) returns! efse( fact=1{ for (R=2ikeon{f++) fact *- ‏ع‎ return fact! 0 )

صفحه 9:
Examples of famous algorithms ™ Constructions of Euclid = Newton's root finding = Fast Fourier Transform = Compression (Huffman, Lempel-Ziv, GIF, MPEG) = DES,RSAencryption ® Simplex algorithmfor linear programming = Shortest Path Algorithms (Dijkstra, Bel(man-Ford) = Error correcting codes (CDs, DVDs) = TCPcongestioncontrol,IP routing = Pattern matching (Genomics) ™ Search Engines

صفحه 10:
Role of ACgorithmsin Modern World = Enormousamount of data E-commerce (Amazon, Ebay) Network traffic (telecom billing, monitoring) Database transactions (Sales,inventory) Scientificmeasurements (astrophysics, geology) Sensor networks. RFID tags 8 قا ۵ 8 0 ظ Bioinformatics (genome, protein bank) = Amazonhiredfirst Chief Algorithms Officer (‘Udi Manber)

صفحه 11:
Areal-worldProblem = CommunicationintheInternet = Message (email, ftp) broken downintoIP packets. ™ Sender/receiver identified byIP address. ™ ThepacketsareroutedthroughtheInternet by special computers calledRouters. = Eachpacket is stampedwithits destinationaddress, but not the route. ™ Because the Internet topologyandnetworkloadis constantly changing, routersmust discover routes dynamically. = What should the Routing Table look like?

صفحه 12:
IP PrefixesandRouting ™ Eachrouter is really aswitch:it receives packetsat several input ports,andappropriately sends themout to output ports. = Thus, for eachpacket, the router needs to transfer the packet to that output port that getsit closer toits destination. ™ Shouldeachrouter keepa table:IP address x Output Port? ™ How bigis this table? = Whenalinkor router fails, how muchinformation wouldneed tobemodified? = A router typically forwards several million packets/sec!

صفحه 13:
Data Structures ™ TheIP packet forwardingisa Data Structureproblem! = Efficiency, scalability isveryimportant. ™ Similarly, how does Googlefind the documents matching your query sofast? ۳ Uses sophisticatedalgorithms tocreateindexstructures,which arejust data structures. = Algorithmsanddata structuresareubiquitous. = Withthe data glut created by the new technologies, theneedto organize,search,andupdate MASSIVE amounts of information FASTis more severe than ever before.

صفحه 14:
Algorithms toProcess theseData Whichare the top KselCers? Correlation between time spent at awebsiteandpurchase amount? Whichflowsat a router account for >7%traffic? DidsourceS senda packet inlast s seconds? Sendanalarmifanyinternational arrival matchesa profileinthe database Similarity matches against genome databases Etc.

صفحه 15:
Max Subsequence Problem ‎Givena sequence of integers. A7,A2,.... An findthemaximum possible value‏ "ا ‎ofa subsequence Ai,...,. Aj.‏ ‎= Numberscan be negative. ‎© Youwanta contiguous chunkwithlargest sum. ‎= Example: -2,77,-4,13,-5,-2 = Theansweris 20 (subseq. A2throughA4). ‎= Wewill discuss 4 different algorithms,withtime complexities O(n*),0(n*), O(n log n),andO(n). ‎=| Withn = 70%, algorithm7 may take>zoyears: algorithm4will takea fraction ofa second! ‎

صفحه 16:
Algorithm 7for Max Subsequence Sum "Given A,,.,A,,findthemaximumvalue of A;+A,,,+' +A, oifthemaxvalueis negative int maxSum = 0; | ou for( int i = 0; i < a.size( ); i+ +) for( int j = i; j < a.size( ‏نالل‎ nt nim OD )- 3( ۱02 2 )[- 3( j- Ja 120 ‏دز‎ ‏زه‎ jou Jou a k<=j; ‏وف‎ ‎thisSum += a[ k ]; if( thisSum > maxSum ) maxSum = thisSum; =Time ‏مکحم مدمه(‎ OU) int thisSu for( int k +) 6

صفحه 17:
Algorithm 2 BIdea:Givensumfromitoj-7,wecan compute the sum fromitojinconstant time. = This eliminates one nestedloop,andreduces the running timetoO(n’). into maxSum = 0; for( int i = 0; i < a.size( ); i+ +) int thisSum = 0; for( int j = i; j < a.size( ); i++) thisSum += a[ j J; if( thisSum > maxSum ) maxSum = thisSum; ae

صفحه 18:
Algorithm 3 | This algorithmuses divide-and-conquer paradigm. = Suppose we split theinput sequenceat midpoint. = Themax subsequence isentirely in the left half, entirelyinthe right half, orit straddles themidpoint. "Example: Ceft half | right half ‏اقا | اهب‎ -2 @ Maxinleftis 6 (A7throughA3):maxinright is 8 (Aé through A7). But straddling maxis 77(A7thruA7).

صفحه 19:
Algorithm 3(cont.) = Example: Ceft half | right half 4-35 -2| -126-2 = Max subsequencesin eachhalf found by recursion, = How dowefindthe straddling max subsequence? = Key Observation: O Left half of the straddling sequenceis themax subsequence endingwith-2. 5 Right halfisthemax subsequence beginning with-7. = aCinear scanletsus compute thesein O(n) time.

صفحه 20:
Algorithm 3: Analysis = The divideandconquer is best analyzedthrough recurrence: 1010-1 Tn) =2T(n/2) + O(n) = This recurrence solves toT(n) =O(nlogn).

صفحه 21:
Algorithms 2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2 int maxSum = 0, thisSum = 0; for( int j = 0; j < a.size( ); j++) thisSum += al j ]; if (thisSum > maxSum ) maxSum = thisSum; else if ( thisSum < 0 ) thisSum = 0; mg return maxSum; B but wny aves it Worr: 1.2. proof of correctness. ea

صفحه 22:
Proof of Correctness = Max subsequencecannot start or endatanegativeAi. = More generally, themax subsequence cannot haveaprefix with anegativesum. Ex: -2 11 -4 13 -5 -2 = Thus, if weeverfindthat Ai through Aj sums to<o,thenwecan advanceitoj+7 © Proof. Supposejis thefirst index afteri when the sumbecomes <o © Themax subsequencecannot start at anypbetweeniandj. Because A; throughA, ,ispositive,sostarting at iwowdhave beeneven better.

صفحه 23:
و ‎int maxSum =0, thisSum=o!‏ ‎for(intj=o1j<a.size()tj++)‏ thisSum+=aljli if (thisSum>maxSum) maxSum = thisSum! elseif (thisSum<o) thisSum= oi! 20 returnmaxSum * Thealgorithm resets whenever prefixis<o. Otherwise,it forms new sumsandupdates maxSumin onepass.

صفحه 24:
Why Efficient Algorithms Matter ™ Suppose N = 70° = APCcan read/process Nrecordsin 7 sec. = But ifsomealgorithmdoes N*Ncomputation, thenit takes 7M seconds = 77 days!!! ‎700City Traveling Salesman Problem.‏ "ا ‎5 A supercomputer checking 700 billion tours/sec still requires 10 years! ‎= Fast factoring algorithms can breakencryption schemes. Algorithms research determines what is safe code length. (> 700 digits) ‎

صفحه 25:
How to Measure Algorithm Performance = What metric should beused to judge algorithms? 5 Length of the program (Lines of code) 5 Ease of programming (bugs,maintenance) Memory required DRunning time =Runningtimeis the dominant standard. OD Quantifiableandeasy tocompare OD Often thecritical bottleneck

صفحه 26:
Abstraction = Analgorithmmay run differently depending on: 5 thehardwareplatform(PC,Cray,Sun) Oo theprogramming language (C, Java,C++) 5 theprogrammer (you, me, Bill Joy) = While different in detail,all hardwareandprog models are equivalent in some sense: Turingmachines. ™ It suffices tocount basicoperations. ™ Crude but valuable measure of algorithm's performance asa function ofinput size.

صفحه 27:
Average, Best,and Worst-Case = Onwhichinput instances should the algorithm's performance bejudged? ا ۲ Real world distributions difficult topredict ™ Best case: 5 Seemsunrealistic = Worst case: 5 Givesanabsolute guarantee 5 WewilCuse the worst-casemeasure.

صفحه 28:
Examples B Vector additionZ= A+B for (inti=oli<nitit++) 2111 - ۵11 +4 Tin)=cn =Vector (inner) multiplication z=A*B z=0! for (inti=oli<niit++) 2 <2+ ۵11*2111 ‏؛‎ ‎7)90( - ‏ار + أ‎

صفحه 29:
Examples = Vector (outer) multiplicationZ=A*B for (int i=oti<nii++) for (int j=o!j<nij++) Zlij] = Alt] * BU] ‏؛‎ ‎Tn) =c,n? = Aprogram does all the above Tn)=c,+¢,n+¢,n7f

صفحه 30:
Simplifying the Bound BT(n)=c,n' +c, n°’ +e, Ne? +46, NHC, Otoocomplicated 5 toomany terms O Difficult to compare two expressions, eachwith 70 or 20 terms = Dowereally needthat many terms?

صفحه 31:
Simplifications ™ Keepjust oneterm! 5 thefastest growing term (dominates theruntime) = Noconstant coefficients are kept 5 Constant coefficients affected by machines, languages, etc. = Asymtoticbehavior (as ngetslarge) is determinedentirely by theCeading term. 5 Example. Tin) =70n* +n7+40n + 800 © Ifn=7,000, then Tin) = 10,007,040,800 © erroriso.o7%ifwe drop all but the n? term 5 Inanassembly line the sCowest worker determines the throughput rate oa

صفحه 32:
080 00و Simplification = Drop theconstant coefficient 5 Does not effect the relCativeorder ca? r= 1000 n, g= 4 02, b=n?, k= 0.01 2° co 1 20 300 40 600 eco 7۵5 00

صفحه 33:
Simplification = The fastergrowing term (suchas 2") eventuallywill outgrow the sCower growing terms (e.g.,7000n) no matter what their coefficients! = Put another way,givenacertainincreasein alCocatedtime,a higher order algorithmwill not reap the benefit by solving muchlarger problem

صفحه 34:
Complexity and Tractability 10 [Olus| .0305| tus) Ins| 10۳3 105 1۳5 10° | 1ous 10° |100us Assume the computer does 1 billion ops per sec.

صفحه 35:
0 1 0 1 1 2 1 2 2 4 8 4 2 4 8 16 04 16 3 8 24 64 512 256 4 16 64 256 4056 65,536, 5 160 1,024 32,766 6 ‏سم‎ 2۰ 3 5 ani 3 en, in 5 eins can anlog n aun a ow ۲ ۰ ea 3 ۰ ‏سر‎ ‎vo AL” gh aa A com P19 Rw) (ee eet logan 0 ‏وین و 2 هيه يه‎ ۳ elt a Aa ۳ log n

صفحه 36:
Another View ™ More resources (timeand/or processing power) translateinto Cargeproblems solvedif complexity is Cow T(n) Problem size | Problem Increase in solved in 103 size solved | Problem sec in 104sec_ |size 100n 10 100 10 1000n 1 10 10 5n? 14 45 3:2 NB 10 22 2.2 2n 10 13 ‏ون‎

صفحه 37:
Asympotics Tm) keep one drop coef 3n°+4n+1 3r r 101 r°-+102 1012 r 15 ‏د15 6م‎ r an’-ntc an rv = They all havethesame “growth rate

صفحه 38:
Caveats = Follow the spirit, not theletter 5a 200onalgorithmismoreexpensive than n? algorithmwhen n < 700 ™ Other considerations: ‏لآ‎ times Daprogramrunonsmall data sets D ease of coding, porting,maintenance Omemory requirements

صفحه 39:
Asymptotic Notations ™ Big-0, ‘bounded above by”: Tin) = O(f(n)) 5 For somecand N,T(n) <c'fin) whenever n> N. ™ Big-Omega, ‘bounded below by”: Tin) = Q(fi(n)) 5 For somec>oand N, Tin) =c'fin) whenever n> N. 5 Sameasfin) =O(T(n)). = Big-Theta, ‘boundedaboveand below": Tin) = O(f(n)) 5 Tn) =Of(n)) andalso Tin) = Q(f(n)) @ Little-o, ‘strictly bounded above”: Tin) =o(fin)) 5 Tn)/fin)0 as n> 00

صفحه 40:
By Pictures = Big-Oh(most commonly used) / ‏تا‎ | پ ۱ص ‎Big-Omega‏ = ‎N‏ 07 اء 6 ان 3 = Big-Theta Dexactly Bsmall-o 5 not asexpensiveas... N 0

صفحه 41:
Example Tn) =r + 27 O(?) Q(?) ‏م۰‎ 0 ‏م زر‎ rm ‏بر‎ ‎rm @r

صفحه 42:
Asymptomic rug Si n jk Lal pa Sei Examples ee

صفحه 43:
Summary (Why O(n)?) BT(n)=c,n' +c, ,n*'+C, ,N? +..46,N4+C, = Toocomplicated Zo) Casingletermwithconstant coefficient dropped = Muchsimpler,extratermsand coefficients donot matterasymptotically "Other criteriahardto quantify

صفحه 44:
Runtime Analysis = useful rules 5 simple statements (read, write, assign) ©» O17) (constant) 5 simple operations (+- * 2 om 5 sequence of simple statements/operations © rule ofsums 5 for, do, while Coops © rules of products

صفحه 45:
Runtime Analysis (cont.) = Twoimportant rules O Rule of sums © if youdoa number of operations in sequence, the runtime is dominated by the most expensive operation Oo Rule of products ® if yourepeat anoperationa number of times, the total runtimeis the runtime of the operation multiplied by the iterationcount

صفحه 46:
Runtime Analysis (cont.) if(cond)then O(7) body, Tin) else body, Tn) endif Tn) = O(max (T,(n), T,(n))

صفحه 47:
Runtime Analysis (cont.) = Methodcalls ‏نا‎ 5 ۲ 9 © Detc. @ A sequence of operations when call sequences are flattened Tin) =max(T,,(n),T,(n), Tn)

صفحه 48:
Example for (t=7fi<n $i++) if A(i) >maxVal then max Val=A(i)! maxPos=i! Asymptotic Complexity:O(n)

صفحه 49:
Example for (t=7!i<n-7!i++) for j=n{j>=i+7! j--) if (AG-7) > AG)) then temp =AG-7)! AG- - (4 20( 2 ‏؛ مزالت‎ endif endfor endfor = Asymptotic Complexity is O(n?)

صفحه 50:
5 ۷ 11۵07 بقل BT(n)isdefinedrecursivelyin terms of T(k),k<n = The recurrence relations allow Tin) to be‘unwound” recursively into some base cases (e.g.,T(0) or T(7)). = Examples: 5 Factorial 5 Hanoi towers

صفحه 51:
Example: Factorial int factorial (int n)¢ te =Tn- +d =r 2)+d+d ‏حيرا‎ 3( + 0+ 0+ if (m<=7) return?! e(se returnn *factorial(n-7)! > (+ ‏-ه)‎ (۵ factorial (n)=n*n-7*n-2*.., #7 =e+(n- 0 n *factorial(n-7) T(n) 3 ‏ی‎ T(n-1) n-2 *factorialin-3) T(n-2 wo ۲ ( ) “oo 2 *factorial(7) T(1) 60

صفحه 52:
Example: Factorial (cont.) int factorials(int n) ¢ if (n<=7) return?! elset fact=7! for (k=2!k<=n k++) ١ a fact *=k! an returnfact! | 00 > 2 = BothalgorithmsareO(n).

صفحه 53:
Example: Hanot Towers BHanoi(n,ABC) = @ H{anoi(n-7,A,C,B)+Hanoi(7,A,B,C)+Hanoi(n-7,C,B,A) 1 =2T(n- 1)+c =?’ TUn- 2)+2c+c =2T(n- 3)+2?c+2c+e 2+1(6 ج.. +2 2) + 117 - 2+1(0 + +22 +اع)- ‎=O(2")‏

صفحه 54:
Worst Case, Best Case, and Average Case template<class T> void SelectionSort(T a[], int n) { //Early-terminating version of selection sort bool sorted = false; for (int size=n; !sorted && (size>1); size--) { int pos = 0; sorted = true; // find largest for (int i = 1; i < size; i++) if (a[pos] <= a[i]) pos = i; else sorted = false; // out of order Swap(a[pos], a[size - 1]); } = Worst Case = Best Case

صفحه 55:
f(N) (N) (N) T(N)=O(£(N)) ۱ mo=wande=7,fiNI=N TUN)=6N+4 <= ‏محال‎ - 79010: 30<-4 ‏مبهوج‎ - 030 0360 ممجكدوه NONI? 31-7 و2000 31-7 و3010 2-7 200-002 ۱ 31-37 و3010 2-00 1000000-01( 007 مدو

صفحه 56:
An Analogy: Cooking Recipes = Algorithmsare detailedandpreciseinstructions. = txample:bakeachocolatemoussecake. 5 Convert rawingredientsintoprocessedoutput. 3 Hardware (PC, supercomputer vs. oven, stove) 5 Pots,pans,pantryaredata structures. = Interplay ofhardwareandalgorithms © Different recipesfor oven, stove,microwaveetc, = Newadvances, 5 New models: clusters,Internet,workstations 5 Microwave cooking, 5-minute recipes, refrigeration 6ه

جهت مطالعه ادامه متن، فایل را دریافت نمایید.
32,000 تومان