裝配圖網(wǎng) > 圖紙專(zhuān)區(qū) > 課件教案 > 《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件

《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件

《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件,數(shù)據(jù)庫(kù)系統(tǒng),數(shù)據(jù)庫(kù),系統(tǒng),教學(xué),課件 Implementation of Relational OperationsJianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYIntroductionnNext topic:QUERY PROCESSINGnSome database operations are EXPENSIVEnHuge performance gained by being“smart”qWell see 1,000,000 x over nave approachnMain weapons are:qclever implementation techniques for operatorsqexploiting relational algebra“equivalences”qusing statistics and cost models to chooseSimple SQL RefreshernSELECT FROM WHERE SELECT S.name,E.cid FROM Students S,Enrolled E WHERE S.sid=E.sid AND E.grade=AA Really Bad Query OptimizernFor each Select-From-Where query blockqCreate a plan that:nForms the cross product of the FROM clausenApplies the WHERE clausen(Then,as needed:qApply the GROUP BY clauseqApply the HAVING clauseqApply any projections and output expressionsqApply duplicate elimination and/or ORDER BY)predicatestablesCost-based Query Sub-System Query ParserQuery OptimizerPlan GeneratorPlan Cost EstimatorQuery Plan EvaluatorCatalog ManagerSchemaStatisticsSelect*From Blah BWhere B.blah=blahQueriesThe Query Optimization GamenGoal is to pick a“good”planqGood=low expected cost,under cost modelqDegrees of freedom:naccess methodsnphysical operatorsnoperator ordersnRoadmap for this topic:qFirst:implementing individual operatorsqThen:optimizing multiple operatorsRelational OperationsnWe will consider how to implement:qSelection ()Select a subset of rows.qProjection ()Remove unwanted columns.qJoin ()Combine two relations.qSet-difference ()Tuples in reln.1,but not in reln.2.qUnion ()Tuples in reln.1 and in reln.2.nQ:What about Intersection?Schema for ExamplesnSailors:qEach tuple is 50 bytes long,80 tuples per page,500 pages.qS=500,pS=80.nReserves:qEach tuple is 40 bytes,100 tuples per page,1000 pages.qR=1000,pR=100.Sailors(sid:integer,sname:string,rating:integer,age:real)Reserves(sid:integer,bid:integer,day:dates,rname:string)Simple SelectionsnHow best to perform?Depends on:qwhat indexes are availableqexpected size of resultnSize of result approximated as(size of R)*selectivity qselectivity estimated via statistics we will discuss shortly.SELECT *FROM Reserves RWHERE R.rname C%Our options nIf no appropriate index exists:Must scan the whole relationcost=R.For“reserves”=1000 I/Os.nWith index on selection attribute:1.Use index to find qualifying data entries2.Retrieve corresponding data recordsTotal cost=cost of step 1+cost of step 2qFor“reserves”,if selectivity=10%(100 pages,10000 tuples):nIf clustered index,cost is a little over 100 I/Os;nIf unclustered,could be up to 10000 I/Os!unless Our options Index entriesData entriesdirect search for(Index File)(Data file)Data Recordsdata entriesData entriesData RecordsCLUSTEREDUNCLUSTEREDRefinement for unclustered indexes1.Find qualifying data entries.2.Sort the rids of the data records to be retrieved.3.Fetch rids in order.Each data page is looked at just once(though#of such pages likely to be higher than with clustering).(Index File)(Data file)Data entriesData RecordsUNCLUSTEREDGeneral Selection ConditionsnFirst,convert to conjunctive normal form(CNF):q(day8/9/94 OR bid=5 OR sid=3)AND (rname=Paul OR bid=5 OR sid=3)nWe only discuss the case with no ORsnTerminology:qA B-tree index matches terms that involve only attributes in a prefix of the search key.e.g.:qIndex on matches a=5 AND b=3,but not b=3.*(day8/9/94 AND rname=Paul)OR bid=5 OR sid=32 Approaches to General SelectionsApproach I:1.Find the cheapest access path2.retrieve tuples using it3.Apply any remaining terms that dont match the indexqCheapest access path:An index or file scan that we estimate will require the fewest page I/Os.Cheapest Access Path-Examplequery:day 8/9/94 AND bid=5 AND sid=3some options:B+tree index on day;check bid=5 and sid=3 afterward.hash index on;check day8/9/94 afterward.nHow about a B+tree on?nHow about a B+tree on?nHow about a Hash index on?Approach II:use 2 or more matching indexes.1.From each index,get set of rids2.Compute intersection of rid sets3.Retrieve records for rids in intersection4.Apply any remaining termsEXAMPLE:day8/9/94 AND bid=5 AND sid=3Suppose we have an index on day,and another index on sid.qGet rids of records satisfying day8/9/94.qAlso get rids of records satisfying sid=3.qFind intersection,then retrieve records,then check bid=5.2 Approaches to General Selections(Contd.)ProjectionnIssue is removing duplicates.nUse sorting!1.Scan R,extract only the needed attributes2.Sort the resulting set3.Remove adjacent duplicatesCost:Ramakrishnan/Gehrke writes to temp table at each step!Reserves with size ratio 0.25=250 pages.With 20 buffer pages can sort in 2 passes,so:1000+250+2*2*250+250=2500 I/OsSELECT DISTINCT R.sid,R.bidFROM Reserves RProjection-improvednAvoid the temp files,work on the fly:qModify Pass 0 of sort to eliminate unwanted fields.qModify Passes 1+to eliminate duplicates.Cost:Reserves with size ratio 0.25=250 pages.With 20 buffer pages can sort in 2 passes,so:1.Read 1000 pages2.Write 250(in runs of 40 pages each)=7 runs3.Read and merge runs(20 buffers,so 1 merge pass!)Total cost=1000+250+250=1500.Other Projection TricksIf an index search key contains all wanted attrs:nDo index-only scanqApply projection techniques to data entries(much smaller!)If a B+Tree index search key prefix has all wanted attrs:nDo in-order index-only scanqCompare adjacent tuples on the fly(no sorting required!)JoinsnJoins are very common.nR S is large;so,R S followed by a selection is inefficient.nMany approaches to reduce join cost.nJoin techniques we will cover today:1.Nested-loops join2.Index-nested loops join3.Sort-merge joinSELECT *FROM Reserves R1,Sailors S1WHERE R1.sid=S1.sidR S:foreach tuple r in R doforeach tuple s in S doif ri=sj then add to resultSimple Nested Loops JoinCost=(pR*R)*S+R=100*1000*500+1000 IOsqAt 10ms/IO,Total time:?nWhat if smaller relation(S)was“outer”?nWhat assumptions are being made here?nWhat is cost if one relation can fit entirely in memory?R S:Page-Oriented Nested Loops JoinCost=R*S+R=1000*500+1000nIf smaller relation(S)is outer,cost=500*1000+500nMuch better than nave per-tuple approach!foreach page bR in R do foreach page bS in S do foreach tuple r in bR doforeach tuple s in bSdoif ri=sj then add to resultBlock Nested Loops JoinnPage-oriented NL doesnt exploit extra buffers:(nIdea to use memory efficiently:.R&Sblock of R tuples(B-2 pages)Input buffer for SOutput buffer.Join ResultCost:Scan outer+(#outer blocks*scan inner)#outer blocks=Examples of Block Nested Loops JoinnSay we have B=100+2 memory buffersnJoin cost=outer+(#outer blocks*inner)#outer blocks=outer/100nWith R as outer(R=1000):qScanning R costs 1000 IOs(done in 10 blocks)qPer block of R,we scan S;costs 10*500 I/OsqTotal=1000+10*500.nWith S as outer(S=500):qScanning S costs 500 IOs(done in 5 blocks)qPer block of S,we can R;costs 5*1000 IOsqTotal=500+5*1000.R S:Index Nested Loops Joinforeach tuple r in R doforeach tuple s in S where ri=sj doadd to resultData entriesData RecordsINDEX on Slookup(ri)sjS:R:riR S:Cost=R+(R*pR)*cost to find matching S tuples If index uses Alt.1,cost=cost to traverse tree from root to leaf.For Alt.2 or 3:1.Cost to lookup RID(s);typically 2-4 IOs for B+Tree.2.Cost to retrieve records from RID(s);depends on clustering.qClustered index:1 I/O per page of matching S tuples.qUnclustered:up to 1 I/O per matching S tuple.Index Nested Loops Joinforeach tuple r in R doforeach tuple s in S where ri=sj doadd to resultReminder:Schema for ExamplesnSailors:qEach tuple is 50 bytes long,80 tuples per page,500 pages.qS=500,pS=80.nReserves:qEach tuple is 40 bytes,100 tuples per page,1000 pages.qR=1000,pR=100.Sailors(sid:integer,sname:string,rating:integer,age:real)Reserves(sid:integer,bid:integer,day:dates,rname:string)Sort-Merge Join Example:SELECT *FROM Reserves R1,Sailors S1WHERE R1.sid=S1.sid1.Sort R on join attr(s)2.Sort S on join attr(s)3.Scan sorted-R and sorted-S in tandem,to find matchesCost of Sort-Merge JoinnCost:Sort R+Sort S+(R+S)qBut in worst case,last term could be R*S (very unlikely!)qQ:what is worst case?Suppose B=35 buffer pages:nBoth R and S can be sorted in 2 passesnTotal join cost=4*1000+4*500+(1000+500)=7500Suppose B=300 buffer pages:nAgain,both R and S sorted in 2 passesnTotal join cost=7500Block-Nested-Loop cost=2500 15,000Other Considerations 1.An important refinement:Do the join during the final merging pass of sort!If have enough memory,can do:1.Read R and write out sorted runs2.Read S and write out sorted runs3.Merge R-runs and S-runs,while finding R S matchesCost=3*R+3*SQ:how much memory is“enough”(will answer next time)2.Sort-merge join an especially good choice if:one or both inputs are already sorted on join attribute(s)output is required to be sorted on join attributes(s)Q:how to take these savings into account?(stay tuned)SummarynA virtue of relational DBMSs:queries are composed of a few basic operatorsqThe implementation of these operators can be carefully tunednMany alternative implementation techniques for each operatorqNo universally superior technique for most operators.nMust consider available alternativesqCalled“Query optimization”-we will study this topic soon!

數(shù)據(jù)庫(kù)系統(tǒng)教學(xué)課件.zip

《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件

lec 9 Query Optimization.ppt---(點(diǎn)擊預(yù)覽)

lec 9 Physical Design.ppt---(點(diǎn)擊預(yù)覽)

lec 8 Query Processing.ppt---(點(diǎn)擊預(yù)覽)

lec 7 Hash-Based Inds.ppt---(點(diǎn)擊預(yù)覽)

lec 6 External Sorting.ppt---(點(diǎn)擊預(yù)覽)

lec 5 TreeInds.ppt---(點(diǎn)擊預(yù)覽)

lec 4 SQL.ppt---(點(diǎn)擊預(yù)覽)

lec 4 calculus.ppt---(點(diǎn)擊預(yù)覽)

lec 4 algebra.ppt---(點(diǎn)擊預(yù)覽)

lec 3 Storing Data.ppt---(點(diǎn)擊預(yù)覽)

lec 3 File and Indexing.ppt---(點(diǎn)擊預(yù)覽)

lec 2 The Relational Model.ppt---(點(diǎn)擊預(yù)覽)

lec 2 The Entity-Relationship Model.ppt---(點(diǎn)擊預(yù)覽)

lec 15 final review.ppt---(點(diǎn)擊預(yù)覽)

lec 14 warehouse.ppt---(點(diǎn)擊預(yù)覽)

lec 12 recovery.ppt---(點(diǎn)擊預(yù)覽)

lec 11 transaction.ppt---(點(diǎn)擊預(yù)覽)

lec 10 norm.ppt---(點(diǎn)擊預(yù)覽)

lec 1 overview.ppt---(點(diǎn)擊預(yù)覽)

壓縮包目錄

預(yù)覽區(qū)

《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件
- lec 1 overview.ppt--點(diǎn)擊預(yù)覽
- lec 10 norm.ppt--點(diǎn)擊預(yù)覽
- lec 11 transaction.ppt--點(diǎn)擊預(yù)覽
- lec 12 recovery.ppt--點(diǎn)擊預(yù)覽
- lec 14 warehouse.ppt--點(diǎn)擊預(yù)覽
- lec 15 final review.ppt--點(diǎn)擊預(yù)覽
- lec 2 The Entity-Relationship Model.ppt--點(diǎn)擊預(yù)覽
- lec 2 The Relational Model.ppt--點(diǎn)擊預(yù)覽
- lec 3 File and Indexing.ppt--點(diǎn)擊預(yù)覽
- lec 3 Storing Data.ppt--點(diǎn)擊預(yù)覽
- lec 4 algebra.ppt--點(diǎn)擊預(yù)覽
- lec 4 calculus.ppt--點(diǎn)擊預(yù)覽
- lec 4 SQL.ppt--點(diǎn)擊預(yù)覽
- lec 5 TreeInds.ppt--點(diǎn)擊預(yù)覽
- lec 6 External Sorting.ppt--點(diǎn)擊預(yù)覽
- lec 7 Hash-Based Inds.ppt--點(diǎn)擊預(yù)覽
- lec 8 Query Processing.ppt--點(diǎn)擊預(yù)覽
- lec 9 Physical Design.ppt--點(diǎn)擊預(yù)覽
- lec 9 Query Optimization.ppt--點(diǎn)擊預(yù)覽

請(qǐng)點(diǎn)擊導(dǎo)航文件預(yù)覽

編號(hào)：48634128 類(lèi)型：共享資源大小：6.17MB 格式：ZIP 上傳時(shí)間：2022-01-12

30
積分

舉報(bào)

版權(quán)申訴 word格式文檔無(wú)特別注明外均可編輯修改；預(yù)覽文檔經(jīng)過(guò)壓縮，下載后原文更清晰！ 立即下載

關(guān) 鍵詞：: 數(shù)據(jù)庫(kù)系統(tǒng) 數(shù)據(jù)庫(kù) 系統(tǒng) 教學(xué) 課件

資源描述：: 《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件,數(shù)據(jù)庫(kù)系統(tǒng),數(shù)據(jù)庫(kù),系統(tǒng),教學(xué),課件

展開(kāi)閱讀全文

溫馨提示:
1: 本站所有資源如無(wú)特殊說(shuō)明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
3.本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

裝配圖網(wǎng)所有資源均是用戶(hù)自行上傳分享，僅供網(wǎng)友學(xué)習(xí)交流，未經(jīng)上傳用戶(hù)書(shū)面授權(quán)，請(qǐng)勿作他用。

關(guān)于本文

本文標(biāo)題：《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件
鏈接地址：http://m.appdesigncorp.com/article/48634128.html

點(diǎn)擊下載此資源

《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件

最新文檔

相關(guān)資源

相關(guān)搜索