《數(shù)據(jù)庫系統(tǒng)》英文教學(xué)課件
《數(shù)據(jù)庫系統(tǒng)》英文教學(xué)課件,數(shù)據(jù)庫系統(tǒng),數(shù)據(jù)庫,系統(tǒng),英文,教學(xué),課件
External SortingChapter 13.Why Sort?nA classic problem in computer science!nData requested in sorted order qe.g.,find students in increasing gpa ordernFirst step in bulk loading B+tree index.nUseful for eliminating duplicates(Why?)nUseful for summarizing groups of tuplesnSort-merge join algorithm involves sorting.nProblem:sort 100Gb of data with 1Gb of RAM.qwhy not virtual memory?So?nDont we know how to sort?qQuicksortqMergesortqHeapsortqSelection sortqInsertion sortqRadix sortqBubble sortqEtc.nWhy dont these work for databases?Example:merge sortAppleBananaBlueberryGrapefruitKiwiMangoOrangeStrawberryAppleBananaBlueberryGrapefruitKiwiMangoOrangeStrawberryAppleBananaGrapefruitOrangeBlueberryKiwiMangoStrawberryAppleBananaGrapefruitOrangeBlueberryKiwiMangoStrawberryExample:merge sortAppleBananaGrapefruitOrangeStrawberryMangoKiwiBlueberryGrapefruitAppleBananaOrangeStrawberryKiwiBlueberryMangoBlueberryAppleBananaGrapefruitStrawberryMangoKiwiOrangeIsnt that good enough?nConsider a file with N recordsnMerge sort is O(N lg N)comparisonsnWe want to minimize disk I/OsqDont want to go to disk O(N lg N)times!nKey insight:sort based on pages,not recordsqRead whole pages into RAM,not individual recordsqDo some in-memory processingqWrite processed blocks out to diskqRepeatStreaming Data Through RAMnAn important method for sorting&other DB operationsnSimple case:qCompute f(x)for each record,write out the resultqRead a page from INPUT to Input BufferqWrite f(x)for each item to Output BufferqWhen Input Buffer is consumed,read another pageqWhen Output Buffer fills,write it to OUTPUTnReads and Writes are not coordinatedqE.g.,if f()is Compress(),you read many pages per write.qE.g.,if f()is DeCompress(),you write many pages per read.f(x)RAMInputBufferOutputBufferOUTPUTINPUT2-Way Sort:Requires 3 BuffersnPass 0:Read a page,sort it,write it.qonly one buffer page is used(as in previous slide)nPass 1,2,3,etc.:qrequires 3 buffer pagesqmerge pairs of runs into runs twice as longqthree buffer pages used.Main memory buffersINPUT 1INPUT 2OUTPUTDiskDiskMerging RunsTwo-Way External Merge SortnEach pass we read+write each page in file.nN pages in the file=the number of passesnSo total cost is:nIdea:Divide and conquer:sort subfiles and mergeInput file1-page runs2-page runs4-page runs8-page runsPASS 0PASS 1PASS 2PASS 393,46,29,48,75,63,123,45,62,64,97,81,322,34,64,78,91,35,622,34,46,78,91,23,561,22,33,44,56,67,8Merging RunsGeneral External Merge SortnTo sort a file with N pages using B buffer pages:qPass 0:use B buffer pages.Produce sorted runs of B pages each.qPass 1,2,etc.:merge B-1 runs.B Main memory buffersINPUT 1INPUT B-1OUTPUTDiskDiskINPUT 2.*More than 3 buffer pages.How can we utilize them?ExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputExampleInputOutputCost of External Merge SortnNumber of passes:nCost=2N*(#of passes)nE.g.,with 5 buffer pages,to sort 108 page file:qPass 0:=22 sorted runs of 5 pages each(last run is only 3 pages)qPass 1:=6 sorted runs of 20 pages each(last run is only 8 pages)qPass 2:2 sorted runs,80 pages and 28 pagesqPass 3:Sorted file of 108 pagesFormula check:log4 22=3 +1 4 passes Number of Passes of External Sort(I/O cost is 2N times number of passes)Can I do two passes?nPass 0:sort runsnPass 1:merge runsnP pages filenGiven B buffersnNeed:qNo more than B-1 runsqEach run no longer than B pagesnCan do two passes if P B*(B-1)nQuestion:whats the largest file we can sort in three passes?N passes?Internal Sort AlgorithmnQuicksort is a fast way to sort in memory.nAn alternative is“tournament sort”(a.k.a.“heapsort”)qTop:Read in B blocksqOutput:move smallest record to output bufferqRead in a new record rqinsert r into“heap”qif r not smallest,then GOTO Outputqelse remove r from“heap”qoutput“heap”in order;GOTO TopBlocked I/O for External Merge SortnDo I/O a page at a timeqNot one I/O per recordnIn fact,read a block(chunk)of pages sequentially!nSuggests we should make each buffer(input/output)be a block of pages.qBut this will reduce fan-in during merge passes!qIn practice,most files still sorted in 2-3 passes.Theme:Amortize a random I/O across more data read.But pay for it in memory footprint Double BufferingnGoal:reduce wait time for I/O requests during mergenIdea:2 blocks RAM per run,disk reader fills one while sort merges the otherqPotentially,more passes;in practice,most files still sorted in 2-3 passes.OUTPUTOUTPUTDiskDiskINPUT 1INPUT kINPUT 2INPUT 1INPUT 2INPUT kblock sizebB main memory buffers,k-way mergeTheme:overlap I/O and CPU activity via read-ahead(prefetching)Solution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputSolution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputSolution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputSolution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputSolution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputSolution:double bufferingnKeep a second set of buffersqProcess one set while waiting for disk I/O to fill the other setInputOutputUsing B+Trees for SortingnScenario:Table to be sorted has B+tree index on sorting column(s).nIdea:Can retrieve records in order by traversing leaf pages.nIs this a good idea?nCases to consider:qB+tree is clusteredqB+tree is not clusteredGood idea!Could be a very bad idea!Clustered B+Tree Used for SortingnCost:root to the left-most leaf,then retrieve all leaf pages(Alternative 1)nIf Alternative 2 is used?Additional cost of retrieving data records:each page fetched just once.*Always better than external sorting!(Directs search)Data RecordsIndexData Entries(Sequence set)Unclustered B+Tree Used for SortingnAlternative(2)for data entries;each data entry contains rid of a data record.In general,one I/O per data record!(Directs search)Data RecordsIndexData Entries(Sequence set)External Sorting vs.Unclustered Index*p:#of records per page*B=1,000 and block size=32 for sorting*p=100 is the more realistic value.What did that cost us?nTraverse B+tree to left-most leaf pagenRead all leaf pagesqFor each leaf page,read data pagesnData not in B+tree:qHeight+Width+Data pagesnData in B+tree:qHeight+WidthExamplen1,000,000 records,12,500 data pagesnAssume keys are 10 bytes,disk pointers are 8 bytesqSo 300 entries per 8 KB B+tree page(if two-thirds full)nData not in B+treeq12,500 entries needed=42 leaf pagesqTwo level B+treeqTotal cost:1+42+12,500=12,543 I/Osq2 minutes versus 17 minutes for external merge sortnData in B+treeqThree level B+tree,12,500 leaf pagesqTotal cost:2+12,500=12,502 I/OsqAlso about 2 minutesWhat if the B+tree is unclustered?nWe know the proper sort order of the datanBut retrieving the data is hard!What if the B+tree is unclustered?nResult is that in the worst case,may need one disk I/O per recordqEven though we know the sort order!nUsually external merge sort is better in these casesqUnless all you need is the set of keysSummarynExternal sorting is importantnExternal merge sort minimizes disk I/O cost:qPass 0:Produces sorted runs of size B(#buffer pages).Later passes:merge runs.q#of runs merged at a time depends on B,and block size.qLarger block size means less I/O cost per page.qLarger block size means smaller#runs merged.qIn practice,#of runs rarely more than 2 or 3.Summary(cont.)nChoice of internal sort algorithm may matter:qQuicksort:Quick!qHeap/tournament sort:slower(2x),longer runsnThe best sorts are wildly fast:qDespite 40+years of research,still improving!nClustered B+tree is good for sorting;unclustered tree is usually very bad.
收藏