《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件
《數(shù)據(jù)庫(kù)系統(tǒng)》教學(xué)課件,數(shù)據(jù)庫(kù)系統(tǒng),數(shù)據(jù)庫(kù),系統(tǒng),教學(xué),課件
Transactions Introductionand Concurrency ControlJianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYcourtesy of Joe Hellerstein for some slidesConcurrency Control and RecoverynConcurrency ControlqProvide correct and highly available data access in the presence of concurrent access by many users.nRecoveryqEnsures database is fault tolerant,and not corrupted by software,system or media failureq24x7 access to mission critical datanA boon to application authors!qExistence of CC&R allows applications to be written without explicit concern for concurrency and fault tolerance.Query Optimizationand ExecutionRelational OperatorsFiles and Access MethodsBuffer ManagementDisk Space ManagementDBThese layers must consider concurrencycontrol and recovery(Transaction,Lock,Recovery Managers)Structure of a DBMSTransactions and Concurrent Execution nTransaction(“xact”):DBMSs abstract view of a user program(or activity)qA sequence of reads and writes of database objects.qBatch of work that must commit or abort as an atomic unitnTransaction Manager controls execution of xacts.nUsers program logic is invisible to DBMS!qArbitrary computation possible on data fetched from the DBqThe DBMS only sees data read/written from/to the DB.nChallenge:provide atomic xacts to concurrent users!qGiven only the read/write interface.A Sample Transaction1:Begin_Transaction2:get(K1,K2,CHF)from terminal3:Select BALANCE Into S1 From ACCOUNT Where ACCOUNTNR=K1;4:S1:=S1-CHF;5:Update ACCOUNT Set BALANCE=S1 Where ACCOUNTNR=K1;6:Select BALANCE Into S2 From ACCOUNT Where ACCOUNTNR=K2;7:S2:=S2+CHF;8:Update ACCOUNT Set BALANCE=S2 Where ACCOUNTNR=K2;9:Insert Into BOOKING(ACCOUNTNR,DATE,AMOUNT,TEXT)Values(K1,today,-CHF,Transfer);10:Insert Into BOOKING(ACCOUNTNR,DATE,AMOUNT,TEXT)Values(K2,today,CHF,Transfer);12:If S10 Then Abort_Transaction11:End_TransactionConcurrency:Why bother?nThe latency argumentqResponse time:the average time taken to complete an xact.nA short xact could get stuck behind a long xact,leading to unpredictable delays in response time.nThe throughput argumentqThroughput:the average number of xacts completed in a given time.nOverlapping I/O and CPU activity reduces the amount of time disks and CPU are idle.ACID properties of Transaction Executionsn nA A tomicity:qAll actions in the Xact happen,or none happen.n nC C onsistency:qIf the DB(Database)starts consistent,it ends up consistent at end of Xact.n nI I solation:qExecution of one Xact is isolated from that of other Xacts.n nD D urability:qIf an Xact commits,its effects persist.Implications of Atomicity and DurabilitynA transaction ends in one of two ways:qcommit after completing all its actionsn“commit”is a contract with the caller of the DBqabort(or be aborted by the DBMS)after executing some actions.nOr system crash while the xact is in progress;treat as abort.nAtomicity means the effect of aborted xacts must be removednDurability means the effects of a committed xact must survive failures.nDBMS ensures the above by logging all actions:qUndo the actions of aborted/failed xacts.qRedo actions of committed xacts not yet propagated to disk when system crashes.Transaction ConsistencynXacts preserve DB consistencyqGiven a consistent DB state,produce another consistent DB statenDB consistency expressed as a set of declarative Integrity Constraints qCREATE TABLE/ASSERTION statementsnXacts that violate ICs are abortedqThats all the DBMS can automatically check!Isolation(Concurrency)nDBMS interleaves actions of many xactsqActions=reads/writes of DB objectsnUsers should be able to understand an xact without considering the effect of other concurrently executing xacts.nEach xact executes as if it were running by itself.qConcurrent accesses have no effect on an xacts behaviorqNet effect must be identical to executing all xacts for some serial order.Schedule of Executing TransactionsnA schedule isqa list of actions(READ,WRITE,ABORT,or COMMIT)from a set of xacts,qand the order in which two actions of an xact T appears in a schedule must be the same as the order in which they appear in T.nA complete schedule isqA schedule that contains either an abort or a commit for each xact.A Schedule Involving Two Transactions:An Incomplete ScheduleSerial SchedulenSerial scheduleqEach xact runs from start to finish,without any intervening actions from other xacts.nan example:T1;T2.Serializable SchedulenTwo schedules are equivalent if:qThey involve the same actions of the same xacts,qand they leave the DB in the same final state.nA serializable schedule over a set S of xacts is qa schedule whose effect on any consistent database instance is guaranteed to be identical to that of some complete serial schedule over the set of committed xacts in S.A Serializable Schedule of Two TransactionsThe result of this schedule is equivalent to the result of the serial schedule:T1;T2.Important Points of SerializabilitynExecuting xacts serially in different orders may produce different results,qbut all are presumed to be acceptable;qthe DBMS makes no guarantees about which of them will be the outcome of an interleaved execution.nUncommitted xacts can appear in a serializable schedule S,but their effects are cancelled out by UNDO.Conflicting ActionsnNeed an easier check for equivalence of schedulesqUse notion of“conflicting”actionsnAnomalies with interleaved execution are simply caused by conflicting actions.nTwo actions are said conflict if:qThey are by different xacts,qthey are on the same object,qand at least one of them is a write.qThree kinds of conflicts:Write-Read(WR)conflict,Read-Write(RW)and Write-Write(WW)conflicts.Conflicts:Anomalies with Interleaved ExecutionnReading Uncommitted Data(WR Conflicts,“dirty reads”):nUnrepeatable Reads(RW Conflicts):T1:R(A),W(A),R(B),W(B),AbortT2:R(A),W(A),CommitT1:R(A),R(A),W(A),CommitT2:R(A),W(A),CommitConflicts(Continued)nOverwriting Uncommitted Data(WW Conflicts):T1:W(A),W(B),CommitT2:W(A),W(B),CommitSchedules Involving Aborted XactsnSerializability relies on UNDOing aborted xacts completely,which may be impossible in some situations.An Unrecoverable Schedule:T2 has already committed,and so can not be undone.Recoverable SchedulenIn a recoverable schedule,xacts commit only after(and if)all xacts whose changes they read commit.qIf xacts read only the changes of committed xacts,not only is the schedule recoverable,qbut also can avoid cascading aborts.Conflict Serializable SchedulesnTwo schedules are conflict equivalent if:qThey involve the same set of actions of the same xacts,qand they order every pair of conflicting actions of two committed xacts in the same way.nSchedule S is conflict serializable if:qS is conflict equivalent to some serial schedule.nNote,some serializable schedules are NOT conflict serializable.qA price we pay to achieve efficient enforcement.Conflict Serializability IntuitionnA schedule S is conflict serializable if:qYou can transform S into a serial schedule by swapping consecutive non-conflicting operations of different xacts.nExample:R(A)R(B)W(A)W(B)R(A)W(A)R(B)W(B)R(A)R(B)W(A)W(B)R(A)W(A)R(B)W(B)Serializable Schedule That is Not Conflict Serializable This schedule is equivalent to the serial schedule:T1;T2;T3.However it is not conflict quivalent to the serial schedule because the writes of T1 and T2 are ordered differently.Dependency GraphnWe use a dependency graph,also called a precedence graph,to capture all potential conflicts between the xacts in a schedule.nThe dependency graph for a schedule S contains:qA node for each committed xactqAn edge from Ti to Tj if an action of Ti precedes and conflicts with one of Tjs actions.nTheorem:Schedule is conflict serializable if and only if its dependency graph is acyclic.Example of Dependency GraphTwo-Phase Locking(2PL)nThe most common scheme for enforcing conflict serializability.n“Pessimistic”qSets locks for fear of conflictqThe alternative scheme is called Optimistic Concurrency Control.Two-Phase Locking(2PL)rules:qAn xact must obtain a S(shared)lock before reading,and an X(exclusive)lock before writing.qAn xact cannot request additional locks once it releases any lock.SXSXLockCompatibilityMatrixTwo-Phase Locking(2PL),cont.2PL guarantees conflict serializabilitytime#locks heldrelease phaseacquisition phaseBut,does not prevent Cascading Aborts.Strict 2PLnProblem:Cascading AbortsnExample:rollback of T1 requires rollback of T2!nStrict Two-phase Locking(Strict 2PL)protocol:Same as 2PL,except:Locks released only when an xact completes i.e.,either:(a)the xact has committed(commit record on disk),or (b)the xact has aborted and rollback is complete.T1:R(A),W(A),R(B),W(B),AbortT2:R(A),W(A)Strict 2PL(continued)#locks heldacquisition phasetimerelease all locks at end of xactNext.nA few examplesLock_X(A)Read(A)Lock_S(A)A:=A-50Write(A)Unlock(A)Read(A)Unlock(A)Lock_S(B)Lock_X(B)Read(B)Unlock(B)PRINT(A+B)Read(B)B:=B+50Write(B)Unlock(B)Non-2PL,A=1000,B=2000,Output=?Lock_X(A)Read(A)Lock_S(A)A:=A-50Write(A)Lock_X(B)Unlock(A)Read(A)Lock_S(B)Read(B)B:=B+50Write(B)Unlock(B)Unlock(A)Read(B)Unlock(B)PRINT(A+B)2PL,A=1000,B=2000,Output=?Lock_X(A)Read(A)Lock_S(A)A:=A-50Write(A)Lock_X(B)Read(B)B:=B+50Write(B)Unlock(A)Unlock(B)Read(A)Lock_S(B)Read(B)PRINT(A+B)Unlock(A)Unlock(B)Strict 2PL,A=1000,B=2000,Output=?Venn Diagram for SchedulesAll SchedulesAvoid Cascading AbortSerialView SerializableConflict Serializable Which schedules does Strict 2PL allow?All SchedulesAvoid Cascading AbortSerialView SerializableConflict SerializableLock ManagementnLock and unlock requests handled by Lock Manager.nLM keeps an entry for each currently held lock.nEntry contains:qList of xacts currently holding lockqType of lock held(shared or exclusive)qQueue of lock requestsLock Management(Contd.)nWhen lock request arrives:qDoes any other transaction hold a conflicting lock?nIf no,grant the lock.nIf yes,put requestor into wait queue.nLock upgrade:qA transaction with shared lock can request to upgrade to exclusive.Lock_X(A)Lock_S(B)Read(B)Lock_S(A)Read(A)A:=A-50Write(A)Lock_X(B)ExampleDeadlocksnDeadlock:Cycle of transactions waiting for locks to be released by each other.nWays of dealing with deadlocks:qpreventionqdetectionqavoidancenMany systems just punt and use TimeoutsqWhat are the dangers with this approach?Deadlock PreventionnCommon technique in operating systemsnStandard approach:resource orderingqScreen Network Card PrinternWhy is this problematic for Xacts in a DBMS?Deadlock DetectionnCreate and maintain a“waits-for”graphnPeriodically check for cycles in graphDeadlock Detection(Continued)Example:T1:S(A),S(D),S(B)T2:X(B),X(C)T3:S(D),S(C),X(A)T4:X(B)T1T2T4T3Deadlock AvoidancenAssign priorities based on timestamps.nSay Ti wants a lock that Tj holds Two policies are possible:Wait-Die:If Ti has higher priority,Ti waits for Tj;otherwise Ti aborts.Wound-wait:If Ti has higher priority,Tj aborts;otherwise Ti waits.nWhy do these schemes guarantee no deadlocks?nImportant detail:If a transaction re-starts,make sure it gets its original timestamp.-Why?Locking GranularitynHard to decide what granularity to lock(tuples vs.pages vs.tables).nwhy?Multiple-Granularity LocksnShouldnt have to make same decision for all transactions!nData“containers”are nested:TuplesTablesPagesDatabasecontainsSolution:New Lock Modes,ProtocolnAllow Xacts to lock at each level,but with a special protocol using new“intent”locks:nStill need S and X locks,but before locking an item,Xact must have proper intent locks on all its ancestors in the granularity hierarchy.vIS Intent to get S lock(s)at finer granularity.vIX Intent to get X lock(s)at finer granularity.vSIX mode:Like S&IX at the same time.Why useful?TuplesTablesPagesDatabaseMultiple Granularity Lock ProtocolnEach Xact starts from the root of the hierarchy.nTo get S or IS lock on a node,must hold IS or IX on parent node.qWhat if Xact holds S on parent?SIX on parent?nTo get X or IX or SIX on a node,must hold IX or SIX on parent node.nMust release locks in bottom-up order.Protocol is correct in that it is equivalent to directly settinglocks at the leaf levels of the hierarchy.TuplesTablesPagesDatabaseLock Compatibility MatrixvIS Intent to get S lock(s)at finer granularity.vIX Intent to get X lock(s)at finer granularity.vSIX mode:Like S&IX at the same time.ISIXSIXISIXSIXSXSX-TuplesTablesPagesDatabaseExamples 2 level hierarchynT1 scans R,and updates a few tuples:qT1 gets an SIX lock on R,then get X lock on tuples that are updated.nT2 uses an index to read only part of R:qT2 gets an IS lock on R,and repeatedly gets an S lock on tuples of R.nT3 reads all of R:qT3 gets an S lock on R.qOR,T3 could behave like T2;can use lock escalation to decide which.qLock escalation dynamically asks for coarser-grained locks when too manylow level locks acquiredIS IXSIXISIXSIX SXSXTuplesTablesJust so youre aware:Optimistic CCnBasic idea:let all transactions run to completionqMake tentative updates on private copies of dataqAt commit time,check schedules for serializabilityqIf you cant guarantee it,restart transactionelse“install”updates in DBMSnPros&ConsqNo waiting or lock overhead in serializable casesqRestarted transactions waste work,slow down othersnOCC a loser to 2PL in traditional DBMSsqPlays a secondary role in some DBMSsnGeneralizations:qMulti-version and Timestamp CC manage the multiple copies in a permanent wayJust So Youre Aware:Indexesn2PL on B+-tree pages is a rotten idea.qWhy?nInstead,do short locks(latches)in a clever wayqIdea:Upper levels of B+-tree just need to direct traffic correctly.Dont need to be serializably handled!qDifferent tricks to exploit thisnNote:this is pretty complicated!Just So Youre Aware:PhantomsnSuppose you query for sailors with rating between 10 and 20,using a B+-treeqTuple-level locks in the Heap FilenI insert a Sailor with rating 12nYou do your query againqYikes!A phantom!qProblem:Serializability assumed a static DB!nWhat we want:lock the logical range 10-20qImagine that lock table!nWhat is done:set locks in indexes cleverlySummarynCorrectness criterion for isolation is“serializability”.qIn practice,we use“conflict serializability,”which is somewhat more restrictive but easy to enforce.nTwo Phase Locking and Strict 2PL:Locks implement the notions of conflict directly.qThe lock manager keeps track of the locks issued.qDeadlocks may arise;can either be prevented or detected.nMulti-Granularity Locking:qAllows flexible tradeoff between lock“scope”in DB,and locking overhead in RAM and CPUnMore to the storyqOptimistic/Multi-version/Timestamp CCqIndex“l(fā)atching”,phantoms
收藏