《數(shù)據(jù)庫系統(tǒng)》英文教學(xué)課件
《數(shù)據(jù)庫系統(tǒng)》英文教學(xué)課件,數(shù)據(jù)庫系統(tǒng),數(shù)據(jù)庫,系統(tǒng),英文,教學(xué),課件
Database SystemsLecture#1Guifeng ZHENGSchool of Software,SYSUFall,2010Slide 2Big Names in Database SystemsCompanyProductRemarksOracle Oracle 8i,9i,etc.Worlds 2nd largest software companyCEO,Larry Ellison,worlds 2nd richestIBMDB2Worlds 2nd largest after Informix acquisitionMicrosoftAccess,SQL ServerAccess comes with MS OfficeSybaseAdaptive ServerCEO John Chen,grown up in HKInformixDynamic ServerAcquired by IBM in 2001Samuel J.Palmisano Steve Ballmer John Chen Larry Ellison Slide 3What Is a Database?A large,integrated collection of datawhich models a real-world enterprise:Entitiesstudents,courses,instructors,TAs.RelationshipsHillary is currently taking DB course.Barack is currently teaching DB course.John is currently TA-ing DB course but took it last year.A Database Management System(DBMS)is a software package that stores and manages DBsSlide 4Databases are everywhere:non-webPolice stationTracking crime stats.Airline bookingsRetailers:Wal-Mart,etc.when to re-order,purchase patterns,data-miningSlide 5Databases are everywhere:webRetail:Amazon,etc.Search enginesSearchable DBs:IMDB,,etc.Web2.0 sites:flickr=images+tagsCMS systems(Wikis,blog&forum software,etc.)Slide 6Databases involved in ordering a pizza?1.Pizza Huts DB2.Credit card records3.CC approval by credit agencies4.Phone companys records5.Caller IDnError-checking,anticrimeSlide 7Your wallet is full of DB recordsDrivers licenseCredit cardsSYSU CardMedical insurance cardMoney(serial numbers)Etc“You may not be interested in databases,but databases are interested in you.”-TrotskySlide 8Example of a Traditional DB AppSuppose we build a system We store:checking accountssavings accountsaccount holdersstate of each of each persons accountsSlide 9Can we do without a DBMS?Sure!Start by storing the data in files:checking.txt savings.txt customers.txtNow write C or Java programs to implement specific tasksSlide 10Doing it without a DBMS.Transfer$100 from Georges savings to checking:Read savings.txtFind&update the line w/“George”balance-=100Write savings.txtRead checking.txtFind&update the line w/“George”balance+=100Write checking.txtWrite a C program to do the following:Slide 11Problems without an DBMS.1.System crashes:Same problem even if reorderedHigh-volume (Rare frequent)2.Simultaneous access by many usersGeorge and Dick visit ATMs at same timeLock checking.txt before each usewhat is the problem?Read savings.txtFind&update the line w/“George.”Write savings.txtRead checking.txtFind&update the line w/“George”Write checking.txtCRASH!Slide 12Problems without a DBMS.3.Large data sets(100s of GBs,or TBs,)No indicesFinding“George”in huge flatfile is expensiveModifications intractable without better data structures“George”“Georgie”is very expensiveDeletions are very expensiveSlide 13Problems without an DBMS.5.Security?File system may lack security featuresFile system security may be coarse6.Application programming interface(API)?nInterfaces,interoperability7.How to query the data?Slide 14In homebrew system,must supportfailover/recoveryconcurrent usedeal with large datasets?securityinterop(互動互動)?querying in what?DBMS as applicationQ:How does a DBMS solve these problems?A:See third part of course,but for nowSlide 15One big issue:Transaction processingGrouping of several queries(or other DB operations)into one transactionACID test properties Atomicityall or nothingConsistencyconstraints on relationshipsIsolationconcurrency controlsimulated solipsism(自閉)DurabilityCrash recoverySlide 16Atomicity&DurabilityAvoiding inconsistent stateA DBMS prevents this outcomexacts are all or nothingOne simple idea:log progress of and plans for each xactDurability:changes stay made(with log)Atomicity:entire xact is committed at onceSlide 17IsolationMany users concurrent executionDisk access is slow(compared to CPU)dont waste CPU keep runningInterweaving actions of different user programs but can lead to inconsistency:e.g.,two programs simultaneously withdraw from the same accountFor each user,should look like a single-user systemSimulated solipsismSlide 18IsolationContrast with a file in two NotepadsStrategy:ignore multiple userswhichever saves last winsfirst save is overwrittenContrast with a file in two WordsStrategy:blunt(生硬的)isolationOne can editTo the other its read-onlySlide 19ConsistencyEach xact(on a consistent DB)must leave it in a consistent statecan define integrity constraintschecks that the defined claims about the dataOnly xacts obeying them are allowedSlide 20A level up:data models(數(shù)據(jù)模型數(shù)據(jù)模型)Any DBMS uses a data model:collection of concepts for describing dataRelational data model:basically universalOracle,DB2,SQL Server,other SQL DBMSsRelations:table of rows&columnsa rels schema(關(guān)系模式關(guān)系模式)defines its fieldsThough some have OO extensionsSlide 21Data Schemas(模式模式)Schema:description of particular set of data,using some data model“Physical schema”Physical files on diskSchemaSet of relations/tables,with structureViews(“external schema”)Virtual tables(虛擬表)generated for user typesviewlogicalphysicalView definitionsLogical schema Physical schemaApplicationSlide 23Schema e.g.:college registrarSchema:Students(ssn:string,name:string,login:string,age:int,gpa:real)Courses(cid:string,cname:string,credits:int)Enrolled(sid:string,cid:string,grade:string)Physical schema:Relations stored as unordered text files.Indices on first column of each relViews:My_courses(cname:string,grade:string,credits:int)Course_info(ssn:string,name:string,status:string)Slide 24How the programmer sees the DBMSStart with SQL DDL to create tables:Continue with SQL to populate tables:CREATE TABLE Students(Name CHAR(30)SSN CHAR(9)PRIMARY KEY NOT NULL,Category CHAR(20);INSERT INTO StudentsVALUES(Hillary,123456789,undergraduate);Slide 25How the programmer sees the DBMSUltimately files,but complexStudents:Courses:Takes:Slide 26Querying:Structured Query LanguageFind all the students who have taken C20.0046:Find all the students who C20.0046 previously:Find the students names:SELECT SSN FROM TakesWHERE CID=C20.0046;SELECT SSN FROM TakesWHERE CID=C20.0046 AND Semester=Fall,2005;SELECT Name FROM Students,TakesWHERE Students.SSN=Takes.SSN AND CID=C20.0046 AND Semester=Fall,2005;Slide 27Database Industry(工業(yè),行業(yè),產(chǎn)業(yè)工業(yè),行業(yè),產(chǎn)業(yè))Relational databases are based on set theoryCommercial DBMSs:Oracle,IBMs DB2,Microsofts SQL Server,etc.Opensource:MySQL,PostgreSQL,etc.DBAs manage theseProgrammers write apps(CRUD,etc.)XML(“semi-structured data”)also importantSlide 28The Study of DBMSPrimary aspects:Data modelingSQLDB programmingDBMS implementationThis course covers all four(tho less of#4)Also will look at some more advanced areasXML,websearch,column-oriented DBs,RAID,RegExs,MapReduceSlide 29Course outlineDatabase design:Entity/Relationship modelsModeling constraintsThe relational model:Relational algebraTransforming E/R models to relational schemasSQLDDL&query languageSlide 30Course outlineProgramming for databasesSome DB implementationIndexes,sorting,xactsAdvanced topicsMay change as course progressespartly in response to audienceAlso“current events”Slashdot/whatever,Database Blog,etc.Slide 31SQL ReadingsMany SQL references available onlineGood online(free)SQL tutorials include:A Gentle Introduction to SQL(http:/ for Web Nerds(http:/ 32The Project:design end-to-end DB web appdata modelIdentify entities&their relationships relationscreation of DB in Oracle/MySQLPopulation with real(alistic)dataweb app for accessing/modifying dataIdentification of“interesting”questions&actionsProduce DBMS interfaceWork in pairs(/threes)Choose topic on your ownStart forming your group today!Slide 33Collaboration modelHomework and exams done individuallyProject done with your team members only,though can in general use any toolsNon-cited use of others problem solutions,code,etc.=plagiarismSee academic honesty policyContact me if youre at all unclear before a particular caseCite any materials used if youre at all unclear afterSlide 34On-going FeedbackDont be afraid to ask questionsSome parts will be abstract/mathematicalTopic selection will be partly based on student interestSlide 35So what is this course about,really?Languages:SQL(some XML)Data modelingSome theory!(rereading)Functional dependencies,normal formse.g.,how to find most efficient schema for dataSome DBMS implementation(algs&data structs)Algorithms and data structures(in the latter part)e.g.,indices make data much faster to find how?Lots of DB implementation and hacking for the projectSlide 36For next timeGet the bookSkim chapter 1Start reading chapter 2
收藏