《數(shù)據(jù)庫系統(tǒng)》教學(xué)課件
《數(shù)據(jù)庫系統(tǒng)》教學(xué)課件,數(shù)據(jù)庫系統(tǒng),數(shù)據(jù)庫,系統(tǒng),教學(xué),課件
SQL:The Query LanguageJianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYThe important thing is not tostop questioning.Albert EinsteinReviewRelational Algebra(Operational Semantics)Given a query,how to mix and match the relational algebra operators to answer itUsed for query optimizationRelational Calculus(Declarative Semantics)Given a query,what do I want my answer set to include?Algebra and safe calculus are simple and powerful models for query languages for relational modelHave same expressive powerSQL can express every query that is expressible in relational algebra/calculus.(and more)Relational Query LanguagesSQL QueryRel.Algebra Query 1Rel.Algebra Query 2Rel.Algebra Query n.Pick the cheapest oneRelational Query LanguagesnTwo sublanguages:qDDL Data Definition LanguagenDefine and modify schema(at all 3 levels)qDML Data Manipulation LanguagenQueries can be written intuitively.nDBMS is responsible for efficient evaluation.qThe key:precise semantics for relational queries.qOptimizer can re-order operationsnWont affect query answer.qChoices driven by“cost model”The SQL Query LanguagenThe most widely used relational query language.nStandardized(although most systems add their own“special sauce”-including PostgreSQL)nWe will study SQL92-a basic subsetExample Databasesidsnameratingage1Fred7222Jim2393Nancy827Sailorssidbidday11029/1221029/13Reservesbidbnamecolor101Ninared102Pintablue103Santa MariaredBoatsThe SQL DDLCREATE TABLE Sailors(sid INTEGER,sname CHAR(20),rating INTEGER,age REAL,PRIMARY KEY sid);CREATE TABLE Boats(bid INTEGER,bname CHAR(20),color CHAR(10)PRIMARY KEY bid);CREATE TABLE Reserves(sid INTEGER,bid INTEGER,day DATE,PRIMARY KEY(sid,bid,day),FOREIGN KEY sid REFERENCES Sailors,FOREIGN KEY bid REFERENCES Boats);sidsnameratingage1Fred7222Jim2393Nancy827bidbnamecolor101Ninared102Pintablue103Santa Mariaredsidbidday11029/1221029/13 The SQL DMLnFind all 18-year-old sailors:SELECT*FROM Sailors SWHERE S.age=18 To find just names and ratings,replace the first line:SELECT S.sname,S.ratingsidsnameratingage1Fred7222Jim2393Nancy827Sailors Querying Multiple RelationsSELECT S.snameFROM Sailors S,Reserves RWHERE S.sid=R.sid AND R.bid=102sidsnameratingage1Fred7222Jim2393Nancy827Sailorssidbidday11029/1221029/13ReservesBasic SQL QuerySELECT DISTINCT target-listFROM relation-listWHERE qualificationrelation-list:List of relation names,possibly with a range-variable after each nametarget-list:List of expressions over attributes of tables in relation-listDISTINCT:optional.Answer should not contain duplicates.SQL default:duplicates are not eliminated!(Result a“multiset”)qualification:Comparisons combined using AND,OR and NOT.Comparisons are Attr op const or Attr1 op Attr2,where op is one of,etc.1.FROM:compute cross product of tables.2.WHERE:Check conditions,discard tuples that fail.3.SELECT:Delete unwanted fields.4.DISTINCT(optional):eliminate duplicate rows.Note:Probably the least efficient way to compute a query!qQuery optimizer will find more efficient ways to get the same answer.Query SemanticsSELECT DISTINCT target-listFROM relation-listWHERE qualificationFind sailors whove reserved at least one boatnWould DISTINCT make a difference here?nWhat is the effect of replacing S.sid by S.sname in the SELECT clause?qWould DISTINCT make a diff to this variant of the query?S.sidSailors S,Reserves RS.sid=R.sidSELECTFROMWHEREAbout Range VariablesnNeeded when ambiguity could arise.qe.g.,same table used multiple times in FROM (“self-join”)SELECT x.sname,x.age,y.sname,y.ageFROM Sailors x,Sailors yWHERE x.age y.agesidsnameratingage1Fred7222Jim2393Nancy827SailorsArithmetic ExpressionsSELECT S.age,S.age-5 AS age1,2*S.age AS age2FROM Sailors SWHERE S.sname=dustinSELECT S1.sname AS name1,S2.sname AS name2FROM Sailors S1,Sailors S2WHERE 2*S1.rating=S2.rating-1String Comparisons _ stands for any one character and%stands for 0 or more arbitrary characters.SELECT S.snameFROM Sailors SWHERE S.sname LIKE B_%BFind sids of sailors whove reserved a red or a green boatSELECT R.sidFROM Boats B,Reserves RWHERE R.bid=B.bid AND (B.color=red OROR B.color=green)SELECT R.sidFROM Boats B,Reserves RWHERE R.bid=B.bid AND B.color=red UNIONUNION SELECT R.sidFROM Boats B,Reserves RWHERE R.bid=B.bid AND B.color=green.or:SELECT R.sidFROM Boats B,Reserves RWHERE R.bid=B.bid AND (B.color=red ANDAND B.color=green)Find sids of sailors whove reserved a red and a green boatFind sids of sailors whove reserved a red and a green boatSELECT S.sidFROM Sailors S,Boats B,Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=redINTERSECTINTERSECTSELECT S.sidFROM Sailors S,Boats B,Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=greenCould use a self-join:SELECT R1.sidFROM Boats B1,Reserves R1,Boats B2,Reserves R2WHERE R1.sid=R2.sid AND R1.bid=B1.bid AND R2.bid=B2.bid AND(B1.color=red ANDAND B2.color=green)Find sids of sailors whove reserved a red and a green boatFind sids of sailors who have not reserved a boatSELECT S.sidFROM Sailors SEXCEPTEXCEPTSELECT S.sidFROM Sailors S,Reserves RWHERE S.sid=R.sid Nested Queries:INSELECT S.snameFROM Sailors SWHERE S.sid ININ (SELECT R.sid FROM Reserves R WHERE R.bid=103)Names of sailors whove reserved boat#103:SELECT S.snameFROM Sailors SWHERE S.sid NOT INNOT IN (SELECT R.sid FROM Reserves RWHERE R.bid=103)Names of sailors whove not reserved boat#103:Nested Queries:NOT INNested Queries with CorrelationnSubquery must be recomputed for each Sailors tuple.qThink of subquery as a function call that runs a querySELECT S.snameFROM Sailors SWHERE EXISTSEXISTS (SELECT *FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)Names of sailors whove reserved boat#103:More on Set-Comparison Operatorsnweve seen:IN,EXISTSncan also have:NOT IN,NOT EXISTSnother forms:op ANY,op ALLnFind sailors whose rating is greater than that of some sailor called Horatio:SELECT*FROM Sailors SWHERE S.rating ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=Horatio)Conceptual SQL EvaluationSELECT DISTINCT target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualificationSELECTRelation cross-product Apply selections(eliminate rows)Project away columns(just keep those used in SELECT,GBY,HAVING)WHEREFROMGROUP BYForm groups&aggregateHAVINGEliminate groupsDISTINCTEliminate duplicatesSorting the Results of a QuerynORDER BY column ASC|DESC,.nCan order by any column in SELECT list,including expressions or aggs:SELECT S.rating,S.sname,S.ageFROM Sailors S,Boats B,Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=redORDER BY S.rating,S.sname;SELECT S.sid,COUNT(*)AS redrescntFROM Sailors S,Boats B,Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=redGROUP BY S.sidORDER BY redrescnt DESC;Null ValuesnField values are sometimes unknown(e.g.,a rating has not been assigned)or inapplicable(e.g.,no spouses name).qSQL provides a special value null for such situations.nThe presence of null complicates many issues.E.g.:qSpecial operators needed to check if value is/is not null.qIs rating8 true or false when rating is equal to null?What about AND,OR and NOT connectives?qWe need a 3-valued logic (true,false and unknown).Joins Explicit join semantics needed unless it is an INNER join(INNER is default)SELECT(column_list)FROM table_name INNER|LEFT|RIGHT|FULL OUTER JOIN table_name ON qualification_listWHERE Inner JoinOnly rows that match the qualification are returned.SELECT s.sid,s.name,r.bidFROM Sailors s INNER JOIN Reserves rON s.sid=r.sidReturns only those sailors who have reserved boats.SELECT s.sid,s.name,r.bidFROM Sailors s INNER JOIN Reserves rON s.sid=r.sid Left Outer JoinReturns all matched rows,plus all unmatched rows from the table on the left of the join clause(use nulls in fields of non-matching tuples)SELECT s.sid,s.name,r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid=r.sidSELECT s.sid,s.name,r.bidFROM Sailors s LEFT OUTER JOIN Reserves rON s.sid=r.sid Right Outer JoinRight Outer Join returns all matched rows,plus all unmatched rows from the table on the right of the join clauseSELECT r.sid,b.bid,b.nameFROM Reserves r RIGHT OUTER JOIN Boats bON r.bid=b.bidSELECT r.sid,b.bid,b.nameFROM Reserves r RIGHT OUTER JOIN Boats bON r.bid=b.bid Full Outer JoinFull Outer Join returns all(matched or unmatched)rows from the tables on both sides of the join clause SELECT r.sid,b.bid,b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid=b.bidSELECT r.sid,b.bid,b.nameFROM Reserves r FULL OUTER JOIN Boats bON r.bid=b.bid Note:in this case it is the same as the ROJ!bid is a foreign key in reserves,so all reservations musthave a corresponding tuple in boats.Views:Defining External DB SchemasCREATE VIEW view_nameAS select_statementMakes development simplerOften used for securityNot“materialized”CREATE VIEW RedsAS SELECT B.bid,COUNT(*)AS scount FROM Boats B,Reserves R WHERE R.bid=B.bid AND B.color=red GROUP BY B.bidSELECT bname,scount FROM Reds R,Boats B WHERE R.bid=B.bidAND scount 10RedsCREATE VIEW RedsAS SELECT B.bid,COUNT(*)AS scount FROM Boats B,Reserves R WHERE R.bid=B.bid AND B.color=red GROUP BY B.bidViews Instead of Relations in Queries Discretionary Access Control GRANT privileges ON object TO users WITH GRANT OPTIONObject can be a Table or a ViewPrivileges can be:SelectInsertDeleteReferences(cols)allow to create a foreign key that references the specified column(s)AllCan later be REVOKEdUsers can be single users or groupsSee Chapter 17 for more details.Two more important topicsnConstraintsnSQL embedded in other languagesIntegrity Constraints(Review)nAn IC describes conditions that every legal instance of a relation must satisfy.qInserts/deletes/updates that violate ICs are disallowed.qCan ensure application semantics(e.g.,sid is a key),or prevent inconsistencies(e.g.,sname has to be a string,age must be=1 AND rating=10)CREATE TABLE Reserves(sname CHAR(10),bid INTEGER,day DATE,PRIMARY KEY (bid,day),CONSTRAINT noInterlakeResCHECK (Interlake(SELECT B.bnameFROM Boats BWHERE B.bid=bid)Writing Applications with SQLnSQL is not a general purpose programming language.+Tailored for data retrieval and manipulation+Relatively easy to optimize and parallelize-Cant write entire apps in SQL aloneOptions:Make the query language“Turing complete”Avoids the“impedance mismatch”but,loses advantages of relational language simplicityAllow SQL to be embedded in regular programming languages.Q:What needs to be solved to make the latter approach work?Embedded SQLnDBMS vendors traditionally provided“host language bindings”qE.g.for C or COBOLqAllow SQL statements to be called from within a programqTypically you preprocess your programs qPreprocessor generates calls to a proprietary DB connectivity librarynGeneral patternqOne call to connect to the right database(login,etc.)qSQL statements can refer to host variables from the languagenTypically vendor-specificqWe wont look at any in detail,well look at standard stuffnProblemqSQL relations are(multi-)sets,no a priori bound on the number of records.No such data structure in C.qSQL supports a mechanism called a cursor to handle this.Just to give you a flavorEXEC SQL SELECT S.sname,S.ageINTO:c_sname,:c_ageFROM Sailors SWHERE S.sid=:c_sidDatabase APIs:Alternative to embeddingnRather than modify compiler,add a library with database calls(API)qspecial objects/methodsqpasses SQL strings from language,presents result sets in a language-friendly wayqODBC a C/C+standard started on WindowsqJDBC a Java equivalentqMost scripting languages have similar thingsnE.g.For Perl there is DBI,“oraPerl”,other packagesnMostly DBMS-neutral qat least try to hide distinctions across different DBMSsArchitecturenA lookup service maps“data source names”(“DSNs”)to drivers qTypically handled by OSnBased on the DSN used,a“driver”is linked into the app at runtimenThe driver traps calls,translates them into DBMS-specific codenDatabase can be across a networknODBC is standard,so the same program can be used(in principle)to access multiple database systemsnData source may not even be an SQL database!Application ODBC driverData SourceODBC/JDBCnVarious vendors provide drivers qMS bundles a bunch into WindowsqVendors like DataDirect and OpenLink sell drivers for multiple OSesnDrivers for various data sourcesqRelational DBMSs(Oracle,DB2,SQL Server,etc.)q“Desktop”DBMSs(Access,Dbase,Paradox,FoxPro,etc.)qSpreadsheets(MS Excel,Lotus 1-2-3,etc.)qDelimited text files(.CSV,.TXT,etc.)nYou can use JDBC/ODBC clients over many data sourcesqE.g.MS Query comes with many versions of MS Office(msqry32.exe)nCan write your own Java or C+programs against xDBCJDBCnPart of Java,easy to usenJava comes with a JDBC-to-ODBC bridgeqSo JDBC code can talk to any ODBC data sourceqE.g.look in your Windows Control Panel or MacOS Utilities folder for JDBC/ODBC drivers!nJDBC tutorial onlineqhttp:/ SummaryAPIs are needed to interface DBMSs to programming languagesnEmbedded SQL uses“native drivers”and is usually faster but less standardnODBC(used to be Microsoft-specific)for C/C+nJDBC the standard for JavanScripting languages(PHP,Perl,JSP)are becoming the preferred technique for web-based systemsSummarynRelational model has well-defined query semanticsnSQL provides functionality close to basic relational model(some differences in duplicate handling,null values,set operators,)nTypically,many ways to write a queryqDBMS figures out a fast way to execute a query,regardless of how it is written.
收藏