In my (admittedly brief) sojurn as an SQL programmer I've often admired "outer joins" in textbooks but never really understood their use. I've finally come across a problem that I think is served by an outer join. -- This table stores the answer to each test question CREATE TABLE TestResults ( studentId varchar (15) NOT NULL, testId int NOT NULL REFERENCES Tests(testId), qId int NOT NULL REFERENCES TestQuestions(qId), response int NOT NULL REFERENCES TestDistractors(dId), CONSTRAINT PK_TestResults PRIMARY KEY NONCLUSTERED (testId, studentId, qId), ) -- This table defines which questions are on which tests CREATE TABLE TestQuestions_Tests ( testId int NOT NULL REFERENCES Tests(testId), qId int NOT NULL REFERENCES TestQuestions(qId), ) (Table Tests contains housekeeping information about a particular test, TestQuestions defines individual questions, TestDistractors lists the possible responses.) In schematic form, the simplest form of my problem is to find all the questions that haven't been answered. That would be: SELECT tqt.qId FROM TestResults AS tr RIGHT OUTER JOIN TestQuestions_Tests AS tqt ON tr.testId = tqt.testId AND tr.qId = tqt.qId WHERE tr.qId is NULL So far I think this is pretty straightforward and an efficient solution. Agreed? But my real problem is a little bit more complex. What I really want to know is "for a given student, on a given test, which questions haven't been answered?" So now I have: SELECT tqt.qId FROM TestResults AS tr RIGHT OUTER JOIN TestQuestions_Tests AS tqt ON tr.testId = tqt.testId AND tr.qId = tqt.qId WHERE tqt.testId = '1' AND tr.studentId = '7' AND tr.qId IS NULL Is this the canonical form of the solution to my problem? It seems to me like it is generating a whole slew of rows and then filtering them. Is there a more elegant or efficient way to do it? -- Rick
In article <MPG.19a5704699c54a76989699@news.newsguy.com>, GMann@dublin.com says... [quoted text, click to view] > So now I have: > > SELECT tqt.qId > FROM TestResults AS tr > RIGHT OUTER JOIN TestQuestions_Tests AS tqt > ON tr.testId = tqt.testId AND tr.qId = tqt.qId > WHERE tqt.testId = '1' AND tr.studentId = '7' AND tr.qId IS NULL > > Is this the canonical form of the solution to my problem? It seems to > me like it is generating a whole slew of rows and then filtering them.
(Yes, I'm talking to myself) Whether it is or is not the canonical form, it's not a solution. <sigh> If tr.qId IS NULL, then so is tr.studentId.... If I remove the tr.studentId = '7' clause then I get the unanswered questions for *all* students. I just want to get a list of the questions on a particular test and emit a list of the ones that haven't been answered by a particular student. Don't tell me I'm going to have to resort to old-fashioned 3-gen procedural programming to solve this problem. Any ideas?
In article <Xns93D7E4944DFE6Yazorman@127.0.0.1>, sommar@algonet.se says... [quoted text, click to view] > This is how I would write it: > > SELECT tqt.qId > FROM TestQuestions_Tests tqt > WHERE NOT EXISTS (SELECT * > FROM TestResults tr > WHERE tr.studentId = @studentid > AND tr.testId = @testid > AND tqt.testId = tr.testId) > AND tqt.testId = @testid
Being a translation of mine, that had the same problem mine had -- it returned no rows. (I know, I know, I didn't supply any create statements. I have some but I didn't want to clutter up what I thought was a conceptually simple problem.) Here's what I made work: SELECT tqt.qid FROM ( SELECT testId, qId FROM testResults WHERE studentId = @studentId and testid = @testId ) AS tr RIGHT OUTER JOIN TestQuestions_Tests AS tqt ON tr.qId = tqt.qId WHERE tr.qId IS NULL AND tqt.testId = @testId By prefiltering the data I avoid retrieving *everyone's* unanswered questions. How would I convert that one to "NOT EXISTS?"
Guinness Mann (GMann@dublin.com) writes: [quoted text, click to view] > But my real problem is a little bit more complex. What I really want to > know is "for a given student, on a given test, which questions haven't > been answered?"
This is how I would write it: SELECT tqt.qId FROM TestQuestions_Tests tqt WHERE NOT EXISTS (SELECT * FROM TestResults tr WHERE tr.studentId = @studentid AND tr.testId = @testid AND tqt.testId = tr.testId) AND tqt.testId = @testid While outer joins can be used for NOT EXISTS logic, I prefer NOT EXISTS, as it more clearly states what the query is all about. Performancewise, outer join can sometimes be better, but it might just as well be slower. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
[quoted text, click to view] Guinness Mann wrote: > In my (admittedly brief) sojurn as an SQL programmer I've often admired > "outer joins" in textbooks but never really understood their use. I've > finally come across a problem that I think is served by an outer join. > > -- This table stores the answer to each test question > CREATE TABLE TestResults ( > studentId varchar (15) NOT NULL, > testId int NOT NULL REFERENCES Tests(testId), > qId int NOT NULL REFERENCES TestQuestions(qId), > response int NOT NULL REFERENCES TestDistractors(dId), > CONSTRAINT PK_TestResults > PRIMARY KEY NONCLUSTERED (testId, studentId, qId), > ) > > -- This table defines which questions are on which tests > CREATE TABLE TestQuestions_Tests ( > testId int NOT NULL REFERENCES Tests(testId), > qId int NOT NULL REFERENCES TestQuestions(qId), > ) >
Why is TestQuestions not just Questions (or Question -- I find singular table names usually work out better, if only because I can always make the primary key <table name>ID). If it was just Questions, then TestQuestions_Tests would just be TestQuestions. And why TestDistractors, rather than just Answers? Anyway, TestQuestions_Tests should have its own key field. That would simplify TestResults, and eliminate the possibility that the student is recorded as answering a question that wasn't on the test. If you ever delete from any of these tables, you'll want to make sure all the foreign key fields are indexed. Otherwise, for instance, any time you delete a row from TestQuestions_Tests, a table scan of TestResults will be done to make sure the delete can be allowed. Bill
[quoted text, click to view] Guinness Mann wrote: > > Being a translation of mine, that had the same problem mine had -- it > returned no rows. (I know, I know, I didn't supply any create > statements. I have some but I didn't want to clutter up what I thought > was a conceptually simple problem.) > > Here's what I made work: > > SELECT tqt.qid > FROM > ( SELECT testId, qId > FROM testResults > WHERE studentId = @studentId and testid = @testId > ) > AS tr > RIGHT OUTER JOIN TestQuestions_Tests AS tqt > ON tr.qId = tqt.qId > WHERE tr.qId IS NULL AND tqt.testId = @testId > > By prefiltering the data I avoid retrieving *everyone's* unanswered > questions. How would I convert that one to "NOT EXISTS?" >
SELECT TestQuestions_Tests.qid FROM TestQuestions_Tests WHERE TestQuestions_Tests = @TestId AND NOT EXISTS ( SELECT * FROM testResults WHERE studentId = @studentId AND testid = @testId AND TestQuestions_Tests.qid = TestResults.qID ) BTW, I'm not sure I've ever seen a right join in production code, only left. Conceptually, it's the same thing, but I always think of queries left-to-right, so it would throw me off when reading it. Bill
In article <3F52CC05.4000103@Ameritech.Net>, WCleveland@Ameritech.Net says... [quoted text, click to view] > > -- This table stores the answer to each test question > > CREATE TABLE TestResults ( > > studentId varchar (15) NOT NULL, > > testId int NOT NULL REFERENCES Tests(testId), > > qId int NOT NULL REFERENCES TestQuestions(qId), > > response int NOT NULL REFERENCES TestDistractors(dId), > > CONSTRAINT PK_TestResults > > PRIMARY KEY NONCLUSTERED (testId, studentId, qId), > > ) > > > > -- This table defines which questions are on which tests > > CREATE TABLE TestQuestions_Tests ( > > testId int NOT NULL REFERENCES Tests(testId), > > qId int NOT NULL REFERENCES TestQuestions(qId), > > ) > > > > Why is TestQuestions not just Questions (or Question -- I find > singular table names usually work out better, if only because I can > always make the primary key <table name>ID). If it was just Questions, > then TestQuestions_Tests would just be TestQuestions. And why > TestDistractors, rather than just Answers?
I didn't even mention TestQuestions_Distractors, TextResources, TestQuestions_AudioResources, TestQuestions_TextResources or AudioResources. :-) Technically (hey, I'm not a content-provider, I'm just the implementer) the "answer" is the distractor that represents the correct response. Therefore we have a table of distractors and elsewhere the distractorId of the correct response is noted. [quoted text, click to view] > If you ever delete from any of these tables, you'll want to make > sure all the foreign key fields are indexed. Otherwise, for > instance, any time you delete a row from TestQuestions_Tests, > a table scan of TestResults will be done to make sure the delete > can be allowed.
Actually, I go the other way on that. I delete from the Tests or TestQuestions table, and it cascades into the intersection tables. Thanks for the help with the "NOT EXISTS" part. -- Rick
Don't see what you're looking for? Try a search.
|