sql server data mining:
Hello, Scott This should do the job: SELECT * FROM Table1 T1 WHERE EXISTS ( SELECT * FROM Table2 T2 WHERE T1.A=T2.A AND T1.B=T2.B AND T1.C=T2.C AND T2.D=2 ) Razvan
I'm attempting to write a query that pulls data from two tables in a really complex way, and I can't figure a way around it. Because of the architecture in use, it has to be done in a single query, without using a stored procedure. This is a simplified version of what I'm doing: Let's say I have two tables, each with four columns, named A, B, C and D: Table 1 A,B,C,D 1,2,3,5 3,5,7,9 3,9,2,6 3,6,7,1 9,8,3,0 2,5,8,1 Table 2 A,B,C,D 5,3,2,3 4,8,7,1 3,5,7,2 1,4,5,6 8,9,3,2 2,6,2,2 Each of the tables has thousands of records. I need a query that does this: For each row in table 1, look up the values in table 2. If a row can be found in table 2 whose columns A, B and C match columns A, B and C in table 1, and whose column D equals 2, then return the row from table 1. Do not return a row unless all three columns A, B and C match, and Table2.D equals 2. Using this rule on the table values above, it should be returning only the row "3,5,7,9" from Table 1 because there is only one row in Table 2 that has columns A,B,C,D equal to "3,5,7,2" Here's what I'm trying now: SELECT * FROM Table1 WHERE Table1.A IN (SELECT DISTINCT(A) FROM Table2 WHERE Table2.D = 2) AND Table1.B IN (SELECT DISTINCT(B) FROM Table2 WHERE Table2.D = 2) AND Table1.C IN (SELECT DISTINCT(C) FROM Table2 WHERE Table2.D = 2) This returns 3,5,7,9, 3,9,2,6 and 3,6,7,1 from Table 1, when it should only be returning 3,5,7,9. It does this because column A matches (3), Column B matches multiple records in Table 2 with a Table2.D=2, and so does Column C. Even though I am specifying AND for the three columns, it is actually acting as an OR, because each SELECT DISTINCT is separate, and does not take into consideration the other SELECTs. The first SELECT DISTINCT pulls a result set, and Column A is compared against it. The second SELECT DISTINCT pulls a different result set, and Column B is compared against it. And so on. This results in an additive comparison. What I need, in effect, is for the SELECT DISTINCT(B) to be more like "Take the SELECT DISTINCT(A) result set and pare it down further using the SELECT DISTINCT(B) values". Then The SELECT DISTINCT(C) needs to be more like "Take the SELECT DISTINCT(B) result set and pare it down even further using the SELECT DISTINCT(C) values". I hope this makes sense, and that someone can see what I'm missing. I've spent hours trying to figure out a way of doing this, without success.
Scott, Try this: SELECT A, B, C, D FROM [Table 1] AS T1 WHERE EXISTS ( SELECT * FROM [Table 2] AS T2 WHERE T2.A = T1.A AND T2.B = T1.B AND T2.C = T1.C AND T2.D = 2 ) or SELECT T1.A, T1.B, T1.C, T1.D FROM [Table 1] AS T1 JOIN [Table 2] AS T2 ON T2.A = T1.A AND T2.B = T1.B AND T2.C = T1.C WHERE T2.D = 2 The latter query will return duplicate rows, so it could need DISTINCT. Use whichever is faster (hopefully at least one has no typos). Steve Kass Drew University [quoted text, click to view] Scott MacLean wrote: >I'm attempting to write a query that pulls data from two tables in a really >complex way, and I can't figure a way around it. Because of the architecture >in use, it has to be done in a single query, without using a stored >procedure. This is a simplified version of what I'm doing: > >Let's say I have two tables, each with four columns, named A, B, C and D: > >Table 1 > >A,B,C,D >1,2,3,5 >3,5,7,9 >3,9,2,6 >3,6,7,1 >9,8,3,0 >2,5,8,1 > >Table 2 > >A,B,C,D >5,3,2,3 >4,8,7,1 >3,5,7,2 >1,4,5,6 >8,9,3,2 >2,6,2,2 > >Each of the tables has thousands of records. I need a query that does this: > >For each row in table 1, look up the values in table 2. If a row can be >found in table 2 whose columns A, B and C match columns A, B and C in table >1, and whose column D equals 2, then return the row from table 1. Do not >return a row unless all three columns A, B and C match, and Table2.D equals >2. > >Using this rule on the table values above, it should be returning only the >row "3,5,7,9" from Table 1 because there is only one row in Table 2 that has >columns A,B,C,D equal to "3,5,7,2" > >Here's what I'm trying now: > >SELECT * FROM Table1 >WHERE > Table1.A IN (SELECT DISTINCT(A) FROM Table2 WHERE Table2.D = 2) >AND > Table1.B IN (SELECT DISTINCT(B) FROM Table2 WHERE Table2.D = 2) >AND > Table1.C IN (SELECT DISTINCT(C) FROM Table2 WHERE Table2.D = 2) > >This returns 3,5,7,9, 3,9,2,6 and 3,6,7,1 from Table 1, when it should only >be returning 3,5,7,9. It does this because column A matches (3), Column B >matches multiple records in Table 2 with a Table2.D=2, and so does Column C. >Even though I am specifying AND for the three columns, it is actually acting >as an OR, because each SELECT DISTINCT is separate, and does not take into >consideration the other SELECTs. The first SELECT DISTINCT pulls a result >set, and Column A is compared against it. The second SELECT DISTINCT pulls a >different result set, and Column B is compared against it. And so on. This >results in an additive comparison. What I need, in effect, is for the SELECT >DISTINCT(B) to be more like "Take the SELECT DISTINCT(A) result set and pare >it down further using the SELECT DISTINCT(B) values". Then The SELECT >DISTINCT(C) needs to be more like "Take the SELECT DISTINCT(B) result set >and pare it down even further using the SELECT DISTINCT(C) values". > >I hope this makes sense, and that someone can see what I'm missing. I've >spent hours trying to figure out a way of doing this, without success. > > >
Thanks Steve and Razvan, this did the trick. Once I saw this I thought; "of COURSE" (hand slap on forehead). Thanks again. [quoted text, click to view] "Razvan Socol" <rsocol@gmail.com> wrote in message news:1110347824.072514.140200@z14g2000cwz.googlegroups.com... > Hello, Scott > > This should do the job: > > SELECT * FROM Table1 T1 > WHERE EXISTS ( > SELECT * FROM Table2 T2 > WHERE T1.A=T2.A AND T1.B=T2.B AND T1.C=T2.C > AND T2.D=2 > ) > > Razvan >
Don't see what you're looking for? Try a search.
|