Is this an example of a real problem? If this is a realistic representation of your data then my solution would be to fix the design. Delete the duplicated data and add a natural key. Of course, it may be that your example isn't realistic in which case it would help us answer you better if you could correct it. -- David Portas SQL Server MVP --
Thanks for the help Ross I came up with the similar idea, but instead of self joining I used the following technique: select * from ( select b.CustomerID, b.ID, 'B' as BID, ( select count(*) from TestCountJoin b2 where b2.CustomerID = b.CustomerID and b2.ItemCode = 'B' and b2.ID <= b.ID ) as seq from ( -- get all Bs select distinct b.CustomerID, b.ID from TestCountJoin b where b.ItemCode = 'B' ) b ) b inner join ( select a.CustomerID, a.ID, 'A' as BID, ( select count(*) from TestCountJoin a2 where a2.CustomerID = a.CustomerID and a2.ItemCode = 'A' and a2.ID <= a.ID ) as seq from ( -- get all Bs select distinct b.CustomerID, b.ID from TestCountJoin b where b.ItemCode = 'A' ) a ) a on a.CustomerID = b.CustomerID and a.Seq = b.Seq do you think there would be any performance benifit between either method? Thanks again -=- Brett [quoted text, click to view] "Ross Presser" <rpresser@NOSPAMgmail.com.invalid> wrote in message news:1kc4mmdxlmnh6.ydurg2lf9rs1.dlg@40tude.net... > On Wed, 8 Jun 2005 10:52:50 +0100, Brett Gerhardi wrote: > >> Can anyone work out how I can do this in a set-based way? I am totally >> stumped on this one. >> >> Given the following: >> > [snip DDL and inserts] >> I want to return all instances where ItemCode 'B' can be paired with an >> ItemCode 'A' for the same CustomerID (can be in PK order) >> >> By "paired" I mean that each ItemCode B can only match a line with >> ItemCode >> 'A' once, then both lines are excluded from any further matching. >> >> So if a single customer has 4 Bs and 3 A's, only the first 3 Bs should be >> returned from the query, as there is no 4th A to match >> >> So the results from the above should be >> >> CustomerID, B_ID, B_ItemCode, A_ID, A_ItemCode >> 1, 4, B, 1, A >> 1, 5, B, 2, A >> 1, 6, B, 3, A >> 2, 9, B, 8, A >> >> 3 pairs for customer 1, 1 pair for customer 2 and none from customer 3. >> >> Thanks for any help >> -=- Brett > > Thanks for the DDL and INSERTs. > > You'll have to use a self-join to number the rows, then another self-join > to filter them. Put the first self-join in a view for convenience: > > CREATE View TestView AS > > SELECT > x.CustomerID, x.ID, x.ItemCode, count(y.ID) RowNum > FROM > TestCountJoin x > INNER JOIN TestCountJoin y > ON x.CustomerID = y.CustomerID AND x.ItemCode=y.ItemCode > AND x.ID >= y.ID > GROUP BY > x.CustomerID, x.ID, x.ItemCode > > GO > > Now, the filtering is easy: > > > SELECT > a.CustomerID, > b.ID B_ID, b.ItemCode B_ItemCode, > a.ID A_ID, a.ItemCode A_ItemCode > FROM > TestView a, TestView b > WHERE > a.ItemCode = 'A' AND b.ItemCode = 'B' > AND a.CustomerID = b.CustomerID > AND a.RowNum = b.RowNum > > CustomerID B_ID B_ItemCode A_ID A_ItemCode > ----------- ----------- ---------- ----------- ---------- > 1 4 B 1 A > 1 5 B 2 A > 1 6 B 3 A > 2 9 B 8 A > > Make sure you add indexes for performance.
[quoted text, click to view] On Wed, 8 Jun 2005 10:52:50 +0100, Brett Gerhardi wrote: > Can anyone work out how I can do this in a set-based way? I am totally > stumped on this one. > > Given the following: > [snip DDL and inserts] > I want to return all instances where ItemCode 'B' can be paired with an > ItemCode 'A' for the same CustomerID (can be in PK order) > > By "paired" I mean that each ItemCode B can only match a line with ItemCode > 'A' once, then both lines are excluded from any further matching. > > So if a single customer has 4 Bs and 3 A's, only the first 3 Bs should be > returned from the query, as there is no 4th A to match > > So the results from the above should be > > CustomerID, B_ID, B_ItemCode, A_ID, A_ItemCode > 1, 4, B, 1, A > 1, 5, B, 2, A > 1, 6, B, 3, A > 2, 9, B, 8, A > > 3 pairs for customer 1, 1 pair for customer 2 and none from customer 3. > > Thanks for any help > -=- Brett
Thanks for the DDL and INSERTs. You'll have to use a self-join to number the rows, then another self-join to filter them. Put the first self-join in a view for convenience: CREATE View TestView AS SELECT x.CustomerID, x.ID, x.ItemCode, count(y.ID) RowNum FROM TestCountJoin x INNER JOIN TestCountJoin y ON x.CustomerID = y.CustomerID AND x.ItemCode=y.ItemCode AND x.ID >= y.ID GROUP BY x.CustomerID, x.ID, x.ItemCode GO Now, the filtering is easy: SELECT a.CustomerID, b.ID B_ID, b.ItemCode B_ItemCode, a.ID A_ID, a.ItemCode A_ItemCode FROM TestView a, TestView b WHERE a.ItemCode = 'A' AND b.ItemCode = 'B' AND a.CustomerID = b.CustomerID AND a.RowNum = b.RowNum CustomerID B_ID B_ItemCode A_ID A_ItemCode ----------- ----------- ---------- ----------- ---------- 1 4 B 1 A 1 5 B 2 A 1 6 B 3 A 2 9 B 8 A
Can anyone work out how I can do this in a set-based way? I am totally stumped on this one. Given the following: create table TestCountJoin ( ID int identity(1,1) primary key, CustomerID int, ItemCode varchar(1) ) insert into TestCountJoin (CustomerID, ItemCode) values (1, 'A') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'A') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'A') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'B') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'B') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'B') insert into TestCountJoin (CustomerID, ItemCode) values (1, 'B') insert into TestCountJoin (CustomerID, ItemCode) values (2, 'A') insert into TestCountJoin (CustomerID, ItemCode) values (2, 'B') insert into TestCountJoin (CustomerID, ItemCode) values (3, 'A') insert into TestCountJoin (CustomerID, ItemCode) values (3, 'A') I want to return all instances where ItemCode 'B' can be paired with an ItemCode 'A' for the same CustomerID (can be in PK order) By "paired" I mean that each ItemCode B can only match a line with ItemCode 'A' once, then both lines are excluded from any further matching. So if a single customer has 4 Bs and 3 A's, only the first 3 Bs should be returned from the query, as there is no 4th A to match So the results from the above should be CustomerID, B_ID, B_ItemCode, A_ID, A_ItemCode 1, 4, B, 1, A 1, 5, B, 2, A 1, 6, B, 3, A 2, 9, B, 8, A 3 pairs for customer 1, 1 pair for customer 2 and none from customer 3. Thanks for any help -=- Brett
I have tried it myself and Ross's joining technique is much more efficient than my calculated field method. Thinking about it I'm not surprised. In my real world scenario I also found putting the output from the pre-rownum'ed into a table variable (with unique constraints for indexes), then doing the self join on the table variable improved the performance significantly. Thanks again for the help. Regards -=- Brett [quoted text, click to view] "Brett Gerhardi" <brett.gerhardi@nildram.net> wrote in message news:Po6dndp1tpY9szrfRVnyiw@pipex.net... > Thanks for the help Ross > > I came up with the similar idea, but instead of self joining I used the > following technique: > > select * > from > ( > select b.CustomerID, b.ID, 'B' as BID, > ( > select count(*) > from TestCountJoin b2 > where b2.CustomerID = b.CustomerID > and b2.ItemCode = 'B' > and b2.ID <= b.ID > ) as seq > from > ( > -- get all Bs > select distinct b.CustomerID, b.ID > from TestCountJoin b > where b.ItemCode = 'B' > ) b > ) b > inner join > ( > select a.CustomerID, a.ID, 'A' as BID, > ( > select count(*) > from TestCountJoin a2 > where a2.CustomerID = a.CustomerID > and a2.ItemCode = 'A' > and a2.ID <= a.ID > ) as seq > from > ( > -- get all Bs > select distinct b.CustomerID, b.ID > from TestCountJoin b > where b.ItemCode = 'A' > ) a > ) a on a.CustomerID = b.CustomerID > and a.Seq = b.Seq > > > do you think there would be any performance benifit between either method? > > Thanks again > -=- Brett > > "Ross Presser" <rpresser@NOSPAMgmail.com.invalid> wrote in message > news:1kc4mmdxlmnh6.ydurg2lf9rs1.dlg@40tude.net... >> On Wed, 8 Jun 2005 10:52:50 +0100, Brett Gerhardi wrote: >> >>> Can anyone work out how I can do this in a set-based way? I am totally >>> stumped on this one. >>> >>> Given the following: >>> >> [snip DDL and inserts] >>> I want to return all instances where ItemCode 'B' can be paired with an >>> ItemCode 'A' for the same CustomerID (can be in PK order) >>> >>> By "paired" I mean that each ItemCode B can only match a line with >>> ItemCode >>> 'A' once, then both lines are excluded from any further matching. >>> >>> So if a single customer has 4 Bs and 3 A's, only the first 3 Bs should >>> be >>> returned from the query, as there is no 4th A to match >>> >>> So the results from the above should be >>> >>> CustomerID, B_ID, B_ItemCode, A_ID, A_ItemCode >>> 1, 4, B, 1, A >>> 1, 5, B, 2, A >>> 1, 6, B, 3, A >>> 2, 9, B, 8, A >>> >>> 3 pairs for customer 1, 1 pair for customer 2 and none from customer 3. >>> >>> Thanks for any help >>> -=- Brett >> >> Thanks for the DDL and INSERTs. >> >> You'll have to use a self-join to number the rows, then another self-join >> to filter them. Put the first self-join in a view for convenience: >> >> CREATE View TestView AS >> >> SELECT >> x.CustomerID, x.ID, x.ItemCode, count(y.ID) RowNum >> FROM >> TestCountJoin x >> INNER JOIN TestCountJoin y >> ON x.CustomerID = y.CustomerID AND x.ItemCode=y.ItemCode >> AND x.ID >= y.ID >> GROUP BY >> x.CustomerID, x.ID, x.ItemCode >> >> GO >> >> Now, the filtering is easy: >> >> >> SELECT >> a.CustomerID, >> b.ID B_ID, b.ItemCode B_ItemCode, >> a.ID A_ID, a.ItemCode A_ItemCode >> FROM >> TestView a, TestView b >> WHERE >> a.ItemCode = 'A' AND b.ItemCode = 'B' >> AND a.CustomerID = b.CustomerID >> AND a.RowNum = b.RowNum >> >> CustomerID B_ID B_ItemCode A_ID A_ItemCode >> ----------- ----------- ---------- ----------- ---------- >> 1 4 B 1 A >> 1 5 B 2 A >> 1 6 B 3 A >> 2 9 B 8 A >> >> Make sure you add indexes for performance. > >
My real world problem is far more complex than just this - this is the only part of the problem that I didn't have a good solution to and was interested in other peoples approaches to it. Briefly, the real-world scenario is that I have the task of building part of a discount system. The discount system has is able to automatically provide a discount on service type sales lines based on other service type sales lines. So essentially if a customer has a service A we want to provide him a discount if he has service B. However if the customer has 2 As but only 1 B, we only want to give 1 A the discount. So this is an secondary requirement away from the 'natural' purpose of the service 'sales' system, which is to record orders for services and provision them. There are further complexities such as start and end dates which remove lines from being eligable, which makes the run-time match calculation desirable. Regards -=- Brett [quoted text, click to view] "David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message news:_didnfLU-PvhojrfRVn-qw@giganews.com... > Is this an example of a real problem? If this is a realistic > representation of your data then my solution would be to fix the design. > Delete the duplicated data and add a natural key. Of course, it may be > that your example isn't realistic in which case it would help us answer > you better if you could correct it. > > -- > David Portas > SQL Server MVP > -- > >
Don't see what you're looking for? Try a search.
|