Groups | Blog | Home
all groups > sql server programming > october 2004 >

sql server programming : percentages of top 10 tallies



Tom Moreau
10/5/2004 9:51:42 PM
I hate to recommend temp tables but this is a case where they would be
acceptable. Consider the fact that you're asking the server to do a lot by
creating this result set. Then, you'll have to do it again in order to get
the percentages. Here's a solution:

create table #t
(
brand int primary key -- I assumed an int, since you didn't
provide DDL
, tally int not null
)

insert #t
select top 10 bb.brand, count(bb.brand) as tally
from participant p, beerbrands bb
where p.favbeerid = bb.beerid
group by bb.brand
order by tally desc

select
t1.*
, 100.0 * t1.tally / (select sum (tally) from #t) as Percentage
from
#t

drop table #t

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com
..
[quoted text, click to view]
i have the following query:

select top 10 bb.brand, count(bb.brand) as tally
from participant p, beerbrands bb
where p.favbeerid = bb.beerid
group by bb.brand
order by tally desc

it will give me the top 10 brand, and count of that brand. Fine. But I
need a 3rd column which is the percentage of that brand's tally, out
of the sum of the top 10 tallies.

anyone?... anyone?...
a NO[at]SPAM b.com
10/5/2004 11:07:24 PM
i have the following query:

select top 10 bb.brand, count(bb.brand) as tally
from participant p, beerbrands bb
where p.favbeerid = bb.beerid
group by bb.brand
order by tally desc

it will give me the top 10 brand, and count of that brand. Fine. But I
need a 3rd column which is the percentage of that brand's tally, out
of the sum of the top 10 tallies.

anyone?... anyone?...
Steve Kass
10/5/2004 11:31:38 PM
Tom,

Here's a way to avoid recreating the result set that uses ROLLUP to
get the total into the result set, and a sneaky join with the table of
all brands to get the brands with the total in the same row. Here's an
example from Northwind - it gets the top 10 customers by number of
orders descending (with ties, since there is a tie for 10th), and gives
the percentages each customer's orders account for from among those top
10 (actually 12) customers listed.

I would be hard pressed to say this is better, since with only 10
rows, the temp table solution is hardly a problem, but I'm posting it
wondering if anyone can make it a bit simpler. Somehow I don't think it
needs three levels of aggregation.

select
coalesce(T.CustomerID,C.CustomerID) as CustomerID,
min(T.Orders) as Orders,
cast(min(T.Orders)*100.00/max(T.Orders) as decimal(5,2)) as [%]
from (
select
CustomerID,
sum(Orders) as Orders
from (
select top 10 with ties
CustomerID,
count(OrderID) as Orders
from Northwind..Orders
group by CustomerID
order by count(OrderID) desc
) T
group by CustomerID, Orders
with rollup
having grouping(CustomerID) = grouping(Orders)
) T left outer join (
select CustomerID
from Northwind..Customers
) C
on coalesce(T.CustomerID,C.CustomerID) = C.CustomerID
group by coalesce(T.CustomerID,C.CustomerID)
having count(*) = 2
go

Steve Kass
Drew University

[quoted text, click to view]
Tom Moreau
10/6/2004 7:36:45 AM
Interesting. I don't get the same percentages as you:

create table #t
(
CustomerID nchar (5) primary key
, Orders int not null
)

insert #t
select top 10
CustomerID
, count (*)
from
Orders
group by
CustomerID
order by
count (*) desc

select
t1.*
, 100.0 * t1.Orders / (select sum (Orders) * 1.0 from #t) as Percentage
from
#t t1
go

drop table #t
go


--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com
..
[quoted text, click to view]
Tom,

Here's a way to avoid recreating the result set that uses ROLLUP to
get the total into the result set, and a sneaky join with the table of
all brands to get the brands with the total in the same row. Here's an
example from Northwind - it gets the top 10 customers by number of
orders descending (with ties, since there is a tie for 10th), and gives
the percentages each customer's orders account for from among those top
10 (actually 12) customers listed.

I would be hard pressed to say this is better, since with only 10
rows, the temp table solution is hardly a problem, but I'm posting it
wondering if anyone can make it a bit simpler. Somehow I don't think it
needs three levels of aggregation.

select
coalesce(T.CustomerID,C.CustomerID) as CustomerID,
min(T.Orders) as Orders,
cast(min(T.Orders)*100.00/max(T.Orders) as decimal(5,2)) as [%]
from (
select
CustomerID,
sum(Orders) as Orders
from (
select top 10 with ties
CustomerID,
count(OrderID) as Orders
from Northwind..Orders
group by CustomerID
order by count(OrderID) desc
) T
group by CustomerID, Orders
with rollup
having grouping(CustomerID) = grouping(Orders)
) T left outer join (
select CustomerID
from Northwind..Customers
) C
on coalesce(T.CustomerID,C.CustomerID) = C.CustomerID
group by coalesce(T.CustomerID,C.CustomerID)
having count(*) = 2
go

Steve Kass
Drew University

[quoted text, click to view]
Tom Moreau
10/6/2004 8:33:25 AM
OK, got it. My solution uses less IO and yours has the lowest query cost.
The Orders table in Northwind is quite small. I'm curious as to what the
performance would be with a larger table.

--
Tom

---------------------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com


[quoted text, click to view]

Hi Tom,

That's because Steve uses with ties and you don't.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Tom Moreau
10/6/2004 12:39:48 PM
Hugo is using what is called a derived table. You must specify an alias -
in this case, T - or you will get an error.

--
Tom

---------------------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com


[quoted text, click to view]
thanks for the help guys.
a quick question: what is the capital T for in Hugo's query? It seems
I can change it to anything and it still works, but if I remove it, it
doesn't.


On Wed, 6 Oct 2004 08:33:25 -0400, "Tom Moreau"
[quoted text, click to view]
Hugo Kornelis
10/6/2004 12:49:48 PM
[quoted text, click to view]

Hi Steve,

I'm not sure if this is simpler. But is is different.

select CustomerID, Orders, cast(100.0 * Orders / (
select sum(Orders)
from (
select top 10 with ties
CustomerID,
count(OrderID) as Orders
from Northwind..Orders
group by CustomerID
order by count(OrderID) desc
) T) as decimal(5,2)) as [%]
from (
select top 10 with ties
CustomerID,
count(OrderID) as Orders
from Northwind..Orders
group by CustomerID
order by count(OrderID) desc
) T


Best, Hugo
--

Hugo Kornelis
10/6/2004 2:21:52 PM
[quoted text, click to view]

Hi Tom,

That's because Steve uses with ties and you don't.

Best, Hugo
--

a NO[at]SPAM b.com
10/6/2004 3:51:16 PM
thanks for the help guys.
a quick question: what is the capital T for in Hugo's query? It seems
I can change it to anything and it still works, but if I remove it, it
doesn't.


On Wed, 6 Oct 2004 08:33:25 -0400, "Tom Moreau"
[quoted text, click to view]
Steve Kass
10/6/2004 5:01:54 PM
Hugo,

My goal was to avoid two scans of the Orders table without using a
temp table. While I succeeded, I had to bring in the Customers table,
and it's hard to guess when that amounts to a worthwhile tradeoff.

SK

[quoted text, click to view]
a NO[at]SPAM b.com
10/6/2004 8:00:39 PM

ok thanks for clearing that up and provinding me with a reference.


On Wed, 06 Oct 2004 20:48:12 +0200, Hugo Kornelis
[quoted text, click to view]
Hugo Kornelis
10/6/2004 8:48:12 PM
[quoted text, click to view]

Hi Mortar,

If you use tables (or views) in the FROM clause, the syntax is:

FROM { tablename | viewname } [ [AS] table_alias ]

You don't have to provide an alias (in which case you can reference
columns from a table (or view) by prefixing the column name with the
tablename (or viewname). If you do provide an alias, you'll have to prefix
columns with the alias instead of the tablename (or viewname).


If you use a derived table, you use the following syntax:

FROM (subquery) [ AS ] table_alias [ ( column_alias [, ...n ] ) ]

In this case, the table_alias is mandatory. This makes sense, as there
must be a way te reference columns in the result set of the subquery; it
has no name of it's own, so you have to supply it - even if all references
to columns from the derived table are done with column name only. Required
by the syntax.


For clarity and readability, I usually don't omit the optional keyword AS
and I try to choose a descriptive mnemonic as table alias. In this case, I
copied and adapted Steve's code and didn't change the alias.

Best, Hugo
--

Hugo Kornelis
10/7/2004 12:33:10 AM
[quoted text, click to view]

Hi Steve,

As always - only tests will tell what's the best choice for the OP.

Best, Hugo
--

AddThis Social Bookmark Button