Groups | Blog | Home
all groups > sql server new users > august 2005 >

sql server new users : calculating percentile


Loane Sharp
8/30/2005 12:00:00 AM
Hi there
I picked up Ken Henderson's book on T-SQL which has been really enlightening
about what is possible, but a bit over my head in terms of
operationalization. However he doesn't seem to cover calculating
percentiles. Is there another useful reference, or can you point me
essentially in the right direction?
Best regards
Loane

Ross Presser
8/30/2005 1:39:12 PM
[quoted text, click to view]

http://www.sqlteam.com/item.asp?ItemID=16480 covers the subject in detail.

Basically, the percentile rank of a given score X is the percent of scores
in the set that are less than X. So:

SELECT
convert(float, SUM(case when T.Value <= @X then 1 else 0 END)) / COUNT(*)
AS Percentile
FROM MyTable T

is the quick answer to the question. The convert(float, ) is necessary
because otherwise the division would be between two integers, which would
Ross Presser
8/30/2005 3:26:40 PM
[quoted text, click to view]

D'oh. Of course you need to multiply that by 100 to get a percentile;
otherwise you get a fraction between zero and one. Sorry.

SELECT
100.0 * convert(float,
SUM(case when T.Value <= @X then 1 else 0 END)) / COUNT(*)
AS Percentile
Loane Sharp
8/31/2005 9:15:27 PM
thanks very much ross

[quoted text, click to view]

Loane Sharp
9/1/2005 12:00:00 AM
Hi Steve, Much appreciated. I changed your formulation to exclude the CROSS
JOIN, since the query was then looping through literally billions of rows.
In this new formulation, for 1,000 rows the query works perfectly, but for
more than that (specifically 135,000 rows) the query takes a heck of a long
time to run. I tried using the index tuning wizard, but this doesn't make
any recommendations. Eventually the server spontaneously reboots. I'm using
an IBM IntelliStation 3.0GHz with 2GB RAM and SQL Server 2000, which should
be OK for most purposes. Have I done something wrong in reformulating the
query?
Best regards
Loane

use candidate
update candtemp
set commscore=x.CommScore
from
(
select
c1.admref,
cast(
coalesce((select count(CommT)
from candtemp as c2
where c2.commt <= c1.commt),0.00) as float)/
cast(
coalesce((select count(CommT)
from candtemp),0.00) as float)
* 100 as CommScore
from candtemp as c1
) as x


[quoted text, click to view]

Loane Sharp
9/1/2005 12:06:42 AM
Hi Ross
This is no doubt going beyond the call of duty, but how would I calculate
the percentile for each of, say, 1,000 records (ie. not using a single value
assigned to the %X local variable)?
Best regards
Loane


[quoted text, click to view]

Steve Kass
9/1/2005 1:14:44 AM
Here is how you might find the % rank of each Freight value in the
Northwind..Orders table:


select
O1.Freight,
cast(
coalesce((select count(*)*100.0/TC.totalCount
from Orders as O2
where O2.Freight < O1.Freight),0.00) as decimal(5,2)) as
Percentile
from Orders as O1
cross join (
select count(*) as totalCount
from Orders
) as TC
group by O1.Freight, TC.totalCount
order by Percentile

Steve Kass
Drew University

[quoted text, click to view]
Loane Sharp
9/2/2005 12:00:00 AM
Hi Steve, Much appreciated. I changed your formulation to exclude the CROSS
JOIN, since the query was then looping through literally billions of rows.
In this new formulation, for 1,000 rows the query works perfectly, but for
more than that (specifically 135,000 rows) the query takes a heck of a long
time to run. I tried using the index tuning wizard, but this doesn't make
any recommendations. Eventually the server spontaneously reboots. I'm using
an IBM IntelliStation 3.0GHz with 2GB RAM and SQL Server 2000, which should
be OK for most purposes. Have I done something wrong in reformulating the
query?
Best regards
Loane

use candidate
update candtemp
set commscore=x.CommScore
from
(
select
c1.admref,
cast(
coalesce((select count(CommT)
from candtemp as c2
where c2.commt <= c1.commt),0.00) as float)/
cast(
coalesce((select count(CommT)
from candtemp),0.00) as float)
* 100 as CommScore
from candtemp as c1
) as x

AddThis Social Bookmark Button