Groups | Blog | Home
all groups > sql server data mining > december 2004 >

sql server data mining : Summarizing data, grouping by time period.


Nevyn Twyll
12/31/2004 1:44:01 PM
I have a table of student incidents, including absences from school, listing
the student ID, the date of an incident, and whether it was an absence.

I want to get a result set from this table showing me ON ONE ROW, the
studentID, and the absences split up by term.

Table: [Incidents]
Pertinent Columns:
StudentID [int]
IncidentDate [datetime]
Absence [bool]

Let's say there are 3 terms: 9/1/2004-12/31/2004, 1/10/2005-3/18/2005,
3/28/2005-6/20/2005.

I know I could get a list of student absences total by saying:
SELECT StudentID, COUNT(*) FROM Incidents WHERE Absence = 1 GROUP BY
StudentID

But how would I get it so the absences were tallied by term, and put into
their own columns, like:
StudentID, Term1Absences, Term2Absences, Term3Absences.

Help!

Thanks.

- Nevyn

Ryan
12/31/2004 2:17:02 PM
Try a Case statement:
SELECT StudentID,
COUNT(Case When IncidentDate >= '9/1/2004' and IncidentDate <= '12/31/2004'
then 1 end) FallAbsences,
COUNT(Case When IncidentDate >= '1/10/2005' and IncidentDate <= '3/18/2005'
then 1 end) SpringAbsences
FROM Incidents WHERE Absence = 1
GROUP BY StudentID

You can swap out the COUNT for SUM. I'm not sure which is faster, I think
the count, but either way it's the same.

Ryan


[quoted text, click to view]
Nevyn Twyll
12/31/2004 2:33:15 PM
Oh, man, that ROCKS!

Thanks so much, man.

[quoted text, click to view]

Steve Kass
12/31/2004 5:09:07 PM
Nevyn,

A report writer will probably do the job more easily than SQL, but you
can do something like this:

select
StudentID,
(select count(*) from Incidents as I2
where I2.StudentID = I1.StudentID
and I2.IncidentDate between '20040901' and '20041231'
and I2.Absence = 1) as Term1Absences,
(select count(*) from Incidents as I2
where I2.StudentID = I1.StudentID
and I2.IncidentDate between '20050110' and '20040318'
and I2.Absence = 1) as Term2Absences,
-- and so on
from (
select distinct StudentID
from Incidents
) as I1


If you have a table of students and want students with no
absences listed also, you can use the students table instead
of the derived table aliased as I1 here.

Also, if the datetime values for IncidentDate are not constrained
to have a time part of midnight, you might want to do this instead:

and IncidentDate >= '20040901'
and IncidentDate < '20050101'
....

Hardcoding the term dates makes this code hard to maintain,
and short of designing a more flexible database, you could use
some variables:

declare @term1start datetime set @term1start = '20040901'
....
and I2.IncidentDate between @term1start and @term1end
....

Steve Kass
Drew University



[quoted text, click to view]
AddThis Social Bookmark Button