Groups | Blog | Home
all groups > sql server (microsoft) > march 2005 >

sql server (microsoft) : I need to find total for rows with a certain value: how?


someone
3/29/2005 8:25:32 PM
I have a table that contains a column that collects discreet data (1 for on
and 0 for off, just those two values) and a date column which is a timestamp
containing a date and time. There are other columns in this table but they
are unimportant in this discussion.

A process inserts rows into this table about every second or so, depending
on the state of an object.

It has been requested of me that I prepare a report. Among other items,
this report must show the total time an object is turned on. In other
words, for every row that is set to 1, I need to add up the time and let the
users know how long that item was set to on for a given period of time, say
24 hours i.e. how long was that object in an on state during the last 24
hours.

I consider myself to be pretty handy at SQL but I must confess I'm at a loss
to come up with a query for this.

Any ideas and examples would be greatly appreciated?

Thanks in advance,
Chris Smith

sienko
3/30/2005 1:14:22 AM
It might be helpful for you to provide sample data, especially about
this timestamp field. When a row is inserted with the on/off flag set
to ON, what value is in the timestamp column? The way I see it, there
are 3 options: -

1. The time the object entered the ON state only
2. The time the object left the ON state only
3. Both the time the objected entered the ON state + the time it left
or the duration it was in this state.

The 3rd option is easy because all you need to do is sum the duration
(or calculate it from the Left - Entered Time and sum) for each state
in a given time period.

The 1st and 2nd options are a bit tough since you have to form a sort
of JOIN to find the related state from which the object toggled from or
to. e.g. if I store the time the object entered the ON state, to find
out how long it stayed in this state, I need to find the immediate NEXT
time it entered the OFF state. Likewise, if it is the end time that is
stored, I need to find the immediate PREVIOUS time it left the OFF
state.

Another thing you will have to consider, though this has to do with
data integrity, how do u ensure no overlaps occur in the records (such
that your data says at a given point in time, the object was both ON
and OFF)? Do you allow time gaps in which you can't tell whether the
object was ON or OFF? if you don't how do u ensure there are no gaps?
If you do, how do you interpret the time where there are no records?

Please clarify...
sienko
3/30/2005 1:25:09 AM
It might be helpful for you to provide sample data, especially about
this timestamp field. When a row is inserted with the on/off flag set
to ON, what value is in the timestamp column? The way I see it, there
are 3 options: -

1. The time the object entered the ON state only
2. The time the object left the ON state only
3. Both the time the objected entered the ON state + the time it left
or the duration it was in this state.

The 3rd option is easy because all you need to do is sum the duration
(or calculate it from the Left - Entered Time and sum) for each state
in a given time period.

The 1st and 2nd options are a bit tough since you have to form a sort
of join to find the related state from which the object toggled from or
to. e.g. if I store the time the object entered the ON state, to find
out how long it stayed in this state, I need to find the NEXT time it
entered the OFF state. Likewise, if it is the end time that is stored,
I need to find the PREVIOUS time it left the OFF state.

Another thing you will have to consider, though this has to do with
integrity, how do u ensure no overlaps occur in the records? Do you
allow time gaps in which you can't tell whether the object was ON or
OFF? if you don't how do u ensure there are no gaps? If you do, how do
you interpret the time where there are no records?

Please clarify...
--CELKO--
4/2/2005 9:58:14 AM
You usually model time as durations, so you would have

CREATE TABLE Events
(event_id CHAR(10) NOT NULL,
start_time DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
end_time DATETIME, -- null means current
.. );
someone
4/2/2005 12:49:01 PM
Hi Sienko,

I'll try to answer your questions as best as I can.

First of all, it should be noted this isn't a typical database. This is
actually a database accessed through MS SQL 2000 called InSQL. InSQL allows
you to store real time data more efficiently than SQL Server while allowing
for more transactions per second and storing the data in a manner that
allows for smaller files on the harddrive than what SQL Server itself would
allow. InSQL itself is actually made up of extension tables to SQL Server
2000. Along with this, InSQL provides other things such as providing a
discreete data type which allows you to store values that tells you if a
device is on or off where 1=on/open and 0=off/close. This probably isn't
important for you to know but it might be. For the most part, T-SQL is
still valid so any solution you can help me find should still work.

A typical row will include a column for a tagname, a discreete value column
(contains a value of 1 or 0, that is all as far as I know of), and a
timestamp which is just a datetime column (this column is equal to the time
the discreete value was retrieved from a device, not the time the row was
created). There are other columns but they aren't important to this
discussion as far as I know of. An example row might look like:

tagname_here ---- discreete value ---- datetime

There are rows inserted for each tag about every second or so. We'll say a
row is inserted each second to keep this simple. The rows will be the same,
just the fact the discreete value is different along with the time. For
example, a valve maybe open at the time a sensor takes a reading so this
data is retrieved (the state of the valve along with the datetime the
reading was taken) and inserted into a table (discreetehistory is the name
of the table I think, we'll use that for the sake of this conversation). As
long as the valve is open, a row will be inserted into this table where that
row will have a 1 in the discreete column and the datetime that value was
taken. When the valve is closed, rows will be inserted into the table where
the discreete value is now 0 and it has a datetime with it. Again, as long
as the valve is closed, rows will be inserted into the table with a
discreete value of 0 along with it's datetime. In other words, we have a
process that is looking at equipment 24/7/365. This process determines the
state of devices and inserts rows into a table every second or so.
Everytime a row is written, that row represents the state of the device and
a datetime is stored in that row to let us know when the sensor reading was
taken.

I don't think the table will allow null values nor do I believe a null value
will ever occur but am not certain. There is a transition period between
opening and closing valves. After all, a valve doesn't instantly open or
close. Depending on the size of size valve, it might take 1 to 5 seconds to
switch states. But, I don't believe this transition state is being
recorded. I wish I was at work answering this. I could tell you for sure
then.

Again, the problem is I need a query where I can report how long a valve was
left open in the last two hours. In other words, for a time from 12:00PM to
2:00 PM, the valve might be open at first but close at 12:30. Then, the
valve might open at 12:40. Then, the valve might be close at 1:00PM and
then open again at 1:10 PM. In other words, the valve was opened and
closeded multiple times during this two hour block. In total, the valve was
open 1hr and 40 minutes during this two hour block with the valve being
closed for 20 minutes. I need a query that will read this table and
everytime the valve is open which is represented with a discreete value of 1
(where 1 = open), then it will sum up all the times and report this to me.

I thought I was pretty good with T-SQL but I've never done anything like
this before when it comes to time. I must admit I don't have the slightest
idea how to proceed with this one.

Any thoughts would be greatly appreciated. If you need more information,
please let me know.

Thanks for trying to help me out. I apologize for taking a while to respond
to you but I was busy and simply forgot about this.

Thanks again!

Chris Smith

[quoted text, click to view]

someone
4/2/2005 8:39:01 PM
The thing is I'm not creating a table with multiple columns for dates. A
table is already present with one datecolumn only. I need to find all rows
within a timespan that have a value equal to 1 in the discreete value
column. Once I find all the rows, I need to add them all somehow such that
it will tell me the total time all the rows were set to a 1 in this column.
Any ideas on that?

Thanks!

Chris Smith

[quoted text, click to view]

sienko
4/4/2005 1:27:33 AM
Your reply was quite elaborate! I think it clears most of my queries.
First, just so as we confirm that I got the problem as it is, the table
contains the poll status of the device with a resolution of 1 second.
I.e. each second or so, it checks that status of the device and stores
the device tag (supposedly), the status the device is in at the time of
polling, and the time the device was polled. Here's an example

(tag, state, datetime,...)

MAC-1, 1, 2005-04-04 12:00:00.021,....
MAC-1, 1, 2005-04-04 12:00:01.634,....
MAC-1, 1, 2005-04-04 12:00:02.301,....
.....
MAC-1, 0, 2005-04-04 12:10:56.041,....
MAC-1, 0, 2005-04-04 12:10:57.776,....
.....
MAC-1, 1, 2005-04-04 12:24:45.321,....


So here's how I would do it.

First, I'll assume that an ID is not stored for each record and even if
it is, I'll assume it is not contiguous (I'll need a contiguous ID in
my query). As such, I'll create a temp table with an identity column to
facilitate numbering the rows for a given device in order from the
earliest to latest. I hope the DBMS you are using (InSQL) has a
facility to create temporary tables.

create table #tmp (sid int identity(1,1), timeid datetime,
state bit)

Next, I'll populate this table with the appropriate values i.e. from a
particular machine in a particular time frame. I have to order by time
so that the sid identity column can be ordered by time.

insert into #tmp (timeid, state)
select timeid, state from poll
where tag='MAC-1' and timeid between @query_start and @query_end
order by timeid

After that, you do your query grouping by state so that you can sum the
time difference

select p.state, sum(datediff(ss,p2.timeid,p.timeid)) as
time_in_state
from #tmp p
inner join #tmp p2 on p.sid-1=p2.sid
group by p.state

(You get something like this....
state time_in_state
------ ------------------
0 234
1 879
)
Note that I'm joining the current record (#tmp as p) with the record
inserted just before it (#tmp as p2). So from the above sample, a
joined record would be like: -

MAC-1, 0, 2005-04-04 12:10:57.776,...., MAC-1, 0, 2005-04-04
12:10:56.041,....

So I make an assumption that, when recorded state at 2005-04-04
12:10:57.776 is 0, this implies that the machine has been in state 0
from the last reported time (2005-04-04 12:10:56.041) till this time (
2005-04-04 12:10:57.776). So it has been in state 0 for 1.735 seconds.
Note that this previous time might also have been a 0 state but all we
care about is the current polling status.

So all we do is sum this 1. and 0. seconds to get the total time the
device was in that state. The summed value can only be as accurate as
the resolution time. So, given that your resolution time is to poll
every 1 second, the error margin for each will be in the order of
microseconds. As you sum, the error will of course increase. So if you
want accurate summations per second, you might consider increasing the
polling interval. Otherwise, if a summation in minutes or hours is good
enough, with an error margin of a few seconds being ok, then this will
work for you.

I used random data to test this query and it worked fine. Try it with
your live data and post any issues you might find. All the same I hope
this works for you.
AddThis Social Bookmark Button