Friday, March 30, 2012

Peak During Time Period

Hi

I am hoping someone might be able to help me out with this.

I am writing a helpdesk system which records agents logging in and out
of the system.

I need to write a stored procedure which will show the peak number of
agents logged in concurrently during a specified time period. Within
the time period, the person viewing the report should be able to
specify intervals at which to see the data.

Eg. There is already a table in the system which holds logged
in/logged out data like

22/11/2004 14:02 - 22/11/2004 17:30
22/11/2004 09:00 - 22/11/2004 17:12
22/11/2004 10:25 - 22/11/2004 16:30
22/11/2004 11:02 - 22/11/2004 12:30
22/11/2004 16:00 - 22/11/2004 17:30

The report user can then say for example they want to view data
between 10th November and 12th November broken down into 15 minutes
intervals which would produce a table like this:

10/11/2004 00:00 - 10/11/2004 00:15
10/11/2004 00:15 - 10/11/2004 00:30
10/11/2004 00:30 - 10/11/2004 00:45
10/11/2004 00:45 - 10/11/2004 01:00 etc etc

Against each of these time slots, I need to work out the peak number
of concurrent agents logged in using the first table.

Can anyone make any suggestions? The time period the report user can
choose are either 15 mins, 30 mins, 45 mins, 1 hour and 1 day.

Thanks in advance[posted and mailed, please reply in news]

Dave (dave@.court-technologies.com) writes:
> I need to write a stored procedure which will show the peak number of
> agents logged in concurrently during a specified time period. Within
> the time period, the person viewing the report should be able to
> specify intervals at which to see the data.
> Eg. There is already a table in the system which holds logged
> in/logged out data like
> 22/11/2004 14:02 - 22/11/2004 17:30
> 22/11/2004 09:00 - 22/11/2004 17:12
> 22/11/2004 10:25 - 22/11/2004 16:30
> 22/11/2004 11:02 - 22/11/2004 12:30
> 22/11/2004 16:00 - 22/11/2004 17:30
> The report user can then say for example they want to view data
> between 10th November and 12th November broken down into 15 minutes
> intervals which would produce a table like this:
> 10/11/2004 00:00 - 10/11/2004 00:15
> 10/11/2004 00:15 - 10/11/2004 00:30
> 10/11/2004 00:30 - 10/11/2004 00:45
> 10/11/2004 00:45 - 10/11/2004 01:00 etc etc
> Against each of these time slots, I need to work out the peak number
> of concurrent agents logged in using the first table.

The normal recommendation for this sort of post is to include:

o CREATE TABLE statements for the involved tables.
o INSERT statements with sample data.
o The desired output given the sample.

This makes it easy to post a tested solution, since the dirty work is
already set up, and it's only to cut and paste.

This time I did it for you, because the problem seemed interesting enough.
First I set up a table of numbers. This is a one-column table with numbers
1 to whatever the limit (80000 in this case, that's 55 days). The I packed
the actual query in a stored procedure to easily permit for parameters.
@.len is the length of the reporting interval in minutes.

The query has a number of nested derived tables. The innermost gives
the number of agents logged in at any given minute. The middle table,
normalizes the minute to the start of the reporting interval, and
the outermost, get the maximum count for each interval.

Further testing is recommended!

CREATE TABLE sessions (start datetime NOT NULL,
stop datetime NULL)
go
SET DATEFORMAT dmy
go
SELECT TOP 80000 n = identity(int, 1, 1)
INTO numbers
FROM Northwind..Orders a
CROSS JOIN Northwind..Orders b
go
INSERT sessions (start, stop)
SELECT '22/11/2004 14:02', '22/11/2004 17:30' UNION
SELECT '22/11/2004 09:00', '22/11/2004 17:12' UNION
SELECT '22/11/2004 10:25', '22/11/2004 16:30' UNION
SELECT '22/11/2004 11:02', '22/11/2004 12:30' UNION
SELECT '22/11/2004 16:00', '22/11/2004 17:30' UNION
SELECT '22/11/2004 16:00', '22/11/2004 16:05' UNION
SELECT '22/11/2004 16:06', '22/11/2004 16:10'
go
CREATE PROCEDURE get_peaks @.start datetime,
@.stop datetime,
@.len smallint AS

SELECT intstart, intstop = dateadd(mi, @.len, intstart), MAX(cnt)
FROM (SELECT intstart = dateadd(mi, @.len *
(datediff(mi, @.start, a.minute) / @.len), @.start),
a.cnt
FROM (SELECT mi.minute, cnt = COUNT(s.start)
FROM (SELECT minute = dateadd(mi, n, @.start)
FROM numbers
WHERE n <= datediff(mi, @.start, @.stop)) AS mi
LEFT JOIN sessions s
ON mi.minute BETWEEN s.start AND s.stop
GROUP BY mi.minute) AS a
) AS b
GROUP BY intstart
ORDER BY intstart
go
EXEC get_peaks '20041122 08:00', '20041122 18:00', 15
go
DROP TABLE numbers
DROP TABLE sessions
DROP PROCEDURE get_peaks

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||>> I am writing a helpdesk system which records agents logging in and
out of the system. <<

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. Does your boss, who is paying you, hide this
information and expect you to get your job done?

You might also want to learn that the only format for temporal data in
Standard SQL is ISO-8601 (yyyy-mm-dd hh:mm:ss.sss...) and start using
it; you can never tell, other systems just might follow iSO standards
:)

>> I need to write a stored procedure which will show the peak number
of agents logged in concurrently during a specified time period. <<

Why not VIEWs? SQL is a non-procdural language after all. If you had
followed minimal netiquette, would this table lok liket his?

CREATE TABLE HelpDeskLogs
(agent_id CHAR(5) NOT NULL
REFERENCES Agents(agent_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
start_time DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
finish_time DATETIME, -- null means still active
CHECK (start_time < finish_time),
PRIMARY KEY (agent_id, start_time));

>> The report user can then say for example they want to view data
between 10th November and 12th November broken down into 15 minutes
intervals which would produce a table like this: <<

Let's fill up a table of ranges:

CREATE TABLE ReportPeriods
(period_scale CHAR(2) DEFAULT '15' NOT NULL,
CHECK (period_scale IN ('15', '30', '45', 'HR', 'DY')
start_time DATETIME NOT NULL,
finish_time DATETIME NOT NULL, -- null means still active
CHECK (start_time < finish_time),
PRIMARY KEY (period_scale, start_time));

In standard SQL, we have a predicate for durations like this:

SELECT COUNT(agent_id) AS active_agents
FROM ReportPeriods AS P, HelpDeskLogs AS L
WHERE (P.start_time, P.finish_time)
OVERLAPS (L.start_time, L.finish_time);

That predicate gets translated into this:

(P.start_time > L.start_time
AND NOT (P.start_time >= L.finish_time
AND P.finish_time >= L.finish_time))
OR (L.start_time > P.start_time
AND NOT (L.start_time >= P.finish_time
AND L.finish_time >= P.finish_time))
OR (P.start_time = L.start_time
AND (P.finish_time <> L.finish_time
OR P.finish_time = L.finish_time))

Yes, it is a bit weird because it has to handle NULLs in the general
case.

You might also want to look up Rick Snodgrass at the University of
Arizona. he has a copy of his book on Temporal quereis in SQL in PDF
on his university website.|||"Dave" <dave@.court-technologies.com> wrote in message
news:7bbc1b13.0411250202.57f40780@.posting.google.c om...
> Hi
> I am hoping someone might be able to help me out with this.
> I am writing a helpdesk system which records agents logging in and out
> of the system.
> I need to write a stored procedure which will show the peak number of
> agents logged in concurrently during a specified time period. Within
> the time period, the person viewing the report should be able to
> specify intervals at which to see the data.
> Eg. There is already a table in the system which holds logged
> in/logged out data like
> 22/11/2004 14:02 - 22/11/2004 17:30
> 22/11/2004 09:00 - 22/11/2004 17:12
> 22/11/2004 10:25 - 22/11/2004 16:30
> 22/11/2004 11:02 - 22/11/2004 12:30
> 22/11/2004 16:00 - 22/11/2004 17:30
> The report user can then say for example they want to view data
> between 10th November and 12th November broken down into 15 minutes
> intervals which would produce a table like this:
> 10/11/2004 00:00 - 10/11/2004 00:15
> 10/11/2004 00:15 - 10/11/2004 00:30
> 10/11/2004 00:30 - 10/11/2004 00:45
> 10/11/2004 00:45 - 10/11/2004 01:00 etc etc
> Against each of these time slots, I need to work out the peak number
> of concurrent agents logged in using the first table.
> Can anyone make any suggestions? The time period the report user can
> choose are either 15 mins, 30 mins, 45 mins, 1 hour and 1 day.
> Thanks in advance

CREATE TABLE LoginPeriods
(
agent_id VARCHAR(20) NOT NULL,
time_in DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
CHECK (time_in <= CURRENT_TIMESTAMP),
time_out DATETIME NOT NULL DEFAULT '99991231'
CHECK (time_out <= CURRENT_TIMESTAMP OR time_out = '99991231'),
PRIMARY KEY (time_in, agent_id),
CHECK (time_in < time_out)
)

-- Your sample data
INSERT INTO LoginPeriods (agent_id, time_in, time_out)
VALUES ('A1', '20041122 14:02', '20041122 17:30')
INSERT INTO LoginPeriods (agent_id, time_in, time_out)
VALUES ('A2', '20041122 09:00', '20041122 17:12')
INSERT INTO LoginPeriods (agent_id, time_in, time_out)
VALUES ('A3', '20041122 10:25', '20041122 16:30')
INSERT INTO LoginPeriods (agent_id, time_in, time_out)
VALUES ('A4', '20041122 11:02', '20041122 12:30')
INSERT INTO LoginPeriods (agent_id, time_in, time_out)
VALUES ('A5', '20041122 16:00', '20041122 17:30')

-- Digits 0-9
CREATE VIEW Digits (d)
AS
SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL
SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL
SELECT 9

-- Nonnegative integers to some suitable upper bound
-- Used in representing the sequence of time periods from
-- begin to end datetimes
CREATE VIEW NonnegativeIntegers (n)
AS
SELECT Ones.d + 10 * Tens.d
FROM Digits AS Ones
CROSS JOIN
Digits AS Tens

-- For each time period between begin and end datetimes,
-- return login periods that overlap
CREATE FUNCTION LoginPeriodsBetween
(@.begin_time DATETIME, @.end_time DATETIME, @.period_mins INT)
RETURNS TABLE
AS
RETURN(
SELECT DATEADD(MINUTE, I.n * @.period_mins, @.begin_time) AS begin_time,
DATEADD(MINUTE, (I.n+1) * @.period_mins, @.begin_time) AS end_time,
agent_id,
CASE WHEN time_in <=
DATEADD(MINUTE, I.n * @.period_mins, @.begin_time)
THEN DATEADD(MINUTE, I.n * @.period_mins, @.begin_time)
ELSE time_in
END AS time_in,
CASE WHEN time_out <=
DATEADD(MINUTE, (I.n+1) * @.period_mins, @.begin_time)
THEN time_out
ELSE DATEADD(MINUTE, (I.n+1) * @.period_mins, @.begin_time)
END AS time_out
FROM NonnegativeIntegers AS I
LEFT OUTER JOIN
LoginPeriods AS LP
ON time_out > DATEADD(MINUTE, I.n * @.period_mins, @.begin_time) AND
time_in < DATEADD(MINUTE, (I.n+1) * @.period_mins, @.begin_time)
WHERE I.n < DATEDIFF(MINUTE, @.begin_time, @.end_time) / @.period_mins
)

-- Maximum number of concurrent agent logins per time period
CREATE FUNCTION MaxConcurrentAgents
(@.begin_time DATETIME, @.end_time DATETIME, @.period_mins INT)
RETURNS TABLE
AS
RETURN(
SELECT begin_time, end_time, MAX(concurrent_agents) AS concurrent_agents_tally
FROM (SELECT LP1.begin_time, LP1.end_time,
LP1.agent_id,
LP1.time_in, LP1.time_out,
COUNT(LP2.agent_id) AS concurrent_agents
FROM LoginPeriodsBetween(@.begin_time, @.end_time, @.period_mins) AS LP1
LEFT OUTER JOIN
LoginPeriodsBetween(@.begin_time, @.end_time, @.period_mins) AS LP2
ON LP1.begin_time = LP2.begin_time AND
LP1.end_time = LP2.end_time AND
LP1.time_in >= LP2.time_in AND
LP1.time_in < LP2.time_out
GROUP BY LP1.begin_time, LP1.end_time, LP1.agent_id,
LP1.time_in, LP1.time_out) AS CA
GROUP BY begin_time, end_time
)

-- Maximum number of concurrent agent logins for each 30 minute
-- period between the specified begin and end datetimes
-- Note that no logins for a time period will be indicated by a 0 tally
SELECT begin_time, end_time, concurrent_agents_tally
FROM MaxConcurrentAgents('20041122 09:00', '20041122 18:00', 30)
ORDER BY begin_time

begin_time end_time concurrent_agents_tally
2004-11-22 09:00:00.000 2004-11-22 09:30:00.000 1
2004-11-22 09:30:00.000 2004-11-22 10:00:00.000 1
2004-11-22 10:00:00.000 2004-11-22 10:30:00.000 2
2004-11-22 10:30:00.000 2004-11-22 11:00:00.000 2
2004-11-22 11:00:00.000 2004-11-22 11:30:00.000 3
2004-11-22 11:30:00.000 2004-11-22 12:00:00.000 3
2004-11-22 12:00:00.000 2004-11-22 12:30:00.000 3
2004-11-22 12:30:00.000 2004-11-22 13:00:00.000 2
2004-11-22 13:00:00.000 2004-11-22 13:30:00.000 2
2004-11-22 13:30:00.000 2004-11-22 14:00:00.000 2
2004-11-22 14:00:00.000 2004-11-22 14:30:00.000 3
2004-11-22 14:30:00.000 2004-11-22 15:00:00.000 3
2004-11-22 15:00:00.000 2004-11-22 15:30:00.000 3
2004-11-22 15:30:00.000 2004-11-22 16:00:00.000 3
2004-11-22 16:00:00.000 2004-11-22 16:30:00.000 4
2004-11-22 16:30:00.000 2004-11-22 17:00:00.000 3
2004-11-22 17:00:00.000 2004-11-22 17:30:00.000 3
2004-11-22 17:30:00.000 2004-11-22 18:00:00.000 0

--
JAG

No comments:

Post a Comment