Sunday, February 19, 2012

Get Max(ondate) but not as aggregate

I know there must be a better way....
What I have done in the past is to make each column a subselect
,ca.ondate = (select max(ondate) from ....
,cs.ondate = (select max(ondate from...
This appears to be a poor way and performance would suffer.
So, onto a new way, but it's not working for me...
Say I have a contacts, calendar and history tables, I want all records c.amp
=
'MIC' and if they exist the latest pending & completed appointment or sale f
or
each.
I want a left outer join to cal & hist.
I want the most recent calendar (pending) appointment if it exists where the
code = 'R' from the calendar table.
I want the most recent calendar (pending) sale if it exists where the code =
'R'
from the calendar table.
I want the most recent history appointment where the code = 'R'.
I want the most recent history sale where the code = 'R'.
What happens is say if there are no pending appt's, then I get no records
because of the exclusive where.
Here's the trash that I have so far... (remember I want all records but onl
y a
date value if it exists.
select c.contact
ca.ondate
cs.ondate
ha.ondate
hs.ondate
from contacts c
left outer join calendar ca on ca.id = c.id and ca.code = 'R'
left outer join calendar cs on cs.id = c.id and cs.code = 'R'
left outer join history ha on ha.id = c.id and ha.code = 'R'
left outer join history hs on hs.id = c.id and hs.code = 'R'
where ca.ondate in(select max(ondate) from calendar where id = ca.id and cod
e =
'R')
and cs.ondate in(select max(ondate) from calendar where id = cs.id and code
= 'R')
and ha.ondate in(select max(ondate) from history where id = ha.id and code =
'R')
and hs.ondate in(select max(ondate) from history where id = hs.id and code =
'R')
where c.amp = 'MIC'
TIA
JeffP...Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. It is very hard to debug code when you do not let us
see it.
The nature of time is a continuum, Remember Zeno, Einstein and all
those guys? You need to model the start and endding times of a state
or an event, something like this:
CREATE TABLE Events
(event_id INTEGER NOT NULL,
start_time DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (event_id, start_time),
end_time DATETIME, -- null means still active
CHECK (start_time < end_time),
..);
Now use predicates like this:
my_date BETWEEN start_time
AND COALESCE(end_time, CURRENT_TIMESTAMP),
and a VIEW that finds the rows where (end_time IS NULL).
You will find that having a Calendar table will be useful. What you are
calling a Calendar seems to have some vague id instead of a calendar
date for a key and some kidn of equally vague code!|||In previous positions I worked with PeopleSoft quite a bit, and nearly every
table in the database had an effective date (EFFDT) column in it, and you
always had to take the record with the last EFFDT <= @.SomeDate.
Often this would mean 5 or more subselects, one for every table, usually
with @.SomeDate equal to the EFFDT from one of the other tables in the query.
The performance was surprisingly fast, as long as the table had the proper
index on (PrimaryKey, DateField).
With the proper indexes, I think the aproach you use below (with "=", not
"in") should be fine.
Of course, if you have already checked your indexes and you are still having
performance issues, then ignore this.
"JDP@.Work" <JPGMTNoSpam@.sbcglobal.net> wrote in message
news:uDgtq02JGHA.1192@.TK2MSFTNGP11.phx.gbl...
> I know there must be a better way....
> What I have done in the past is to make each column a subselect
> ,ca.ondate = (select max(ondate) from ....
> ,cs.ondate = (select max(ondate from...
> This appears to be a poor way and performance would suffer.
> So, onto a new way, but it's not working for me...
> Say I have a contacts, calendar and history tables, I want all records
c.amp =
> 'MIC' and if they exist the latest pending & completed appointment or sale
for
> each.
> I want a left outer join to cal & hist.
> I want the most recent calendar (pending) appointment if it exists where
the
> code = 'R' from the calendar table.
> I want the most recent calendar (pending) sale if it exists where the code
= 'R'
> from the calendar table.
> I want the most recent history appointment where the code = 'R'.
> I want the most recent history sale where the code = 'R'.
> What happens is say if there are no pending appt's, then I get no records
> because of the exclusive where.
> Here's the trash that I have so far... (remember I want all records but
only a
> date value if it exists.
> select c.contact
> ca.ondate
> cs.ondate
> ha.ondate
> hs.ondate
> from contacts c
> left outer join calendar ca on ca.id = c.id and ca.code = 'R'
> left outer join calendar cs on cs.id = c.id and cs.code = 'R'
> left outer join history ha on ha.id = c.id and ha.code = 'R'
> left outer join history hs on hs.id = c.id and hs.code = 'R'
> where ca.ondate in(select max(ondate) from calendar where id = ca.id and
code =
> 'R')
> and cs.ondate in(select max(ondate) from calendar where id = cs.id and
code
> = 'R')
> and ha.ondate in(select max(ondate) from history where id = ha.id and
code =
> 'R')
> and hs.ondate in(select max(ondate) from history where id = hs.id and
code =
> 'R')
> where c.amp = 'MIC'
> TIA
> JeffP...
>|||you where clause effectively coverts your outer joins into inner
jouins.
Here is the correct way:
select 1 id
into #contacts
union all
select 2 id
union all
select 3 id
go
select 1 contact_id, '20060101' contact_date
into #dates1
union all
select 1, '20060103'
go
select 2 contact_id, '20060101' contact_date
into #dates2
union all
select 2, '20060103'
go
select c.id, cd1, cd2
from #contacts c
left outer join
(select contact_id, max(contact_date) cd1
from #dates1 group by contact_id) d1
on c.id = d1.contact_id
left outer join
(select contact_id, max(contact_date) cd2
from #dates2 group by contact_id) d2
on c.id = d2.contact_id
id cd1 cd2
-- -- --
1 20060103 NULL
2 NULL 20060103
3 NULL NULL
(3 row(s) affected)
go
drop table #contacts
drop table #dates1
drop table #dates2|||Thanks to all, I have very little control over the indexes, and rather than
pure
performance I was looking for this later answer using union by Alexander.
Also thanks to Jim as I have done similar on smaller bits of data, however t
his
qry will eventually be derived from a number of regional databases.
TIA
JeffP....
"Alexander Kuznetsov" <AK_TIREDOFSPAM@.hotmail.COM> wrote in message
news:1138830588.972004.263810@.g49g2000cwa.googlegroups.com...
> you where clause effectively coverts your outer joins into inner
> jouins.
> Here is the correct way:
> select 1 id
> into #contacts
> union all
> select 2 id
> union all
> select 3 id
> go
> select 1 contact_id, '20060101' contact_date
> into #dates1
> union all
> select 1, '20060103'
> go
> select 2 contact_id, '20060101' contact_date
> into #dates2
> union all
> select 2, '20060103'
> go
> select c.id, cd1, cd2
> from #contacts c
> left outer join
> (select contact_id, max(contact_date) cd1
> from #dates1 group by contact_id) d1
> on c.id = d1.contact_id
> left outer join
> (select contact_id, max(contact_date) cd2
> from #dates2 group by contact_id) d2
> on c.id = d2.contact_id
> id cd1 cd2
> -- -- --
> 1 20060103 NULL
> 2 NULL 20060103
> 3 NULL NULL
> (3 row(s) affected)
> go
> drop table #contacts
> drop table #dates1
> drop table #dates2
>|||Celko,
Where is there a more in-depth explanation of this? I really don't
understand the benefit to this aproach, but then again I don't really
understand the aproach itself. The only time I have seen this implemented
it was very difficult to retrieve records within a date range. On the other
hand I find the single date column to note the start of an event to be much
easier to work with.
Dates shown are in the format (month/day/year)
For example :
EmpID Event Effective Date Jobcode payrate
department
0034 Hire 1/15/2000 123
25.00 H23
0034 Promotion 3/20/2002 S15 36.00
H23
0034 Transfer 6/30/2005 S15 36.00
NTO
0034 Termination 8/30/2005 S15 36.00
NTO
keys would be EmpID and Effective Date (yes, they are out of order, its just
an example)
The event "Hire" takes place on 1/15/2000, and is not a date range, but a
single day. The employee is active from this day until the day before they
are terminated (8/29/2005). The termination date would mark the first day
that the employee is no longer active with the company.
There is a promotion event that occurs on 3/20/2002. Again the promotion
occurs that day, not over a period of time. Granted, the employee's
jobcode, department, etc, will be valid over a period of time, but the event
which causes the data to change is instantaneous.
I assume your aproach would require an entirely different table structure in
order to work, but I don't know how that would work.
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1138828873.001227.108770@.z14g2000cwz.googlegroups.com...
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. Sample data is also a good idea, along with clear
> specifications. It is very hard to debug code when you do not let us
> see it.
> The nature of time is a continuum, Remember Zeno, Einstein and all
> those guys? You need to model the start and endding times of a state
> or an event, something like this:
> CREATE TABLE Events
> (event_id INTEGER NOT NULL,
> start_time DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
> PRIMARY KEY (event_id, start_time),
> end_time DATETIME, -- null means still active
> CHECK (start_time < end_time),
> ..);
> Now use predicates like this:
> my_date BETWEEN start_time
> AND COALESCE(end_time, CURRENT_TIMESTAMP),
> and a VIEW that finds the rows where (end_time IS NULL).
> You will find that having a Calendar table will be useful. What you are
> calling a Calendar seems to have some vague id instead of a calendar
> date for a key and some kidn of equally vague code!
>

No comments:

Post a Comment