sql server - SQL join to correlated subquery where tables are related by overlapping ranges -
i have following table structure:
item
id | name -------- 1  | apple 2  | pear  3  | banana 4  | plum 5  | tomato event
itemstart | itemend | eventtype | eventdate --------------------------------------------      1    |    2    |  planted  | 2014-01-01      1    |    3    |  picked   | 2014-01-02      3    |    5    |  eaten    | 2014-01-05 the 2 tables linked primary key of item , range of itemstart , itemend (inclusive) in event. events refer contiguous sequences of items, not events given item have same range. events never occur on same date given item.
the query i'd produce follows:
list items, , each item show recent event
sample output:
id | name   |   event | date ---------------------------- 1  | apple  |  picked | 2014-01-02 (planted picked) 2  | pear   |  picked | 2014-01-02 (planted picked) 3  | banana |  eaten  | 2014-01-05 (picked eaten) 4  | plum   |  eaten  | 2014-01-05 (eaten) 5  | tomato |  eaten  | 2014-01-05 (eaten) this seems reasonable enough on face of it, , if there traditional foreign-key relationships in place (imagine itemid instead of itemstart , itemend) i'd join correlated subquery this:
select name, eventtype, eventdate item     inner join (         select itemid, eventtype, eventdate         event e         eventdate = (select max(eventdate) event e_max e_max.itemid = e.itemid)     ) latest_events on i.id = latest_events.itemid however, range relationship in place i'm stuck, want more doesn't work:
select name, eventtype, eventdate item     inner join (         select itemstart, itemend, eventtype, eventdate         event e         eventdate = (select max(eventdate) event e_max i.id >= e_max.itemstart , i.id <= e_max.itemend)     ) latest_events on i.id >= latest_events.itemstart , i.id <= latest_events.itemend i error i.id >= e_max.itemstart , i.id <= e_max.itemend on line 6, because can't reference i within part of join.  wanted (which isn't necessary in simpler example) because when i'm constructing subquery no longer have single id link - overlapping ranges mean there many possible ways of including single item, , want refer directly item, id available in top-level item table.
i hope makes sense.
i'm using sql server 2008 r2. report run overnight, speed not important be, there many items (100s of million); while there multiple events against each item, use of large ranges means there much fewer event records.
things i've thought about:
- somehow expanding out item/event relationship events recorded against individual items. result in significant increase in amount of data under consideration, allow simpler query approach.
- somehow process events constrain or consolidate ranges - if knew given item events had same start , end perhaps simplify things. not thought through.
how can produce query? in advance!
you can utilizing cte , row_number().
;with cte  (     select *,         row_number() on (partition i.id order e.eventdate desc) rnum     item     join event e         on i.id between e.itemstart , e.itemend )  select id,   name,    eventtype,   eventdate cte rnum = 1 basically cte has joined item , event , added new column rownumber , partitioned on item.id. here's screenshot of looks like. here select rnum = 1 should max event date each item.id.

Comments
Post a Comment