sql server - SQL join to correlated subquery where tables are related by overlapping ranges -


i have following table structure:

item

id | name -------- 1  | apple 2  | pear  3  | banana 4  | plum 5  | tomato 

event

itemstart | itemend | eventtype | eventdate --------------------------------------------      1    |    2    |  planted  | 2014-01-01      1    |    3    |  picked   | 2014-01-02      3    |    5    |  eaten    | 2014-01-05 

the 2 tables linked primary key of item , range of itemstart , itemend (inclusive) in event. events refer contiguous sequences of items, not events given item have same range. events never occur on same date given item.

the query i'd produce follows:

list items, , each item show recent event

sample output:

id | name   |   event | date ---------------------------- 1  | apple  |  picked | 2014-01-02 (planted picked) 2  | pear   |  picked | 2014-01-02 (planted picked) 3  | banana |  eaten  | 2014-01-05 (picked eaten) 4  | plum   |  eaten  | 2014-01-05 (eaten) 5  | tomato |  eaten  | 2014-01-05 (eaten) 

this seems reasonable enough on face of it, , if there traditional foreign-key relationships in place (imagine itemid instead of itemstart , itemend) i'd join correlated subquery this:

select name, eventtype, eventdate item     inner join (         select itemid, eventtype, eventdate         event e         eventdate = (select max(eventdate) event e_max e_max.itemid = e.itemid)     ) latest_events on i.id = latest_events.itemid 

however, range relationship in place i'm stuck, want more doesn't work:

select name, eventtype, eventdate item     inner join (         select itemstart, itemend, eventtype, eventdate         event e         eventdate = (select max(eventdate) event e_max i.id >= e_max.itemstart , i.id <= e_max.itemend)     ) latest_events on i.id >= latest_events.itemstart , i.id <= latest_events.itemend 

i error i.id >= e_max.itemstart , i.id <= e_max.itemend on line 6, because can't reference i within part of join. wanted (which isn't necessary in simpler example) because when i'm constructing subquery no longer have single id link - overlapping ranges mean there many possible ways of including single item, , want refer directly item, id available in top-level item table.

i hope makes sense.

i'm using sql server 2008 r2. report run overnight, speed not important be, there many items (100s of million); while there multiple events against each item, use of large ranges means there much fewer event records.

things i've thought about:

  • somehow expanding out item/event relationship events recorded against individual items. result in significant increase in amount of data under consideration, allow simpler query approach.
  • somehow process events constrain or consolidate ranges - if knew given item events had same start , end perhaps simplify things. not thought through.

how can produce query? in advance!

you can utilizing cte , row_number().

sql fiddle demo

;with cte  (     select *,         row_number() on (partition i.id order e.eventdate desc) rnum     item     join event e         on i.id between e.itemstart , e.itemend )  select id,   name,    eventtype,   eventdate cte rnum = 1 

basically cte has joined item , event , added new column rownumber , partitioned on item.id. here's screenshot of looks like. here select rnum = 1 should max event date each item.id.

enter image description here


Comments

Popular posts from this blog

javascript - how to protect a flash video from refresh? -

visual studio 2010 - Connect to informix database windows form application -

android - Associate same looper with different threads -