R: Sorting data.frame by a column and conditionally deleting of rows -
i've got data.frame
in r sample data looks this:
dat <- data.frame(name=c("name1","name1","name1","name1","name2","name2","name2","name2") , survey_year =c(1947,1958,1978,1987,1963,1991,2004,1993), reference_year=c(1934,1947,1974,1947,1944,1987,1993,1987), value=c(10,15,13,20,-2,7,12,-19)) dat name survey_year reference_year value 1 name1 1947 1934 10 2 name1 1958 1947 15 3 name1 1978 1974 13 4 name1 1987 1947 20 5 name2 1963 1944 -2 6 name2 1991 1987 7 7 name2 2004 1993 12 8 name2 1993 1987 -19
how sort first reference_year
(from lowest highest):
name survey_year reference_year value 1 name1 1947 1934 10 2 name1 1958 1947 15 3 name1 1987 1947 20 4 name1 1978 1974 13 5 name2 1963 1944 -2 6 name2 1991 1987 7 7 name2 1993 1987 -19 8 name2 2004 1993 12
and if year in reference_year
same, delete 1 covers longer period (from reference_year
survey_year
) dat
, write deleted rows new data.frame
?
the data.frame sample data should in end:
name survey_year reference_year value 1 name1 1947 1934 10 2 name1 1958 1947 15 3 name1 1978 1974 13 4 name2 1963 1944 -2 5 name2 1991 1987 7 6 name2 2004 1993 12
bondeddust left elegant answer. answer far longer his. but, let me leave it.
dat %>% arrange(reference_year) %>% mutate(gap = survey_year - reference_year) %>% arrange(reference_year, gap) %>% group_by(name, reference_year) %>% filter(gap == gap[1]) %>% arrange(name,reference_year) # name survey_year reference_year value gap #1 name1 1947 1934 10 13 #2 name1 1958 1947 15 11 #3 name1 1978 1974 13 4 #4 name2 1963 1944 -2 19 #5 name2 1991 1987 7 4 #6 name2 2004 1993 12 11
Comments
Post a Comment