count unique values for each column per ID in R -


i have dataset quite big (140000 obs * 125 attributes). each ob associated id (which can unique or not). want count unique values each attribute (columns) per id.

i tried aggregate(. ~ id, mydata, function(x) length(unique(x)). doesn't work. given size of data frame, feel works may take long it. knows better way it?

the dataset:

id  attr1   attr2   attr3   attr125 1     x   y   123 1   b   z   y   345 1   b   x   y   134 2     z   y   abc 2   c   y   y   def 3   d   y   n   xyz 4   b   z   y   789 

the result want:

id  attr1   attr2   attr3   attr125 1   2   2   1   3 2   2   2   1   2 3   1   1   1   1 4   1   1   1   1 

i hesitated posting because similar @mgriebe's answer, different way use data.table. find data.table useful these operations (however, aggregate call worked fine me):

# load data.table package require( data.table )  # first copy data.frame data.table dt <- data.table( mydata )  # count length of id unique id values each column using .sd operator of data.table dt[ , lapply( .sd , function(x) length(unique(x)) ) , by=id , .sdcols=2:5 ]`  #   id attr1 attr2 attr3 attr125 #1:  1     2     2     1       3 #2:  2     2     2     1       2 #3:  3     1     1     1       1 #4:  4     1     1     1       1 

remember adjust .sdcols column numbers attributes stored....


Comments

Popular posts from this blog

javascript - how to protect a flash video from refresh? -

visual studio 2010 - Connect to informix database windows form application -

android - Associate same looper with different threads -