 # R | Data Selection and Manipulation

This functions below aim to give a bit of background on data and data manipulation in R.

• which.max(x) returns the index of the greatest element of x
• which.min(x) returns the index of the smallest element of x
• rev(x) reverses the elements of x
• sort(x) sorts the elements of x in increasing order; to sort in decreasing order: rev(sort(x))
• cut(x,breaks) divides x into intervals (factors); breaks is the number of cut intervals or a vector of cut points
• match(x, y) returns a vector of the same length than x with the elements of x which are in y (NA otherwise)
• which(x == a) returns a vector of the indices of x if the comparison operation is true (TRUE), in this example the values of i for which x[i] == a (the argument of this function must be a variable of mode logical)
• choose(n, k) computes the combinations of k events among n repetitions = n!/[(n−k)!k!]
• na.omit(x) suppresses the observations with missing data (NA) (suppresses the corresponding line if x is a matrix or a data frame)
• na.fail(x) returns an error message if x contains at least one NA
• unique(x) if x is a vector or a data frame, returns a similar object but with the duplicate elements suppressed
• table(x) returns a table with the numbers of the differents values of x (typically for integers or factors)
• subset(x, …) returns a selection of x with respect to criteria (…,
• typically comparisons: x\$V1 < 10); if x is a data frame, the option
• select gives the variables to be kept or dropped using a minus sign
• sample(x, size) resample randomly and without replacement size elements in the vector x, the option replace = TRUE allows to resample with replacement
• prop.table(x,margin=) table entries as fraction of marginal table

 Functions for Manipulating Character Variables nchar(x) a vector fo the lengths of each value in x paste(a,b,sep=”_”) concatenates character values, using sep between them substr(x,start,stop) extract characters from positions start to stop from x strsplit(x,split) split each value of x into a list of strings using split as the delimiter grep(pattern,x) return a vector of the elements of x that included pattern grepl(pattern,x) returns a logical vector indicating whether each element of x contained pattern regexpr(pattern,x) returns the integer positions of the first occurrence of pattern in each element of x gsub(pattern,replacement,x) replaces each occurrence of pattern with occurrence tolower(x) converts x to all lower case toupper(x) converts x to all upper case

 Logical Operators == is equal to != is not equal to > greater than >= greater than or equal to < less than <= less than or equal to %in% is in the list ! not (reverses T & F & and | or Damian Mingle