Q: Date and time functions in R

In library(survival), we can find the following functions:

  • as.date

    Converts any of the following character forms to a Julian date: 8/31/56, 8-31-1956, 31 8 56, 083156, 31Aug56, or August 31 1956.

    Example:

    > as.date(c("1jan1960", "2jan1960", "31mar1960", "30jul1960")) [1] 1Jan60 2Jan60 31Mar60 30Jul60
  • mdy.date

    Given a month, day, and year, returns the number of days since January 1, 1960.

    Example:

    > mdy.date(3, 10, 53) [1] 10Mar53
  • date.mdy

    Convert a vector of Julian dates to a list of vectors with the corresponding values of month, day and year, and optionally weekday.

    Example:

    > a1<-mdy.date(month = 8, day = 7, year = 1960) > a1 [1] 7Aug60 > date.mdy(a1) $month [1] 8 $day [1] 7 $year [1] 1960 >
  • date.mmddyy

    Given a vector of Julian dates, this returns them in the form ``10/11/89'', ``28/7/54'', etc.

    Example:

    > date.mmddyy(mdy.date(3, 10, 53)) [1] "3/10/53"
  • date.ddmmmyy

    Given a vector of Julian dates, this returns them in the form ``10Nov89'', ``28Jul54'', etc.

    Example:

    > date.ddmmmyy(mdy.date(3, 10, 53)) [1] "10Mar53"
  • date.mmddyyyy

    Given a vector of Julian dates, this returns them in the form ``10/11/1989'', ``28/7/1854'', etc.

    Example:

    > date.mmddyyyy(mdy.date(3, 10, 53)) [1] "3/10/1953"
Q: How to call C functions in R?

In R, we can call C functions. For example, we have the following toy C function in the file test.c.

/***** The function is to calculate Hadama product of two matrices. out[i][j]=x[i][j]*y[i][j]. The inputs x, y and the output out are vectors, not matrices. So in R, you need to transform input matrices into vectors and transform output vector back to matrix. *****/ void myHadamaProduct(double *x, double *y, int *nrow, int *ncol, double *out) { int i, j, r, c; r=*nrow; c=*ncol; for(i = 0; i < r; i ++) { for(j = 0; j < c; j ++) { out[i*c+j]=x[i*c+j]*y[i*c+j]; } } return; }

  1. First, we need to compile the file test.c to create a shared library, test.so say, by using the GNU C compiler:

    gcc -fpic -shared -fno-gnu-linker -o test.so test.c
  2. Next, we need to use the R function dyn.load to load the shared library test.so. if(!is.loaded("myHadamaProduct")){ dyn.load("./test.so") } The R function is.loaded is to check if the C function myHadamaProduct is already be loaded to R. If yes, then we do not need to loaded it again.
  3. Next, we use the R function .C to call the C function myHadamaProduct. For example, x<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix y<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix # In R, a matrix is stored by column in the memory. # However, in C, a matrix is stored by row in the memory. # So we need to transpose the matrix x, namely t(x), before # transforming it to a vector. xx<-as.double(as.vector(t(x))) yy<-as.double(as.vector(t(y))) nr<-as.integer(nrow(x)) nc<-as.integer(ncol(x)) res<-.C("myHadamaProduct", xx, yy, nr, nc, out=as.double(rep(0.0, n))) # In C, matrix is stored by row. So when transforming back, we need to # specify byrow=T. mat<-matrix(res$out, ncol=nc, byrow=T) cat("Hadama product >>n") print(mat)
  4. If you do not need to use the shared library test.so any more, you can use the R function dyn.unload to unload it. if(is.loaded("myHadamaProduct")){ dyn.unload("./test.so") }

Note:

  • The C function called by R must be void type. For the example above, the function myHadamaProduct has to have the form: void myHadamaProduct(double *x, double *y, int *nrow, int *ncol, double *out) rather than double *myHadamaProduct(double *x, double *y, int *nrow, int *ncol)

    You have to let the function return values through arguments, e.g "double *out" in the above example. In fact if the arguments are pointers (e.g. *out) and you change their values they refer to within the function, then the values where the pointers refer to will be changed after calling this function.

  • All arguments in the C function have to be passed by addresses instead of values. That is, all arguments have to be pointers. For the example above, you cannot change "int *nrow" to "int nrow".

    The values where the pointers refer to will be changed after calling the function, if the values are changed within the function. So be careful when using pointers as function arguments.

  • Any values returned by C functions which are called in R must be initialized and must have the format: # if the variable is defined as double in the C function variablename=as.double(initialized values) # if the variable is defined as integer in the C function variablename=as.integer(initialized values) in the ".C" function (e.g. "out=as.double(rep(0.0, n))" in the above example).

    The input values must also be initialized and must have the above format. However, they can be formated before the ".C" function (e.g. "nr<-as.integer(nrow(x))" in the above example).

  • If the output is not written as variablename=variablename format (e.g. out=as.double(rep(0.0, n)) in the above example), You still can get results. However, you have to use res[[5]] to refer out in the above example. In fact, the .C function return a list containing all arguments of the C function myHadamaProduct. Since out is the 5-th argument, you can use res[[5]] to refer to the 5-th elements of the list.
  • It is okay that the file test.c contains the main function.
  • Sometimes, the command "dyn.load("test.so")" gets error message. This is probably caused by the environment variable "$PATH" was not set correctly. You can either add the following line to the file ".bashrc" in your home directory: export PATH=$PATH:.:

    or use the command

    dyn.load("./test.so")
Q: How to print the R graphics directly in R?

To print the R graphics directly in R, use command dev.print.

The default for `dev.print' is to produce and print a postscript copy, if `options("printcmd")' is set suitably.

`dev.print' is most useful for producing a postscript print (its default) when the following applies. Unless `file' is specified, the plot will be printed. Unless `width', `height' and `pointsize' are specified the plot dimensions will be taken from the current device, shrunk if necessary to fit on the paper. (`pointsize' is rescaled if the plot is shrunk.) If `horizontal' is not specified and the plot can be printed at full size by switching its value this is done instead of shrinking the plot region.

If `dev.print' is used with a specified device (even `postscript') it sets the width and height in the same way as `dev.copy2eps'.

For `dev.copy2eps', `width' and `height' are taken from the current device unless otherwise specified. If just one of `width' and `height' is specified, the other is adjusted to preserve the aspect ratio of the device being copied. The default file name is `Rplot.eps'.

Example:

plot(hist(rnorm(100))) # plot histogram options("printcmd"="lpr -Poptra") # set default printer dev.print() # print the histogram to printer optra
Q: How to add straight lines through the current plot in R?

You can use the command abline in R. For example, you want add to the current plot a horizontal line at y=1, a vertial line at x=3, and a line with intercept 2 and slope 0.5. You can use the following command:

abline(h=1) abline(v=3) abline(a=2, b=0.5)
Q: How do I install a package in R/Splus in UNIX?

Below are some instructions for installing an R package into your own disk space (so that you don't need to ask the busy systems manager).

To install a package (in Unix) which is not part of the main distribution of R, follow the steps below.

  • go to http://www.r-project.org or a local mirror site http://cran.stat.sfu.ca and get the source as a gzipped tar file, an example is the package sspir_0.2.3.tar.gz for State Space Models in R
  • extract from the tar file with (say under your ~/tmp directory) mkdir ~/tmp # if directory doesn't already exist mv ~/sspir_0.2.3.tar.gz ~/tmp cd ~/tmp tar xzvf sspir_0.2.3.tar.gz
  • compile the source (typically C and/or fortran routines) and install into your local directory (say ~/Rlib) with the following Unix command line (where you replace $HOME with your home directory which is the output of 'echo $HOME'). R CMD INSTALL --library=$HOME/Rlib sspir To use the package, you need the lib.loc option for library() > library(sspir,lib.loc="$HOME/Rlib") > library(help=sspir,lib.loc="$HOME/Rlib")
  • With newer versions of R, the above can be combined into one step R CMD INSTALL --library=$HOME/Rlib sspir_0.2.3.tar.gz
  • If you decide later you don't want the package, then from the Unix command line R CMD REMOVE --library=$HOME/Rlib sspir
  • If you want to let others use your locally added packages, just set permissions appropriately to $HOME/Rlib, e.g. chmod -R og+rX $HOME/Rlib
  • Note that the different flavors of Unix on our network. A package compiled in one computer (e.g. 32-bit Linux) should work on another computer with the same architecture. If you work on different servers on our network, then you would have to compile separate versions. In this case, one possibility is something like subdirectory Rlib_linux32, Rlib_linux64, Rlib_solaris64 under your $HOME
  • Alternatively, after testing the package, you can ask the systems manager to install it (be clear whether you want the Solaris or Linux version) by providing the following instructions (where you replace $MYRPKGTARDIR with the directory where the tar file is unpacked, and replace $PKG with the name of the package) cd $MYRPKGTARDIR R CMD INSTALL $PKG

As a final note, if you want to install a package in Windows, the simplest thing to do is:

  • go to http://www.r-project.org/ and get the zip file with the Windows compiled version
  • unzip the file under $RHOME/library, where $RHOME is the folder where you installed R
  • if you don't have write access to $RHOME, just unzip the package anywhere and use library() with lib.loc argument to load the package later
  • Alternatively, use the installer in the R console menu.
Q: R Miscellaneous Add-Ons
Q: How to use R to read data stored by Minitab, S, SAS, SPSS, Stata, ...?

the foreign package provide several functions to read data stored by Minitab, S, SAS, SPSS, Stata,....

data.restore Read an S3 Binary File lookup.xport Lookup information on a SAS XPORT format library read.dta Read Stata binary files read.epiinfo Read Epi Info data files read.mtp Read a Minitab Portable Worksheet read.S Read an S3 Binary File read.spss Read an SPSS data file read.ssd obtain a data frame from a SAS permanent dataset, via read.xport read.xport Read a SAS XPORT format library SModeNames Read an S3 Binary File write.dta Write files in Stata binary format
Q: How to read data files containing time data by R?

For example, the data file test.dat is:

4188418000628; 1 ; 05-19-2002 ; 06-23-2002 ; 26.6; 3.71; 3.03; 0 4188418000628; 1 ; 05-19-2002 ; 07-15-2002 ; 28.1; 3.41; 2.79; 0 4188418000628; 1 ; 05-19-2002 ; 08-15-2002 ; 32.0; 3.43; 3.30; 0

The field is separated by ";" and the third and fourth columns are time data.

  1. Read time data as characters. y<-read.table("test.dat", sep=";", as.is=T, strip.white=T) and we can get > y$V3 [1] "05-19-2002" "05-19-2002" "05-19-2002" The argument strip.white is used only when sep has been specified, and allows the stripping of leading and trailing white space from `character' fields (`numeric' fields are always stripped). If strip.white=F, then we will get > y1<-read.table("test.dat", sep=";", as.is=T) > y1$V3 [1] " 05-19-2002 " " 05-19-2002 " " 05-19-2002 "

    The default behavior of read.table is to convert character variables (which are not converted to logical, numeric or complex) to factors. When as.is=T, read.table will not convert character variables to factors.

  2. Convert characters to a Julian date by using the function as.date in library(survival) library(survival) > a<-as.date(y$V3) > a [1] 19May2002 19May2002 19May2002 > b [1] 23Jun2002 15Jul2002 15Aug2002 > a[1]-b[1] [1] -35
Q: How to format output of R?

You can use R command format. For example

round(0.10000, 3)

produces

0.1

instead of

0.100

. To get correct format, we can use the following command

format(round(0.10000,3), nsmall=3, digits=3)

Note that the return value of the above command is a character string "0.100", not a numerical value.

Pages