Q: How to read SAS data files int Splus?

Method 1. 

If you are using SAS version 6.12 in Unix... 


Step 1 
Output SAS data to the desired directory as usual. The file extension using SAS version 6.12 should be .ssd01. Suppose a SAS data file named myfile.ssd01 is located under /home/user/data/

Step 2 
In Splus 3.4 or above, use the function sas.get(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mylibrary <- "/home/user/data/" 
      mydata <- sas.get(mylibrary,"myfile") 

For more detail, do help(sas.get)

OR 

In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/myfile.ssd01",type="SAS1") 

For more detail, do help(importData)

If you are using SAS version 7 in Unix... 

Step 1 
Output SAS data to the desired directory as usual. The file extension using SAS version 7 should be .sas7bdat. Suppose a SAS data file named myfile.sas7bdat is located under /home/user/data/

Step 2 
In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/myfile.sas7bdat",type="SAS7") 

For more detail, do help(importData)

If you are using SAS version 8 in Unix... see Method 2. 


Method 2 (recommended). Create a transport file from SAS (this works for SAS version 6.12 or above) 

Step 1 
In the SAS code, include the following line when defining SAS libraries: 

      libname sasdata xport "/home/user/data/file.tpt"; 
 

  • "libname" and "xport" are required in the syntax
  • "sasdata" is the user-defined library name
  • "/home/user/data/file.tpt" is the full path name of the transport file (Note the .tpt extension!). "file.tpt" is the file you want to output from SAS and later to be read into Splus


Don't forget to name the file to be outputted in a SAS procedure as "sasdata.file.tpt" in the SAS code!

Step 2 
In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/file.tpt",type="SAS_TPT") 

For more detail, do help(importData)

Note: 
1. Splus 3.4 does not have the importData() function. 
2. The sas.get() function may not read data files created by SAS of version 7 or above. 
3. The importData() function may not read data files from SAS of version 8 even though the file suffix can be the same as version 7 files (.sas7bdat). 
4. For Splus 4+ in Windows, use the import.data() function instead.

Q: Examples with SQL and databases (by Harry Joe)

Examples with SQL and databases (by Harry Joe)

Q: SAS Macro Language

SAS Macro Language


Go back to Howto Home


  • Introduction

    The macro facility is a powerful tool for extending and customizing the SAS System. It is particularly useful if you are to repeat some common tasks in the input data step, etc. Writing a macro in SAS is like writing a function in Splus. With the macro facility, you can assign a name to a character strings or groups of SAS programming statements. When you want to run those programming statements, you can just refer to the name assigned.

    When you use a macro facility name in a SAS program, the macro facility generates SAS statements and commands as needed. The rest of the SAS System receives those statements and uses them in the same way it uses the ones you enter in the standard manner.

    The macro facility has two components:

    1. The macro processor - the portion of the SAS System that does the work. 
    2. The macro language - the syntax that you use to communicate with the macro processor.

    When the SAS System compiles program text, two delimiters trigger macro processor activity:

       &name     refers to a macro variable. The form &name is called a macro variable reference.

       %name     refers to a macro.

    Go to top

    Go back to Howto Home

  • Macro Definition

    The syntax for defining a macro is as follows:    

    %macro macro-name;
    macro-definition 
    %mend macro-name;
     

    Note that in defining a macro, you need to give it a distinct name. You will also need to begin and end the macro with the %macro and %mend statements, respectively. The macro-name specified in the %mend statment must match the macro-name in the %macrostatement. Examples of macros will follow in the later sections.

    Go to top

    Go back to Howto Home

  • Defining Macro Variables

    Macro variables are an efficient way of substituting text strings in SAS code. One way of defining a macro variable is by using the %letstatement to assign the macro variable to a name, and a value.

    For example,

    %let subject=Statistics;



    subject is the macro variable name and Statistics is its value. Assigning a text string to a macro variable does not require quotation marks to embrace the string, as opposed to the standard way of text string assignment to non-macro variables in SAS.

    Later, when you want to create a title for the data set you use for the Statistics 200 course, you would like the text Statistics to appear. To refer to the variable subject, precede the variable name with an ampersand (&) (for macro variable reference):

    title "Data for &subject 200 class";



    The macro processor resolves the reference to the macro variable subject, and the statement becomes

    title "Data for Statistics 200 class;
     
  • An important note: the macro processor resolves macro variables references within DOUBLE quotation marks but NOT within single quotation marks.

    Go to top

    Go back to Howto Home

  • Commenting in Macros

    There are two ways of inserting comments in Macros:

    1. begin with /* and end with */ 
    2. begin with a %* and end with a ;

    An example:


    %macro comment;
      
    /* the first type of commenting */

    %let myvar=abc;

    %* the second type of commenting;
    %let myvar2=xyz;

    %mend comment;

    Go to top

    Go back to Howto Home

  • Invoking a Macro

    The general rule for invoking a macro is to precede the name of the macro with a percent sign (%). Unlike typing a SAS statement, the statement for macro invocation does NOT end in a semicolon.

    In the previous section, we have defined a macro named comment. To invoke the macro, simply do



        %comment
     

    Go to top

    Go back to Howto Home

  • Passing Information into a Macro Using Parameters

    Here we will see the similarity between a macro and a user-defined function in other programming language such as Splus.

    Suppose we are interested in defining a macro which takes in the x and y variables and plot a graph of y versus x. The following macro myplot will do the job:

    %macro myplot(x=,y=);

      proc plot;

    plot &y * &x; 

    title "plot of &y versus &x";

    run;

    %mend myplot;
     

    A macro variable defined in parentheses in a %macro statement is a macro parameter. Macro parameters allow you to pass information into a macro. They are like input arguments in routines in Splus, C/C++, etc.

    To invoke the macro myplot in the above example, you must provide values for the two parameters x and y. Say, you want a graph of income versus age, then do:



        %myplot(x=age,y=income)
     

    where the variables age and income already exist in a SAS data set. The macro processor matches the values specified in the macro call to the parameters in the macro definition. Macro execution produces the following code:


      proc plot;

    plot income * age; 

    title "plot of income versus age";

    run;
     

    Go to top

    Go back to Howto Home

  • Generating Repetitive Pieces of Text Using %DO Loops

    To generate repetitive pieces of text, one can make use of an iterative %DO loop. Suppose you want to list of a series of data sets or variables whose names share a certain pattern. The following macro can reduce the amount of typing in listing the data sets or variables:



    %macro names(name=,count=);


      %do n=1 %to &count;

        
    &name&n

    %end;

    %mend names;
    Say, you are to list 5 data sets named data1, data2, data3, data4 and data5 in your data step, you can do:

      data %names(name=data,count=5);
     

    Macro execution produces the following code:


      data data1 data2 data3 data4 data5;
     

    Note that to concatenate the text string passed to the macro parameter name and the number passed to the macro parameter count, we only need to juxtapose the macro variable references &name and &count.

    However, to concatenate a text string passed to a macro parameter and another text string (the second is not through a macro parameter), we may need the period character (.). See the section of Combining Macro Variable Reference with Text for more detail.

    Go to top

    Go back to Howto Home

  • Combining Macro Variable Reference with Text

    Let us first look at the following piece of SAS code with the use of macros:



    %let name=sales;

    data AB&name.dat;

    set save.&name; 

    if units>100;

    run;
     

    Macro execution produces the following code:


    data ABsalesdat;

    set save.sales;

    if units>100;

    run;
     

    The student who wrote the above code was planning to create a data set named ABsalesdat which was a subset of the data saved in savesales where the units variable in savesales was larger than 100. But the macro execution of his code did not produce his desired output. In particular, save.sales was shown instead of the desired savesales.

    Here is the trick for concatenating text and macro variables:

    1. To precede a text string "string1" with another string assigned to a macro variable mystring, simply juxtapose the first text string (without quotation) with the macro variable reference &mystring, i.e., do

        string1&mystring

    2. To add a suffix "sufstring" to a text string assigned to a macro variable mystring, you will need the period character (.), i.e., do

        &mystring.sufstring

    You can combine the two rules to concatenate more than 2 text strings involving a macro variable. In the above example, execution of AB&name.dat produces a string that precedes the text string "AB" with the macro variable reference &name, which is followed by the suffix "dat". However, the execution of save.&name interprets "." as part of the string "save." and thus concatenates it to the string "sales" from resolving the macro variable reference &name.

    Go to top

    Go back to Howto Home

  • Referencing Macro Variables Indirectly

    Let us consider the following macro:

    %let city1=New York;

    %let city2=Boston;

    %let city3=Seattle;

    %let city4=Las Vegas;

    %macro listcity(count=);
    %do n=1 %to &count;
    &city&n

    %end;

    %mend listcity;
     

    The above macro listcity was written by a student who wanted to save some typing when listing names of US cities. Unfortunately, when he calls the macro to list the first three cities using

      %listcity(count=3)

    SAS macro processor gives an error message saying "macro variable city is not resolved." To understand what has gone wrong, we need to know how the SAS macro processor interprets

      &city&n.

    The macro execution will try to concatenate the two macro variable references &city and &n, while the student wants the concatenation of the text string "city" and the macro variable reference &n completed before the processor references the resulting variables city1city2 and city3.

    To solve the problem, precede &city&n by another ampersand sign (&), i.e.

      &&city&n

    The double ampersands && will be first interpreted as a single & (this & will be kept aside and wait for referencing at a later time), and the concatenation of the text string "city" and the macro variable reference &n will be done prior to referencing the variables city1city2and city3.

    Go to top

    Go back to Howto Home

  • Scope of Macro Variables

    Let us consider the following macro:



    %let datanew=inventory;

    %macro conditn;

    %let dataold=sales;

    %let cond=cases > 0;

    %mend conditn;

    %macro name; 
    %let datanew=report;
    %let dataold= warehse;

    %conditn
    data &datanew;
    set &dataold;

    if &cond;

    run;


    %mend name;
     

    The very first %let statement defines the datanew variable globally. In contrast, all %let statements within macros will define variables locally unless a variable has already been defined in an open code (i.e., not within other macros) prior to its definition within a macro. In the above, the variable datanew reappears in the macro named name. Since this variable is defined earlier in an open code, it remains global throughout the code. The variables dataold and cond are local.

    When we invoke the macro name using

        %name

    the SAS macro processor will give an error message saying "The macro variable cond is not resolved." The problem behind is related to the scope of the macro variables. When the macro name is invoked, the first %let statement assigns the text string "report" to the global datanew variable (it replaces the original assigned string "inventory" by "report"). The second %let statement defines a local variable named dataold and assigns it a value of the text string "warehse". Next, the macro conditn is invoked. Within this macro, the variable cond is defined locally, and the value of the variable dataold is reassigned to the text string "sales". However, after the execution of the statement %conditn, the variable cond no longer exists (because it is local within the macro conditn only!!), so in the if statement in the later data step, the macro variable cond cannot be resolved. The other two variables datanew and dataold in the data step can be resolved because they still exist within the scope of the macro name.

    Indeed, the SAS macro processor interprets the above code upon the invocation of the macro name as follows:

    data report; 
        set sales; 
        if &cond; <-- this gives an error! 
    run;

    Go to top

    Go back to Howto Home

  • Forcing a Macro Variable to be Local

    Sometimes it might be useful to force a variable local within a macro if a (global) variable of the same name has been defined earlier in an open code, e.g. you might not want to alter the value of a global variable throughout the code. Of course, by avoiding the use of a variable name same as that of some global variable, we will not have problems of accidental alteration of the values of global variables.

    The following shows an example when a local definition is necessary:

    %let n=North State Industries;

    %macro namelst(name,number); 
        %do n=1 %to &number; 
            &name&n 
        %end; 
    %mend namelst;

    proc print; 
        var %namelst(dept,5); 
        title "Quarterly Report for &n"; 
    run;

    The macro namelst makes use of a %DO loop (as discussed earlier) for generating repetitive pieces of text (in here, the repetitive pieces are name1name2 and so on).  The variable n is global, as defined in the first %let statement. Within the macro namelst, the variable nserves as a counter. However, when the macro namelst is invoked later in the proc print step, the value of n changes during the execution of the %DO loop. In particular, it is the value of the global variable n that is changed. So in the title statement where referencing the variable n is required, the SAS macro processor will print a title

        "Quarterly Report for 6".

    The macro variable reference 6 is the result of running the %DO loop five times after the invocation of the macro namelst
    One can imagine the desired title should be "Quarterly Report for North State Industries", which can be obtained by forcing the variable n local within the macro namelst. The %local statement is used to keep a variable local; just add the line

        %local n;

    before the %DO loop in the macro namelst. The value of the global variable n (the text string "North State Industries") will not be affected by the change of values of the locally defined variable n within the macro.

    Go to top

    Go back to Howto Home

  • Creating Global Macro Variables

    The %local statement allows one to create local macro variables. Similarly, the %global statement creates a global macro variable if a variable with the same name does not already exist.

    Referring to the example we gave earlier when we discussed the scope of macro variables, the macro variable cond is not resolved during the execution of the data step within the macro name. This is because the variable cond is defined locally within the macroconditn and it does not exist after the execution of this macro. To avoid problems in referencing the macro variable cond in the data step, we can use the %global statement in defining the macro conditn, as follows:



    %macro conditn;

    %global cond;

    %let dataold=sales;
    %let cond=cases > 0;

    %mend conditn;
    Invoking the macro name generates the below statements:


    data report;


    set sales;

    if cases>0;

    run;
     

    Note: You CANNOT use the %global statement to make an existing local variable global!

    If you want to put the data step outside the macro name, then all the macro variables have to be global for the macro processor to resolve the references. You cannot add the macro variable dataold to the %global statement within the macro conditn since the %letstatement in the macro name has already created dataold as a local variable to name by the time conditn begins to execute.

    Go to top

    Go back to Howto Home

Reference:

SAS Macro Language: Reference, First Edition. SAS Institute Inc., Cary, NC, USA, 1997.

Q: Where can I use SAS?

Server host SAS: unixlab.stat.ubc.ca
OS: Unix Solaris
Hardwares: SUN 280R, 2x 1.4Mhz CPUs, 4Gb of Memory
Location: Undergraduate Network, Room LSK 121
Access mode: Via LSK 121 lab or SSH for remote login.

Q: What is your How to read Microsoft Excel format (.xls) data file by R?

There seem direct way to read .xls format file (see http://maths.newcastle.edu.au/~rking/R/help/00b/2519.html).

However, there some ways to indirectly read .xls file. For example, you can save the .xls file into .csv (comma separated value) format. Then use R's function read.csv to read it.

The reason to save .xls file to .csv file is that usually there are some columns in .xls file which are strings containing white spaces. So if to save .xls file to white space delimited or TAB delimited, then it is still difficult to read the file into R.

Q: How to call Fortran subroutines in R?

In R, we can call Fortran subroutines. For example, we have the following toy Fortran subroutine in the file test.f.

CCCCCCCCCCCCCCCCC C The subroutine is to calculate Hadama product of two matrices. C out[i][j]=x[i][j]*y[i][j]. C Both R and Fortran store matrix by column. CCCCCCCCCCCCCCCCC CCCCCCCCC Fortran program (f77) has to be between 7-th and 72-th column. CCCCCCCCC The 6-th column is for continuation marker. subroutine myHadamaProduct(x, y, nrow, ncol, mo) integer i, j, nrow, ncol CCCCCCC In Fortran, you don't need to specify the second dimension for matrix double precision x(nrow, *), y(nrow, *), mo(nrow, *) do i = 1, nrow do j = 1, ncol mo(i,j)=x(i,j)*y(i,j) enddo enddo return end
  1. First, we need to compile the file test.f to create a shared library, test.so say, by using the GNU Fortran compiler:

    g77 -fpic -shared -fno-gnu-linker -o test.so test.f
  2. Next, we need to use the R function dyn.load to load the shared library test.so. if(!is.loaded("myhadamaproduct")){ dyn.load("./test.so") } The R function is.loaded is to check if the Fortran subroutine myHadamaProduct is already be loaded to R. If yes, then we do not need to loaded it again.
  3. Next, we use the R function .Fortran to call the Fortran subroutine myHadamaProduct. For example, x<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix y<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix out<-matrix(0, nrow=5, ncol=2) # initialize output matrix # to format matrix or array, use function storage.mode() storage.mode(x)<-"double" storage.mode(y)<-"double" storage.mode(out)<-"double" nr<-as.integer(nrow(x)) nc<-as.integer(ncol(x)) # Fortran is *NOT* case-sensitive. So it will change the all characters # to lower case. Thus, to use .Fortran call Fortran subroutines, you # have to type lower case. Otherwise, R will prompt error message. res<-.Fortran("myhadamaproduct", x, y, nr, nc, out=out) cat("Hadama product >>n") print(res$out)
  4. If you do not need to use the shared library test.so any more, you can use the R function dyn.unload to unload it. if(is.loaded("myhadamaproduct")){ dyn.unload("./test.so") }

Note:

  • The Fortran program called by R must be subroutines, not functions. For the example above, myHadamaProduct is defined as subroutine. subroutine myHadamaProduct(x, y, nrow, ncol, mo)

    The arguments in Fortran subroutines are passed by address instead of by values. And not like C language, there is no "pointer" concept in Fortran.

  • When you use ".Fortran" to call Fortran subroutines, the name of the Fortran subroutines must be in lower case.
  • Any values returned by Fortran subroutines which are called in R must be initialized and must have the format: # if the variable is defined as double in the Fortran subroutine variablename=as.double(initialized values) # inside .Fortran # if the variable is defined as integer in the Fortran subroutine variablename=as.integer(initialized values) # inside .Fortran # if the output is double precision matrix or array storage.mode(variablename)<-"double" # before .Fortran variablename=variablename # inside .Fortran

    The input values must also be initialized and must have the above format. However, they can be formated before the ".Fortran" function.

  • If the output is not written as variablename=variablename format (e.g. out=out in the above example), You still can get results. However, you have to use res[[5]] to refer out in the above example. In fact, the .Fortran function return a list containing all arguments of the Fortran subroutine myHadamaProduct. Since out is the 5-th argument, you can use res[[5]] to refer to the 5-th elements of the list.
  • It is okay that the file test.f contains the main program.
  • Sometimes, the command "dyn.load("test.so")" gets error message. This is probably caused by the environment variable "$PATH" was not set correctly. You can either add the following line to the file ".bashrc" in your home directory: export PATH=$PATH:.:

    or use the command

    dyn.load("./test.so")
Q: How to install command-line editor in Splus?

R has a command-line editor which allows us to retrieve and edit commands we entered before. We also can install a command-line editor in Splus.

STEP 1

If you use Bourne, Korn, Bash or Z-Shell, type the following lines into your .bashrc file which is in your home directory:

export EDITOR="/usr/local/bin/vim" export S_CLEDITOR="/usr/local/bin/vim" export VISUAL="/usr/local/bin/vim"

If you use C-Shell or TC-Shell, then type the following lines into your .cshrc file which is in your home directory:

setenv EDITOR "/usr/local/bin/vim" setenv S_CLEDITOR "/usr/local/bin/vim" setenv VISUAL "/usr/local/bin/vim"

If you want to use emacs instead of vi, then simply replace the vim with emacs.

STEP 2

in your home directory, type command

source .bashrc

or

source .cshrc

STEP 3

type

Splus -e

to invoke editor when you enter into Splus session.

The most useful editing commands are summarized in the following table:

COMMAND emacs vi backward character Ctrl-B Esc, h forward character Ctrl-F Esc, l previous line Ctrl-P Esc, k next line Ctrl-N Esc, j beginning of line Ctrl-A Esc, ^ (Shift-6) end of line Ctrl-E Esc, $ (Shift-4) forward word Esc, f Esc, w backward word Esc, b Esc, b kill char Ctrl-D Esc, x kill line Ctrl-K Esc, Shift-d delete word Esc, d Esc, dw search backward Ctrl-R Esc, ? yank Ctrl-Y Esc, Shift-y transpose chars Ctrl-T Esc, xp

You can type Splus command

?Command.edit

to get the above table.

Q: How to run R/Splus program in background so that the command can continue running in the background after you log out?

The syntax: nohup R --no-save < input.R > output& nohup Splus < input.s > output& where input.R contains your R code and where input.s contains your Splus code.

Q: How to call C functions or Fortran subroutines in Splus?

There are different versions of Splus installed in the department computing system. Splus is installed in Hajek, Newton, Emily, and Statlab. Splus 5 Splus 6 are installed in all servers.

  • For Splus 3.4, the function to load the shared libraries is dyn.load.shared instead of dyn.load. The function dyn.load is used to load the object functions such as test.o obtained by using the command "g77 -c test.f". There is also no dyn.unload function in Splus 3.4.
  • For Splus 5 and Splus 6, the function dyn.load and dyn.load.shared are obsolete. Splus 5 and Splus 6 use the function dyn.open to load the shared libraries and the function dyn.close to unload the shared libraries.

Pages