Q: SAS Interactive Matrix Language (IML) Language

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

Go to top

Go back to Howto Home

  • Introduction

    The SAS IML language is a programming language which is used mainly to manipulate numeric and character matrices/vectors. Its grammar is quite similar to that of Splus/R.

    The SAS IML language is a supplement of SAS procedures. If you can use SAS procedures to solve your problems, then you do not need to use IML.

    The SAS IML language can input, create, and output SAS data sets. It also can input and output external files and can produce graphics.

    The SAS IML source code (statements) should be within the procedure proc iml;. Thus the statements cannot contain other SAS procedures (e.g. proc print).

    Go to top

    Go back to Howto Home

  • SAS/IML software

    The SAS/IML software provides a dynamic, interactive environment for programming by the SAS/IML language.

    To run the SAS/IML software,

    1. Login to statlab
    2. Type command sas&

      to run SAS

    3. In the window "SAS: PROGRAM EDITOR", type command: proc iml;
    4. Press the function key F3 to submit the above sas command.

      In the window "SAS: LOG", you will see "IML Ready".

      Now you can program by SAS/IML. The programming process is dynamic which is similar to Splus/R.

      • In Splus/R, you type your statement after the prompt ">". Once you finish typing the statement, you hit the "Enter" key to submit the statement.
      • In SAS, you type your statement in the window "SAS: PROGRAM EDITOR". Once you finish typing the statement, you hit "F3" key to submit the statement.
    5. In the window "SAS: PROGRAM EDITOR", type quit; and press F3 key to quit the SAS/IML interactive programming process.
    6. If you want to print the final results of the statement you submit, you can type reset print; in the window "SAS: PROGRAM EDITOR" before your statement. If you do not want this feature later, you can type reset noprint; to turn it off. The default value is noprint.
    7. In the window "SAS: PROGRAM EDITOR", type quit; and press F3 key to quit the SAS/IML interactive programming process.
    8. If you want to print the both intermediate and final results of the statement you submit, you can type reset printall; in the window "SAS: PROGRAM EDITOR" before your statement. If you do not want this feature later, you can type reset noprintall; to turn it off. The default value is noprintall.

    Like writing Splus/R programs, you also can first write SAS/IML source code in a file, then load it to the window "SAS: PROGRAM EDITOR" (Select File menu and Open submenu to load source code files). Then press F3 to submit it.

    The SAS/IML source code should begin with

    proc iml; and end with quit;

    Go to top

    Go back to Howto Home

  • Some tips of using SAS
    • SAS is NOT case-sensitive. So the name AB is the same as Ab, aB, or ab.
    • The length of names used in SAS can not exceed 8 characters. Names should begin with a letter or underscore. The remaining parts of names can contain letter, underscores and numbers.
    • A number can be expressed in scientific notation such as 1e-6.
    • If the string does not contain blanks and special characters, then you don't need to use quotes.

      If a character string contains blanks, special characters, the string must be enclosed by either single quote (') or double quote (").

      If the string contains quotes (e.g. Can't), then you have to double them (e.g. 'Can''t').

    • A missing value is denoted as a period ".".
    • Any SAS statement is ended with semicolon ";".
    • The comments of SAS programs are enclosed by the pair "/*" and "*/".

      For example,

      /* To print out final results, turn on PRINT */ reset print;
    • Short-cut keys
      KEY DESCRIPTION
      F1 Invoke the online help window.
      F2 Invoke the window to show the short-cut keys defined in SAS.
      F3 Submit SAS statements edited in the "SAS: PROGRAM EDITOR".
      F4 Recall SAS statements from the buffer to the "SAS: PROGRAM EDITOR".

    Go to top

    Go back to Howto Home

  • SAS/IML Basics

    The data objects of SAS/IML are matrices.

    Go to top

    Go back to Howto Home

    Go to top

    Go back to Howto Home

    • Defining a matrix

      A n by p matrix is a n by p table. That is, A n by p matrix has n rows and p columns.

      • The elements of a matrix should have the same type, i.e. either numeric or character, but not mixed.
      • Like Splus/R, the element (the ij-th element say) of a matrix (A say) is expressed by A[i, j]

        The i-th row is expressed by

        A[i,] and the j-th column is expressed by A[,j]

        The expression of the submatrix consisting of the 1, 3 rows and the 2, 4 columns of the matrix A is

        A[{1 3}, {2 4}] or A[{1, 3}, {2, 4}]

      • If A is a p by 1 column vector or a 1 by p row vector, then the i th variable is expressed as A[i]
      • You can not use the statements like the followings to show the values of submatrices/subvectors of a matrix/vector: print A[1, 2]; print A[1,]; print A[,1]; print A[{1 3}, {2 4}];

        You have to first assign the submatrices/subvectors to a variable, then print the variable:

        b1=A[1,2]; b2=A[1,]; b3=A[,1]; b4=A[{1 3}, {2 4}]; print b1, b2, b3, b4, A;
    • Matrix Operations (creation, combining, etc.)
      • To create a scalar (1 by 1 matrix), you don't need to use curly brackets "{}". For example, a=348; /* numeric value */ b="Tom"; /* character value */ c="Yes or No"; /* character value */ d=.; /* missing value */
      • To create a row vector, use curly brackets to enclose its elements. Elements are separated by blanks. For example, a={1.239 43.23 29}; b={"Tom" "Mark" "SAM"}; print a, b;

        The results look like:

        A 1.239 43.23 29 B Tom Mark SAM
      • To create a column vector, use curly brackets to enclose its elements. Elements are separated by commas. For example, a={1.239, 43.23, 29}; b={"Tom", "Mark", "SAM"}; print a, b;

        The results look like:

        A 1.239 43.23 29 B Tom Mark SAM
      • To create a matrix, type the elements by rows and separate each row with comma. For example, a={1 2, 3 4, 5 6}; print a;

        The result looks like:

        A 1 2 3 4 5 6
      • To create a identity matrix, use the SAS/IML inner function I. The syntax is: I(dimension)

        For example,

        I5=I(5); print I5;
      • To create a matrix whose elements are all equal, use the SAS/IML inner function J. The syntax is: J(nrow<,ncol<,value>>) where ncol and value are optional. By default, The statement J(nrow) produces a nrow by nrow matrix whose elements are all 1.

        For example,

        a=J(5); /* a 5x5 matrix whose elements are all 1. */ b=J(5, 3, 21); /* a 5x3 matrix whose elements are all 21. */ print a, b;
      • To create a block diagonal matrix, use the SAS/IML inner function BLOCK. The syntax is: BLOCK(matrix1,<,matrix2<,...,matrix15>)

        For example,

        a=J(1,3); /* a 1x3 matrix whose elements are all equal to 1. */ b=J(2,1,0.5); /* a 2x1 matrix whose elements are all equal to 0.5. */ c=block(a,b); print a, b, c;

        The results look like:

        A 1 1 1 B 0.5 0.5 C 1 1 1 0 0 0 0 0.5 0 0 0 0.5
      • Like in Splus/R, we can produce index vectors in SAS/IML by using the index operator ":". For example, a=1:6; /* produce {1, 2, 3, 4, 5, 6} */ b=90:85; /* produce {90, 89, 88, 87, 86, 85} */ c='a1':'a5'; /* produce {'a1', 'a2', 'a3', 'a4', 'a5'} */
      • SAS/IML have a function (DO function) like the function "seq" in Splus/R. The syntax for the function DO is DO(from, to, by);

        For example,

        a=DO(2, 3, 0.5); /* produce {2, 2.5, 3} */
      • The SAS/IML function "SHAPE" is similar to the Splus/R function "matrix". The syntax is SHAPE(matrix<,nrow<,ncol<,pad-value>>>);

        For example,

        a=SHAPE(1:10, 5, 2); /* produce a 5x2 matrix */ print a; The result is A 5 rows 2 cols (numeric) 1 2 3 4 5 6 7 8 9 10

        If the argument matrix does not provide enough elements to create nrow by ncol matrix, the function SHAPE will fill in the pad-value. For example,

        a=SHAPE(1:3, 5, 2, 0); /* the pad-value is 0 */ print a; The result of the above statements is A 1 2 3 0 0 0 0 0 0 0

        If no pad-value provides, then the function SHAPE will cycles back and repeats values to fill in. For example,

        a=SHAPE(1:3, 5, 2); /* No pad-value is provided */ print a; The result of the above statements is A 1 2 3 1 2 3 1 2 3 1

      • The SAS/IML operators "||" and "//" concatenates matrices horizontally and vertically respectively (They are similar to the Splus/R functions rbind and cbind respectively). For example, a=(1:3)||(4:6); print a; b=(1:3)//(4:6); print b; The results are A 1 row 6 cols (numeric) 1 2 3 4 5 6 B 2 rows 3 cols (numeric) 1 2 3 4 5 6
      • To change the values of an element of a matrix, just assign new value to the element. For example, a[3,9]=0.5;
      • To create a diagonal matrix, you can use the SAS/IML function DIAG. The syntax is DIAG(square matrix/vector); For example, b=diag({1 2 3}); b=diag({1 2, 3 4}); b=diag(3); /* is equivalent to b=3. */
    • Matrix Algebra (+, -, *, transpose, eigenvalues, etc.)

      • In SAS/IML, the basic matrix algebra such as "+, -, *" is simple. For example,

        STATEMENT DESCRIPTION
        c=(A+B); matrix addition
        c=(A-B); matrix subtraction
        c=(A*B); matrix multiplication
        c=(A#B); elementwise multiplication cij=aij*bij.
        c=(A**2); matrix power. A**2 = A*A. Thus, A should be a square matrix.
        c=(A##2); elementwise power. cij=aij*aij.
        c=(A/B); elementwise division. cij=aij/bij.
        c=(A<>B); elementwise maximum. cij=max(aij, bij).
        c=(A><B); elementwise miminmum. cij=min(aij, bij).
        c=(A>=B); elementwise greater than or equal to. cij=1 if aij>= bij. cij=0 if aij<bij;
        c=(A<=B); elementwise less than or equal to. cij=1 if aij<= bij. cij=0 if aij>bij;
        c=(A>B); elementwise greater than. cij=1 if aij> bij. cij=0 if aij<=bij;
        c=(A<B); elementwise less than. cij=1 if aij< bij. cij=0 if aij>=bij;
        c=(A^=B); elementwise not equal to. cij=1 if aij is not equal to bij. cij=0 if aij = bij;
        c=(A=B); elementwise equal to. cij=1 if aij=bij. cij=0 if aij is not equal to bij;
        c=(A#(A>0)); cij=aij if aij>0. cij=0 if aij<=0.

      • The following table lists some commonly used matrix algebra.

        STATEMENT DESCRIPTION
        t(A); or A`; Retrun the transpose of the matrix A. Note the quote is backquote.
        det(A); Return the determinant of the matrix A. Note the matrix A should be square matrix.
        trace(A); Return the trace of the matrix A. Note the matrix A should be square matrix.
        inv(A); Return the inverse of the matrix A. Note the matrix A should be square matrix.
        eigval(A); Return the eigen values of the symmetric matrix A.
        eigvec(A); Return a matrix whose columns correspond to the orthonormal eigenvectors of the symmetric matrix A.
        call eigen(vals, vecs, A); Calculate the eigen values and corresponding orthogonal eigen vectors of the symmetric matrix A. vals stores eigen values and the columns of the matrix vecsstore orthogonal eigen vectors.
        root(A); Performs the Cholesky decomposition of a symmetric and positive definite matrix A, where A=t(B)*B and B is an upper triangular matrix. The statement "root(A);" returns the matrix B.
        solve(A, b); Solves a system of linear equations, where A is a n by n nonsingular square matrix and b is a n by p matrix.
        nrow(A); Return the number of rows of the matrix A.
        ncol(A); Return the number of columns of the matrix A.

      • SAS/IML provides matrix subscript reduction operators to simplify matrix operation. The following table shows their usage:

        STATEMENT DESCRIPTION
        A[+, ]; Get a row vector {sum(A[,1], sum(A[,2],..., sum(A[,p])} whose elements are summation of each column of the matrix A.
        A[ ,+]; Get a column vector {sum(A[1,], sum(A[2,],..., sum(A[p,])} whose elements are summation of each row of the matrix A.
        A[#, ]; Get a row vector whose elements are product of each column of the matrix A.
        A[ ,#]; Get a column vector whose elements are product of each row of the matrix A.
        A[<>, ]; Get a row vector {max(A[,1], max(A[,2],..., max(A[,p])} whose elements are maximum of each column of the matrix A.
        A[ ,<>]; Get a column vector {max(A[1,], max(A[2,],..., max(A[p,])} whose elements are maximum of each row of the matrix A.
        A[><, ]; Get a row vector {min(A[,1], min(A[,2],..., min(A[,p])} whose elements are minimum of each column of the matrix A.
        A[ ,><]; Get a column vector {min(A[1,], min(A[2,],..., min(A[p,])} whose elements are minimum of each row of the matrix A.
        A[<:>, ]; Get a row vector whose elements are indices of the maximum of each column of the matrix A.
        A[ ,<:>]; Get a column vector whose elements are indices of the maximum of each row of the matrix A.
        A[>:<, ]; Get a row vector whose elements are indices of the minimum of each column of the matrix A.
        A[ ,>:<]; Get a column vector whose elements are indices of the minimum of each row of the matrix A.
        A[:, ]; Get a row vector whose elements are the mean of each column of the matrix A.
        A[ ,:]; Get a column vector whose elements are the mean of each row of the matrix A.
        A[##, ]; Get a row vector whose elements are the sum of squares of each column of the matrix A.
        A[ ,##]; Get a column vector whose elements are the sum of squares of each row of the matrix A.

        You can use these operators in both row and column. Row reduction is done first. You also can repeat reduction operators. The following table gives some examples.

        STATEMENT DESCRIPTION
        A[+,<>] Get the maximum of the sums of columns of the matrix A.
        A[{2 6},<>] Get the maximum of the rows of the submatrix A[{2 6},].
        A[ ,<>][+, ] Get the sum of the maximum of the rows of the matrix A.

        For vectors, we also can use these reduction operators. For example,

        A[+]; /* A[+] is equivalent to "A[+,];" if A is a column vector and is equivalent to "A;" if A is row vector. */
  • Modules

    SAS/IML is a programing language. You can define your own modules. There are two kinds of modules. One is function and the other is subroutine.

    Go to top

    Go back to Howto Home

    Go to top

    Go back to Howto Home

    • Functions

      The basic structure of a function in SAS/IML is:

      start function name (arguments); function body return(values of the the function); finish function name;

      For example,

      /* define a function */ start test (x,y); z=3; x=4; y=5; p=6; return(x+y-z-p); finish test; a=1; b=2; p = -0.2143; c=test(a, b); print a, b, p, c;

      The results are:

      A 4 B 5 p -0.2143 C 0

      • The variables (e.g. p and z in the above example) created within a function are local variables. They will not be available outside the function.

        Note that although a variable p is defined before calling the function test, the variable inside test is different from that defined outside test. SAS/IML allocate memories for them separately. For the variable inside test, its memory is temporary. This memory will be recycled after calling the function test.

      • The arguments (e.g. a and b in the above example) of a function are similar to the pointers in C language. If you change their values inside the function, the values remain changed outside the function. In the above example, the values of variables a and b are changed inside the function test. After calling the function test, the values of variables a and bremain changed.
      • A function can return only one data object. Although you can ``return'' more than one data object by defining output as arguments of the function, it's better to use subroutine instead of function in this case.
    • Subroutines

      Subroutines are similar to functions. The main differences are:

      • subroutines do not contain return statement.
      • To call subroutines, you have to use one of the following syntax: call subroutine name (arguments); or run subroutine name (arguments);

      The basic structure of a subroutine in SAS/IML is:

      start subroutine name (arguments); function body finish subroutine name;

      For example,

      /* define a subroutine */ start test2 (c, x,y); z=3; x=4; y=5; p=6; c=x+y-z-p; finish test2; a=1; b=2; p = -0.2143; call test2(c, a, b); print a, b, p, c;

      The results are:

      A 4 B 5 p -0.2143 C 0

    Note that

    • If the function name you defined is the same as the name of an inner function of SAS/IML, then SAS/IML will use the inner function instead of the function you defined.
    • The difference between run and call is how to deal with the case where the subroutine you defined has the same name as an inner subroutines. If you use run to call the subroutine you defined, SAS/IML will call the subroutine you defined. If you use call, SAS/IML will call the inner subroutine.
    • Functions and/or subroutines can be nested into other functions and/or subroutines.
    • If the arguments are used as output, then it's better to put them before those arguments which are used as input.
  • Flow Controls

    Like any other programming language, SAS/IML allows the user to control the path of the execution.

    Go to top

    Go back to Howto Home

    • Go to top

      Go back to Howto Home

    • IF-THEN/ELSE

      The syntax is

      IF expression THEN statement1; ELSE statement2;

      For example,

      if a[k] < mymax then ; /* null statement */ else mymax=a[k];

      Note that IF statements can be nested into other IF statements.

      Go to top

      Go back to Howto Home

    • Do groups

      • Iterative Do statement.

        The syntax is

        DO variable=start TO stop BY increment; statements; END;

        For example,

        DO i=1 to n by 1; a[i]=i; END;

        By default the increment is 1.

      • DO WHILE statement.

        The syntax is

        DO WHILE(expression); statements; END;

        The statements will be executed if the expression is true. For example,

        loop=2; a=1; do while(loop < 100); a=a//loop; /* concatenate vertically */ loop=loop+1; end;

        The above statements produce the column vector {1, 2, ..., 99}.

        Note that the "do while" statement evaluates the expression at the beginning of the loop.

      • DO UNTIL statement.

        The syntax is

        DO UNTIL(expression); statements; END;

        The statements will be executed if the expression is false. For example,

        loop=2; a=1; do until(loop < 100); a=a//loop; /* concatenate vertically */ loop=loop+1; end;

        The above statements produce the column vector {1, 2}.

        Note that "do until" statement evaluates the expression at the bottom of the loop so that the loop always execute at least once.

      • DO DATA statement.

        The "DO DATA" statement is used to read data from external files. It can also be used to process SAS data sets. The syntax is

        DO DATA; statements; END;

        For example,

        infile 'abc.txt'; /* open abc.txt */ do data; input x; /* read a data value */ y = y//x; end; print y;

    • Pause, Resume, Stop and Abort

      • PAUSE statement.

        The syntax is:

        PAUSE <message><*>;

        The "PAUSE" pause the execution of the program. You can enter more statements. Type "RESUME;" to continue execution at the place where the most recent PAUSE statement was executed.

        PAUSE must be used within a module.

        Examples:

        pause "variable x should be numeric! Assign correct value to x, then type RESUME;"; pause *; /* suppress printing any message. */
      • STOP statement.

        The syntax is

        STOP;

        The "STOP" statement is similar to PAUSE.

      • ABORT statement.

        The syntax is

        ABORT;

        The "ABORT" statement stops execution and exits from IML.

  • Read, edit and create SAS data sets

    SAS/IML can create matrices/vectors by reading data from SAS datasets. SAS/IML also can edit and create SAS data sets.

    Go to top

    Go back to Howto Home

    • Go to top

      Go back to Howto Home

    • Read SAS data sets

      To read a SAS data set, abc say, you need to

      1. open the data set by the "use" statement. For example, use abc;

        The syntax of the USE statement is

        USE SAS data set name <VAR variables> <WHERE(expression)>;

      2. read data into a matrix by using the "read" statement. The syntax of the read statement is READ <range> <VAR variables> <WHERE (expression)> <INTO name>; where range specifies which observations (rows) in the SAS data set you want to use. variables specifies which variables (columns) in the SAS data set you want to use

        For example,

        /* "point {1 3}" specifies that the 1st and 3rd observations (rows) of the data set will be read. x, y and sex are variables in the data set. */ read point {1 3} var {x y} where(sex="F") into a; /* "all" specifies that all observations (rows) of the data set will be read. */ read all var {x y} into a;
      3. To close a SAS data set in IML, type close test;

      To show the information of the data set, you can use "SHOW" and "LIST" statements. For example,

      show datasets; /* shows how many data sets are used in IML and which one is currently used */ show contents; /* shows the contents (e.g. type of variables) of the current used data set. */ list all; /* list all observations and all variables of the current used data set. */ list point {3 6} var {x} where(sex="F");

      Go to top

      Go back to Howto Home

    • Edit SAS data sets

      To edit SAS data sets, you need to

      1. use the "EDIT" statement to set the current used SAS data set for both input and output. The syntax of the EDITstatement is EDIT SAS data set name <VAR variables> <WHERE(expression)>;

        For example,

        edit test;

      2. find the observations you want to update by using the "FIND" statement. For example, /* find the row numbers of all observations for Tom and stores the row number into matrix pos. */ find all where(name="Tom") into pos; /* list the value of pos */ print pos; /* list the observation */ list point pos;

      3. update the values. For example, age=20; /* replace Tom's age with 20 */ score=90; /* replace Tom's score with 90 */ replace; /* update the values in the SAS data set */ list point pos; /* list the observation again to check if it is updated */

      You can delete the observations in the SAS data set by using the "DELETE" statement. For example,

      delete; /* delete the current observation */ delete point {1 3}; /* delete the 1st and 3rd observations */ delete all where (name="Tom"); /* delete all observations of Tom */
    • Create SAS data sets

      You can create SAS data sets from matrices. The syntax is as follows:

      CREATE SAS data set name FROM matrix name <[COLNAME=column-name ROWNAME=row-name]>;

      For example,

      a=1:100; create test2 from a [colname="id"]; append;
  • Import from and export to external files

    Go to top

    Go back to Howto Home

    • Go to top

      Go back to Howto Home

    • Import from external files

      The followings show steps to import an external file into SAS/IML:

      1. Assign an alias to the external file by using a FILENAME statement. For example, /* refer "testfile" to the external file "abc.txt". */ /* "testfile" is an alias of "abc.txt". */ filename testfile 'abc.txt';
      2. Open the external file for input by using an INFILE statement. For example, infile testfile;
      3. Set the length of any character variables. For example, name='1':'8'; /* name="12345678"; */ sex='1';
      4. Create a new SAS data set. For example, /* the name of the new SAS data set is "test". */ create test var {name sex score};
      5. Read data by using the "DO DATA" statement and the "INPUT" statement. For example, do data; input name $ sex $ score; append; end;
      6. Close the external file by using the "CLOSEFILE" statement. For example, closefile testfile;
    • Export to external files

      To write a matrix to an external file, you can follow the steps below:

      1. Assign an alias to the external file by using a FILENAME statement. For example, /* refer "testfile" to the external file "abc.txt". */ /* "testfile" is an alias of "abc.txt". */ filename testfile 'abc.txt';
      2. Open the external file for output by using an FILE statement. For example, file testfile;
      3. Output the matrix to the external file by using DO loop and the "PUT" statement. For example, n=nrow(A); p=ncol(A); do i=1 to n; do j=1 to p; /* Output A[i,j] using SAS format 6.4. There are 2 space between A[i,j] and A[i,j+1]. @ instruct IML to put a hold on the current record (row) so that IML can write more to the same record (row). */ put (A[i,j]) 6.4 +2 @; end; /* the symbol "/" instruct IML to start a new record (row) */ put /; end;

      4. Close the external file by using the "CLOSEFILE" statement. For example, closefile testfile;
  • Graphics

    You can draw plots in SAS/IML. To draw a plot in SAS/IML, you can follow the steps below.

    1. Start graphics. /* This is similar to Splus/R function x11() or motif(). But after call gstart, no window will popup. */ call gstart;
    2. Start a new graph. /* instruct IML to erase old graph instead of plotting new graphs on the old graph. */ call gopen;
    3. Define display window. /* define the coordinates of the lower-left and upper-right corners. The range of the data should within this area. Otherwise, IML will not draw data points outside this area. */ x1=0; y1=0; x2=200; y2=200; call gwindow({x1 y1 x2 y2});
    4. Draw points. /* Draw scatter plot of y versus x. Points have diamond shape and green color. */ call gpoint(x, y, "diamond", "green"); /* Connect points by solid green line. */ call gdraw(x, y, 1, "green");
    5. Display the graph. /* A graph window will popup and shows the graph you specified above. */ call gshow;
    Note that
    • By default, the graph does not show border and axes. You have to specify them by the "GPOLY", "GXAXIS" and "GYAXIS" statements.
    • You can add labels in the graph by using the "GSET", "GTEXT" and "GVTEXT" statements.
    • To add title, you can use the "GSCENTER" statement.
    Details can be found in the reference book which is in the computing room of the department.
  • Misc. (loading and storing matrices, etc.)

    To save memory, you can first store matrices into library storage, then release these matrices. If you want to use these matrices again, you can load them from library storage.

    To store matrices, use the "STORE" statement. For example,

    store a b; /* store matrices a and b */

    To release the memory of matrices, use the "FREE" statement. For example,

    free a b;

    To show which matrices in the library storage, use the "SHOW" statement. For example,

    show storage;

    To load matrices from the library storage, use the "LOAD" statement. For example,

    load a b;

Reference:

SAS IML Software: Usage and Reference, Version 6, First Edition. SAS Institute Inc., Cary, NC, USA, 1990.

Q: SAS online for teaching

You can use SAS software online for teaching, provided by SAS.
Detail how to setup and running provided by Dr. Harry Joe: http://www.ugrad.stat.ubc.ca/~hjoe/sas

Q: How to read SAS data files int Splus?

Method 1. 

If you are using SAS version 6.12 in Unix... 


Step 1 
Output SAS data to the desired directory as usual. The file extension using SAS version 6.12 should be .ssd01. Suppose a SAS data file named myfile.ssd01 is located under /home/user/data/

Step 2 
In Splus 3.4 or above, use the function sas.get(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mylibrary <- "/home/user/data/" 
      mydata <- sas.get(mylibrary,"myfile") 

For more detail, do help(sas.get)

OR 

In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/myfile.ssd01",type="SAS1") 

For more detail, do help(importData)

If you are using SAS version 7 in Unix... 

Step 1 
Output SAS data to the desired directory as usual. The file extension using SAS version 7 should be .sas7bdat. Suppose a SAS data file named myfile.sas7bdat is located under /home/user/data/

Step 2 
In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/myfile.sas7bdat",type="SAS7") 

For more detail, do help(importData)

If you are using SAS version 8 in Unix... see Method 2. 


Method 2 (recommended). Create a transport file from SAS (this works for SAS version 6.12 or above) 

Step 1 
In the SAS code, include the following line when defining SAS libraries: 

      libname sasdata xport "/home/user/data/file.tpt"; 
 

  • "libname" and "xport" are required in the syntax
  • "sasdata" is the user-defined library name
  • "/home/user/data/file.tpt" is the full path name of the transport file (Note the .tpt extension!). "file.tpt" is the file you want to output from SAS and later to be read into Splus


Don't forget to name the file to be outputted in a SAS procedure as "sasdata.file.tpt" in the SAS code!

Step 2 
In Splus 5 or above, use the function importData(). For example, if you want to name the data set as "mydata" in Splus, type the following at the Splus prompt: 

      mydata <- importData("/home/user/data/file.tpt",type="SAS_TPT") 

For more detail, do help(importData)

Note: 
1. Splus 3.4 does not have the importData() function. 
2. The sas.get() function may not read data files created by SAS of version 7 or above. 
3. The importData() function may not read data files from SAS of version 8 even though the file suffix can be the same as version 7 files (.sas7bdat). 
4. For Splus 4+ in Windows, use the import.data() function instead.

Q: Examples with SQL and databases (by Harry Joe)

Examples with SQL and databases (by Harry Joe)

Q: SAS Macro Language

SAS Macro Language


Go back to Howto Home


  • Introduction

    The macro facility is a powerful tool for extending and customizing the SAS System. It is particularly useful if you are to repeat some common tasks in the input data step, etc. Writing a macro in SAS is like writing a function in Splus. With the macro facility, you can assign a name to a character strings or groups of SAS programming statements. When you want to run those programming statements, you can just refer to the name assigned.

    When you use a macro facility name in a SAS program, the macro facility generates SAS statements and commands as needed. The rest of the SAS System receives those statements and uses them in the same way it uses the ones you enter in the standard manner.

    The macro facility has two components:

    1. The macro processor - the portion of the SAS System that does the work. 
    2. The macro language - the syntax that you use to communicate with the macro processor.

    When the SAS System compiles program text, two delimiters trigger macro processor activity:

       &name     refers to a macro variable. The form &name is called a macro variable reference.

       %name     refers to a macro.

    Go to top

    Go back to Howto Home

  • Macro Definition

    The syntax for defining a macro is as follows:    

    %macro macro-name;
    macro-definition 
    %mend macro-name;
     

    Note that in defining a macro, you need to give it a distinct name. You will also need to begin and end the macro with the %macro and %mend statements, respectively. The macro-name specified in the %mend statment must match the macro-name in the %macrostatement. Examples of macros will follow in the later sections.

    Go to top

    Go back to Howto Home

  • Defining Macro Variables

    Macro variables are an efficient way of substituting text strings in SAS code. One way of defining a macro variable is by using the %letstatement to assign the macro variable to a name, and a value.

    For example,

    %let subject=Statistics;



    subject is the macro variable name and Statistics is its value. Assigning a text string to a macro variable does not require quotation marks to embrace the string, as opposed to the standard way of text string assignment to non-macro variables in SAS.

    Later, when you want to create a title for the data set you use for the Statistics 200 course, you would like the text Statistics to appear. To refer to the variable subject, precede the variable name with an ampersand (&) (for macro variable reference):

    title "Data for &subject 200 class";



    The macro processor resolves the reference to the macro variable subject, and the statement becomes

    title "Data for Statistics 200 class;
     
  • An important note: the macro processor resolves macro variables references within DOUBLE quotation marks but NOT within single quotation marks.

    Go to top

    Go back to Howto Home

  • Commenting in Macros

    There are two ways of inserting comments in Macros:

    1. begin with /* and end with */ 
    2. begin with a %* and end with a ;

    An example:


    %macro comment;
      
    /* the first type of commenting */

    %let myvar=abc;

    %* the second type of commenting;
    %let myvar2=xyz;

    %mend comment;

    Go to top

    Go back to Howto Home

  • Invoking a Macro

    The general rule for invoking a macro is to precede the name of the macro with a percent sign (%). Unlike typing a SAS statement, the statement for macro invocation does NOT end in a semicolon.

    In the previous section, we have defined a macro named comment. To invoke the macro, simply do



        %comment
     

    Go to top

    Go back to Howto Home

  • Passing Information into a Macro Using Parameters

    Here we will see the similarity between a macro and a user-defined function in other programming language such as Splus.

    Suppose we are interested in defining a macro which takes in the x and y variables and plot a graph of y versus x. The following macro myplot will do the job:

    %macro myplot(x=,y=);

      proc plot;

    plot &y * &x; 

    title "plot of &y versus &x";

    run;

    %mend myplot;
     

    A macro variable defined in parentheses in a %macro statement is a macro parameter. Macro parameters allow you to pass information into a macro. They are like input arguments in routines in Splus, C/C++, etc.

    To invoke the macro myplot in the above example, you must provide values for the two parameters x and y. Say, you want a graph of income versus age, then do:



        %myplot(x=age,y=income)
     

    where the variables age and income already exist in a SAS data set. The macro processor matches the values specified in the macro call to the parameters in the macro definition. Macro execution produces the following code:


      proc plot;

    plot income * age; 

    title "plot of income versus age";

    run;
     

    Go to top

    Go back to Howto Home

  • Generating Repetitive Pieces of Text Using %DO Loops

    To generate repetitive pieces of text, one can make use of an iterative %DO loop. Suppose you want to list of a series of data sets or variables whose names share a certain pattern. The following macro can reduce the amount of typing in listing the data sets or variables:



    %macro names(name=,count=);


      %do n=1 %to &count;

        
    &name&n

    %end;

    %mend names;
    Say, you are to list 5 data sets named data1, data2, data3, data4 and data5 in your data step, you can do:

      data %names(name=data,count=5);
     

    Macro execution produces the following code:


      data data1 data2 data3 data4 data5;
     

    Note that to concatenate the text string passed to the macro parameter name and the number passed to the macro parameter count, we only need to juxtapose the macro variable references &name and &count.

    However, to concatenate a text string passed to a macro parameter and another text string (the second is not through a macro parameter), we may need the period character (.). See the section of Combining Macro Variable Reference with Text for more detail.

    Go to top

    Go back to Howto Home

  • Combining Macro Variable Reference with Text

    Let us first look at the following piece of SAS code with the use of macros:



    %let name=sales;

    data AB&name.dat;

    set save.&name; 

    if units>100;

    run;
     

    Macro execution produces the following code:


    data ABsalesdat;

    set save.sales;

    if units>100;

    run;
     

    The student who wrote the above code was planning to create a data set named ABsalesdat which was a subset of the data saved in savesales where the units variable in savesales was larger than 100. But the macro execution of his code did not produce his desired output. In particular, save.sales was shown instead of the desired savesales.

    Here is the trick for concatenating text and macro variables:

    1. To precede a text string "string1" with another string assigned to a macro variable mystring, simply juxtapose the first text string (without quotation) with the macro variable reference &mystring, i.e., do

        string1&mystring

    2. To add a suffix "sufstring" to a text string assigned to a macro variable mystring, you will need the period character (.), i.e., do

        &mystring.sufstring

    You can combine the two rules to concatenate more than 2 text strings involving a macro variable. In the above example, execution of AB&name.dat produces a string that precedes the text string "AB" with the macro variable reference &name, which is followed by the suffix "dat". However, the execution of save.&name interprets "." as part of the string "save." and thus concatenates it to the string "sales" from resolving the macro variable reference &name.

    Go to top

    Go back to Howto Home

  • Referencing Macro Variables Indirectly

    Let us consider the following macro:

    %let city1=New York;

    %let city2=Boston;

    %let city3=Seattle;

    %let city4=Las Vegas;

    %macro listcity(count=);
    %do n=1 %to &count;
    &city&n

    %end;

    %mend listcity;
     

    The above macro listcity was written by a student who wanted to save some typing when listing names of US cities. Unfortunately, when he calls the macro to list the first three cities using

      %listcity(count=3)

    SAS macro processor gives an error message saying "macro variable city is not resolved." To understand what has gone wrong, we need to know how the SAS macro processor interprets

      &city&n.

    The macro execution will try to concatenate the two macro variable references &city and &n, while the student wants the concatenation of the text string "city" and the macro variable reference &n completed before the processor references the resulting variables city1city2 and city3.

    To solve the problem, precede &city&n by another ampersand sign (&), i.e.

      &&city&n

    The double ampersands && will be first interpreted as a single & (this & will be kept aside and wait for referencing at a later time), and the concatenation of the text string "city" and the macro variable reference &n will be done prior to referencing the variables city1city2and city3.

    Go to top

    Go back to Howto Home

  • Scope of Macro Variables

    Let us consider the following macro:



    %let datanew=inventory;

    %macro conditn;

    %let dataold=sales;

    %let cond=cases > 0;

    %mend conditn;

    %macro name; 
    %let datanew=report;
    %let dataold= warehse;

    %conditn
    data &datanew;
    set &dataold;

    if &cond;

    run;


    %mend name;
     

    The very first %let statement defines the datanew variable globally. In contrast, all %let statements within macros will define variables locally unless a variable has already been defined in an open code (i.e., not within other macros) prior to its definition within a macro. In the above, the variable datanew reappears in the macro named name. Since this variable is defined earlier in an open code, it remains global throughout the code. The variables dataold and cond are local.

    When we invoke the macro name using

        %name

    the SAS macro processor will give an error message saying "The macro variable cond is not resolved." The problem behind is related to the scope of the macro variables. When the macro name is invoked, the first %let statement assigns the text string "report" to the global datanew variable (it replaces the original assigned string "inventory" by "report"). The second %let statement defines a local variable named dataold and assigns it a value of the text string "warehse". Next, the macro conditn is invoked. Within this macro, the variable cond is defined locally, and the value of the variable dataold is reassigned to the text string "sales". However, after the execution of the statement %conditn, the variable cond no longer exists (because it is local within the macro conditn only!!), so in the if statement in the later data step, the macro variable cond cannot be resolved. The other two variables datanew and dataold in the data step can be resolved because they still exist within the scope of the macro name.

    Indeed, the SAS macro processor interprets the above code upon the invocation of the macro name as follows:

    data report; 
        set sales; 
        if &cond; <-- this gives an error! 
    run;

    Go to top

    Go back to Howto Home

  • Forcing a Macro Variable to be Local

    Sometimes it might be useful to force a variable local within a macro if a (global) variable of the same name has been defined earlier in an open code, e.g. you might not want to alter the value of a global variable throughout the code. Of course, by avoiding the use of a variable name same as that of some global variable, we will not have problems of accidental alteration of the values of global variables.

    The following shows an example when a local definition is necessary:

    %let n=North State Industries;

    %macro namelst(name,number); 
        %do n=1 %to &number; 
            &name&n 
        %end; 
    %mend namelst;

    proc print; 
        var %namelst(dept,5); 
        title "Quarterly Report for &n"; 
    run;

    The macro namelst makes use of a %DO loop (as discussed earlier) for generating repetitive pieces of text (in here, the repetitive pieces are name1name2 and so on).  The variable n is global, as defined in the first %let statement. Within the macro namelst, the variable nserves as a counter. However, when the macro namelst is invoked later in the proc print step, the value of n changes during the execution of the %DO loop. In particular, it is the value of the global variable n that is changed. So in the title statement where referencing the variable n is required, the SAS macro processor will print a title

        "Quarterly Report for 6".

    The macro variable reference 6 is the result of running the %DO loop five times after the invocation of the macro namelst
    One can imagine the desired title should be "Quarterly Report for North State Industries", which can be obtained by forcing the variable n local within the macro namelst. The %local statement is used to keep a variable local; just add the line

        %local n;

    before the %DO loop in the macro namelst. The value of the global variable n (the text string "North State Industries") will not be affected by the change of values of the locally defined variable n within the macro.

    Go to top

    Go back to Howto Home

  • Creating Global Macro Variables

    The %local statement allows one to create local macro variables. Similarly, the %global statement creates a global macro variable if a variable with the same name does not already exist.

    Referring to the example we gave earlier when we discussed the scope of macro variables, the macro variable cond is not resolved during the execution of the data step within the macro name. This is because the variable cond is defined locally within the macroconditn and it does not exist after the execution of this macro. To avoid problems in referencing the macro variable cond in the data step, we can use the %global statement in defining the macro conditn, as follows:



    %macro conditn;

    %global cond;

    %let dataold=sales;
    %let cond=cases > 0;

    %mend conditn;
    Invoking the macro name generates the below statements:


    data report;


    set sales;

    if cases>0;

    run;
     

    Note: You CANNOT use the %global statement to make an existing local variable global!

    If you want to put the data step outside the macro name, then all the macro variables have to be global for the macro processor to resolve the references. You cannot add the macro variable dataold to the %global statement within the macro conditn since the %letstatement in the macro name has already created dataold as a local variable to name by the time conditn begins to execute.

    Go to top

    Go back to Howto Home

Reference:

SAS Macro Language: Reference, First Edition. SAS Institute Inc., Cary, NC, USA, 1997.

Q: Where can I use SAS?

Server host SAS: unixlab.stat.ubc.ca
OS: Unix Solaris
Hardwares: SUN 280R, 2x 1.4Mhz CPUs, 4Gb of Memory
Location: Undergraduate Network, Room LSK 121
Access mode: Via LSK 121 lab or SSH for remote login.

Q: What is your How to read Microsoft Excel format (.xls) data file by R?

There seem direct way to read .xls format file (see http://maths.newcastle.edu.au/~rking/R/help/00b/2519.html).

However, there some ways to indirectly read .xls file. For example, you can save the .xls file into .csv (comma separated value) format. Then use R's function read.csv to read it.

The reason to save .xls file to .csv file is that usually there are some columns in .xls file which are strings containing white spaces. So if to save .xls file to white space delimited or TAB delimited, then it is still difficult to read the file into R.

Q: How to call Fortran subroutines in R?

In R, we can call Fortran subroutines. For example, we have the following toy Fortran subroutine in the file test.f.

CCCCCCCCCCCCCCCCC C The subroutine is to calculate Hadama product of two matrices. C out[i][j]=x[i][j]*y[i][j]. C Both R and Fortran store matrix by column. CCCCCCCCCCCCCCCCC CCCCCCCCC Fortran program (f77) has to be between 7-th and 72-th column. CCCCCCCCC The 6-th column is for continuation marker. subroutine myHadamaProduct(x, y, nrow, ncol, mo) integer i, j, nrow, ncol CCCCCCC In Fortran, you don't need to specify the second dimension for matrix double precision x(nrow, *), y(nrow, *), mo(nrow, *) do i = 1, nrow do j = 1, ncol mo(i,j)=x(i,j)*y(i,j) enddo enddo return end
  1. First, we need to compile the file test.f to create a shared library, test.so say, by using the GNU Fortran compiler:

    g77 -fpic -shared -fno-gnu-linker -o test.so test.f
  2. Next, we need to use the R function dyn.load to load the shared library test.so. if(!is.loaded("myhadamaproduct")){ dyn.load("./test.so") } The R function is.loaded is to check if the Fortran subroutine myHadamaProduct is already be loaded to R. If yes, then we do not need to loaded it again.
  3. Next, we use the R function .Fortran to call the Fortran subroutine myHadamaProduct. For example, x<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix y<-matrix(1:10,nrow=5, ncol=2) # get a 5x2 matrix out<-matrix(0, nrow=5, ncol=2) # initialize output matrix # to format matrix or array, use function storage.mode() storage.mode(x)<-"double" storage.mode(y)<-"double" storage.mode(out)<-"double" nr<-as.integer(nrow(x)) nc<-as.integer(ncol(x)) # Fortran is *NOT* case-sensitive. So it will change the all characters # to lower case. Thus, to use .Fortran call Fortran subroutines, you # have to type lower case. Otherwise, R will prompt error message. res<-.Fortran("myhadamaproduct", x, y, nr, nc, out=out) cat("Hadama product >>n") print(res$out)
  4. If you do not need to use the shared library test.so any more, you can use the R function dyn.unload to unload it. if(is.loaded("myhadamaproduct")){ dyn.unload("./test.so") }

Note:

  • The Fortran program called by R must be subroutines, not functions. For the example above, myHadamaProduct is defined as subroutine. subroutine myHadamaProduct(x, y, nrow, ncol, mo)

    The arguments in Fortran subroutines are passed by address instead of by values. And not like C language, there is no "pointer" concept in Fortran.

  • When you use ".Fortran" to call Fortran subroutines, the name of the Fortran subroutines must be in lower case.
  • Any values returned by Fortran subroutines which are called in R must be initialized and must have the format: # if the variable is defined as double in the Fortran subroutine variablename=as.double(initialized values) # inside .Fortran # if the variable is defined as integer in the Fortran subroutine variablename=as.integer(initialized values) # inside .Fortran # if the output is double precision matrix or array storage.mode(variablename)<-"double" # before .Fortran variablename=variablename # inside .Fortran

    The input values must also be initialized and must have the above format. However, they can be formated before the ".Fortran" function.

  • If the output is not written as variablename=variablename format (e.g. out=out in the above example), You still can get results. However, you have to use res[[5]] to refer out in the above example. In fact, the .Fortran function return a list containing all arguments of the Fortran subroutine myHadamaProduct. Since out is the 5-th argument, you can use res[[5]] to refer to the 5-th elements of the list.
  • It is okay that the file test.f contains the main program.
  • Sometimes, the command "dyn.load("test.so")" gets error message. This is probably caused by the environment variable "$PATH" was not set correctly. You can either add the following line to the file ".bashrc" in your home directory: export PATH=$PATH:.:

    or use the command

    dyn.load("./test.so")
Q: How to install command-line editor in Splus?

R has a command-line editor which allows us to retrieve and edit commands we entered before. We also can install a command-line editor in Splus.

STEP 1

If you use Bourne, Korn, Bash or Z-Shell, type the following lines into your .bashrc file which is in your home directory:

export EDITOR="/usr/local/bin/vim" export S_CLEDITOR="/usr/local/bin/vim" export VISUAL="/usr/local/bin/vim"

If you use C-Shell or TC-Shell, then type the following lines into your .cshrc file which is in your home directory:

setenv EDITOR "/usr/local/bin/vim" setenv S_CLEDITOR "/usr/local/bin/vim" setenv VISUAL "/usr/local/bin/vim"

If you want to use emacs instead of vi, then simply replace the vim with emacs.

STEP 2

in your home directory, type command

source .bashrc

or

source .cshrc

STEP 3

type

Splus -e

to invoke editor when you enter into Splus session.

The most useful editing commands are summarized in the following table:

COMMAND emacs vi backward character Ctrl-B Esc, h forward character Ctrl-F Esc, l previous line Ctrl-P Esc, k next line Ctrl-N Esc, j beginning of line Ctrl-A Esc, ^ (Shift-6) end of line Ctrl-E Esc, $ (Shift-4) forward word Esc, f Esc, w backward word Esc, b Esc, b kill char Ctrl-D Esc, x kill line Ctrl-K Esc, Shift-d delete word Esc, d Esc, dw search backward Ctrl-R Esc, ? yank Ctrl-Y Esc, Shift-y transpose chars Ctrl-T Esc, xp

You can type Splus command

?Command.edit

to get the above table.

Pages