Virginia Tech
Advanced Research Computing
  • Home
  • About ARC@VT
    • Leadership & Staff
    • Events
    • Press Room
    • Featured Links
    • Contact Us
  • Research
  • Services & Support
    • User Accounts
    • Training
    • Documentation
    • Facility Tours
    • Help - Support Requests
  • Systems & Resources
    • System X
    • SGI Systems
    • Sun Systems
    • Visualization
  • Application Software
  • Web Site Map


The R Statistical Software Application

R is a GNU Project which was developed to provide a language and environment for statistical computing and graphics.

R can be run in interactive mode by simply entering the command  R  at a command prompt. A greater than sign (">") prompt will appear when R is ready to accept your commands. You can then enter R commands;  results, if any, will be displayed on the screen.  To exit from R in interactive mode, simply enter the R quit command:  q()

Note: The above assumes that the R command is in a directory included in your default search path.  If you receive an error message indicating that the command R could not be found, include the full directory path and file name when you invoke the command. For example, on the VT ARC SGI Systems, you could enter the following command to invoke R in interactive mode:     /apps/bin/R

Alterntively, you can run R in batch mode by redirecting a file containing your R commands;  for example, to use R to process the commands in the file "test.r" and write the results to the file "test.out", you could use the following command:

    R  --vanilla  <test.r  >test.out

To begin learning R commands, see An Introduction to R;  for additional information, see The R Manuals.

Note:  the R language is case sensitive!

The remainder of this document consists of a sample R prorgam file and shows the results obtained from this sample program.

An Example R Program

An example R program file is given below. This program completes the following steps:

  1. Reads Data from the file "survey.dat"
  2. Creates Summary Statistics
  3. Runs a Linear Regression using the variables INC and AGE
$ cat test.r
survey <- read.table("survey.dat",header=TRUE)
summary(survey)

reg_data <- lm(INC~AGE,data=survey)
summary(reg_data)
Here is a sample data file which could be used with the above R program:
 OBS    ID    SEX      AGE     INC     R1      R2      R3
   1     1      F       34      17      7       2       2
   2    17      M       40      14      5       5       3
   3    33      M       45       6      7       2       7
   4    49      M       24      14      7       5       7
   5    65      F       52       9      4       7       7
   6    81      M       45      11      7       7       7
   7     2      F       24      17      6       5       3
   8    18      F       40      14      7       5       2
   9    34      F       45       6      6       5       6
  10    50      M       34      17      5       7       5

Example R Program Output

$ cat test.out
R : Copyright 2005, The R Foundation for Statistical Computing
Version 2.1.1  (2005-06-20), ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for a HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> survey <- read.table("survey.dat",header=TRUE)

> summary(survey)
      OBS              ID        SEX        AGE            INC
 Min.   : 1.00   Min.   : 1.00   F:5   Min.   :24.0   Min.   : 6.00
 1st Qu.: 3.25   1st Qu.:17.25   M:5   1st Qu.:34.0   1st Qu.: 9.50
 Median : 5.50   Median :33.50         Median :40.0   Median :14.00
 Mean   : 5.50   Mean   :35.00         Mean   :38.3   Mean   :12.50
 3rd Qu.: 7.75   3rd Qu.:49.75         3rd Qu.:45.0   3rd Qu.:16.25
 Max.   :10.00   Max.   :81.00         Max.   :52.0   Max.   :17.00
       R1             R2            R3
 Min.   :4.00   Min.   :2.0   Min.   :2.0
 1st Qu.:5.25   1st Qu.:5.0   1st Qu.:3.0
 Median :6.50   Median :5.0   Median :5.5
 Mean   :6.10   Mean   :5.0   Mean   :4.9
 3rd Qu.:7.00   3rd Qu.:6.5   3rd Qu.:7.0
 Max.   :7.00   Max.   :7.0   Max.   :7.0

> reg_data <- lm(INC~AGE,data=survey)
> summary(reg_data)

Call:
lm(formula = INC ~ AGE, data = survey)

Residuals:
    Min      1Q  Median      3Q     Max
-4.2107 -2.6361  0.9852  2.0809  3.0307

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  25.5866     4.3569   5.873 0.000373 ***
AGE          -0.3417     0.1109  -3.082 0.015074 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.085 on 8 degrees of freedom
Multiple R-Squared: 0.5428,     Adjusted R-squared: 0.4857
F-statistic: 9.498 on 1 and 8 DF,  p-value: 0.01507


VT-ARC Privacy Statement | Contact Us
VT-ARC is a Unit within the Office of the Vice President of Information Technology
© 2007-2008 Virginia Polytechnic Institute and State University
Principles of Community | Acceptable Use Policy | Accessibility | Equal Opportunity
Website Feedback   -   Page Last Updated:  February 1st, 2008