Monday, June 08, 2015

Getting Started with R

The easy way to get started with R is to use RStudio. You will need to install both R and RStudio. Here are the software packages you need to install:

Windows
Mac OSX
You will need access to the internet to download R packages. If you need to configure a proxy, in RStudio, you will need to edit the environment text file:

file.edit('~/.Rprofile')

In that file, you will need to add the following settings:

Mac OSX:

Sys.setenv(http_proxy="http://proxy:8080")


Windows:

http_proxy=http://proxy:8080

Instructions were based from the following articles:

Once you have set up internet access, you should be able to install R packages. The yhat blog lists 10 R packages they wish they'd known about earlier.
  • sqldf (for selecting from data frames using SQL)
  • forecast (for easy forecasting of time series)
  • plyr (data aggregation)
  • stringr (string manipulation)
  • Database connection packages RPostgreSQL, RMYSQL, RMongo, RODBC, RSQLite
  • lubridate (time and date manipulation)
  • ggplot2 (data visulization)
  • qcc (statistical quality control and QC charts)
  • reshape2 (data restructuring)
  • randomForest (random forest predictive models)