R can be downloaded from http://cran.r-project.org.
> 2+2
[1] 4
> 3-5
[1] -2
> 3*2
[1] 6
> 7/3
[1] 2.333333
> 8^2
[1] 64
> pi
[1] 3.141593
Storing in variables
> radius <- 24
> area <- 2*pi*radius
> area
[1] 150.7964
Vectors
> math <- c(60,90,34)
> science <- c(56,98,76)
> english <- c(34,98,22)
> avg_grades <- (math + science + english) / 3
> avg_grades
[1] 50.00000 95.33333 44.00000
Graphical summaries
- For a single categorical variables, we use bar plots and dot plots
- For single numerical variables, we use histograms and boxplots
- For two numerical variables, we use scatterplot
Histogram
A histogram is a special kind of bar plot. It is used for visualizing the distribution of values of a numerical variable. When drawn with a density scale, the area of each bar is the proportion of observations in the interval. Height represents density where the total area is 100%.
Type the following for help on histogram
> ?hist
Example
> par(mfrow=c(2,2))
> simdata <- rchisq(100,8)
> hist(simdata)
> hist(simdata,breaks=2)
Mean is appropriate for distributions that are fairly symmetrical.
> mean(math)
The median is the middlemost number. Half of the values are greater than the median and the other half are smaller. Median is usually more appropriate summary for skewed distributions.