Statistics with R

Descriptive Statistics
Descriptive statistics is used to summarize a collection of data in a clear and understandable manner. Measurements of an experiment can be summarized numerically or graphically. For the numerical approach, we compute the mean, standard deviation, etc. The graphical method involves box plots and stem and leaf displays. Numerical approach is generally more objective and precise while the graphical method is more useful for identifying patterns in data.

Descriptive statistics is looking at the data prior to formal analysis.

Inferential Statistics
Inferential statistics is used to draw inferences about a population from a sample. Statistical inferences can be made by either estimation or by hypothesis testing. In estimation, the sample is used to estimate a parameter and a confidence estimate. In hypothesis testing, we are interesting in finding whether we can reject a null hypothesis.

Variable
Variables are characteristics or attributes which can vary across different individuals. For example, age, height, gender, etc.

Experimental unit
Experimental units are objects or individuals on which a variable is measured. Suppose we measure the time it took a group of runners to run 100 meters. The experimental units being measured are runners, the runners are the experimental units. The variable is time it took them to complete a 100 meters run.

Univariate, bivariate, and multivariate data
In the runners example, we are only measuring the time. This is an example of univariate data since a single variable is measured on a single experimental unit. If we measure time and height of each runner, it is bivariate data since we are measuring two variable per experimental unit. Multivariate data has two or more variables per experimental unit.

Population and sample
A population is the set of all measurements of an entire group. A sample is a subset of these measurements.

Categorical or qualitative:
These variables are measured on a nominal scale. They have category names but no ordering e.g. black bear, polar bear, grisly bear, etc. Frequency

Numerical and quantitative:
These variables are measured on an ordinal (e.g. good, better, best), interval, or ratio scale. center and spread. Numerical variables can be either discrete (exact numbers) or continuous (range). For example, favorite football player, favorite singer, or favorite color are qualitative variables. Speed and count of something are quantitative variable. A variable can be independent or dependent.