Ultimate Statistics Glossary

-A-

Aspect Ratio – Ratio of a shape between its longer dimension and its shorter dimension.

Average – A single value that represents the basic significance of an unequal set of values.

-B-

Beta Distribution – Family of probability distributions that are defined by the interval [0, 1].

Beta Risk – Risk of failing to assume that there is no difference in any given procedure when there really may be.

Box-and-Whisker Plot – Simple way of graphing sets of numerical data through number summaries, beginning with the smallest.

-C-

Central Limit Theorem States that under independent observations, the sum of a large number has a normal distribution, under general conditions.

Chi-Square Distribution – Sum distribution used directly or indirectly in order to test for significance among proportions.

Chi-Square Test – Used to compare observable data with data obtained through hypothesis.

Confidence Interval – Gives a range of data estimated through a calculation of a sample data set to find an unknown population parameter.

Contingency Coefficient – Ratio between two separate quantities produced by various certain factors.

Correlation – Single numerical value that describes the degree connection between two different variables.

Correlation Coefficient – Measure of how a trend in a predicted set of values matches a trend in actual past values.

Covariance – Measure of the degree of change between two variables formed together.

-D-

Degrees of Freedom – Number of values that can vary when determining the final calculation of a statistic.

Dependant Variable – Measurement of what is being affected during an experiment, depending on the independent variable.

Distribution – Describes the possible number of times an outcome can result from the given data.

-E-

Exponential Distribution – Class of probability distributions used to describe times between Poisson process events.

Extreme Value Distribution – Limiting distribution of a large amount of identically distributed variables at their minimum.

-F-

Factor – Controlled independent variable experiment whose levels are set up by whoever is arranging the experiment.

Frequency – The number of times an event occurs during an experiment.

-G-

Gamma Distribution – Continuous two-parameter group of probability distributions.

Geometric Distribution – Number of trails needed to result in a first failure.

Geometric Mean Measure of the center of a set of data that uses multiplication rather than addition to determine data values.

-H-

Histogram – Display of frequencies shown as parallel rectangles to show a plot summary of data.

Hypothesis Tests – Assumption that may or may not be true about a population parameter.

-I-

Independent Variable – Variable presumed to determine the value of a dependant variable.

Intercept – Coordinate of a surface, line or curve that intersects across any point on an axis.

Interquartile Range Measurement of the difference between the first and third quartiles of statistical dispersion.

-L-

Lambda – Parameter of a distribution that specifies numerical analysis of the probability.

Least Squares Means – Method of forming a curve using data points to result in a reduction of square sums.

Leverage – Analyses used to identify observations which may have a large effect in the results of regression models.

Linear Model – Used to identify a models subclass in connection with linear regression models.

Logistic Distribution – Found in growth models and used in logistic regressions.

Logistic Model – Used as a prediction indicator for the probability of occurrences by associating data with a logistic curve.

-M-

Mean Average of a set of values found by adding the numbers, then dividing by the sum of the total amount of numbers in the set.

Median –Middle value in a data set found by sorting the list of numbers into increasing order.

Mode – Value that occurs most frequently in a set of numbers unless all the values are different which results in no mode.

Multiple Regression Model – Extension of a linear regression model to result in multiple explanatory variables.

-N-

Negative Binomial Distribution – Probably distribution of how many successes are present in a given Bernoulli trial before a failure occurs.

Nonlinear Models – Contains logarithms, exponents and various other functions of an independent variable and its parameters.

Normal Distribution – Probability distribution that offer data descriptions that focus on the mean.

-O-

Observation – Consists of obtaining knowledge through human senses and recording related data.

Ordinal – Level or scale of measurement used in statistics that refer to the theory of scale types.

Outlier – Observation that appears numerically different than the remaining data.

-P-

P Value – Probability of receiving an extreme statistic from a given test when assuming the null hypothesis is true.

Parameter – Quantity that measures an aspect of score population.

Percentiles – Value of a given variable in which a specific percent of a test observation falls.

Pooled Variance – Principal used to find an estimate of variances after repeating a test a certain amount of times.

Population – Group of same specie organisms who reside in relatively isolated groups.

Probability – Expression of thought or belief that a certain event will or has occurred.

-Q-

Quantiles – Points at regular intervals that are taken from the cumulative distribution function of any given variable.

Quartiles – One of three values used to divide a set of data into four separate but equal parts to represent parts of a sampled population.

-R-

R-Squared – Measure of how a regression line estimates real data points.

Random Sampling – Method of obtaining data to meet the criteria of a population sample.

Range – The length of the smallest interval in a set of intervals that contain all given data.

Regression Model – Used in the prediction of a variable with the use or one of more other variables.

Residual – Result of the subtraction of one quantity from a separate quantity.

-S-

S-Curve Model – Visual representation used to analyze certain inconsistencies between low and high performances.

Sample – Population subset collected from samples to avoid extrapolations.

Significance Level – Criteria used to determine if the null hypothesis needs to be rejected.

Simple Regression – Least squares estimation with a specific predictor variable.

Skewness – Measure of the probability distribution asymmetry of any random variable.

Skychart – Representation of certain positions of celestial bodies in the sky.

Slope – Steepness or grade of a line descending or ascending from two different points.

Standard Deviation – Square root of a variance in a statistical data set.

Standard Error – Method of estimated measurement used in standard sampling distribution.

Sum of Squares – Measure of variability in the sum of squared deviation.

Symmetric Distribution – Distribution with symmetric opposing sides and without skewness.

-T-

T-Test – Determines whether the means of two or more groups are statistically different from one another.

Type I Error – Occurs when an observation of a difference is being analyzed when in fact, there is no difference indicating poor testing.

Type II Error – Occurs when a null hypothesis is rejected when in fact, it is not true.

-U-

Uncertainty Coefficient – Measurement of nominal association. 

Uniform Distribution – Theory that indicates that all intervals are of the same length and that all distributions are equally probable.

-V-

Variable – Capable of change or being varied over a period of time.

Variance – Expectation of deviation squared of a given variable from its expected mean.