OverviewDefinitionMeasures of central tendency are single values that attempt to describe a data set by identifying the central or “typical” value for that data set. Show
Data distribution and measures of dispersion
Mean, Median, and ModeMeanDefinition: Mean is the sum of all measurements in a data set divided by the number of measurements in that data set.
Equation: $$ Mean = \frac{Sum\ of\ all\ values\ in\ the\ data\ set}{Total\ number\ of\ values\ in\ data\ set} $$ $$ Mean = \frac{x_{1}+x_{2}+x_{3}+…+x_{n}}{n} $$ Example: Find the mean of the following data set: 1, 1, 1, 3, 5, 5, 7, 19. Answer: There are 8 numbers in this data set. To calculate the mean, add up all the numbers and divide by 8: $$ Mean = \frac{1+1+1+3+5+5+7+19}{8}=\frac{42}{8}=5.25 $$ MedianDefinition: After arranging the data from lowest to highest, the median is the middle value, separating the lower half from the upper half of the data set.
Equation: To find the median, arrange the values from lowest to highest, then use the following equation to determine which “position” in order represents the median: $$ Median = \left \{ \frac{(n+1)}{2} \right \} $$ where n = the number of values in the data set. Example: Find the median of the following data set: 1, 5, 1, 19, 3, 1, 7, 5. Answer: There are 8 numbers in this data set. To find the median, first arrange the numbers in order: 1, 1, 1, 3, 5, 5, 7, 19. Next, determine which “position” represents the median. To do this, use the formula (n + 1) / 2. There are 8 numbers in this data set, so n = 8. Therefore, the median will be: (8 + 1) / 2 = 4.5. The median is between the 4th and 5th numbers, which are 3 and 5 (visually: 1, 1, 1, 3, 5, 5, 7, 19). So the median in this data set is 4. ModeDefinition: The mode is the value that occurs most frequently in the data set.
Example: Find the mode of the following data set: 1, 5, 1, 19, 3, 1, 7, 5. Answer: Identify the number that appears most often. This can be done by setting up a frequency table: Table: Frequency table
Mnemonic: MOde is the value that is in the set MOst often. SummaryTable: Summary of the mean, median, and mode
Measures of Dispersion: Percentiles and Standard DeviationsDispersion is the size of distribution of values in a data set. Several measures of dispersion include a range, quantiles (e.g., quartiles or percentiles), and standard deviations. Based on quantiles
Graphical depiction of quartiles, important percentiles, and the interquartile range Image by Lecturio. License: CC BY-NC-SA 4.0Based on the mean: standard deviationsDefinition: The standard deviation (SD) is a measure of how far each observed value is from the mean in a data set.
Demonstration of the percentages associated with each standard deviation away from the mean: Equation: Mathematically, the SD can be calculated using the following equation: $$ \sigma = \sqrt{\frac{\sum (\chi _{i}-\mu )^{2}}{N}} $$ σ = population standard deviation Calculations (using the equation):
Related videosData DistributionData distribution describes how your data cluster (or don’t cluster). Data tend to cluster in certain patterns, known as distribution patterns. There is a “normal” distribution pattern, and there are multiple nonnormal patterns. Different statistical tests are used for different distribution patterns. Data distributionNormal distributions differ according to their mean and variance, but share the following characteristics:
Nonnormal distributionMany processes follow a nonnormal distribution, which can be due to the natural variations or errors in the data. Common distributions:
Types of nonnormal distributions Image by Lecturio. License: CC BY-NC-SA 4.0Reasons why data may have a nonnormal distribution:
Related videosWhich is the central tendency separating the higher half from the lower half of the sample data?Median. Median - the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half.
Which of the following is the value separating the higher half from the lower half of a data sample a population or a probability distribution?The median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. You can think of the median as "the middle" value in a set of numbers based on a count of your values rather than the middle based on numeric value.
Which measure of central tendency splits the data in half with half the scores being higher than this number and half the scores being lower than this number?The median is the middlemost number. In other words, it's the number that divides the distribution exactly in half such that half the cases are above the median, and half are below.
Which measure of central tendency refers to the middle value that splits the dataset into half?The median is the middle value. It is the value that splits the dataset in half, making it a natural measure of central tendency. To find the median, order your data from smallest to largest, and then find the data point that has an equal number of values above it and below it.
|