Given any data value, we can identify how far that data value is away from the mean, simply by doing a subtraction x – μ. This value will be positive if your data value lies above (to the right) of the mean, and negative if it lies below (to the left) of the mean. But what we’d really like to know is, relative to the spread of our data set, how far is x from μ? Remember that the standard deviation σ gives us a measure of how spread out our entire set of individual data values is. Show The zscore for any single data value can be found by the formula (in English): or with symbols (as seen before!): Obviously a zscore will be positive if the data value lies above (to the right) of the mean, and negative if the data value lies below (to the left) of the mean. Example 6.1: Calculating and Graphing zValues Given a normal distribution with μ = 48 and s = 5, convert an xvalue of 45 to a zvalue and indicate where this zvalue would be on the standard normal distribution. Solution Begin by finding the zscore for x = 45 as follows. Now draw each of the distributions, marking a standard score of z = −0.60 on the standard normal distribution. The distribution on the left is a normal distribution with a mean of 48 and a standard deviation of 5. The distribution on the right is a standard normal distribution with a standard score of z = −0.60 indicated. Zscores measure the distance of any data point from the mean in units of standard deviations and are useful because they allow us to compare the relative positions of data values in different samples. In other words, the zscore allows us to standardize two or more normal distributions, or more appropriately, to put them on the same scale. Therefore, we’ll be able to compare relative positions of data values within their own distribution to determine which data values are closer to or farther from the mean. A prime example for this is to compare the test scores for two students, one who scored a 28 on the ACT (scores range from 1 – 36) and another who scored a 1280 on the SAT (scores range from 400 – 1600). Who, relative to their associated exam, scored better? ExampleSuppose you are enrolled in three classes, statistics, biology, and kayaking, and you just took the first exam in each. You receive a grade of 82 on your statistics exam, where the mean grade was 74 and the standard deviation was 12. You receive a grade of 72 on your biology exam, where the mean grade was 65 and the standard deviation was 10. Finally, you receive a grade of 91 on your kayaking exam, where the mean grade was 88 and the standard deviation was 6. Although your highest test score was 91 (kayaking), in which class did you score the best, relative to the rest of the class? We can answer this using a zscore! Your statistics exam score was 0.67 standard deviations better than the class average; your biology score was 0.7 standard deviations better than the class average; your kayaking score was only 0.5 standard deviations better than the class average. Therefore, even though your actual score on the biology exam was the lowest of the three exam scores, relative to the distribution of all class exam scores, your biology exam score was the highest relative grade. Finding an Area (Proportion) Given a Specific ZValueTo determine the area under the N(0, 1) curve for any data value that does not fall exactly 1, 2, or 3 standard deviations above or below the mean actually requires some calculus. Lucky for us, areas under the N(0, 1) curve can be obtained in numerous other ways, including technology (TI83/84, Excel) and a table of values. Search the Internet for “standard normal table” and you’ll find hundreds of tables illustrating zscores and their associated areas. The majority of these methods report the area to the left of the specified zscore z, no matter where it lies. This comes from a calculus operation of integration, which finds an area from the start of a distribution (i.e., the far lefttail) up to the zscore. Two images are provided. There are three types of area calculations that you will be performing, each requiring slightly different work:
Finding a ZValue Given an AreaThis is a slightly more challenging task than calculating an area, because you basically work “backwards” from an algebraic standpoint. It’s important to realize that a Standard Normal Table has two parts: (1) the top and side margins, which form the tenths and hundredths of a zscore, and (2) the body of the table, which are all the area (probability) values. Also, remember that the Standard Normal Table only provides us information on the area (probability) to the left of a zscore. A small excerpt of Table B from Appendix A is shown below. Notice that the zvalues given in the table are rounded to two decimal places. The first decimal place of each zvalue is listed in the left column, with the second decimal place in the top row. Where the appropriate row and column intersect, we find the amount of area under the standard normal curve to the left of that particular zvalue. Example : Finding Area to the Left of a Positive zValue Using a Cumulative Normal Table Find the area under the standard normal curve to the left of z = 1.37. Solution To read the table, we must break the given zvalue (1.37) into two parts: one containing the first decimal place (1.3) and the other containing the second decimal place (0.07). So, in Table B from Appendix A, look across the row labeled 1.3 and down the column labeled 0.07. The row and column intersect at 0.9147. Thus, the area under the standard normal curve to the left of z = 1.37 is 0.9147. Using a TI83/84 Plus calculator, we can find a value of the area to the left of a zscore. To obtain the solution using a TI83/84 Plus calculator, perform the following steps.
If we are given an area (or probability) value, we need to first locate it in the body of a table, then track our way up and to the left in order to piece together the zscore that relates to the specified area. Keep in mind that you may not find the exact area value in the body of the table…so just use the closest value you can find, and then identify the proper zscore. One calculation that will be used frequently in the coming chapters is to identify the two zscores that separate a specific area in the middle of the standard normal distribution. ExampleSuppose we want to know which two zscores separate out the middle 95% of the data. From the empirical rule, we already know the zscores that do this are ±2 (2 standard deviations on either side of the mean). In reality, it’s not exactly ±2, but close enough for rough calculations. To find the exact two zscores, we use the following logic: If the middle portion is 95% = 0.95, then how much area lies outside of the middle (to the left and right)? A simple subtraction solves this! 1 – 0.95 = 0.05. The “outside” area, 0.05, must be split equally between the two tails (because of symmetry!). Therefore, dividing 0.05 by two gives us an area of 0.025 in each tail. Using a standard normal table “backwards,” we first look through the body of the table to find an area closest to 0.025. The zscore corresponding to a lefttail area of 0.025 is z = −1.96. Now, therefore, the upper zscore will be z = 1.96, by the symmetry property of the standard normal distribution. You could also discover the upper zscore by looking up the area/probability value 0.025 + 0.95 = 0.975 in the body of the table and finding the associated zvalue. By the end of the class, you will be extremely familiar with zscores that define a central 90% (z = ± 1.645), 95% (z = ± 1.96), and 99% (z = ± 2.576).Example: Find and interpret the probability of a random Normal variableSuppose you just purchased a 2005 Honda Insight with automatic transmission. Using www.fueleconomy.gov you determine for the 2005 Honda Insights have mean highway gas milage is 56 miles per gallon with a standard deviation of 3.2. The distribution of this data has a bellshape and is normal. You want to know the following: a) How likely is it that your Honda Insight with automatic transition will get better than 60 miles per gallon on the highway. b) How likely is it that your Honda Insight with automatic transition will get less than 50 miles per gallon on the highway. c) How likely is it that your Honda Insight with aoutomatic transition will get between 52 and 62 miles per gallon on the highway. Solution This problem deals with data that is normally distributed with mean 56 and standard deviation 3.2, i.e., . (a) In symbols, we are asked to calculate P(X > 60). Sketching a normal curve and shading the area corresponding to greater than 60, gives us the graph shown. In order to calculate the appropriate area in the upper (right) tail, we must first convert our data to the standard normal distribution. The zscore for x = 60 is: This means that 60 is 1.25 standard deviations above the mean. Notice how lining the two normal curves up as shown illustrates how the two areas are the same: P(X > 60) = P(Z > 1.25). Using z = 1.25, we go to Table IV (or use normcdf(1.25,1E99,0,1)) to find the area to the left of z = 1.25 is 0.8943. Since we need the area to the right, we simply take 1 – 0.8943 = 0.1057.Therefore, P(X > 60) = 0.1057 = 10.57%. There are a couple ways to interpret this answer:
(b) In symbols, we are asked to calculate P(X < 50). Sketching a normal curve N(56, 3.2)and shading the area corresponding to less than 50, gives us the graph shown to the right. In order to calculate the appropriate area in the lower (left) tail, we must first convert our data to the standard normal distribution. The zscore for x = 50 is: Thus, the value 50 MPG is 1.88 standard deviations below the mean. In symbols we see: P(X < 50) = P(Z < −1.88). Using z = 1.88, we go to Table IV (or use normcdf(1E99,1.88,0,1)) to find the area to the left of z = 1.88 is 0.0301. Therefore, P(X < 50) = 0.0301 = 3.01%. There are a couple ways to interpret this answer:
(c) In symbols, we are asked to calculate P(58 < X < 62). Sketching a normal curve N(56, 3.2) and shading the area corresponding to greater than 58 but less than 62, gives us the graph shown. In order to calculate the appropriate area, we must first convert both data to the standard normal distribution. The zscore for X = 58 is: and the zscore for x = 62 is: In terms of probability, we can now say: P(58 < X < 62) = P(0.63 < Z < 1.88). Using z = 1.88, we go to Table IV (or use technology) to find the area to the left of z = 1.88 is 0.9699. Now, we need to remove (subtract) the area left of z = 0.63, which is 0.7357. Therefore, P(58 < X < 62) = 0.9699 – 0.7357 = 0.2342, or 23.42%. There are a couple ways to interpret this answer:
This calculation can be done with both normcdf(0.63,1.88,0,1) and normcdf(58,62,56,3.2), which will be the same. Find the Value of a Random Variable Knowing a Probability ValueIn these types of problems, we need to work “backwards.” Starting with a specified probability, find the specified zscore, then work our way back to the random variable. The tables of standard normal values are not a “oneway” tool! What do we mean by that? So far you’ve started with a value for a random variable (like a gas mileage value in the previous problem), turned it into a zscore, and then looked up the associated probability value for that zscore. We can use this table to work backwards! We can start with a known probability value in the body of a table, identify the zscore corresponding to that area by moving your fingers to the associated row and column, the reverse the algebra transformation from a zscore to a random variable. If this sounds confusing, think back to the steps we took in the preceding example: If, however, we are given an area/probability, then to work our way back to the original data value, we must first identify the appropriate zscore, and then “unstandardize” the zscore to arrive (finally!) back at the data value. How do we algebraically “undo” the zscore? Easy…just solve for the data value X: Multiply both sides by σ to remove it from the denominator on the left side: X – μ = Z⋅σ Finally, add the value of μ to both sides to isolate the value of the random variable X: X = Z⋅σ + μ Example: Finding the value of a normal random variableInstead you want to know a gas mileage for a particular probability. Find what gas mileage for your 2005 Honda Insight will get better gas mileage than 97% of all other 2005 Honda Insights with automatics transmission. Solution This problem again deals with data that is normally distributed with mean 56 and standard deviation 3.2, i.e., N(56, 3.2). To find the 97% percentile gas mileage, we need to find the specific miles per gallon X that separates the bottom 97% of all gas mileages from the top 3%. So for this problem we are given a percentage/area. Sketching the normal curve gives the graph shown. Using Table IV, we find 0.97 in the body of the table, and then identify the zscore of 1.88. Notice that the exact area 0.97 is not in the table, but the closest area of 0.9699 has the zscore of 1.88. Now we unstandardize the zscore of 1.88. In English this means we need to identify the specific gas mileage that is 1.88 standard deviations above the mean of 56. Solving for X in the Z transform gives: Therefore, if your 2005 Honda Insight cars with an automatic transmission gets 62 mpg, it gets better miles per gallon than 97% of all 2005 Honda Insight cars with an automatic transmission. What is the standard deviation of the Z distribution?The standard normal distribution, also called the zdistribution, is a special normal distribution where the mean is 0 and the standard deviation is 1.
What is the meaning of Z in relationship with the mean and standard deviation and vice versa?Z scores (Z value) is the number of standard deviations a score or a value (x) away from the mean. In other words, Zscore measures the dispersion of data. Technically, Zscore tells a value (x) is how many standard deviations below or above the population mean (µ).
Why is the mean 0 and the standard deviation 1 for zWe might have to do a little math to convert our data from one unit of measurement to another, but the thing we are measuring remains unchanged. When we convert our data into z scores, the mean will always end up being zero (it is, after all, zero steps away from itself) and the standard deviation will always be one.
What does a standard Z value mean?Simply put, a zscore (also called a standard score) gives you an idea of how far from the mean a data point is. But more technically it's a measure of how many standard deviations below or above the population mean a raw score is. A zscore can be placed on a normal distribution curve.
