Solutions Statistics for Business and Economics 10 Ed. Anderson. Chapter 16

16.1 Consider the following data for two variables, x and y....
a. Develop an estimated regression equation for the data of the form ŷ = b0 + b1x.
b. Using the results from part (a), test for a significant relationship between x and y; use α = .05.
c. Develop a scatter diagram for the data. Does the scatter diagram suggest an estimated regression equation of the form ŷ = b0 + b1x + b2x2? Explain.
d. Develop an estimated regression equation for the data of the form ŷ = b0 + b1x + b2x2.
e. Refer to part (d). Is the relationship between x, x2, and y significant? Use α = .05.

f. Predict the value of y when x = 25.

Get solution

16.2 Consider the following data for two variables, x and y....
a. Develop an estimated regression equation for the data of the form ŷ = b0+ b1x. Comment on the adequacy of this equation for predicting y.
b. Develop an estimated regression equation for the data of the form ŷ = b0 + b1x + b2x2. Comment on the adequacy of this equation for predicting y.
c. Predict the value of y when x = 20.

Get solution

16.3 Consider the following data for two variables, x and y....
a. Does there appear to be a linear relationship between x and y? Explain.
b. Develop the estimated regression equation relating x and y.
c. Plot the standardized residuals versus y for the estimated regression equation developed in part (b). Do the model assumptions appear to be satisfied? Explain.
d. Perform a logarithmic transformation on the dependent variable y. Develop an estimated regression equation using the transformed dependent variable. Do the model assumptions appear to be satisfied by using the transformed dependent variable? Does a reciprocal transformation work better in this case? Explain.

Get solution

16.4 A highway department is studying the relationship between traffic flow and speed. The following model has been hypothesized....Where...The following data were collected during rush hour for six highways leading out of the city.
a. Develop an estimated regression equation for the data.
b. Using α = .01, test for a significant relationship.

Get solution

16.5 In working further with the problem of exercise 4, statisticians suggested the use of the following curvilinear estimated regression equation....
a. Use the data of exercise 4 to compute the coefficients of this estimated regression equation.
b. Using α = .01, test for a significant relationship.
c. Estimate the traffic flow in vehicles per hour at a speed of 38 miles per hour.

Get solution

16.6 A study of emergency service facilities investigated the relationship between the number of facilities and the average distance traveled to provide the emergency service. The following table gives the data collected....
a. Develop a scatter diagram for these data, treating average distance traveled as the dependent variable.
b. Does a simple linear model appear to be appropriate? Explain.
c. Develop an estimated regression equation for the data that you believe will best explain the relationship between these two variables.

Get solution

16.7
Get solution

16.8 Corvette, Ferrari, and Jaguar produced a variety of classic cars that continue to increase in value. The following data, based upon the Martin Rating System for Collectible Cars, show the rarity rating (1–20) and the high price ($1000) for 15 classic cars (http:// www.businessweek.com, February 2006)....
a. Develop a scatter diagram of the data using the rarity rating as the independent variable and price as the independent variable. Does a simple linear regression model appear to be appropriate?
b. Develop an estimated multiple regression equation with x = rarity rating and x2 as the two independent variables.
c. Consider the nonlinear relationship shown by equation (16.7). Use logarithms to develop an estimated regression equation for this model.
d. Do you prefer the estimated regression equation developed in part (b) or part (c)? Explain.

Get solution

16.9
Get solution

16.10 In a regression analysis involving 27 observations, the following estimated regression equation was developed....For this estimated regression equation SST = 1550 and SSE = 520.
a. At α = .05, test whether x1 is significant.Suppose that variables x2 and x3 are added to the model and the following regression equation is obtained....For this estimated regression equation SST = 1550 and SSE = 100.
b. Use an F test and a .05 level of significance to determine whether x2 and x3 together contribute significantly to the model.

Get solution

16.11 In a regression analysis involving 30 observations, the following estimated regression equation was obtained....For this estimated regression equation SST = 1805 and SSR = 1760.
a. At α = .05, test the significance of the relationship among the variables.Suppose variables x1 and x4 are dropped from the model and the following estimated regression equation is obtained....For this model SST = 1805 and SSR = 1705.
b. Compute SSE(x1, x2, x3, x4).
c. Compute SSE(x2, x3).
d. Use an F test and a .05 level of significance to determine whether x1 and x4 contribute significantly to the model.

Get solution

16.12 The Ladies Professional Golfers Association (LPGA) maintains statistics on performance and earnings for members of the LPGA Tour. Year-end performance statistics for the 30 players who had the highest total earnings in LPGA Tour events for 2005 appear on the data disk in the file named LPGATour (http://www.lpga.com, 2006). Earnings ($1000) is the total earnings in thousands of dollars; Scoring Avg. is the average score for all events; Greens in Reg. is the percentage of time a player is able to hit the green in regulation; Putting Avg. is the average number of putts taken on greens hit in regulation; and Sand Saves is the percentage of time a player is able to get “up and down” once in a greenside sand bunker. A green is considered hit in regulation if any part of the ball is touching the putting surface and the difference between the value of par for the hole and the number of strokes taken to hit the green is at least 2.
a. Develop an estimated regression equation that can be used to predict the average score for all events given the average number of putts taken on greens hit in regulation.
b. Develop an estimated regression equation that can be used to predict the average score for all events given the percentage of time a player is able to hit the green in regulation, the average number of putts taken on greens hit in regulation, and the percentage of time a player is able to get “up and down” once in a greenside sand bunker.
c. At the .05 level of significance, test whether the two independent variables added in part (b), the percentage of time a player is able to hit the green in regulation and the percentage of time a player is able to get “up and down” once in a greenside sand bunker, contribute significantly to the estimated regression equation developed in part (a). Explain.

Get solution

16.13 Refer to exercise 12.
a. Develop an estimated regression equation that can be used to predict the total earnings for all events given the average number of putts taken on greens hit in regulation.
b. Develop an estimated regression equation that can be used to predict the total earnings for all events given the percentage of time a player is able to hit the green in regulation, the average number of putts taken on greens hit in regulation, and the percentage of time a player is able to get “up and down” once in a greenside sand bunker.
c. At the .05 level of significance, test whether the two independent variables added in part (b), the percentage of time a player is able to hit the green in regulation and the percentage of time a player is able to get “up and down” once in a greenside sand bunker, contribute significantly to the estimated regression equation developed in part (a). Explain.
d. In general, lower scores should lead to higher earnings. To investigate this option to predicting total earnings, develop an estimated regression equation that can be used to predict total earnings for all events given the average score for all events. Would you prefer to use this equation to predict total earnings or the estimated regression equation developed in part (b)? Explain.

Get solution

16.14 A 10-year study conducted by the American Heart Association provided data on how age, systolic blood pressure, and smoking relate to the risk of strokes. Data from a portion of this study follow. Risk is interpreted as the probability (times 100) that a person will have a stroke over the next 10-year period. For the smoker variable, 1 indicates a smoker and 0 indicates a nonsmoker.......
a. Develop an estimated regression equation that can be used to predict the risk of stroke given the age and blood-pressure level.
b. Consider adding two independent variables to the model developed in part (a), one for the interaction between age and blood-pressure level and the other for whether the person is a smoker. Develop an estimated regression equation using these four independent variables.
c. At a .05 level of significance, test to see whether the addition of the interaction term and the smoker variable contribute significantly to the estimated regression equation developed in part (a).

Get solution

16.15
Get solution

16.16 A study provided data on variables that may be related to the number of weeks a manufacturing worker has been jobless. The dependent variable in the study (Weeks) was defined as the number of weeks a worker has been jobless due to a layof
f. The following independent variables were used in the study....The data are available on the CD that accompanies this text in the file named Layoffs.
a. Develop the best one-variable estimated regression equation.
b. Use the stepwise procedure to develop the best estimated regression equation. Use values of .05 for p-value to Enter and p-value to Leave.
c. Use the forward selection procedure to develop the best estimated regression equation. Use a value of .05 for p-value to Enter.
d. Use the backward elimination procedure to develop the best estimated regression equation. Use a value of .05 for p-value to Leave.

Get solution

16.17 The Ladies Professional Golfers Association (LPGA) maintains statistics on performance and earnings for members of the LPGA Tour. Year-end performance statistics for the 30 players who had the highest total earnings in LPGA Tour events for 2005 appear on the data disk in the file named LPGATour2 (http://www.lpga.com, 2006). Earnings ($1000) is the total earnings in thousands of dollars; Scoring Avg. is the average score for all events; Drive Average is the average length of a players drive in yards; Greens in Reg. is the percentage of time a player is able to hit the green in regulation; Putting Avg. is the average number of putts taken on greens hit in regulation; and Sand Saves is the percentage of time a player is able to get “up and down” once in a greenside sand bunker. A green is considered hit in regulation if any part of the ball is touching the putting surface and the difference between the value of par for the hole and the number of strokes taken to hit the green is at least 2. Let DriveGreens denote a new independent variable that represents the interaction between the average length of a player’s drive and the percentage of time a player is able to hit the green in regulation. Use the methods in this section to develop the best estimated multiple regression equation for estimating a player’s average score for all events.
Get solution

16.18 Jeff Sagarin has been providing sports ratings for USA Today since 1985. In baseball his predicted RPG (runs/game) statistic takes into account the entire player’s offensive statistics, and is claimed to be the best measure of a player’s true offensive value. The following data show the RPG and a variety of offensive statistics for the 2005 Major League Baseball (MLB) season for 20 members of the New York Yankees (http://www.usatoday.com, March 3, 2006). The labels on columns are defined as follows: RPG, predicted runs per game statistic; H, hits; 2B, doubles; 3B, triples; HR, home runs; RBI, runs batted in; BB, bases on balls (walks); SO, strikeouts; SB, stolen bases; CS, caught stealing; OBP, on-base percentage; SLG, slugging percentage; and AVG, batting average....Let the dependent variable be the RPG statistic.
a. Develop the best one-variable estimated regression equation.
b. Use the methods in this section to develop the best estimated multiple regression equation for estimating a player’s RPG.

Get solution

16.19 Refer to exercise 14. Using age, blood pressure, whether a person is a smoker, and any interaction involving those variables, develop an estimated regression equation that can be used to predict risk. Briefly describe the process you used to develop an estimated regression equation for these data.
Get solution

16.20 Consider a completely randomized design involving four treatments: A, B, C, and D. Write a multiple regression equation that can be used to analyze these data. Define all variables.
Get solution

16.21 Consider a completely randomized design involving four treatments: A, B, C, and D. Write a multiple regression equation that can be used to analyze these data. Define all variables.
Get solution

16.22 Write a multiple regression equation that can be used to analyze the data for a two-factorial design with two levels for factor A and three levels for factor B. Define all variables.
Get solution

16.23 The Jacobs Chemical Company wants to estimate the mean time (minutes) required to mix a batch of material on machines produced by three different manufacturers. To limit the cost of testing, four batches of material were mixed on machines produced by each of the three manufacturers. The times needed to mix the material follow....
a. Write a multiple regression equation that can be used to analyze the data.
b. What are the best estimates of the coefficients in your regression equation?
c. In terms of the regression equation coefficients, what hypotheses must we test to see whether the mean time to mix a batch of material is the same for all three manufacturers?d. For an α = .05 level of significance, what conclusion should be drawn?

Get solution

16.24 Four different paints are advertised as having the same drying time. To check the manufacturers’ claims, five samples were tested for each of the paints. The time in minutes until the paint was dry enough for a second coat to be applied was recorded for each sample. The data obtained follow....
a. Use α = .05 to test for any significant differences in mean drying time among the paints.
b. What is your estimate of mean drying time for paint 2? How is it obtained from the computer output?

Get solution

16.25 An automobile dealer conducted a test to determine whether the time needed to complete a minor engine tune-up depends on whether a computerized engine analyzer or an electronic analyzer is used. Because tune-up time varies among compact, intermediate, and full-sized cars, the three types of cars were used as blocks in the experiment. The data (time in minutes) obtained follow....Use α = .05 to test for any significant differences.
Get solution

16.26 A mail-order catalog firm designed a factorial experiment to test the effect of the size of a magazine advertisement and the advertisement design on the number (in thousands) of catalog requests received. Three advertising designs and two sizes of advertisements were considered. The following data were obtained. Test for any significant effects due to type of design, size of advertisement, or interaction. Use α = .05....
Get solution

16.27 The following data show the daily closing prices (in dollars per share) for IBM for November 3, 2005, through December 1, 2005 (Compustat, February 26, 2006)....
a. Define the independent variable Period, where Period = 1 corresponds to the data for November 3, Period = 2 corresponds to the data for November 4, and so on. Develop the estimated regression equation that can be used to predict the closing price given the value of Period.
b. At the .05 level of significance, test for any positive autocorrelation in the data.

Get solution

16.28 Refer to the Cravens data set in Table 16.5. In Section 16.3 we showed that the estimated regression equation involving Accounts, AdvExp, Poten, and Share had an adjusted coefficient of determination of 88.1%. Use the .05 level of significance and apply the Durbin-Watson test to determine whether positive autocorrelation is present.
Get solution

16.29 Lower prices for color laser printers make them a great alternative to inkjet printers. PC World reviewed and rated 10 color laser printers. The following data show the price, printing speed for color graphics in pages per minute (ppm), and the overall PC World rating for each printer tested (PC World, December 2005)....
a. Develop a scatter diagram of the data using the printing speed as the independent variable. Does a simple linear regression model appear to be appropriate?
b. Develop an estimated multiple regression equation with x = speed and x2 as the two independent variables.
c. Consider the nonlinear model shown by equation (16.7). Use logarithms to transform this nonlinear model into an equivalent linear model, and develop the corresponding estimated regression equation. Does the estimated regression equation provide a better fit than the estimated regression equation developed in part (b)?

Get solution

16.30
Get solution

16.31 A study investigated the relationship between audit delay (Delay), the length of time from a company’s fiscal year-end to the date of the auditor’s report, and variables that describe the client and the auditor. Some of the independent variables that were included in this study follow....A sample of 40 companies provided the following data.......
a. Develop the estimated regression equation using all of the independent variables.
b. Did the estimated regression equation developed in part (a) provide a good fit? Explain.
c. Develop a scatter diagram showing Delay as a function of Finished. What does this scatter diagram indicate about the relationship between Delay and Finished?
d. On the basis of your observations about the relationship between Delay and Finished, develop an alternative estimated regression equation to the one developed in (a) to explain as much of the variability in Delay as possible.

Get solution

16.32 Refer to the data in exercise 30. Consider a model in which only Industry is used to predict Delay. At a .01 level of significance, test for any positive autocorrelation in the data.
Get solution

16.33 Refer to the data in exercise 30.
a. Develop an estimated regression equation that can be used to predict Delay by using Industry and Quality.
b. At the .05 level of significance, test for any positive autocorrelation in the data.

Get solution

16.34 A study was conducted to investigate browsing activity by shoppers. Shoppers were classified as nonbrowsers, light browsers, and heavy browsers. For each shopper in the study, a measure was obtained to determine how comfortable the shopper was in the store. Higher scores indicated greater comfort. Assume that the following data are from this study. Use a .05 level of significance to test for differences in comfort levels among the three types of browsers....
Get solution

16.35 Money magazine reported price and related data for 418 of the most popular vehicles of the 2003 model year. One of the variables reported was the vehicle’s resale value, expressed as a percentage of the manufacturer’s suggested resale price. The data were classified according to size and type of vehicle. The following table shows the resale value for 10 randomly selected small cars, 10 randomly selected midsize cars, 10 randomly selected luxury cars, and 10 randomly selected sports cars (Money, March 2003)....Use α = .05 and test for any significant difference in the mean resale value among the four types of vehicles.
Get solution