STAT200 – Assignment #2: Descriptive Statistics Analysis and Writeup

Introduction:

use the same scenario you submitted for the first assignment with modifications using your instructor’s feedback, if needed. Include Table 1: Variables Selected for the Analysis you used in Assignment #1 to show the variables you selected for analysis.

Discussion and Conclusion.

Briefly discuss each variable in the same sequence as presented in the results. What has the highest expenditure? What variable has the lowest expenditure? If you were to recommend a place to save money, which expenditure would it be and why? Note: The section should be no more than 2 paragraphs.

While not included in the above analysis, the medians of the four expenditure values were $64,339.50 for total household expenditures, $7,817 for food, $105.50 for entertainment, and $273.50 for education. Of the three individual expenditures, food had the highest median value (as would be expected), while entertainment has the lowest.

Without additional information about what is actually being purchased with these expenditures, it is difficult to determine where the most effective savings could be realized. For instance, the $7800 food expenditure could include a significant amount of dining out, which is usually costly compared to home cooking. If that is the case, then perhaps some of this expenditure should be reclassified as entertainment. Of the three, the entertainment expenditure has the lowest median value, but it is also perhaps the least important of the three as there are ways to entertain oneself without spending money. With the limited information available, it would seem that cutting entertainment expenditures would be better choice.

STAT 200: Introduction to Statistics Homework #7

1. (3 points): Table #10.1.6 contains the value of the house and the amount of rental income in a year that the house brings in (“Capital and rental,” 2013). Create a scatter plot and find a regression equation between house value and rental income. Then use the regression equation to find the rental income a house worth $230,000 and for a house worth $400,000. Which rental income that you calculated do you think is closer to the true rental income? Why?

Table 1: Data of House Value versus Rental

2. (3 points): The World Bank collected data on the percentage of GDP that a country spends on health expenditures (“Health expenditure,” 2013) and also the percentage of woman receiving prenatal care (“Pregnant woman receiving,” 2013). The data for the countries where this information are available for the year 2011 is in Table 2.

Create a scatter plot of the data and find a regression equation between percentage spent on health expenditure and the percentage of woman receiving prenatal care. Then use the regression equation to find the percent of woman receiving prenatal care for a country that spends 5.0% of GDP on health expenditure and for a country that spends 12.0% of GDP.

Which prenatal care percentage that you calculated do you think is closer to the true percentage? Why?

Table 2: Data of Heath Expenditure versus Prenatal Care

(3 points): Table 1 (in Question 1) contains the value of the house and the amount of rental income in a year that the house brings in (“Capital and rental,” 2013).

Find the correlation coefficient and coefficient of determination and then interpret both.

(3 points): The World Bank collected data on the percentage of GDP that a country spends on health expenditures (“Health expenditure,” 2013) and also the percentage of woman receiving prenatal care (“Pregnant woman receiving,” 2013). The data for the countries where this information is available for the year 2011 are in Table 2 (in Question 2).

Find the correlation coefficient and coefficient of determination and then interpret both.

(3 points): Table 1 (in Question 1) contains the value of the house and the amount of rental income in a year that the house brings in (“Capital and rental,” 2013).

a.) Test at the 5% level for a positive correlation between house value and rental amount.

b.) Find the standard error of the estimate.

c.) Compute a 95% prediction interval for the rental income on a house worth $230,000.

(3 points): The World Bank collected data on the percentage of GDP that a country spends on health expenditures (“Health expenditure,” 2013) and also the percentage of woman receiving prenatal care (“Pregnant woman receiving,” 2013). The data for the countries where this information is available for the year 2011 are in Table 2 (in Question 2).

a.) Test at the 5% level for a correlation between percentage spent on health expenditure and the percentage of woman receiving prenatal care.

b.) Find the standard error of the estimate.

c.) Compute a 95% prediction interval for the percentage of woman receiving prenatal care for a country that spends 5.0 % of GDP on health expenditure.

(3 points): Researchers watched groups of dolphins off the coast of Ireland in 1998 to determine what activities the dolphins partake in at certain times of the day (“Activities of dolphin,” 2013). The numbers in Table 3 represent the number of groups of dolphins that were partaking in an activity at certain times of days.

Is there enough evidence to show that the activity and the time period are independent for dolphins? Why or Why not? Test at the 1% level.

Table 2: Dolphin Activity

(3 points): A person’s educational attainment and age group was collected by the U.S. Census Bureau in 1984 to see if age group and educational attainment are related. The counts in thousands are in Table 3 (“Education by age,” 2013). Do the data show that educational attainment and age are independent? Why or Why not? Test at the 5% level.

Table 3: Number of Cell Phones Per 100 Residents in Europe

9. (3 points): In Africa in 2011, the number of deaths of a female from cardiovascular disease for different age groups are in Table 4 (“Global health observatory,” 2013). In addition, the proportion of deaths of females from all causes for the same age groups are also in Table 4.

Does the data show that the death from cardiovascular disease are in the same proportion as all deaths for the different age groups? Why or Why not? Test at the 5% level.

Table 4: Deaths of Females for Different Age Groups

10. (3 points): A project conducted by the Australian Federal Office of Road Safety asked people many questions about their cars. One question was the reason that a person chooses a given car, and that data is in Table 5 (“Car preferences,” 2013). Does the data show that the frequencies observed substantiate the claim that the reason for choosing a car are equally likely? Why or Why not? Test at the 5% level.

Table 5: Reason for Choosing a Car

Purpose:

STAT200 Introduction to Statistics Assignment #3: Inferential Statistics Analysis and Writeup

The purpose of this assignment is to develop and carry out an inferential statistics analysis plan and write up the findings. There are two main parts to this assignment:

● Part A: Inferential Statistics Data Plan and Analysis

● Part B: Write up of Results

Part A: Prepare Data Plan, Analyze Data, and Complete Part A of the Assignment #3 Template

➢Task 1: Select Variables. Review the variables you used for assignments #1 and #2. Select your

qualitative socioeconomic variable as your grouping variable and the two expenditure variables from the variables used in these previous assignments. Fill in Table 1: Variables Selected for Analysis with name, description, and type of variable (i.e., qualitative or quantitative).

➢Task 2: Select and Run a One Sample Confidence Interval Analysis. For one expenditure variable, select and run the appropriate method for estimating a parameter, based on a statistic (i.e., confidence interval method). Complete Table 2: Confidence Interval Information and Results, which follows the format outlined by Kozak and the course’s problem-solving approach, including:

○ Random variable stated in words

○ Confidence interval method, including rationale and assumptions

○ Method used for analyzing data (i.e., web applets, Excel, TI calculator, etc.).

○ Results obtained

○ Interpretation

➢Task 3: Select Two Sample Hypothesis Test. Using the second expenditure variable (with the socioeconomic variable as the grouping variable), select and run the appropriate method for making decisions about two parameters relative to observed statistics (i.e., two sample

hypothesis test method). Complete Table 3: Two Sample Hypothesis Test Analysis, which follows the format outlined by Kozak and the course’s problem-solving approach, including:

○ Hypotheses (null and alternative).

○ Two sample hypothesis testing method, including rationale and assumptions

○ Method used for analyzing data (i.e., web applets, Excel, TI calculator, etc.).

○ Results obtained.

○ Interpretation (i.e., Reject the null hypothesis OR Fail to reject null hypothesis)

Step 2: Write Up Results and Complete Part B of the Assignment #3 Template

For this 1 to 2 page section, refer to the inferential statistics data plan and computations done for Part A of this assignment. Address the following area:

➢Introduction. Based on the scenario you submitted for the second assignment, provide a brief description of scenario, including the variables that were used in this analysis. Include a completed “Table 1: Variables Selected for Analysis to show the variables you selected for analysis.

➢Data Set Description and Method Used for Analysis. Briefly describe the data set, using information provided with data set and write up in Assignment #2. Also describe what method(s) (i.e., free web applets, Excel, TI Calculator) you used to analyze the data.

➢Results. In this section, you will report the results of your inferential statistics data analysis.

For the Confidence Interval Analysis, write one paragraph that includes:

o Statistical method used, including rationale and whether assumptions were met.

o Statistical Interpretation. The statistical interpretation is that the confidence interval

has a probability (1−α, where α is the complement of the confidence level) of containing

the population parameter.

o Real World Interpretation. Explain the results in everyday language. Recommend

reviewing the text and information from the classroom for examples on how to report results in everyday language.

For the Two Sample Hypothesis Test Analysis, write one paragraph that includes: o Hypotheses that were assessed. See below table for example format:

Examples Format for Writing Null and Alternative Hypotheses, in Words

Null Hypothesis: There is no significant difference in [insert variable name] between [insert group 1 name] and [insert group 2 name] households.

Alternative Hypothesis:

➢For two-tailed (≠): There is a significant difference in [insert variable name] between [insert group 1 name] and [insert group 2 name] households.

➢For one-tailed (>): [Insert group 1 name] has statistically significantly higher [insert variable name] than [insert group 2 name].

➢For one-tailed (<): [Insert group 1 name] has statistically significantly lower [insert variable name] than [insert group 2 name].

o Statistical method used, including rationale and whether assumptions were met. See below table for example format:

Example Format for Writing Statistical Method with Rationale

To determine whether the there was a difference in [insert household expenditure] between [insert names of two groups), a [insert name of hypothesis test used] was used. It was the appropriate statistical method, because [insert rationale]. The assumptions were assessed [insert information about the assumptions assessed and whether they were met].

o Conclusion from the Results. This is where you state whether to reject Ho or fail to reject Ho including the p-value that was obtained. The rule is: if the p-value < α, then reject Ho. If the p-value ≥α, then fail to reject Ho.

o Real World Interpretation. Explain, in everyday language, the results. If any of the assumptions were not met, describe how it might affect conclusions. Address issues of Type I and/or Type II Error, where appropriate. Recommend reviewing the text and information from the classroom for examples on how to report results in everyday language.

➢Discussion– Write one paragraph that summarizes the results of your findings and how they may be helpful to the person described in the scenario, when making a household budget.

Assignment Submission: Name the file that contains your completed Assignment #3 Inferential Statistics Analysis – Template using the following format: “Assignment3-StudentLastName.” Submit it via the Assignments area in the LEO classroom in the “Assignment #3: Inferential Statistics Analysis and Writeup” folder.