# Generate a scatterplot for income ($1,000) versus credit balance

Course

Project: AJ DAVIS DEPARTMENT STORES

Introduction

.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>.next.ecollege.com/(NEXT(4692ce8356))/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentIte#t”>.0/msohtmlclip1/01/clip_image001.gif” alt=”Description: back to top”>.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>

AJ DAVIS is

a department store chain, which has many credit customers and wants to find out

more information about these customers. A sample of 50 credit customers is

selected with data collected on the following five variables.

Location (rural, urban, suburban)

Income (in $1,000’s—be careful with this)

Size (household size, meaning number of

people living in the household)

Years (the number of years that the

customer has lived in the current location)

Credit balance (the customers current

credit card balance on the store’s credit card, in $).

The data is

available in Doc Sharing Course Project Data Set as an Excel file. You are to

copy and paste the data set into a minitab worksheet.

PROJECT PART A: Exploratory Data Analysis

.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>.next.ecollege.com/(NEXT(4692ce8356))/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentIte#t”>.0/msohtmlclip1/01/clip_image001.gif” alt=”Description: back to top”>.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>

Open the file MATH533 Project

Consumer.xls from the Course Project Data Set folder in Doc Sharing.

For each of the five variables, process,

organize, present, and summarize the data. Analyze each variable by itself

using graphical and numerical techniques of summarization. Use minitab as

much as possible, explaining what the printout tells you. You may wish to

use some of the following graphs: stem-leaf diagram, frequency or relative

frequency table, histogram, boxplot, dotplot, pie chart, bar graph.

Caution: Not all of these are appropriate for each of these variables, nor

are they all necessary. More is not necessarily better. In addition, be

sure to find the appropriate measures of central tendency and measures of

dispersion for the above data. Where appropriate use the five number

summary (the Min, Q1, Median, Q3, Max). Once again, use minitab as

appropriate, and explain what the results mean.

Analyze the connections or relationships

between the variables. There are 10 pairings here (location and income,

location and size, location and years, location and credit balance, income

and size, income and years, income and balance, size and years, size and

credit balance, years and Credit Balance). Use graphical as well as

numerical summary measures. Explain what you see. Be sure to consider all

10 pairings. Some variables show clear relationships, while others do not.

Prepare your report in Microsoft Word

(or some other word processing package),integrating your graphs and tables with text

explanations and interpretations.Be sure that you have graphical and numerical back up for

your explanations and interpretations. Be selective in what you include in

the report. I’m not looking for a 20-page report on every variable and

every possible relationship (that’s 15 things to do). Rather, what I want

you do is to highlight what you see for three individual variables

(no more than one graph for each, one or two measures of central

tendency and variability (as appropriate), and two or three sentences of

interpretation). For the 10 pairings, identify and report only on three of the pairings,

again using graphical and numerical summary (as appropriate), with

interpretations. Please note that at least one of your pairings must include location

and at least one of your pairings must not include location.

All DeVry University policies are in

effect, including the plagiarism policy.

Project Part A report is due by the end

of Week 2.

Project Part A is worth 100 total

points. See grading rubric below.

Submission:

The report from Part 4, including

all relevant graphs and numerical analysis along with interpretations

Format

for report:

Brief introduction

Discuss your first individual variable,

using graphical, numerical summary, and interpretation

Discuss your second individual variable,

using graphical, numerical summary, and interpretation

Discuss your third individual variable,

using graphical, numerical summary, and interpretation

Discuss your first pairing of variables,

using graphical, numerical summary, and interpretation

Discuss your second pairing of

variables, using graphical, numerical summary, and interpretation

Discuss your third pairing of variables,

using graphical, numerical summary, and interpretation

Conclusion

Project Part A: Grading Rubric

.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>.next.ecollege.com/(NEXT(4692ce8356))/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentIte#t”>.0/msohtmlclip1/01/clip_image001.gif” alt=”Description: back to top”>.next.ecollege.com/%28NEXT%284692ce8356%29%29/Main/CourseMode/VizedHtmlView/RenderVizedHtmlView.ed?courseItemSubId=389534002&courseItemType=CourseContentItem&#top”>

Category

Points

%

Description

Three

Individual Variables

12

points each

36

36

graphical

analysis, numerical analysis (when appropriate) and interpretation

Three

Relationships

15

points each

45

45

graphical

analysis, numerical analysis (when appropriate), and interpretation

Communication

Skills

19

19

writing,

grammar, clarity, logic, cohesiveness, adherence to the above format

Total

100

100

A

quality paper will meet or exceed all of the above requirements.

Project Part B: Hypothesis Testing and Confidence Intervals

Your manager

has speculated the following.

a. The

average (mean) annual income was greater than $45,000.

b. The

true population proportion of customers who live in a suburban area is less

than 45%.

c. The

average (mean) number of years lived in the current home is greater than 8

years.

d. The

average (mean) credit balance for rural customers is less than $3,200.

Using the sample data, perform the

hypothesis test for each of the above situations in order to see if there

is evidence to support your manager’s belief in each case A–D. In each

case, use the Seven Elements of a Test of Hypothesis in Section 6.2 of

your text book with α = .05, and explain your conclusion in simple terms. Also,

be sure to compute the p-value and interpret.

Follow this up with computing 95%

confidence intervals for each of the variables described in A–D, and again

interpreting these intervals.

Write a report to your manager about the

results, distilling down the results in a way that would be understandable

to someone who does not know statistics. Clear explanations and interpretations

are critical.

All DeVry University policies are in

effect, including the plagiarism policy.

Project Part B report is due by the end

of Week 6.

Project Part B is worth 100 total

points. See the grading rubric below.

Submission:

The report from Part 3 and all of

the relevant work done in the hypothesis testing (including minitab) in 1 and

the confidence intervals (minitab) in Part 2 as an appendix

Format

for report:

Summary report (about one paragraph on

each of the speculations, A–D)

Appendix with all of the steps in

hypothesis testing (the format of the Seven Elements of a Test of

Hypothesis, in Section 6.2 of your text book) for each speculation A–D, as

well as the confidence intervals, including all minitab output

Project Part B: Grading Rubric

Category

Points

%

Description

Addressing

each speculation—20 points each

80

80

hypothesis

test, interpretation, confidence interval, and interpretation

Summary

report

20

20

one

paragraph on each of the speculations

Total

100

100

A

quality paper will meet or exceed all of the above requirements.

Project Part C: Regression and Correlation Analysis

Using

MINITAB, perform the regression and correlation analysis for the data on income(Y),

the dependent variable, and credit balance (X), the independent

variable, by answering the following.

1.

Generate a scatterplot for income ($1,000)

versus credit balance($), including the graph of the best fit line. Interpret.

2.

Determine the equation of the best fit

line, which describes the relationship between income and credit balance.

3.

Determine the coefficient of correlation.

Interpret.

4.

Determine the coefficient of determination.

Interpret.

5.

Test the utility of this regression model

(use a two tail test with α =.05). Interpret your results, including the

p-value.

6.

Based on your findings in 1–5, what is

your opinion about using credit balance to predict income? Explain.

7.

Compute the 95% confidence interval for beta-1

(the population slope). Interpret this interval.

8.

Using an interval, estimate the average

income for customers that have credit balance of $4,000. Interpret this

interval.

9.

Using an interval, predict the income for

a customer that has a credit balance of $4,000. Interpret this interval.

10.

What can we say about the income for a customer that has a credit balance of

$10,000? Explain your answer.

In an

attempt to improve the model, we attempt to do a multiple regression model

predicting income based on credit balance, years, and size.

11.

Using MINITAB, run the multiple regression analysis using the variables credit balance,

years, and size to predict income. State the equation for this multiple

regression model.

12.

Perform the global test foruUtility (F-Test). Explain your conclusion.

13.

Perform the t-test on each independent variable. Explain your conclusions and

clearly state how you should proceed. In particular, state which independent

variables should we keep and which should be discarded.

14.

Is this multiple regression model better than the linear model that we

generated in parts 1–10? Explain.

All DeVry

University policies are in effect, including the plagiarism policy.

15.

Project Part C report is due by the end of Week 7.

16.

Project Part C is worth 100 total points. See the grading rubric below.

Summarize

your results from 1–14 in a report that is 3 pages or less in length and

explains and interprets the results in ways that are understandable to someone

who does not know statistics.

Submission:

The summary report + all of the

work done in 1–14 (Minitab Output + interpretations) as an appendix

Format:

Summary Report

Points 1–14 addressed with appropriate

output, graphs, and interpretations. Be sure to number each point 1–14.

Project Part C: Grading Rubric

Category

Points

%

Description

Questions

1–12 and 14

5

points each

65

65

addressed

with appropriate output, graphs, and interpretations

Question

13

15

15

addressed

with appropriate output, graphs, and interpretations

Summary

20

20

writing,

grammar, clarity, logic, and cohesiveness

Total

100

100

A

quality paper will meet or exceed all of the above requirements.