 Order Now

# APPLIED STATISTICS PROJECT TWO 代写 作业

QUESTION ONE (ex 22.22 from class text “The Statistical Sleuth”)APPLIED STATISTICS PROJECT TWO 代写 作业
When she died in 1817, the English novelist Jane Austen had not yet finished the novel
Sandition, but she did leave notes on how she intended to conclude the book. The novel was
completed by a ghost writer, who attempted to emulate Austen’s style. In 1978, a researcher
reported counts of some words found in chapters of books written by Austen and in chapters
written by the emulator. These are reproduced in the table below (Data from A.Q. Morton,
Literary Detection: How to Prove Authorship and Fraud in Literature and Documents, New
York: Charles Scribne’s sons, 1978). Did the emulator do a good job in terms of matching the
relative rates of occurrence of these six words? In particular, did the emulator match the
relative rates that Austen used the first part of Sandition?
QUESTION TWO (ex 22.23 from class text “The Statistical Sleuth”)APPLIED STATISTICS PROJECT TWO 代写 作业
On January 27, 1986, the night before the space shuttle challenger exploded, an engineer
recommended to the National Aeronautics and Space Administration (NASA) that the shuttle
not be launched in the cold weather. The forecasted temperature for the Challenger launch
was 30 degrees F – the coldest launch ever. After an intense 3-hour telephone conference,
officials decided to proceed with the launch. Shown in Display 22.15 are the launch
temperatures and the number of O-ring problems in 24 shuttle launches prior to the
Challenger. Do these data offer evidence that the number of incidents increases with
decreasing temperature?
2
QUESTION THREEAPPLIED STATISTICS PROJECT TWO 代写 作业
A sample of size  3 n  is taken to estimate the “squared mean” IQ of undergraduate students
at the ANU. The sampled values are 120, 140, and 100. The “squared mean” is defined to be
2 ,
 where   is the usual population mean.
a) Using all possible unique bootstrap samples of the observed data, compute the
bootstrap estimate of the standard error of
2
ˆ
. 
After the initial study of size  3, n  a further sample of 27 IQ measurements from ANU
undergraduates is taken. The vector of 30 IQ values, called IQvalues, was then analysed in
the Statistical package R. Output of this analysis follows:
> bootres<-rep(0,3000)
>
> for(i in 1:3000) {
+
+ bootvalues<-sample(IQvalues,size=30,replace=TRUE)
+ bootres[i]<-mean(bootvalues)^2
+
+ }
>
> quantile(bootres,c(0.025,0.05,0.95,0.975))
2.5% 5% 95% 97.5%
11548.91 11664.00 12882.25 12980.80
> mean(IQvalues)
 110.8
> mean(bootres)
 12287.92
> var(bootres)
 140130.5
Based on this output answer the following questions:
b) Obtain 90% confidence intervals for both
2
 and  . 
c) Obtain a bootstrap estimate of the bias of
2
ˆ
. 
[Note: the bias is equal to the estimate
based on the original data less the average of the bootstrap replicates]
d) Does
2
12985   seem plausible? Please provide reasons for your answer.
QUESTION FOURAPPLIED STATISTICS PROJECT TWO 代写 作业
A logistic regression model was fitted to investigate whether four continuous explanatory
variables (X1,X2,X3,X4) were related to the probability that a response (Y) takes the value 1
(the categorical response can take the value “1” or “0”). The results of a logistic regression
model fitted to this data using R are provided in the output below. Based on this output
answer the following questions:
> summary(fit1)
Call:
glm(formula = Y ~ X1 + X2 + X3 + X4, family = binomial(link = logit))
Deviance Residuals:
Min 1Q Median 3Q Max
-2.0977 -0.8558 0.4788 0.7529 1.6552
Coefficients:
Estimate Std. Error z value Pr(>|z|)
3
(Intercept) 2.31293 0.64259 3.599 0.000319 ***
X1 -0.02975 0.01350 -2.203 0.027577 *
X2 -0.40879 0.59900 -0.682 0.494954
X3 0.30525 0.60413 0.505 0.613362
X4 -1.57475 0.50162 -3.139 0.001693 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 122.32 on 97 degrees of freedom
Residual deviance: 101.05 on 93 degrees of freedom
> anova(fit1)
Analysis of Deviance Table
Model: binomial, link: logit
Response: Y
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev
NULL 97 122.32
X1 1 7.4050 96 114.91
X2 1 1.8040 95 113.11
X3 1 ?????? 94
X4 1 10.4481 93 101.05
a)  What is the estimated probability that Y takes the value 1 when X1=40, X2=1,
X3=0 and X4=0, everything else held constant? If possible, provide a 95%
b)  Compute the missing value labelled ?????? in the above Analysis of Variance
Table. Based on this table is it possible to test whether X1, X3, and X4 are
needed in a model that contains X2?
c)  If the above model were re-fitted with the Y variable coded differently so that
“1” became “0” and “0” became “1” how would the coefficient estimates
above change? You must provide a reason for your answer.
d)  Below are you given some further R output relating to the four covariates used
to fit the above model. Based on this information discuss the suitability of the
fitted model and any recommendations you might have for changes to this
model.
> cor(cbind(X1,X2,X3,X4))
X1 X2 X3 X4
X1 1.0000000 -0.2177187 -0.2093691 -0.1553068
X2 -0.2177187 1.0000000 0.9945411 0.1110746
X3 -0.2093691 0.9945411 1.0000000 0.1221508
X4 -0.1553068 0.1110746 0.1221508 1.0000000