- [2006.02.06] After additions and corrections, the final grades are on their way to Osiris, at last.

For the 'logopedisten' I will send a paper certificate to the Academie Gezondheidszorg Utrecht, HB 4.05. - [2006.02.06] Added hyperlink to applets for normal probabilities, by Gary McClelland, Univ of Colorado at Boulder, CO [hyperlink by Jan Willem Chevalking].
- [2006.01.28] It's almost impossible to believe, but the
**grades**are finally here. First, I'd like to apologise again for the long delay, due to serious understaffing.

As you know, I could not graded each assignment separately. Assignments 1 and 2 had already been graded during the course. From the other assignments (3 to 8) I've selected *two* at random for grading. These four graded assignments determine the assignment-part of the final grade (70%). If you missed one assignment, then this part of the grade was lowered by one grade point.

The total grade was calculated by adding the weighted assignment-part (70%) and final-part (30%).

If you find a question mark or hyphen in the grades matrix, then this means that I could not retrieve or open your work. Please send me your document once again, by email to`hquene@gmail.com`. If you think that I've made a mistake, then let me know as well.

Finally, thanks for your effort in writing and reviewing assignments, and for your cooperation and patience. Best -- Hugo - [2005.10.22] Added explanation about reliability, in session 5.
- [2005.10.17] Summary text about effect size translated into English.
- [2005.09.22] For students who need additional training in statistics, there will be some additional class hours: Wednesdays, 15:00-17:00h, Kromme Nieuwegracht 66, room 0.08.
- [2005.08.23] Instead of WebCT Vista, we will use a Yahoo bulletin board to exchange information in this course. The bulletin board or 'group' for this course is at uk.groups.yahoo.com/group/mer200506uu. Please subscribe to (join) this group, by visiting the group webpage and following instructions given there. More instructions are available below, see the first session (16 Sept) in the schedule. PS: This is a members-only group, meaning that only members can access its contents. The teacher has to approve your subscription for membership.
- [2005.04.08] Added link to GraphPad easy online statistical calculators [hyperlink by Min Que].

Trans 10, room 1.31

office hours Tue 14:00-15:30 and by appointment

course 2005-2006, period 1 (sept-nov) | ||
---|---|---|

Fri | 13:00-16:00 | Drift 15, room 0.02 |

The course will be taught in English.

- make assignments about the topics covered in the last meeting;
- hand in your assignments (see below), by Tuesdays 18:00h at the latest;
- review and judge the assignments of a fellow student, by Thursday 12:00h at the latest;
- study new materials.

Put your work in one document per week, preferably in PDF, since that ensures correct display of figures and tables. Place your document in a folder on your personal webpage at UU — check the CIM help pages for instructions. Make sure your document is accessible over the web.

Now send a short message to the group bulletin board, with a short description of your work and with a working hyperlink (URI) to your document on your webpage. (You don't have to enclose the URI in tags, just copying the right

Retrieve the document of your victim for this week, and write a review of her/his work in a separate document. Place your review on your personal webpage, and again announce its location on the group bulletin board, by Thursday 12:00h at the latest.

Notice that strict timing is required to make this schedule work!

Peer review, commenting the work of a peer or colleague, is a serious business. You can learn more about it through these web pages:

- You Lost Me In The Third Paragraph, about "gracious criticism" (Writing Center, George Mason Univ, Fairfax, VA);
- Responding To Other People's Writing (Writing Center, Univ North Carolina, Chapel Hill, NC).

Experimenting. General methodology. Design of experiment.

Peer review.

- Maxwell, S. E., & Delaney, H. D. (2004). Designing Experiments and Analyzing Data: A model comparison perspective (2nd ed.). Mahwah, NJ: Lawrence Erlbaum. Chapter 1 "The Logic of Experimental Design", pp.3-35.

- This course requires and presumes that you already have previous knowledge of statistics, equivalent with an introductory course in statistics. You may test yourself by means of this tentamen of the Statistiek course.
- Make sure that you have an account on the Solis UU netwerk.
- Join the course bulletin board for this class. By way of experiment, we'll use a Yahoo! group for exchanging documents and information. You'll need a Yahoo! ID to join this group, so you may need to sign up or register as a Yahoo! user. The homepage for this class group is uk.groups.yahoo.com/group/mer200506uu. Visit this homepage, and join the group [just press the Join button], so you can tell others about your assignments, reviews, etc.

More information about how to submit and distribute your work is given above. - Browse the various websites listed below. Make sure to browse the Research Methods Knowledge Base.

Your elaborations on the questions below have to be placed on your personal webpage, and announced on the group webpage, as described above. Write clearly, correctly, and concisely. Make a PDF (preferred) or PS or HTML document, or plain ASCII text file, with a maximum length of about 2000 words. (I've made a short explanation about how to make a non-proprietary document, in Dutch.)

- Visit the Letterenbibliotheek. Take a recent issue (2004 or 2005) of an experimental journal (in phonetics, psycholinguistics, etc.), such as Language and Speech, Journal of Phonetics, Speech Communication, Phonetica, etc.

(a) Which questions does the study attempt to answer?

(b) Which independent and dependent variables are involved in the study?

(c) Describe the design of the experiment. - A researcher wants to know whether the vowel duration in stressed vowels is longer than in unstressed vowels. There are two groups of participants, and the researcher is interested in their difference (e.g. L1 and L2 speakers). The target vowels occur in the first vs. the third syllable of three-syllable words. To prevent strategic behavior (what's that?), a speaker may not produce words with different stress patterns: all words produced by a single speaker need to have the same stress pattern.

Provide a possible design for this experiment. Indicate which factors are between or within subjects, dependent or independent, etc. Make a graph or table to illustrate your design. - Answer the following questions in the chapter by Maxwell & Delaney, Chapter 1: Exercises 1, 5, 6, 7, 10.
- This last assignment is not for peer review but for independent study. Now is the perfect time to brush up your statistical skills. Answer the tentamina of my Statistics course (see above). Afterwards, check your answers with those provided on the course webpage. Determine what parts of your statistics proficiency are still deficient. Design a plan of action, to remedy your shortcomings during this teaching period.

- Research Methods Knowledge Base by William M. Trochim, Cornell University, Ithaca, NY (The Web Center for Social Research Methods)
- Statistics Every Writer Should Know by Robert Niles, journalist at the Los Angeles Times.
- Rice Virtual Lab in Statistics by David Lane, Rice University, Houston, TX. This website also contains HyperStat Online, an online introduction to statistics.
- WISE Project, Web Interface for Statistics Education, at Claremont Graduate University, Claremont, CA.
- StatPages.Net by John C. Pezzullo, Georgetown University, Washington DC. A treasure trove of helpful links and programs.
- web-based tools for statistical computation, by Richard Lowry, Vassar College, Poughkeepsie, NY.
- GraphPad QuickCalcs, easy online statistical calculators.
- Lucian Freud, Grosses Interieur W 11 (nach Wattau) (1981/'83).

ANOVA: general, interaction, fixed vs random factors, error terms.

**Readings**:
Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. Chapter 16 "Analysis of Variance: Two-Way Classification", pp. 272-296.

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

- Answer the following questions in Chapter 16 by Ferguson & Takane: Exercises 1, 5, 8.
- In a study of cardiovascular risk factors, joggers who run at least 15 miles per week were compared with a control group described as "generally sedentary". Both men and women participated in this study. The design is a 2x2 between-subjects ANOVA, with Group and Sex as factors. There were 200 participants for each combination of factors. One of the dependent variables is the rate of heartbeat of a participant, after 6 minutes on a treadmill, expressed in beats per minute.

Data from this study are available here in SPSS format, or as plain text (the latter file contains variable names in the first line).

(a) Which auxiliary theories (à la Meehl) are needed for this study. Comment on construct validity.

(b) Is is allowed to conduct an analysis of variance on these data? Motivate your answer with relevant statistical considerations.

(c) Conduct a two-way ANOVA on these data.

(d) Write a summary of the results of this study, and draw your conclusions clearly.

(e) From each cell (combination of factors), draw a random sample of n=20 individuals, out of the 200 in that cell. Explain how you have performed the random sampling. Repeat the two-way ANOVA on this smaller data set.

(f) Discuss the similarities and differences in results between (b) and (d).

This exercise is adapted from: Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. Example 13.8, pp.813-816. - In a
**fictitious**study, the effect of a growing potion was investigated. The growing potion was administered in 5 different dosages (of 1, 3, 5, 10, and 20 units per day), to 10 men and to 10 women for each dosage, during 15 days. The dependent variable is the increase in body length of a participant, after 15 days, in cm.

Data from this study are available here in SPSS format, or as plain text (the latter file contains variable names in the first line).

(a) Import these data into SPSS or a statistical package of your choice. Make a graph of the increase in body length, for each of the 10 conditions. (Hint: In SPSS use a "clustered boxplot".) Discuss what the graph shows.

(b) Conduct a two-way ANOVA on these data, with Sex and Dosage as two "fixed" factors.

(c) What is the range of generalisation over dosages, in the ANOVA in (b)? Discuss the external validity of the dosage factor.

(d) Conduct a two-way ANOVA, but now with Dosage as "random" factor. (Hint: SPSS does not handle "mixed" models like this one very well. It's probably easiest to calculate the F-ratios by hand, using the ANOVA results obtained under (b) above.)

(e) What is now the range of generalisation over dosages, in the ANOVA in (d)? Again discuss the external validity of the dosage factor.

(f) Discuss the similarities and differences in results between the two ANOVAs in this assignment. Does the growing potion have a different effect on men and women?

ANOVA: Repeated Measures, post-hoc tests.

- Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. Chapter 19 "Repeated Measurement and Other Experimental Designs", pp.347-390.
- my manual about ANOVA in SPSS (now in English!).

Compare these notes from similar courses in experimental research methods, at other universities:

- Repeated Measures ANOVA, from Durham University, Durham, England — with instructions for SPSS.
- Repeated Measures ANOVA, from University of Guelph, Canada — with instructions for SPSS.
- Repeated Measures ANOVA Using SPSS Manova, Univ of Texas, Austin, TX — with solid background.
- MANOVA and GLM in SPSS, van UCLA, Los Angeles, CA — about differences between these two methods for repeated measures ANOVA.

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

- Answer the following questions in Chapter 19 by Ferguson & Takane: Exercises 5, 8, 9, 11, 12, 14. Use your own words. For exercise 14, answer the what-question as well as the why-question.
- Using SPSS, re-calculate the example of Ferguson & Takane §19.9. Import the data correctly, you'll need an additional column for factor R or Group. Describe a study that could plausibly have generated these data, and treat the data as if they are the results of that study. Discuss all relevant aspects, including possible violations of assumptions. (§19.10). Draw clear conclusions.

If we are comparing *two* groups of means, as in a pairwise *t* test, then the effect size *d* is defined as: *d* = (m_{1}-m_{2})/s (Cohen, 1969, p.18; m represents mean).

A value of *d*=.2 is regarded as small, *d*=.5 as medium, *d*=.8 as large. It is left to the researcher to classify intermediate values (ibid., p.23-25).

The difference in body length between girls of 15 and 16 years old has a small effect size, just as male-female differences in sub-tests of an IQ test. "A medium effect size is conceived as one large enough to be visible to the naked eye," e.g. the difference in body length between girls of ages 14 and 18. Large effect sizes are "grossly perceptible", e.g. the difference in body length between girls of ages 13 and 18, or the difference in IQ between PhD graduates and freshman students.

If we are comparing *k* groups of means, as in an *F* test (ANOVA), then the effect size *f* is defined as: *f* = s_{m}/s, where s_{m} in turn is defined as the standard deviation of the *k* different group means (ibid., p.268). If *k*=2, then *d*=2*f* (ibid., p.278). These rules apply only if all groups are of the same size; otherwise different criteria apply.

A value of *f*=.10 is regarded as small, *f*=.25 as medium, *f*=.40 as large.
Again, it is left to the researcher to classify intermediate values (ibid., p.278-281).

Small-sized effects can also be meaningful or interesting. Large differences may correspond to small effect sizes, due to measurement error, disruptive side effects, etc. Medium effect sizes are observed in IQ differences between house painters, mechanics, carpenters, butchers. Large effect sizes are observed in IQ differences between house painters, mechanics, carpenters, (railroad) engine drivers, and lab technicians.

Adapted from: Cohen, J. (1969). Statistical Power Analysis for the Behavioral Sciences (1st ed.). New York: Academic Press.

my answer with annotations.

input data for SPSS, my answer with annotations.

There will be no class this week. Use this extra time to catch up on reading materials, assignments, to study reviews, etc. You'll need this preparation for the final assignment!

When: 19:00-21:45. Where: Drift 21, room 0.03.

Signal Detection Theory.

**Reading**:

Gelfand, S.A. (1990). Hearing: An Introduction to Psychological and Physiological Acoustics (2nd ed.). New York: Marcel Dekker. Chapter 8 "Theory of Signal Detection", pp. 207-217. [ISBN 0-8247-8368-9].

- An Introduction to Signal Detection Theory, good tutorial with a nifty SDT Applet, from the WISE Project at Claremont Graduate University, Claremont, CA. [hyperlink by Ivana Brasileiro Reis Pereira]
- Normal Probability Calculator, converts between
*Z*and*p*[hyperlink by Marco 'mvenus82']

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

- In a so-called "Yes/No" perception test, two types of stimuli (
*N*and*N+S*respectively) are both perceived according to a stochastic process with standard normal distribution. The distributions of "Yes" and "No" responses is given below for each stimulus type. Calculate d" in Z units.stimulus *N**N+S*"Yes" .10 .80 "No" .90 .20 - In a so-called "Yes/No" perception test, two types of stimuli (
*N*and*N+S*respectively) are both perceived according to a stochastic process with standard normal distribution. The results show that d'=2. Calculate the expected proportion of "false alarms", if P(hit) = 90.15%. - True or false?
- The points on one ROC curve correspond to different d' values, at the same criterion.
- The points on one ROC curve correspond to different criteria, at the same d' value.
- The points on one ROC curve correspond to various d' values, at the same hit rate.

- We have conducted two "Yes/No" perception tests, in Experiments I and II.
Two stimuli,
*N*and*N+S*were both perceived according to a standard normal distribution. The following proportions "hits" and "false alarms" were observed in these experiments.Exp.I Exp.II P(hit) .6915 .9713 P(false alarm) .3085 .4602 - In order to investigate the perception of two types of stimuli, two methods are available: by means of a so-called "Yes/No" judgement about each stimulus, or by means of pairwise comparison for all possible pairs of stimuli. What are the advantages and disadvantages of these two methods?

In medical applications, a terminology inconsistent with that of [signal] detection theory is sometimes used: The hit rate is called "sensitivity", and the correct-rejection rate "specificity".

Macmillan, N.A., & Creelman, C.D. (1991). Detection Theory: A user's guide. Cambridge: Cambridge University Press [ISBN 0-521-36359-4]. p 32.

regression, error of measurement, reliability.

- Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. Chapter 24 "Errors of Measurement", pp.466-478.
**optional**: Trochim, W.M. (2002). Measurement. In: Research Methods Knowledge Base (Web Center for Social Research Methods).

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

[Update 2005.10.21] Let us assume that we have 2 observations for each of 5 persons. These observations are about the perceived body weight, as judged by two 'raters' or judges, x1 and x2. The data are as follows:

person x1 x2 1 60 62 2 70 68 3 70 71 4 65 65 5 65 63Because we have only two measures (variables), there is only one pair of measures to compare in this example. Very often, however, there are more than two judges involved, and hence many more pairs.

First, let us calculate the correlation between these two variables x1 and x2. This can be done in SPSS with the `Correlations` command (Analyze > Correlate > Bivariate, check Pearson correlation coefficient).
This yields r=.904, and the average r (over 1 pair of judges) is the same.

If you need to compute r manually, one method is to first convert x1 and x2 to Z-values [(x-mean)/s], yielding z1 and z2. Then r = SUM(z1×z2) / (n-1).

This value of r corresponds to Cronbach's Alpha of (2×.904)/(1+.904) = .946 (with N=2 judges).
Cronbach's Alpha can be obtained in SPSS by choosing Analyze > Scale > Reliability Analysis. Select the "items" (or judges) x1 and x2, and select model Alpha.
The output states: `Reliability Coefficients [over] 2 items, Alpha = .9459` [etc.]

If the same average correlation r=.904 had been observed over 4 judges (i.e. over 4×3 pairs of judges), then that would have indicated an even higher inter-rater reliability, viz. alpha = (4×.904)/(1+3×.904) = .974.

Exactly the same reasoning applies if the data are not provided by 2 raters judging the same 5 objects, but by 2 test items "judging" a property of the same 5 persons. Both approaches are common in language research. Although SPSS only mentions items, and inter-item reliability; the analysis is equally applicable to raters or judges, and inter-rater reliability.

Note that both judges (items) may be inaccurate. A priori, we do not know how good each judge is, nor which judge is better. We know, however, that their reliability of judging the same thing (true body weight, we hope) increases with their mutual correlation.

Now, let's regard the same data, but in a different context. We have one measuring instrument of the abstract concept x that we try to measure. The same 5 objects are measured twice (test-retest), yielding the data given above. In this test-retest context, there is always just one correlation, and the idea of inter-rater reliability does not apply in this context. We find that r_{xx}=.904.

This reliability coefficient r = s^{2}_{T} / s^{2}_{x} . This provides us with an *estimate* about how much of the total variance is due to variance in the underlying, unknown, "true" scores. In this example, 90.4% of the total variance is estimated to be due to variance of the true scores. The complementary part, 9.6% of the total variance, is estimated to be due to measurement error. If there were no measurement error, then we would predict perfect correlation (r=1); if the measurements would contain only error (and no true score component at all), then we would predict zero correlation (r=0) between x1 and x2.

In this example, we find that

s_{e} = s_{x} × sqrt(1-.904) = sqrt(15.484) × sqrt(.096) = 1.219

check: s^{2}_{x} = 15.484 = s^{2}_{T} + s^{2}_{e} =
s^{2}_{T} + (1.219)^{2},

so s^{2}_{T} = 15.484 - 1.486 = 13.997

and indeed r = .904 = s^{2}_{T} / s^{2}_{x} = 13.997 / 15.484.

Supposedly, x1 and x2 measure the same property x. To obtain s^{2}_{x}, the total observed variance of x (as needed above), we cannot use x1 exclusively nor x2 exclusively. The total variance is obtained here from the two standard deviations:

s^{2}_{x} = s_{x1} × s_{x2}

s^{2}_{x} = 4.18330 × 3.70135 = 15.484

In general, a reliability coefficient smaller than .5 is regarded as low, between .5 and .8 as moderate, and over .8 as high.

Multiple regression, multivariate analyses.

- Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. [ISBN 0-7167-9657-0]. Chapter 11 "Multiple Regression", pp. 708-745. [book companion website].
**optional**: Devore, Jay & Peck, Roxy (2001) Statistics: The Exploration and Analysis of Data (4th ed.). Pacific Grove, CA: Duxbury. [ISBN 0-534-35867-5.] Chapter 14 "Multiple Regression Analysis", pp. 553-610.

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

- Answer the following questions:
Moore & McCabe, Chapter 11: Exercises 2, 3, 16, 33.

Data for the last two questions are available here in plain text format (the first line of this file contains variable names).

For questions 16 and 33 the FORWARD method is most appropriate.
This means that you start with an empty model (only intercept b_{0})
to which predictors are added step by step. After each addition of a predictor,
you check whether the model performs significantly better than before
(e.g. by checking whether R^{2} increases).

The questions are about the **increment** in R^{2} by **adding** a predictor.
The relevant information is easier to find in the SPSS output if you specify
the FORWARD method.

As a bonus, you could check what happens if you exclude case #51 from the
data set, e.g. by marking it as a missing value. This is quite easy if you
keep the regression command in a Syntax window for repeated use.

For admission to a university, two things are taken into account: (a) your average grades in the final years of high school (HSM, HSS, HSE), and (b) your score in a national admissions exam, like the Dutch CITO test (Scholastic Aptitude Test, SAT). Top-class universities, like Harvard, Yale, Stanford, etc., use both parameters in selection. You have to be the best in your class (but your classmates are strongly competing for this honor), plus you need a minimal score on your SAT.

During your academic study, all your grades and results contribute to your Grade Point Average (GPA), a weighted average grade. This GPA is generally used as an indication of academic achievement and success. The authors attempt to predict the GPA from the previously obtained indicators (a) and (b).

Take a sample of fathers, and note their body length (X). Wait for one full generation, and measure the body length of each father's oldest adult son (Y). Make a scattergram of X and Y. The best-fitting line throught the observations has a slope of less than 1 (typically about .65). This is because the sons' length Y tends to "regress to the mean" — outlier fathers tend to produce average sons, and average fathers also tend to produce average sons. Galton called this phenomenon "regression towards mediocrity". Thus the best-fitting line is a "regression" line because it shows the degree of regression to the mean, from one generation to the next. (Note that any slope larger than 0 suggests an hereditary component in the sons' body length, Y.)

Questions: Which variable has the larger variance, X or Y? Does the variation in body length increase or decrease (regress) over generations? Why?

- Why is it called regression, by Ann Lehman and John Sall (SAS Institute).
- Galton, Pearson, and the Peas, by Jeffrey M. Stanton (Syracuse University).

Non-normal data, transformations, nonparametric testing.

- Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. §15.11 "Assumptions underlying the Analysis of Variance" and §15.12 "Transformation of Data", pp. 261-271.
- Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. Chapter 22 "Nonparametric Tests of Significance", pp. 431-448.
**optional**: Devore, Jay & Peck, Roxy (2001) Statistics: The Exploration and Analysis of Data (4th edition). Pacific Grove, CA: Duxbury. ISBN 0-534-35867-5. §7.4 "Checking for Normality and Normalizing Transformations", pp. 266-278. §11.4 "Distribution-Free Procedures for Inferences...", pp. 445-454. [provides a different explanation of the Wilcoxon rank sum test, equivalent to the Mann-Whitney test.]**optional**: Vocht, Alphons de (2000) Basishandboek SPSS 10 voor Windows 98/ME/2000. Utrecht: Bijleveld Press. ISBN 90-5548-113-0. Hoofdstuk 15 "Niet-parametrische toetsen", pp. 223-240 [practical information, in Dutch, about how to use these tests in SPSS]. Achterflap, "Overzicht Statistische Technieken" p.256 [overview table about when to use which test].**optional**: Maxwell, S. E., & Delaney, H. D. (1990). Designing Experiments and Analyzing Data: A model comparison perspective. Belmont, CA: Wadsworth. Chapter 15 "Robust ANOVA and ANCOVA", pp.695-723.

- StatSoft Electronic Textbook, look under "Nonparametric Statistics".

Your answers and solutions to the questions below have to be handed in as described above. As always, write clearly, correctly, and concisely.

- Answer the following questions: Ferguson & Takane, Chapter 22: Exercises 1, 2, 4, 5, 9, 10, 11.
- Memorine is a non-existing drug which increases verbal memory. The effect of memorine on verbal memory was investigated in a
**fictitious**experiment. Listeners (n=100) were presented with a spoken text of 1000 words, and afterwards they had to repeat the words they could remember from that spoken text. The same participants were observed in a control condition (first column, "control") and after swallowing 100 mg memorine ("test"). The numbers of remembered words per listener are available in plain text format (with comma between observations).

(a) What are H_{0}and H_{a}for this study?

To continue, use α=.05 for all statistical tests in this assignment.

(b) Assume that the numbers of remembered words are normally distributed. Use a pairwise t-test to see whether there is a difference between the two conditions. Draw your conclusions clearly.

(c) As a bonus, and to check on your analysis, use a repeated-measures ANOVA to see whether there is a difference between the two conditions. The t-test and F-test are equivalent, and should produce the same result here, with F = t^{2}. Check.

(d) Inspect whether the dependent variables are indeed distributed normally. Include relevant diagnostics and figures in your report. Discuss the validity of your previous conclusions.

Note: In real life, these checks on assumptions are done*before*statistical testing, and not afterwards.

As explained in class, there are two options if your data are not normal: transformation, or using nonparametric tests.

(e) transformation. Construct (COMPUTE) two new variables, containing the*square root*of the raw observations. Inspect whether the new variables are distributed normally. Use a pairwise t-test on the transformed data. Draw your conclusions clearly.

(f) non-parametric testing. Use a nonparametric, pairwise test to investigate your hypotheses. Again, draw your conclusions clearly.

(g) Discuss the similarities and differences in conclusions at parts (b-c), (e), and (f). Your text should mention key concepts: validity, testing, power, significance, assumption.

my answers [in Dunglish], with annotations.

logistic regression, GLM, modelling.

- Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. [ISBN 0-7167-9657-0]. Chapter 15 "Logistic Regression", only available here. [book companion website].
**optional**: Generalized Linear Models, van StatSoft, Inc — be warned that this is not an easy text! Concentrate on the first part, until "Types of Analyses". The sections on matrix algebra may be skipped. Make notes about your questions and problems with this text.

- GLM in SPSS, from UCLA, Los Angeles, CA.
- Generalized Linear Models (GLZ), by Edward F. Connor, San Francisco State University, CA.
- GLM in SPSS, from Universiteit Gent.
- Logistic Regression, by Michael T. Brannick, University of South Florida, Tampa, FL.

Your answers and solutions to the questions below have to be handed in as described above, by Tuesday 18:00. As always, write clearly, correctly, and concisely.

- Answer the following questions: Moore & McCabe, Chapter 15: Exercises 8, 10, 12, 25.

In order to speed up your work on exercise 15.25 in SPSS, I've put the data on the web, in a plain text data file. The first line contains the names of the variables. Data (N=2900) start on line 2, and are coded as follows:hospital: 0=hosp.A, 1=hosp.B; outcome: 0=died, 1=survived; condition: 0=poor, 1=good.

Variables are separated by commas.

In your logistic regression, the variables`hospital`and`condition`must be treated as categorical variables. For easier interpretation of the results, I prefer to use the zero codes as references or baselines (in SPSS choose`Reference: First`).

SPSS does not provide you with 95% confidence intervals; you need to calculate these by hand. The Wald statistic in the SPSS output is the same as the test statistic for β as defined on p.46 in the reading material. - For this week, there will be
**no review**to write, since we're pressed for time in the final week of this teaching period. The answers provided below may help you in spotting errors in your submission.

As promised, here are my annotated answers for session 8.

As always, the revised paper should be (as much as possible) a running text, not a collection of incomplete sentences and statistical output.

In the revised version you have to accommodate the comments of your reviewer — if you agree of course. Also use the reading materials and hyperlinks provided.

You may discuss the reviewer's comments in the text of your revised version. But perhaps you find it easier to write a coherent (revised) text on your own, plus a second document in which you discuss the reviewer's comments explicitly, stating which comments you have taken into account, which comments you have ignored, and why. Such a separate document is called a "cover letter to the Editor".

**Deadline is Thursday 17th November 2005, 23:55 h**.

- Maxwell, S.E. & Delaney, H.D. (2004) Designing Experiments and Analyzing Data: A model comparison perspective (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. ISBN 0-8058-3718-3. [very good, but not an easy book].
- Levin, I.P. (1998) Relating Statistics and Experimental Design: An introduction. Thousand Oaks, CA: Sage. Sage University Papers Series on Quantitative Applications in the Social Sciences; 07-125. ISBN 0-7619-1472-2.
- Carver, R.H. & Nash, J.G. (2005) Doing Data Analysis with SPSS version 12.0. Belmont, CA: Brooks/Cole. ISBN 0-534-46551-x.
- Kirkpatrick, L.A. & Feeney, B.C. (2005) A Simple Guide to SPSS for Windows/ for Version 12.0. Belmont, CA: Thomson Wadsworth. ISBN 0-534-61006-4.
- StatSoft, Inc. (2004) Electronic Statistics Textbook. Tulsa, OK: StatSoft. URL: http://www.statsoft.com/textbook/stathome.html [clear and concise chapters about most statistical topics].
- Also check the hyperlinks listed under session 1.
- Also check the webpage of my statistiek course [in Dutch].

© 2003-2006 HQ 2006.02.06