course code 200800180
200910, period 1, SeptemberNovember
 [2009.11.09]
Added a few postmortem remarks, highly recommended.
 [2009.11.09]
The grades for the first and second parts of the weekly assignments (together counting for 70% of your grade) are now available.
 [2009.10.28]
There are now two options for the final assignment.
 [2009.09.22]
Added link to Java Applets for Power and Sample Size by Russell V. Lenth, University of Iowa, Iowa City, IA. Highly recommended, but you should read the whole web page first, before using the applets!
 [2009.09.01]
We will use a SurfNet user group to exchange information in this course.
The bulletin board or group for this course is at
www.surfgroepen.nl/sites/mer0910blok1.
Most students are already enrolled in this group.
Open your Solis (UU) mail and follow instructions from SurfGroepen to confirm you membership.
 [2009.08.26]
Note that there are no class meetings on 9 Sept and 14 Oct, and no lab meetings on 25 Sept and 16 Oct, because of conference visits. The lab session of 23 Oct will be used for class meeting.
Teacher
Hugo Quené
email h dot quene AT uu nl,
Trans 10, room 1.17
office hours Tue 14:0016:00 and by appointment
Readings

Rietveld, T. & Van Hout, R. (2005).
Statistics in Language Research: Analysis of Variance. Berlin: Mouton de Gruyter. ISBN 3110185814.
Further abbreviated as RvH.

Additional reading materials;
these will be distributed online or through a pigeonhole (postvak) at Trans 10.
Schedule
course 200910, period 1 (septnov) 

Wed 13:1517:00  class  ADD: 102 
Fri 09:0010:45  data lab  KNG 80: 113 
Prerequisites
This course requires a basic knowledge of statistics and of data analysis, including hypothesis testing, t tests, and analysis of variance.
The course will have weekly class meetings on Wednesdays.
The focus is on independent study, assignments, and peer review.
In addition, the first weeks there will also be computer lab sessions, in which we will practice and rehearse data analysis techniques.
The course will be taught in English.
Before each class meeting you'll have to do the following:
 make assignments about the topics covered in the last meeting;
 hand in your assignments (see below), by Friday 18:00h at the latest;
 review and judge the assignments of a fellow student, by Monday 18:00h at the latest;
 read and study new materials.
During a class meeting we will discuss your work,
using your mutual reviews, and new topics will be introduced.
After each class meeting, assignments have to be handed to the group bulletin board, so all information is available to all.
Put your work in one document per week, this has to be in PDF format (why PDF?).
Name your document as LASTNAMEassN.pdf (use your last name and assignment number N).
Upload your document to the group bulletin board at
www.surfgroepen.nl/sites/mer0910blok1. Place your document in the Documents section, in the N'th folder: folder "one" for assignment 1, folder "two" for assignment 2, etc.
All this should be done by Friday 18:00h at the latest.
Retrieve the document of your selected peer student for this week, and write a review of her/his work in a separate document.
Name your review document as LASTNAMErevNREVIEWED.pdf (replace with your last name and assignment number N and name of reviewed student).
Place your review on the group bulletin board in the same folder as the assignments, by Monday 18:00h at the latest.
Before the Wednesday class session, you should read the review of your assignment.
Notice that everybody's cooperation is required to make this schedule work! Failure to meet deadlines will cause problems "downstream", so make sure to finish and upload your work on time.
In the first weeks of the course, there will also be a "data lab" on Fridays, to practice and rehearse your skills in data analysis. More details will follow soon.
Peer review, commenting the work of a peer or colleague, is a serious business.
You can learn more about it through these web pages:

Peer Review, by Laura Guertin (Science Education Research Center, Carleton College, Northfield, MN);

Peer Review (Manoa Writing Program, Univ of Hawai'i, Honolulu, HI);

You Lost Me In The Third Paragraph, about "gracious criticism" (Writing Center, George Mason Univ, Fairfax, VA);

Responding To Other People's Writing (Writing Center, Univ North Carolina, Chapel Hill, NC).
Your final grade is determined by the weekly assigments (35%+35%) and the final assigment (30%).
Your collected works and class participations of the first part of the course will be graded halfway during the course (weight 35%), and similarly for the second part of the course (also weight 35%). This means that your weekly assignments and reviews will not be graded weekly! You should therefore assess your weekly progress with the help of other students' assignments, peer reviews, and class discussions.
The final assignment determines 30% of your final grade. Due to the limited time in period 1, the final grades will only be available after the end of the course.
Wed 9 Sept
no class (conference visit)
lab 1: Fri 11 Sept
Introduction. Practicalities. Working with SPSS. Working with R. Descriptive statistics. Inferential statistics: t tests and ANOVA.
In this course we will introduce and support two programs for data analysis.
SPSS can be used in the computer labs, and it can be obtained for a low fee under the UU campus license, from the surfspot web store.
R is a more recent program, more flexible than SPSS. R is quickly gaining in popularity, and becoming the standard in academic research. It can be obtained as opensource software from
www.rproject.org;
for an introduction see my
tutorial.
We will use this toy data set (created by this
R script).
session 1: Wed 16 Sept
Experimentation. General methodology. Design of experiment. How to peerreview.
Reading:
 RvH: Chapters 1 and 2 (and 3).

H. Quené (2009). How to design and analyze language acquisition studies.
To appear in: S. Unsworth & E. Blom (Eds.) Experimental Methods in Language Acquisition Research. Amsterdam: Benjamins.
[PDF].
Before:

This course requires and presumes that you already have previous knowledge of statistics, equivalent with an introductory course in statistics. You may test yourself by means of this tentamen of the Statistiek course.

Make sure that you have an account on the Solis UU netwerk.
 Browse the various websites listed below.
Make sure to browse the
Research Methods Knowledge Base.
Assignments:
Your elaborations on the questions below have to be placed on your personal webpage, and announced on the group webpage, as described
above. Write clearly, correctly, and concisely.
Make a document
in PDF format with a maximum length of about 2000 words. (I've made a short explanation about how to make a
nonproprietary document, in Dutch.)

Visit the University library — you could even do this physically. The location at Drift 27 is convenient and holds excellent collections.
Take a recent printed issue (2008 or 2009) of an experimental linguistics journal (in phonetics, psycholinguistics, speech pathology, etc.), such as Journal of Phonetics, Journal of Memory and Language, Phonetica, JLSHR, and select an article that reports an experimental study.
(a) Which questions does the study attempt to answer?
(b) Which independent and dependent variables are involved in the study?
(c) Describe the design of the experiment.

A researcher wants to know whether the vowel duration in stressed vowels is longer than in unstressed vowels. There are two groups of participants, and the researcher is interested in their difference (e.g. L1 and L2 speakers). The target vowels occur in the first vs. the third syllable of threesyllable words. To prevent strategic behavior (what's that?), a speaker may not produce words with different stress patterns: all words produced by a single speaker need to have the same stress pattern.
Provide a possible design for this experiment. Indicate which factors are between or within subjects, dependent or independent, etc. Make a graph or table to illustrate your design.

Answer the following questions in RvH section 2.9: Exercises 2, 3, 4, 5, 6, 8.
 In preparation for next week, also read RvH Chapter 3.

This last assignment is not for peer review but for independent study. Now is the perfect time to brush up your statistical skills. Answer the tentamina of my Statistics course (see above). Afterwards, check your answers with those provided on the course webpage. Determine what parts of your statistics proficiency are still deficient. Design a plan of action, to remedy your shortcomings during this teaching period.
Links:

Research Methods Knowledge Base by William M. Trochim, Cornell University, Ithaca, NY (Web Center for Social Research Methods)

Statistics Every Writer Should Know by Robert Niles, journalist at the Los Angeles Times.

Rice Virtual Lab in Statistics by David Lane, Rice University, Houston, TX. This website also contains HyperStat Online, an online introduction to statistics.

WISE Project, Web Interface for Statistics Education, at Claremont Graduate University, Claremont, CA.

StatPages.Net by John C. Pezzullo, Georgetown University, Washington DC. A treasure trove of helpful links and programs.

webbased tools for statistical computation, by Richard Lowry, Vassar College, Poughkeepsie, NY.

GraphPad QuickCalcs, easy online statistical calculators.

Lucian Freud, Grosses Interieur W 11 (nach Wattau) (1981/'83).

mountain gorillas

To give you some idea about how to make the assignment, there is
an example available. Note that results
of statistical analysis should be described, and not copiedandpasted into the document.

Highly recommended: Java Applets for Power and Sample Size by Russell V. Lenth, University of Iowa, Iowa City, IA.
Read the whole web page first, before using the applets!
lab 2: Fri 18 Sept
Topics TBA.
session 2: Wed 23 Sept
Experimental design.
Readings:
Additional readings:
Assignments:
For this assignment you have to provide the experimental design of a prospective (future) study of your own. You could, for example, select an idea for your masters thesis, or a research project for one of your classes, or a followup study building on a previous experiment. Your prospective study should in principle be suitable for publication in a top peerreviewed journal in your field; this means that not only the question being addressed, but the design and methodology need to be very good too! Your experimental design and methods should be adequate to provide answers to your question.
Give a brief introduction about the issues your study attempts to answer, and describe and motivate the experimental design and methods. Which are the dependent and independent variables? Discuss the construct validity of your manipulations (treatments) and observations. Describe and classify your design according to the schemes in the reading materials (withinsubject, splitplot, etc). Can you give some estimate of the expected effect size? And if so, what would be the power of your study? How many units (children, participants, sentences, items) do you need to achieve that power? Think about plausible alternative explanations, and other threats to the validity of your study, and how to neutralize these threats in your design.
As before, your elaborations have to result in a PDF document to be placed (or announced) on the group webpage (see above). Write clearly, correctly, and concisely (you'll probably need about 2 or 3 pages of text).
Fri 25 Sept
No lab session (conference visit).
session 3: Wed 30 Sept
ANOVA: general principles, oneway, posthoc test, power.
Readings:
RvH: Chapters 3 and 4.
Assignments:
Your answers and solutions to the questions below have to be handed in
as described
above.
As always, write clearly, correctly, and concisely.
 Answer the following questions in RvH, section 3.10: Exercises 2, 5.
 Answer the following questions in RvH, section 4.11: Exercises 1, 2, 3, 4, 7, 8.
lab 3: Fri 02 Oct

Perform oneway ANOVA of the minidataset given in RvH Table 4.1 part B.
First enter the data in an SPSS worksheet in " data format (N=9 rows, with separate columns for each variable).

Perform oneway ANOVA of the datasets warpbreaks (number of breaks, wool type, tension condition) and Pitt_Shoaf1.txt (participant ID, condition, reaction time) that are provided in the Surfnet group (under Shared Documents > Extra).

Explore oneway ANOVA by means of this Java
applet.
For example, what happens if you add outliers or change variances?
Adjusting t or adjusting df?
If two variables have unequal variances, then the t test statistic may become inflated. The computed t value is larger than it should be. Consequently H0 may be rejected while in fact it should not be rejected. This is known as a Type I error. To prevent this error, we should decrease the t test statistic by some amount. However, in practice it is easier to decrease not the t value itself, but its associated degrees of freedom.
In this way we pretend that the t value is based on fewer observations than it was. Thus we are more conservative while testing our hypotheses.
The figure below shows the critical values of t (on the vertical axis) for a range of df (on horizontal axis).
As you can see, decreasing the value of the t statistic with unchanged df (down arrow) yields a similar effect as decreasing the df with unchanged t (left arrow). Both adjustements would result here in an insignificant outcome, and H0 would not be rejected.
Because it's easier to compute the adjustement in df (length of left arrow) than the adjustement in t (length of down arrow), we commonly adjust the degrees of freedom, and not the t value, if we need to be more conservative.
We will encounter the same reasoning with F values used in ANOVA; those adjustements are known as the HuynhFeldt and GreenhouseGeisser corrections to the degrees of freedom.
session 4: Wed 07 Oct
ANOVA: multifactorial designs, interaction, fixed vs random factors, error terms.
Readings:
RvH: Chapters 5 and 6.
Assignments:
Your answers and solutions to the questions below have to be handed in
as described
above.
As always, write clearly, correctly, and concisely.
 Answer the following questions in RvH, section 5.13: Exercises 1, 3, 5, 6.
 Answer the following questions in RvH, section 6.9: Exercises 3, 4, 5.
 In a study of cardiovascular risk factors, joggers who run at least 15 miles per week were compared with a control group described as "generally sedentary". Both men and women participated in this study. The design is a 2×2 betweensubjects ANOVA, with Group and Sex as factors. There were 200 participants for each combination of factors. One of the dependent variables is the rate of heartbeat of a participant, after 6 minutes on a treadmill, expressed in beats per minute.
Data from this study are available here in SPSS format, or as plain text (the latter file contains variable names in the first line).
(a) What do you think of the construct validity? Please comment.
(b) Is is allowed to conduct an analysis of variance on these data? Motivate your answer with relevant statistical considerations.
(c) Conduct a twoway ANOVA on these data.
(d) Write a summary of the results of this study, including the (partial) effect size η and η^{2}.
Draw your conclusions clearly.
(e) From each cell (combination of factors), draw a random sample of n=20 individuals, out of the 200 in that cell. Explain how you have performed the random sampling. Repeat the twoway ANOVA on this smaller data set.
(f) Discuss the similarities and differences in results between (b) and (d).
This exercise is adapted from:
Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. Example 13.8, pp.813816.
lab 4: Fri 09 Oct, Janskerkhof 13, room K.06
Introduction to R.
This lab session will be used to introdoce R, an opensource package for statistical analysis, available from www.rproject.org.
We will use this
tutorial to explore the program.
Note that this lab session will be in the Phonetics lab, Janskerkhof 13, room K.06! We will use the lab's Linux system to login and to use R.
Wed 14 Oct
There will be no class this week, because the teacher is attending a conference abroad.
This gives you a chance to catch up on reading materials, etc.
There is a refresher assignment (see below) but there will be no peer review of this assignment!
Assignment:
Here is one refresher assignments, outside the normal peerreview process.
Your answers and solutions to the question below do NOT have to be handed in.
As always, write clearly, correctly, and concisely.
 In a fictitious study, the effect of a growing potion was investigated. The growing potion was administered in 5 different dosages (of 1, 3, 5, 10, and 20 units per day), to 10 men and to 10 women for each dosage, during 15 days.
The dependent variable is the increase in body length of a participant, after 15 days, in cm.
Data from this study are available here in SPSS format,
or as plain text
(the latter file contains variable names in the first line).
(a) Import these data into SPSS or a statistical package of your choice. Make a graph of the increase in body length, for each of the 10 conditions. (Hint: In SPSS use a "clustered boxplot".) Discuss what the graph shows.
(b) Conduct a twoway ANOVA on these data, with Sex and Dosage as two "fixed" factors. Include measures of effect size and of power in your report.
(c) What is the range of generalisation over dosages, in the ANOVA in (b)? Discuss the external validity of the dosage factor.
(d) Conduct a twoway ANOVA, but now with Dosage as "random" factor. (Hint: SPSS does not handle "mixed" models like this one very well. It's probably easiest to calculate the Fratios by hand, using the ANOVA results obtained under (b) above.)
(e) What is now the range of generalisation over dosages, in the ANOVA in (d)? Again discuss the external validity of the dosage factor.
(f) Discuss the similarities and differences in results between the two ANOVAs in this assignment. Does the growing potion have a different effect on men and women?
Fri 16 Oct
no lab session (conference visit)
session 5: Wed 21 Oct
ANOVA: Repeated Measures, posthoc tests.
Readings:
 RvH: Chapter 8
 my manual about ANOVA in SPSS, covering the same ground as RvH Chapter 8.
Links:
Compare these notes from similar courses in experimental research methods, at other universities:
Assignments:
Your answers and solutions to the questions below have to be handed in
as described
above.
As always, write clearly, correctly, and concisely.
 Answer the following questions in RvH, section 8.15: Exercises 1, 2, 3, 4, 5.
If we are comparing two groups of means, as in a pairwise t test, then the effect size d is defined as: d = (m_{1}m_{2})/s (Cohen, 1969, p.18; m represents mean).
A value of d=.2 is regarded as small, d=.5 as medium, d=.8 as large. It is left to the researcher to classify intermediate values (ibid., p.2325).
The difference in body length between girls of 15 and 16 years old has a small effect size, just as malefemale differences in subtests of an IQ test. "A medium effect size is conceived as one large enough to be visible to the naked eye," e.g. the difference in body length between girls of ages 14 and 18. Large effect sizes are "grossly perceptible", e.g. the difference in body length between girls of ages 13 and 18, or the difference in IQ between PhD graduates and freshman students.
If we are comparing k groups of means, as in an F test (ANOVA), then the effect size f is defined as: f = s_{m}/s, where s_{m} in turn is defined as the standard deviation of the k different group means (ibid., p.268). If k=2, then d=2f (ibid., p.278). These rules apply only if all groups are of the same size; otherwise different criteria apply.
A value of f=.10 is regarded as small, f=.25 as medium, f=.40 as large.
Again, it is left to the researcher to classify intermediate values (ibid., p.278281).
Smallsized effects can also be meaningful or interesting. Large differences may correspond to small effect sizes, due to measurement error, disruptive side effects, etc. Medium effect sizes are observed in IQ differences between house painters, mechanics, carpenters, butchers. Large effect sizes are observed in IQ differences between house painters, mechanics, carpenters, (railroad) engine drivers, and lab technicians.
Adapted from: Cohen, J. (1969). Statistical Power Analysis for the Behavioral Sciences (1st ed.). New York: Academic Press.
Additional reading: Rosenthal, R., R. L. Rosnow, & Rubin, D.B. (2000). Contrasts and Effect Sizes in Behavioral Research: A correlational approach. Cambridge: Cambridge University Press. ISBN 0521659809.
session 6: Fri 23 Oct
location: TBA
linear regression, error of measurement, reliability.
Reading:
 Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGrawHill. Chapter 24 "Errors of Measurement", pp.466478.
 Trochim, W.M. (2002). Measurement. In: Research Methods Knowledge Base
(Web Center for Social Research Methods).
Links:
Let us assume that we have 2 observations for each of 5 persons. These observations are about the perceived body weight, as judged by two 'raters' or judges, x1 and x2.
The data are as follows:
person x1 x2
1 60 62
2 70 68
3 70 71
4 65 65
5 65 63
Because we have only two measures (variables), there is only one pair of measures to compare in this example. Very often, however, there are more than two judges involved, and hence many more pairs.
First, let us calculate the correlation between these two variables x1 and x2. This can be done in SPSS with the Correlations command (Analyze > Correlate > Bivariate, check Pearson correlation coefficient).
This yields r=.904, and the average r (over 1 pair of judges) is the same.
If you need to compute r manually, one method is to first convert x1 and x2 to Zvalues [(xmean)/s], yielding z1 and z2. Then r = SUM(z1×z2) / (n1).
This value of r corresponds to Cronbach's Alpha of (2×.904)/(1+.904) = .946 (with N=2 judges).
Cronbach's Alpha can be obtained in SPSS by choosing Analyze > Scale > Reliability Analysis. Select the "items" (or judges) x1 and x2, and select model Alpha.
The output states: Reliability Coefficients [over] 2 items, Alpha = .9459 [etc.]
If the same average correlation r=.904 had been observed over 4 judges (i.e. over 4×3 pairs of judges), then that would have indicated an even higher interrater reliability, viz. alpha = (4×.904)/(1+3×.904) = .974.
Exactly the same reasoning applies if the data are not provided by 2 raters judging the same 5 objects, but by 2 test items "judging" a property of the same 5 persons. Both approaches are common in language research. Although SPSS only mentions items, and interitem reliability; the analysis is equally applicable to raters or judges, and interrater reliability.
Note that both judges (items) may be inaccurate. A priori, we do not know how good each judge is, nor which judge is better. We know, however, that their reliability of judging the same thing (true body weight, we hope) increases with their mutual correlation.
Now, let's regard the same data, but in a different context. We have one measuring instrument of the abstract concept x that we try to measure. The same 5 objects are measured twice (testretest), yielding the data given above. In this testretest context, there is always just one correlation, and the idea of interrater reliability does not apply in this context. We find that r_{xx}=.904.
This reliability coefficient r = s^{2}_{T} / s^{2}_{x} . This provides us with an estimate about how much of the total variance is due to variance in the underlying, unknown, "true" scores. In this example, 90.4% of the total variance is estimated to be due to variance of the true scores. The complementary part, 9.6% of the total variance, is estimated to be due to measurement error. If there were no measurement error, then we would predict perfect correlation (r=1); if the measurements would contain only error (and no true score component at all), then we would predict zero correlation (r=0) between x1 and x2.
In this example, we find that
s_{e} = s_{x} × sqrt(1.904) = sqrt(15.484) × sqrt(.096) = 1.219
check: s^{2}_{x} = 15.484 = s^{2}_{T} + s^{2}_{e} =
s^{2}_{T} + (1.219)^{2},
so s^{2}_{T} = 15.484  1.486 = 13.997
and indeed r = .904 = s^{2}_{T} / s^{2}_{x} = 13.997 / 15.484.
Supposedly, x1 and x2 measure the same property x. To obtain s^{2}_{x}, the total observed variance of x (as needed above), we cannot use x1 exclusively nor x2 exclusively. The total variance is obtained here from the two standard deviations:
s^{2}_{x} = s_{x1} × s_{x2}
s^{2}_{x} = 4.18330 × 3.70135 = 15.484
In general, a reliability coefficient smaller than .5 is regarded as low, between .5 and .8 as moderate, and over .8 as high.
session 6 (continued)
Assignments:
Your answers and solutions to the questions below have to be handed in
as described
above.
As always, write clearly, correctly, and concisely.
Deadline for handing in these assignments is Monday 26 Oct 18:00h; deadline for handing in your peer review is Tuesday 27 Oct 18:00.

Answer the following questions:
Ferguson & Takane, Chapter 24: Exercises 1, 2.

We have constructed a test consisting of 4 items, with an average interitem correlation of 0.4.
a. How many interitem correlations are there, between 4 items? (Ignore the trivial correlation of an item with itself.)
b. Compute the Cronbach Alpha reliability coefficient of this test of 4 items.
Now we add a new 5th item.
c. How many new interitem correlations are added to the correlation matrix when a 5th item is added to the test?
Unfortunately the coding of this item happens to be incorrect, that is, the scale was reversed for this new item. The interitem correlation of this 5th item with each of the 4 older items is 0.4 (note the negative sign).
d. What is the average interitem correlation after adding this 5th test item?
e. Compute the Cronbach Alpha coefficient of the longer test of 5 items.
f. Compare and discuss the reliability and usefulness of the shorter and of the longer test.

A student weights an object 6 times. The object is known to weigh 10 kg. She obtains readings on the scale of 9, 12, 5, 12, 10, and 12 kg. Describe the systematic error and the random errors characterizing the scale's performance.
Adapted from: R.L. Rosnow & R. Rosenthal (2002). Beginning Behavioral Research: A conceptual primer (4th ed.). Upper Saddle River, NJ: Prentice Hall. Ch.6, Q.7, p.159.

Let us assume that in this course, in addition to writing a peer review, you would also have to grade each other's work as part of the peer review process. Grades would have to be on the Dutch scale from 1 (bad) to 10 (good).
Discuss the reliability and validity of this method to assess student performance. What are the possible threats to reliability and validity, and how could these be reduced?
session 7: Wed 28 Oct
Multiple regression, multivariate analyses.
Readings:
 Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. [ISBN 0716796570]. Chapter 11 "Multiple Regression", pp. 708745.
[book companion website].
 optional: Devore, Jay & Peck, Roxy (2001) Statistics: The Exploration and Analysis of Data (4th ed.). Pacific Grove, CA: Duxbury. [ISBN 0534358675.] Chapter 14 "Multiple Regression Analysis", pp. 553610.
 optional: chapter on Multiple Regression,
from the excellent online statistics textbook at StatSoft, Inc.
Assignments:
Your answers and solutions to the questions below have to be handed in
as described
above.
As always, write clearly, correctly, and concisely.
File your work in folder
"seven" of the group webpage.
 Answer the following questions:
Moore & McCabe, Chapter 11: Exercises 2, 3, 16, 33.
Data for the last two questions are available here in plain text format (the first line of this file contains variable names).
Forward or Backward?
For questions 16 and 33 the FORWARD method is most appropriate.
This means that you start with an empty model (only intercept b_{0})
to which predictors are added step by step. After each addition of a predictor,
you check whether the model performs significantly better than before
(e.g. by checking whether R^{2} increases).
The questions are about the increment in R^{2} by adding a predictor.
The relevant information is easier to find in the SPSS output if you specify
the FORWARD method.
As a bonus, you could check what happens if you exclude case #51 from the
data set, e.g. by marking it as a missing value. This is quite easy if you
keep the regression command in a Syntax window for repeated use.
HSS, SAT, GPA??
The chapter by Moore & McCabe draws heavily on typically American concepts. In the USA, your achievements are all that counts, in life as well as in study. The US grading system ranges from A+ (excellent) to F (fail).
For admission to a university, two things are taken into account:
(a) your average grades in the final years of high school (HSM, HSS, HSE), and
(b) your score in a national admissions exam, like the Dutch CITO test (Scholastic Aptitude Test, SAT).
Topclass universities, like Harvard, Yale, Stanford, etc., use both parameters in selection. You have to be the best in your class (but your classmates are strongly competing for this honor), plus you need a minimal score on your SAT.
During your academic study, all your grades and results contribute to your Grade Point Average (GPA), a weighted average grade. This GPA is generally used as an indication of academic achievement and success. The authors attempt to predict the GPA from the previously obtained indicators (a) and (b).
regression
Why is it "regression"? This has to do with heredity, the field of biology where regression was first developed by Francis Galton (cousin of Charles Darwin) in the late 19th century.
Take a sample of fathers, and note their body length (X). Wait for one full generation, and measure the body length of each father's oldest adult son (Y). Make a scattergram of X and Y. The bestfitting line throught the observations has a slope of less than 1 (typically about .65). This is because the sons' length Y tends to "regress to the mean" — outlier fathers tend to produce average sons, and average fathers also tend to produce average sons. Galton called this phenomenon "regression towards mediocrity". Thus the bestfitting line is a "regression" line because it shows the degree of regression to the mean, from one generation to the next. (Note that any slope larger than 0 suggests an hereditary component in the sons' body length, Y.)
Questions: Which variable has the larger variance, X or Y? Does the variation in body length increase or decrease (regress) over generations? Why?
partial correlation
The partial correlation between X
_{1} and X
_{2}, with X
_{3} removed from both, is given by:
r
_{12.3} = ( r
_{12}r
_{13}r
_{23} ) / sqrt[ (1r
^{2}_{13})(1r
^{2}_{23}) ]
 Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGrawHill. p495.
Fri 30 Oct
no lab session; prepare final assignment.
Assignment 5
The shorthand notation F1xF2 is misleading.
This notation means that the treatment effect is regarded as significant if it is significant in an ANOVA by subjects (F1) and in an ANOVA by items (F2). It does not mean that the values of F1 and F2 should be multiplied! The shorthand notation F1 & F2 would be more appropriate.
Assignment 7
A regression model can also be evaluated by inspecting its residuals, i.e. the observed score minus the predicted score. If the model is approximately correct, then the residuals should be normally distributed around zero, for the whole range of observations (because of the assumption that errors are independent, and distributed normally, with mean zero). This can be inspected by specifying residual plots in the SPSS Regression menu:
REGRESSION /DEPENDENT GPA /METHOD=FORWARD IQ SC
/SCATTERPLOT=(*ZRESID ,GPA)
/RESIDUALS NORM(ZRESID).
The first subcommand produces the following scattergram, which shows that the residuals tend to be more negative for lower GPAs (i.e. predictions are too high for lower GPAs). The worstperforming students obtain GPAs that are lower than predicted from their IQ and SC. Can you imagine why?
The second subcommand produces a QQ Normal plot as we've seen before. If normally distributed, then the residuals should be on a straight line:
This plot confirms that the residuals are indeed not quite normally distributed.
final assignment
For your final assignment you can choose either one of the two assignments described below.
Deadline is Thursday 5th Nov 2009, 23:59 h.
option one
This final assignment is to submit a revised or improved version of one previous assignment of this course. You're free to choose which one you want to revise.
As always, the revised paper should be (as much as possible) a running text, not a collection of incomplete sentences and statistical output.
In the revised version you have to accommodate the comments of your reviewer — if you agree of course. Also use the reading materials and hyperlinks provided.
You may discuss the reviewer's comments in the text of your revised version. But perhaps you find it easier to write a coherent (revised) text on your own, plus a second document with revision notes, in which you discuss the reviewer's comments explicitly, stating which comments you have taken into account, which comments you have ignored, and why.
option two
There are considerable similarities between analysis of variance (ANOVA) and multiple regression (MR), especially in designs without repeated measurements. You can read more about these similarities in the sources given below.
Your assignment is to analyze a given dataset with both methods, and to discuss the differences and similarities among the two methods. The ANOVA must use a single dependent variable named opleiding (type of study: 1=alfa, 2=beta, 3=gamma). The MR must use the socalled dummy factors named isalfa, isbeta, isgamma (0=false, 1=true, for each dummy factor), or a subset of these dummies. (Note that the given dataset already contains the categorical factor as well as the associated dummy factors.) Each row or unit represents a single participant of a fictional survey about students' work load.
The dependent variable studietijd represents the time (in hour/week) a student spends on studyrelated activities.
In your analyses, do not forget to inspect all relevant relationships between the factor(s) and the DV, to test whether assumptions are met, and to inspect residuals of all models.
Sources:
Please remember to evaluate this course. Go to
www.let.uu.nl/oce,
log in with your SolisID, and fill in the evaluation about this course.

Butler, Ch. (1985) Statistics in Linguistics. s.l.: Blackwell.
[out of print, but see the
web version].

Carver, R.H. & Nash, J.G. (2005) Doing Data Analysis with SPSS version 12.0. Belmont, CA: Brooks/Cole. ISBN 053446551x.

Gelman, A. & Hill, J. (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press. ISBN 9780521686891.

Johnson, K. (2008) Quantitative Methods in Linguistics. Malden, MA: Blackwell. ISBN 9781405144254.

Maxwell, S.E. & Delaney, H.D. (2004) Designing Experiments and Analyzing Data: A model comparison perspective (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. ISBN 0805837183. [very good, but not an easy book].

Kirkpatrick, L.A. & Feeney, B.C. (2005) A Simple Guide to SPSS for Windows/ for Version 12.0. Belmont, CA: Thomson Wadsworth. ISBN 0534610064.

Levin, I.P. (1998) Relating Statistics and Experimental Design: An introduction. Thousand Oaks, CA: Sage. Sage University Papers Series on Quantitative Applications in the Social Sciences; 07125. ISBN 0761914722.

Rosenthal, R., & Rosnow, R.L. (2008) Essentials of Behavioral Research: Methods and Data Analysis. Boston: McGraw Hill. ISBN 0073531960.

Rosenthal, R., R. L. Rosnow, & Rubin, D.B. (2000) Contrasts and Effect Sizes in Behavioral Research: A correlational approach. Cambridge: Cambridge University Press. ISBN 0521659809.

StatSoft, Inc. (2004) Electronic Statistics Textbook. Tulsa, OK: StatSoft.
URL: http://www.statsoft.com/textbook/stathome.html [clear and concise chapters about most statistical topics].

Also check the hyperlinks listed under session 1.

Also check the webpage of my former statistiek course [in Dutch].
© 20032009
HQ
2009.11.09