The critical values for t-tests and f-tests can be found from for instance this site:
http://www.statsoft.com/textbook/distribution-tables/ , or from most Econometrics books.
You have data on British year 11 pupils, and their performance on a test of computer skills. The summary statistics are as follows:
Variable |
| |
Obs |
Mean |
Std. Dev. |
Min |
Max |
tot |
| |
322 |
22.043 |
5.4657 |
8 |
41 |
mat |
| |
321 |
5.6137 |
1.2249 |
2 |
7 |
eng |
| |
322 |
5.6211 |
1.1299 |
2 |
7 |
female |
| |
322 |
0.5124 |
0.5006 |
0 |
1 |
csgcse |
| |
318 |
0.1509 |
0.3586 |
0 |
1 |
onlaptop |
| |
305 |
1.4262 |
1.2429 |
0 |
3.5 |
ondesktop |
| |
301 |
0.9003 |
1.1525 |
0 |
3.5 |
The variables are: tot = total score in computer skills test, mat = predicted GCSE math result, eng = predicted English result (both coded as 2-7 where 2=E, 3=D, 4=C, 5=B, 6=A, 7=A*). Female is a dummy for a girl, csgcse = dummy whether the pupil is taking Computer Science GCSE, onlaptop and ondesktop are typical daily hours spent using either laptop or a PC as reported by the pupil.
You estimate two models with the following results (estimated standard errors are in parenthesis):
(Model 1) tot = 13.91 + 1.52*mat + 0.13*eng - 2.11*female N = 321, R2 = 0.188
(1.54) (0.31) (0.33) (0.58)
(Model 2) tot = 10.47 + 1.55*mat + 0.22*eng - 1.19*female N = 280, R2 = 0.280
(1.72) (0.31) (0.34) (0.62)
+ 2.35*csgsce + 0.54*onlaptop + 1.29*ondesktop
(0.83) (0.24) (0.28)
1. Test whether variables in Model 1 are statistically significant at 1% level and interpret the results.
2. Use results in both Model 1 and Model 2 to discuss whether girls have weaker computer skills than boys, and why (1-2 paragraphs is enough).
3. Test for a null hypothesis that coefficient for female in Model 2 is zero against an alternative hypothesis that it is negative, at 5% statistical significance.
4. Name one reason why you might not want to directly compare the estimated coefficients for female in Models 1 and 2. Elaborate if necessary (1-2 paragraphs is enough).
Model 2 could still be further developed by instead of using variables mat and eng, creating separate dummy variables for each of the predicted grades and using them in the estimation. Let's call those variables mt2, mt3, mt4, mt5, mt6, mt7 and en2, en3, en4, en5, en6, en7
5. If you would run 'Model 3' where mat and eng are replaced by mt3, mt4, mt5, mt6, mt7 and en3, en4, en5, en6, en7, what would be the interpretation of the estimated coefficient for mt6?
6. What could happen to R2 in the hypothetical 'Model 3' as opposed to R2 in Model 2?