What is the relationship between the length of the complement and the choice of going to/gonna?
Code
gonna_plot
Linear fit
The relationship is not linear.
Code
gonna_plot +geom_smooth(method ="lm", se =FALSE, color ="darkolivegreen4")
The residuals are not normally distributed.
Not all values are possible (probabilities go between 0 and 1).
Logistic fit
The fit has an S shape.
Code
gonna_plot +geom_line(aes(y = fit1), color ="goldenrod", size =2)
Model
Code
summary(m2)
Call:
glm(formula = variant ~ comp_length + register, family = binomial(logit),
data = gt)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9876 -0.2053 -0.0209 0.1067 2.1787
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 11.582 2.899 4.00 6.5e-05 ***
comp_length -2.651 0.613 -4.32 1.5e-05 ***
registerformal 3.284 1.068 3.08 0.0021 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 138.63 on 99 degrees of freedom
Residual deviance: 34.02 on 97 degrees of freedom
AIC: 40.02
Number of Fisher Scoring iterations: 7
Interpretation
Intercept
log odds of outcome (“going to”) when all predictors are at 0 (comp_length = 0).
odds = exp(11.582) = 107138
prob = odds/(odds+1) \(\approx\) 1
Coefficients
log odds ratios: positive increases chances of “going to”, negative of “gonna”.
odds ratio
of comp_length = exp(-2.651) = 0.071
of register = exp(3.284) = 26.676
The odds of going to vs gonna in the formal register are 26.676 times higher than those in the informal register, other variables being controlled for.
Probabilities, odds and logit
Probabilities
probs.
\(P\)
0 - 0.5 - 1
Number of successes divided by number of trials.
Odds
probs.
\(P\)
0 - 0.5 - 1
Number of successes divided by number of trials.
odds
\(\frac{P}{1-P}\)
0 - 1 - \(\infty\)
Probability of success divided by the probability of failure.
Undefined for \(P=1\).
Logit
probs.
\(P\)
0 - 0.5 - 1
Number of successes divided by number of trials.
odds
\(\frac{P}{1-P}\)
0 - 1 - \(\infty\)
Probability of success divided by the probability of failure.
Undefined for \(P=1\).
logit
\(\log\left(\frac{P}{1-P}\right)\)
\(-\infty\) - 0 - \(\infty\)
If positive, success is more likely; if negative failure is more likely.