In rare cases, the rounding, e.g. of the ‘Std. Error’. column in printCoefmat() is undesirably rounding too much and then depicting inaccurate values. This is from the recent R-devel mailing list thread starting at https://stat.ethz.ch/pipermail/r-help/2023-July/477688.html
Author
Martin Maechler
Background
Printing of e.g. lm() results; but also glm(), anova(), etc; not just package {stats} but used by other packages as well, e.g., lme4.
Problem statement
Slightly modified from Shu Fai Cheung’s example on the R-devel mailing list:
Call:
lm(formula = y ~ x1 + x2, data = dat)
Residuals:
Min 1Q Median 3Q Max
-0.102097 -0.022598 -0.001331 0.023538 0.121632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.761e-07 1.115e-03 0.0 1
x1 4.992e-01 1.128e-03 442.4 <2e-16 ***
x2 6.010e-01 1.108e-03 542.3 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.03524 on 997 degrees of freedom
Multiple R-squared: 0.9981, Adjusted R-squared: 0.9981
F-statistic: 2.677e+05 on 2 and 997 DF, p-value: < 2.2e-16
The standard print is fine, but we would like the intercept to be “zapped to zero”. So we try zap.ind = 1 (which is passed to printCoefmat by print.summary.lm):
print(sfm, zap.ind =1)
Call:
lm(formula = y ~ x1 + x2, data = dat)
Residuals:
Min 1Q Median 3Q Max
-0.102097 -0.022598 -0.001331 0.023538 0.121632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000000 0.001115 0.0 1
x1 0.499200 0.001128 442.4 <2e-16 ***
x2 0.601000 0.001108 542.3 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.03524 on 997 degrees of freedom
Multiple R-squared: 0.9981, Adjusted R-squared: 0.9981
F-statistic: 2.677e+05 on 2 and 997 DF, p-value: < 2.2e-16
Here, we’d rather like to see correctly rounded coefficients
The use of zapsmall() there, can round too much in these cases.
One option is to extend the meaning of the zap.ind argument and allow negative indices to mean that the zapping should not be done with zapsmall() but with an alternative function (to be defined inside printCoefmat()):
---anova.R (revision 84998)+++anova.R (working copy)@@-105,7+105,13@@ ok <-!(ina <-is.na(xm))## zap before deciding any formats-for (i in zap.ind) xm[, i] <-zapsmall(xm[, i], digits)+ zapOnlysmall <-function(x, dig) { x[abs(x) <=10^-dig] <-0 ; x }+for (i in zap.ind) { ai <-abs(i) # i < 0 [zapOnly..] vs i > 0 [zapsmall]:+ xm[, ai] <- (if(i <0) zapOnlysmall else zapsmall)(xm[, ai], digits)+## problem: if one of these is a cs.ind (below) and is of larger order of magn., it is+## ------- rounded too much here --> shows trailing 00 afterwards.+## ==> TODO: detect and *fix* rather than needing negative zap.ind for zapOnlysmall+ }
which “solved” the problem, but actually also changes other examples (not shown here) where a change is not necessary.
Also, see my “TODO” above, I think we could try to make this automatic, so the zap.ind = 1 example above would just do “the right thing”.
Testing
This is not yet a strict test, but code you can/should use this code to get a broader picture. Note that this produces a few pages of R console output :
cf <-coef(sfm)for(dig inc(3:5, 7, 10)) {cat("\ndigits = ", dig,":\n================\n")for(f in10^c(-6, -3:3, 6, 9)) {cat("cf[,1:2] := cf0[,1:2] * f, f=", formatC(f),":\n") fcf <- cf; fcf[,1:2] <- f*cf[,1:2]printCoefmat(fcf, signif.legend=FALSE, digits=dig) # using e-07 for intercept --> everything in sci.formatcat(" ... and with zap.ind = 1:\n")printCoefmat(fcf, signif.legend=FALSE, digits=dig, zap.ind =1)cat("-------------------------\n\n") }}
Project requirements
Almost none. Some understanding about floating point arithmetic, notably rounding.
Project outcomes
A patch to the R source. Code changes only in the first file (anova.R) but a really useful patch also updates the documentation (*.Rd) notably adding an example there.
As I have seen (with my own first fix), that it may change output of other examples, and for some of these, the tests/Examples/<pkg>-Ex.Rout.save files also need patching ( tests/Examples/stats-Ex.Rout.save will need an update in any case if we modify/add examples on the help page). Files to change: