The test count was off because without warnings disabled, it was also
counting warning lines as tests.
The `head -n -2` grabs everything but the last two lines which contain a
count (not sure why this isn't used). If you run without
`--disable-warnings` this will include any warnings that occur during
test collection which we don't want in this case.