We sometimes see rendering errors in the error page itself, which then
cause another attempt at rendering the error page. I'm not sure _exactly_
how the loop is occurring, but it looks something like this:
1. An error is raised in a view or middleware and is not caught by
application code
2. Django catches the error and calls the registered uncaught error
handler
3. Our handler tries to render an error page
4. The rendering code raises an error
5. GOTO 2 (until some sort of server limit is reached)
By catching all errors raised during error-page render and substituting in
a hardcoded string, we can reduce server resources, avoid logging massive
sequences of recursive stack traces, and still give the user *some*
indication that yes, there was a problem.
This should help address https://github.com/openedx/edx-platform/issues/35151
At least one of these rendering errors is known to be due to a translation
error. There's a separate issue for restoring translation quality so that
we avoid those issues in the future (https://github.com/openedx/openedx-translations/issues/549)
but in general we should catch all rendering errors, including unknown
ones.
Testing:
- In `lms/envs/devstack.py` change `DEBUG` to `False` to ensure that the
usual error page is displayed (rather than the debug error page).
- Add line `1/0` to the top of the `student_dashboard` function in
`common/djangoapps/student/views/dashboard.py` to make that view error.
- In `lms/templates/static_templates/server-error.html` replace
`static.get_platform_name()` with `None * 7` to make the error template
itself produce an error.
- Visit <http://localhost:18000/dashboard>.
Without the fix, the response takes 10 seconds and produces a 6 MB, 85k
line set of stack traces and the page displays "A server error occurred.
Please contact the administrator."
With the fix, the response takes less than a second and produces three
stack traces (one of which contains the error page's rendering error).