The mobile app is getting unexpected 403s from
/oauth2/exchange_access_token/, but we have been unable
to pinpoint from where they are coming. This commit
introduces a temporary exception handler to provide stack info
for 403s on this endpoint to try to track down the source.
Requires the ENABLE_403_MONITORING setting to be set to
True to enable the logging.
ARCHBOM-1667
We use django-ratelimit to limit per IP login attempts, and then we use
django-ratelimit-backend to limit per username login attempts. This
change replaces the usage of django-ratelimit-backend with another
instance of django-ratelimit so that both limits can be managed by one
library.
This is the first step in being able to fully excise
django-ratelimit-backend from edx-platform. Note that we're still using
the `RateLimitMixin` in openedx/core/djangoapps/oauth_dispatch/dot_overrides/backends.py
because studio and the admin UI still relies on that for rate limiting.
Those login paths will have to be updated before we can remove the mixin
from our auth backend.
Mobile apps load HTML (and other) XBlocks individually using the
render_xblock endpoint. This is an attmept to reduce the number
of requests and JS processing needed to do so by detecting when
we have math content in HTMLBlocks and only adding the Mathjax
resources when necessary.
This is controlled by the "courseware.optimized_render_xblock"
CourseWaffleFlag. For maximum safety, we currently only optimize
in this way when directly hitting HTMLBlocks, and not for
ProblemBlock or VerticalBlock.
This was made as part of edX's Hackathon XXV.
The rate limiting library computes the rate limit by chunking time since
the epoch into chunks of whatever your period is. It then adds some
consistent offset based on your key. This means that at certain times,
you are closer to the end of your rate limit time period than others.
So moving 1 minute into the future would put you into the next time
chunk and your rate limit would be reset.
I updated the test to test rate limit at the same time as the initial
call to ensure that we don't end up on the other side of a time chunk
boundary by accident. We were seeing times in CI where it
would occasionally fail because time chunking wasn't in our favor.
Now that we always return an existing value from the DB rather than trusting that ID generation is deterministic and constant over time, we're free to change the generation algorithm.
Our long term goal is to switch to random IDs, but we need to first investigate the uses of save=False. In the meantime, this is a good opportunity to move away from MD5, which has a number of cryptographic weaknesses. None of the known vulnerabilities are considered exploitable in this location, given the limited ability to control the input to the hash, but we should generally be moving away from it everywhere for consistency.
This change should not be breaking even for save=False callers, since those calls are extremely rare (1 in 100,000) and should only occur after a save=True call, at which point they'll use the stored value. Even if this were not true, for a save=False/True pair of calls to result in a mismatch in output, the first of the calls would have to occur around the time of the deploy of this code.
Co-authored-by: Tim McCormack <tmccormack@edx.org>
Co-authored-by: Tim McCormack <tmccormack@edx.org>
This deprecates `save=False` for several functions and removes all known
usages of the parameter but does not actually remove the parameter.
Instead, it will emit a deprecation warning if the parameter is used.
We can remove the parameter as soon as we feel sure nothing is using it.
Now that we have refactored `anonymous_id_for_user` to always prefer
retrieving an existing ID from the database -- and observed that only a
small fraction of calls pass save=False -- we can stop respecting
save=False. This opens the door for future improvements, such as generating
random IDs or switching to the external user ID system.
Metrics: I observe that 1 in 16 requests for new, non-request-cached
anon user IDs are made with save=False. But 71% of all calls are served
from the request cache, and 99.7% of the misses are served from the DB.
save=False only appear to come from intermittent spikes as reports are
generated and are low in absolute number.
Also document usage/risk/rotation of secret in anonymous user ID
generation as indicated by `docs/decisions/0008-secret-key-usage.rst`
ADR on `SECRET_KEY` usage.
ref: ARCHBOM-1683
Currently it's hard to see the content of an error without knowing how
to cause an existing view to make that error in production. Adding
these default paths should make that a lot easier.
See context here: https://django-ratelimit.readthedocs.io/en/latest/cookbook/429.html#context
For now we continue to fall back to django's default 403 handler for 403
but provide a new 429 template that we use for ratelimit exceptions.
This commit also updates a logistration test that relied on the old 403
behavior of django-ratelimit instead of the newly added 429 behavior.