This brings an important security improvement -- codejail won't default to
running in unsafe mode, which can happen if certain configuration errors
are present.
Properly configured installations shouldn't be affected. We just need to
adjust some unit tests to opt into unsafe mode.
Changes:
- Update `edx-codejail` dependency to [version 4.0.0](https://github.com/openedx/codejail/blob/master/CHANGELOG.rst#400---2025-06-13)
- Define a `use_unsafe_codejail` decorator that allows running a unit test (or entire TestCase class) in unsafe mode
- Use that decorator as needed, based on which tests started failing
For darklaunch comparisons where the two sides have different Python
versions, we'll want a more comprehensive list of normalizers.
- Expand the default list to include patterns discovered during a Python
3.8 vs. 3.12 comparison.
- Append the setting value by default, rather than replacing (but still
allow replacing).
- Use default normalizers if custom ones can't be loaded.
- Add log message when loading normalizers fails.
- Validate the replacement pattern, not just the search pattern.
This is intended to make logs more or less a standalone source for
analyzing mismatches.
- Only log mismatches or exceptions
- Merge local and remote log messages into one so they can be correlated
more easily
- Different log messages for mismatch vs. unexpected exceptions
- Include course ID (as limit overrides context) in log message
- Fix bug where we were overwriting `remote_emsg` with None, and add test
that would have caught it.
- Suppress differences due solely to the codejail sandbox directory name
differing (in stack traces), and add test for this. Configurable because
we'll need to add an additional search/replace pair for the sandbox venv
paths.
- Add a variety of custom attributes, replacing existing ones. The attrs
now have a prefixed naming scheme to simplify searching.
- Add slug to log output so we can more readily correlate traces and logs,
as well as logs across services.
- Fix typo in error message.
- Fix existing import sort order lint.
- Separate test for misconfiguration
- Add helper method for generic dark launch testing
- Test two darklaunch scenarios: Globals interference, and error that would
previously have caused the remote side not to run
- Rename mocks to have our usual `mock_` prefix
- Catch all exceptions, not just Exception, to better prevent errors from
interfering with mainline responses.
- Introduce a separate try block around the monitoring code so that bugs
there don't cause issues.
- Print exception information as well for both sides (but only if not a
SafeExecException, which is redundant with emsg).
Some formatting changes to log messages as well.
Example outputs:
For `1/0`:
```
2025-04-14 17:26:34,239 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:240 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': '1/0'}, emsg=None, exception=None
2025-04-14 17:26:34,239 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:245 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': '1/0'}, emsg='ZeroDivisionError: division by zero', exception=None
```
For `raise BaseException("hi")`:
```
2025-04-14 17:26:13,359 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:240 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': 'raise BaseException("hi")'}, emsg=None, exception=None
2025-04-14 17:26:13,359 INFO 10232 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:245 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': 'raise BaseException("hi")'}, emsg='hi', exception=BaseException('hi')
```
With codejail-service down, and `out = 1 + 2`:
```
2025-04-14 17:30:28,597 INFO 12484 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:241 - Remote execution in darklaunch mode produces globals={'expect': None, 'ans': 'out = 1 + 2', 'out': 3, 'cfn_return': {'input_list': [{'ok': True, 'msg': 'Output:\n3', 'grade_decimal': 1}]}}, emsg=None, exception=CodejailServiceUnavailable('Codejail API Service is unavailable. Please try again in a few minutes.')
2025-04-14 17:30:28,597 INFO 12484 [xmodule.capa.safe_exec.safe_exec] [user 3] [ip 172.18.0.1] safe_exec.py:246 - Local execution in darklaunch mode produces globals={'expect': None, 'ans': 'out = 1 + 2', 'out': 3, 'cfn_return': {'input_list': [{'ok': True, 'msg': 'Output:\n3', 'grade_decimal': 1}]}}, emsg=None, exception=None
```
During dark launch of remote codejail, we want to ensure we always run both
local and remote execution -- otherwise we're missing data for the remote
side in an important situation.
This will help answer the question of whether the unexpected exception
happens on both sides, even though it may not look exactly the same due to
differences in how unexpected errors are handled.
An example input that provokes this in unsafe execution mode is
`raise BaseException("hi")`; in safe execution mode, printing to
`sys.__stdout__` should also produce an appropriate error.
Example output from running `import matplotlib; 1/0`, before and after the change:
```diff
--- tmp/before 2025-03-28 03:34:06.633689552 +0000
+++ tmp/after 2025-03-28 03:34:37.268688891 +0000
@@ -1,6 +1,5 @@
-Matplotlib created a temporary cache directory at /tmp/codejail-hveq16ah/tmp/matplotlib-tv0c_vzt because the default path (/home/sandbox/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Traceback (most recent call last):
File "jailed_code", line 19, in <module>
exec(code, g_dict)
File "<string>", line 1, in <module>
ZeroDivisionError: division by zero
```
This makes it easier to run matplotlib in codejail, and should prevent a
number of other issues in the future with other packages that need to
create tempfiles.
No change is required for existing codejail installations, but after this
change operators may be able to tighten their apparmor configuration to
prevent write access to global temp or cache dirs.
Manual testing instructions: Create a codejail problem that runs
`import matplotlib` and confirm that it runs without error. (Unit tests
aren't feasible here because this requires a fully configured codejail in
order for the tmp subdirectory to exist.)
Also: Add comment for `OPENBLAS_NUM_THREADS` and numpy support.
Currently, ./xmodule/ unit tests are only run with LMS settings. However,
./common/ and ./xmodule/ are run twice: once with LMS settings and once with
CMS settings.
Just like ./common/ and ./openedx/, the unit tests in ./xmodule/ validate
behavior in both LMS and CMS. So, order to fully test ./xmodule/, we should to
run its tests with CMS settings too.
This will enable us to better validate certain LibraryContentBlocks behaviors
being touched by https://github.com/openedx/edx-platform/pull/33263 which can't
be expressed under LMS settings.
Also in this commit:
* refactor: rename the shards to be clear whether they're running under LMS or CMS
* docs: correct comments regarding conditions under which codejail's
test_cant_do_something_forbidden is skipped.
* test: update a unit test which was using the now-deleted library_sourced block to use
library_content block instead.
* get rid of six.text_type(s)
* get rid of six.b()
* get rid of six.string_types
* get rid of six.PY2/six.PY3
* get rid of six.iteritems() and six.viewvalues()
As part of dissolving our sub-projects in edx-platform, we are moving this package under the xmodule directory.
We have fixed all the occurences of import of this package and also fixed all documents related references.
This might break your platform if you have any reference of `import capa` or `from capa import` in your codebase or in any Xblock.
Ref: https://openedx.atlassian.net/browse/BOM-2582