From 6f5efb62031193702fc787bf30469c94f8f94070 Mon Sep 17 00:00:00 2001
From: Kelly Buchanan <kbuchanan@2u.com>
Date: Fri, 25 Mar 2022 14:06:51 -0400
Subject: [PATCH] docs: Add ADR for not sanitizing html in bulk email.

[MICROBA-1666]
- Update ADR to explain why we will not be sanitizing the contents of bulk email messages.
---
 .../001-bulk-email-content-sanitization.rst   | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/lms/djangoapps/bulk_email/docs/decisions/001-bulk-email-content-sanitization.rst b/lms/djangoapps/bulk_email/docs/decisions/001-bulk-email-content-sanitization.rst
index 7bc53a835a..4741225259 100644
--- a/lms/djangoapps/bulk_email/docs/decisions/001-bulk-email-content-sanitization.rst
+++ b/lms/djangoapps/bulk_email/docs/decisions/001-bulk-email-content-sanitization.rst
@@ -1,6 +1,6 @@
-====================================
-Bulk Email HTML Content Sanitization
-====================================
+=============================================
+Bulk Email HTML Content Will Not Be Sanitized
+=============================================
 
 Status
 ------
@@ -17,16 +17,15 @@ It is considered good security practice to scan and sanitize user-provided conte
 Decision
 --------
 
-We will sanitize the HTML content received through the bulk course email tool before sending the messages to any recipients.
+We will not sanitize the HTML content received through the bulk course email tool on the back end. The content that instructors create for courses also allows unfiltered html, which is arguably a larger risk than email, which will block the execution of certain types of code. In the future, if a standard is set for filtering course content, the same standard could be applied here.
 
-We will use the `bleach`_ Python package to sanitize the data.
 
-We will introduce a new configuration setting called ``BULK_COURSE_EMAIL_ALLOWED_HTML_TAGS`` that acts as an *allowlist* of HTML tags permitted for use within the body of a message authored by the bulk course email tool. A default list of options is provided in the ``lms/envs/common.py`` `configuration file`_. Offending data will be escaped (converted to plaintext) over being stripped out.
+Rejected solutions
+------------------
 
-Consequences
-------------
+Bleach with allowlist
+*********************
 
-A message sent through the Bulk Course Email tool that includes any restricted HTML content will not appear as intended to the recipients of the message. The restricted HTML content will be converted to plaintext and will not render.
+It has been standard practice for us to use `Bleach`_ with an allowlist to sanitize user provided content within the Open edX ecosystem. Santization using blocklists is vulnerable to obfuscation attacks, and the industry standard is to use an allowlist and explicitly enumerate all supported values. Given that this tool has been live for many years with no sanitization in place, a highly restrictive allowlist would be difficult to roll out to users, resulting in broken email templates and angry instructors who are used to having free rein. A permissive allowlist would potentially address this, but presents the non-trivial problem of assembling and maintaining a comprehensive list.
 
 .. _bleach: https://bleach.readthedocs.io/en/latest/
-.. _configuration file: https://github.com/openedx/edx-platform/blob/e608db847c39c2e3d723ef81f7dac66f63663a28/lms/envs/common.py#L4965