From ae5c897ef6dfa6d012ac2880dd93c06bfe4a92bf Mon Sep 17 00:00:00 2001 From: Alison Hodges Date: Fri, 27 Jun 2014 14:29:46 -0400 Subject: [PATCH] Completed edits for Sylvia's review. Added data czar to glossary --- .../source/getting_started/glossary.rst | 12 +++ .../internal_data_formats/credentials.rst | 84 ++++++++++--------- 2 files changed, 56 insertions(+), 40 deletions(-) diff --git a/docs/en_us/course_authors/source/getting_started/glossary.rst b/docs/en_us/course_authors/source/getting_started/glossary.rst index f071c487dd..cc083f9faa 100644 --- a/docs/en_us/course_authors/source/getting_started/glossary.rst +++ b/docs/en_us/course_authors/source/getting_started/glossary.rst @@ -143,6 +143,18 @@ C D **** +.. _Data Czar_g: + +**Data Czar** + + A data czar is the single representative at a partner institution who is + responsible for receiving course data from edX, and transferring it securely + to researchers and other interested parties after it is received. + + See `edX Research Guide`_. + +.. _edX Research Guide: http://edx.readthedocs.org/projects/devdata/en/latest/ + .. _Discussion Forum: **Discussion Forum** diff --git a/docs/en_us/data/source/internal_data_formats/credentials.rst b/docs/en_us/data/source/internal_data_formats/credentials.rst index aea7810bef..a77154fbfb 100644 --- a/docs/en_us/data/source/internal_data_formats/credentials.rst +++ b/docs/en_us/data/source/internal_data_formats/credentials.rst @@ -8,10 +8,10 @@ EdX transfers course data to the data czars at our partner institutions in regularly generated data packages. Data packages can only be accessed by a single contact at each university, referred to as the "data czar". -The data czar who is selected at each institution sets up encryption "keys" -for securely transferring files from edX to the partner institution. Meanwhile, -the Analytics team at edX sets up credentials so that the data czar can log in -to the site where data packages are stored. +The data czar who is selected at each institution sets up keys for securely +transferring files from edX to the partner institution. Meanwhile, the +Analytics team at edX sets up credentials so that the data czar can log in to +the site where data packages are stored. .. image:: ../Images/Data_Czar_Initialization.png :alt: Flowchart of data czar creating public and private keys and sending the @@ -20,31 +20,31 @@ to the site where data packages are stored. the data czar After these steps for setting up credentials are complete, the data czar can -download data packages. +download data packages on an ongoing basis. **************************************************************** Data Czar: Create Keys for Encryption and Decryption **************************************************************** To assure the security of data packages, the edX Analytics team encrypts all -files before transferring them to a partner institution. As a result, when you -receive a data package (or any other file from the edX Analytics team), you must -decrypt the data before it can be used in any way. +files before making them available to a partner institution. As a result, when +you receive a data package (or other files) from the edX Analytics team, you +must decrypt the files that it contains before you use them. + +The cryptograhpic processes of encrypting and decrypting data files require that you create a pair of keys: the public key in the pair is used to encrypt data, and the corresponding private key is used to decrypt any files that have been encrypted with the public key. To create the keys needed for this encryption and decryption process, you use GNU Privacy Guard (GnuPG or GPG). Essentially, you install a cryptographic application on your local computer and supply your email address and a secret -passphrase (a password). The application uses this information to create both a -private key for you to use for *decrypting* files from edX and also the unique -public key that you send to edX to use in *encrypting* your data packages and -files. Each data czar creates his or her own private and public key pair to use -with edX files. +passphrase (a password). .. note:: The email address that you supply when you create your keys must be your official email address at your edX partner institution. -Creating these keys is a one-time process that you coordinate with your edX -program manager. Instructions for creating the keys on Windows or Macintosh -follow. +The result is the public key that you send to edX to use in encrypting data +files for your institution, and the private key which you keep secret and use +to decrypt the encrypted files that you receive. Creating these keys is a one- +time process that you coordinate with your edX program manager. Instructions +for creating the keys on Windows or Macintosh follow. For more information about GPG encryption and creating key pairs, see the `Gpg4win Compendium`_. @@ -94,7 +94,7 @@ Create Keys: Macintosh #. When the download is complete, click the .dmg file to begin the installation. -#. When installation is complete, GPG Keychain Access opens a web page with + When installation is complete, GPG Keychain Access opens a web page with `First Steps`_ and a dialog box. #. Enter your name and email address. Be sure to enter your official university @@ -111,35 +111,39 @@ Create Keys: Macintosh a. Specify a file name and location to save the file. - b. Make sure that **Format** is ASCII. + b. Make sure that **Format** is set to ASCII. c. Make sure that **Allow secret key export** is cleared. + + When you click **Save**, only the public key is saved in the resulting .asc + file. Do not share your private key with edX or any third party. #. Compose an e-mail message to your edX program manager. Attach the .asc - file that you saved in the previous step to the message then send the + file that you saved in the previous step to the message, then send the message. .. _GPG Tools: https://gpgtools.org/ .. _First Steps: http://support.gpgtools.org/kb/how-to/first-steps-where-do-i-start-where-do-i-begin#setupkey **************************************************************** -edX: Create and Deliver Credentials for Accessing Data Storage +EdX: Deliver Credentials for Accessing Data Storage **************************************************************** The data packages that edX prepares for each partner organization are uploaded -to the Amazon Web Service (AWS) Simple Storage Service (S3). The edX Analytics -team creates an individual account to access this storage service for each data -czar. The credentials for accessing this account are called an Access Key -and a Secret Key. +to the Amazon Web Service (AWS) Simple Storage Service (Amazon S3). The edX +Analytics team creates an individual account to access this storage service for +each data czar. The credentials for accessing this account are called an Access +Key and a Secret Key. -After the edX Analytics team creates these access credentials for you, they are -encrypted (using the public encryption key that you sent your program manager) -into a **credentials.csv.gpg** file. This file is then sent to you as an email -attachment. +After the edX Analytics team creates these access credentials for you, they use +the public encryption key that you sent your program manager to encrypt the +credentials into a **credentials.csv.gpg** file. The edX Analytics team then +sends the file to you as an email attachment. The **credentials.csv.gpg** file is likely to be the first file that you decrypt with your private GPG key. You use the same process to decrypt the data -package files that you retrieve from Amazon S3. +package files that you retrieve from Amazon S3. See `Decrypt an Encrypted +File`_. .. image:: ../Images/Access_AmazonS3.png :alt: Flowchart of edX collecting files for the data package and then @@ -149,9 +153,9 @@ package files that you retrieve from Amazon S3. .. _Decrypt an Encrypted File: -========================================== +**************************************************************** Decrypt an Encrypted File -========================================== +**************************************************************** To work with an encrypted .gpg file, you use the same GNU Privacy Guard program that you used to create your public/private key pair. You use your private key @@ -161,24 +165,24 @@ to decrypt the Amazon S3 credentials file and the files in your data packages. #. On a Windows computer, open Windows Explorer. On a Macintosh, open Finder. -#. Navigate to the file and right-click on it. +#. Navigate to the file and right-click it. -#. On a Windows computer, select **Decrypt and verify** and then click - **Decrypt/Verify**. On a Macintosh, select **Services** and then click +#. On a Windows computer, select **Decrypt and verify**, then click + **Decrypt/Verify**. On a Macintosh, select **Services**, then click **OpenPGP: Decrypt File**. #. Enter your passphrase. The GNU Privacy Guard program decrypts the file. For example, when you decrypt the credentials.csv.gpg file the result is a -credentials.csv file. When you open the credentials.csv file it contains your -email address, your Access Key, and your Secret Key. +credentials.csv file. Open the decrypted credentials.csv file to see that it +contains your email address, your Access Key, and your Secret Key. .. image:: ../Images/AWS_Credentials.png - :alt: A csv file, open in Notepad, with the access key value and the secret key value underlined + :alt: A csv file, open in Notepad, with the Access Key value and the Secret Key value underlined -============================================ +**************************************************************** Access Amazon S3 and Download Data Packages -============================================ +**************************************************************** To connect to Amazon S3, you must have your decrypted credentials. You may want to have a third-party tool that gives you a user interface for managing files @@ -204,7 +208,7 @@ Browser. Alternatively, you can use the `AWS Command Line Interface`_. Event tracking data is in a file named {date}-{organization}-tracking.tar. Database data files are in a file named {organization}-{date}.zip. -#. Download the files. These files can become very large, sometimes several +#. Download the files. These files can be very large, sometimes several gigabytes in size. #. Extract the files from the compressed .tar and the .zip files. All of the