Skip to content

Data Flows — MITx Pro

Generated 2026-06-24 16:33 UTC · c4gen dev

Each scenario below replays one interaction as a C4 Dynamic diagram. Amber steps are asynchronous (queued / scheduled / event-driven).

How to read these diagrams

These are C4 model diagrams (C4-PlantUML). Read them top-down: System Context (the whole SOA) → Container (one system's runtime units) → Dynamic (a single data flow, step by step).

  • People are rounded boxes; systems and containers are rectangles; databases and queues have distinct shapes.
  • Each arrow is a data flow labelled with what moves.
  • Solid arrows are synchronous (request/response, caller blocks).
  • Amber dashed arrows are asynchronous (queued, scheduled, or event-driven — caller does not block).
  • Drag to pan, scroll to zoom. Boxes with a link drill into the next level.

A learner pays through CyberSource Secure Acceptance. xPRO signs a payload and redirects the browser to CyberSource; CyberSource posts a signed result back to xPRO's OrderFulfillmentView, which fulfills the order, enrolls the learner in Open edX, syncs the deal to HubSpot, and emails a receipt.

Open edX SSO & user provisioning (synchronous)

xPRO authenticates learners through Open edX via social-auth. On first login/signup xPRO creates the corresponding edX user and an OpenEdxApiAuth access token, used later for enrollment and grade reads.

Courseware sync & certificates (asynchronous)

RedBeat-scheduled Celery tasks repair failed enrollments and faulty edX users, sync course-run data and grades from Open edX, and generate course certificates. External vendor courses (Emeritus / Global Alumni) are synced from report APIs on a daily cron.

Google Sheets coupon/refund/deferral ops (asynchronous)

Staff manage coupon assignment, refund, and deferral requests in Google Sheets. Drive push notifications and a polling beat task make Celery read the sheets via a service account, apply coupons/enrollments, email bulk enrollment codes, and write status back.

MIT Learn catalog ingestion (cross-service, asynchronous)

MIT Learn's ETL pulls xPRO's course/program catalog from the public REST API and content files from S3 to surface xPRO offerings in discovery. This is a one-directional pull owned by MIT Learn's Celery scheduler.

Ingestion sources (ETL)

Every external source the edx_content / default Celery workers pull from, with transport and cadence. ⚠️ marks brittle linkages (HTML/token scrapes, hardcoded URLs).

Source Transport Cadence Data Source of truth
Emeritus (external courses) REST report API daily cron external course/run batch reports courses/sync_external_courses/external_course_sync_api.py
Global Alumni (external courses) REST report API daily cron external course/run batch reports courses/sync_external_courses/external_course_sync_api.py
⚠️ Google Sheets (coupon/refund/deferral) Sheets/Drive API + push webhook every SHEETS_MONITORING_FREQUENCY coupon assignment, refund, deferral requests sheets/tasks.py
Open edX course-run data REST (edx-api-client) daily cron course-run titles & dates courses/tasks.py:109
Open edX grades REST (edx-api-client) daily cron course-run grades for certificates courses/tasks.py:31