Skip to content

Data Flows — MITx Online Open edX

Generated 2026-06-24 17:02 UTC · c4gen dev

Each scenario below replays one interaction as a C4 Dynamic diagram. Amber steps are asynchronous (queued / scheduled / event-driven).

How to read these diagrams

These are C4 model diagrams (C4-PlantUML). Read them top-down: System Context (the whole SOA) → Container (one system's runtime units) → Dynamic (a single data flow, step by step).

  • People are rounded boxes; systems and containers are rectangles; databases and queues have distinct shapes.
  • Each arrow is a data flow labelled with what moves.
  • Solid arrows are synchronous (request/response, caller blocks).
  • Amber dashed arrows are asynchronous (queued, scheduled, or event-driven — caller does not block).
  • Drag to pan, scroll to zoom. Boxes with a link drill into the next level.

Learner takes a course (synchronous)

A learner works through courseware in the LMS. The LMS loads course structure from the modulestore (MongoDB), reads/writes state in MySQL, proxies forum activity to the comments service, and stores annotations in the Notes API.

Course authoring & publish (synchronous + async)

An author edits a course in Studio. Studio writes structure to the modulestore, stores uploaded assets and (on demand) course export archives to S3, and indexes content into edx-search.

MITx Online enrollment, grade & certificate sync (cross-service, mixed)

MITx Online pushes user creation and enrollments to the LMS REST API at courses.learn.mit.edu and reads grades; the LMS posts enrollment/certificate webhooks back. A periodic MITx Online Celery task retries failed enrollments and repairs faulty edX users. MicroMasters batch-reads enrollments, certificates, and current grades on a Friday cadence.

MITx Online SSO + Keycloak federation (synchronous)

MITx Online authenticates learners against this Open edX deployment as the OAuth2 identity provider (mitxonline-oauth2). The provider verifies the login, issues an access token, and persists OAuth state in MySQL. The LMS itself federates learner login to Keycloak (sso.ol.mit.edu).

Ingestion sources (ETL)

Every external source the edx_content / default Celery workers pull from, with transport and cadence. ⚠️ marks brittle linkages (HTML/token scrapes, hardcoded URLs).

Source Transport Cadence Data Source of truth
MITx Online (enrollment/grade API consumer) REST OAuth2 on enroll / scheduled enrollments + current grades openedx/api.py
MicroMasters (batch enrollment/cert/grade refresh) REST OAuth2 (edx-api-client) Fridays every 6h enrollments, certificates, current grades dashboard/tasks.py