Skip to content

Data Flows — MicroMasters

Generated 2026-06-24 13:33 UTC · c4gen dev

Each scenario below replays one interaction as a C4 Dynamic diagram. Amber steps are asynchronous (queued / scheduled / event-driven).

How to read these diagrams

These are C4 model diagrams (C4-PlantUML). Read them top-down: System Context (the whole SOA) → Container (one system's runtime units) → Dynamic (a single data flow, step by step).

  • People are rounded boxes; systems and containers are rectangles; databases and queues have distinct shapes.
  • Each arrow is a data flow labelled with what moves.
  • Solid arrows are synchronous (request/response, caller blocks).
  • Amber dashed arrows are asynchronous (queued, scheduled, or event-driven — caller does not block).
  • Drag to pan, scroll to zoom. Boxes with a link drill into the next level.

Learner dashboard load (synchronous)

The learner loads the SPA dashboard. Django authenticates via the edX/MITx Online OAuth session, hydrates the program/course state from Postgres (backed by cached edX data), and returns it to the React app.

edX enrollment/grade refresh (asynchronous)

Celery Beat triggers batch_update_user_data on Fridays every 6h. The worker takes a Redis lock, pulls enrollments/certificates/current-grades from edX.org and MITx Online via edx-api-client, and upserts the caches in Postgres; index updates flow to OpenSearch.

Grade freeze & exam authorization (asynchronous)

Daily during the 16:00 UTC hour the worker freezes final grades; hourly it authorizes learners who passed for upcoming exam runs. Both coordinate via Redis and persist to Postgres.

MIT Learn catalog + page ingestion (asynchronous)

MIT Learn's daily ETL pulls MicroMasters' live program catalog and Wagtail program pages over the public REST API — the SOA peer-pull this map exists to capture.

Ingestion sources (ETL)

Every external source the edx_content / default Celery workers pull from, with transport and cadence. ⚠️ marks brittle linkages (HTML/token scrapes, hardcoded URLs).

Source Transport Cadence Data Source of truth
MIT Learn (course runs) REST JSON daily 02:00 UTC course-run metadata by readable_id courses/mit_learn_api.py:21
MITx Online (enrollments/grades) REST OAuth2 (staff token) Fridays every 6h enrollment + current-grade caches dashboard/api_edx_cache.py
⚠️ Open Exchange Rates (currency) REST JSON scheduled financial-aid currency rates micromasters/settings.py:613
edX.org (enrollments/grades/certs) REST OAuth2 (edx-api-client) Fridays every 6h enrollments, certificates, current grades dashboard/tasks.py