System Context — OL Data Platform
Generated 2026-06-24 13:33 UTC · c4gen dev
The widest view: OL Data Platform and every external actor and system it exchanges data with. Edges shown are curated and code-verified; raw graph-derived candidates are listed under Dependencies & Cycles.
Interactive
Drag to pan, scroll to zoom. Click the OL Data Platform box to drill into its container view.
External systems & peers
| System | Role |
|---|---|
| HashiCorp Vault | Source-DB and SaaS credentials; every Dagster code location authenticates at startup. |
| MIT Learn | Discovery platform. Its Postgres is ingested into raw; the platform POSTs HMAC-signed content/OVS webhooks back into its Django API to trigger ingest. |
| MITx Online | Course/enrollment platform. App Postgres + its Open edX MySQL/forum are ingested. Heavy edX-sync/certificate ETL still runs in its own Celery workers. |
| MITx Pro (xPRO) | App Postgres + Open edX MySQL/forum ingested into raw. |
| MicroMasters | App Postgres ingested into raw (courses, programs, certificates). |
| OCW Studio | App Postgres ingested into raw; OCW site JSON also flows via S3. |
| ODL Video Service | Video/transcript metadata ingested via API; OVS webhooks pushed back. |
| Bootcamps | Bootcamps app Postgres ingested into raw. |
| edX.org Archives & BigQuery | edX.org course tarballs/tracking logs from GCS/S3 and Emeritus/IRX BigQuery exports — landed via dlt and Airbyte. |
| SaaS Sources (Salesforce / Mailgun / feeds) | Salesforce and Mailgun via Airbyte; MIT Climate, MIT Professional Ed, Open Learning Library, and podcast RSS via dlt. |
| Hightouch | Third-party reverse-ETL SaaS. Reads curated models from the warehouse (Starburst) and syncs rows into operational systems — notably writing ProgramCertificate records into MIT Learn's Postgres. Operated outside this repo (no Hightouch code lives here); it connects to the warehouse as an external consumer. |