Back to Blog
S

SQLExplain

How to Design a Database Schema for an Online Learning Platform in PostgreSQL

Person studying on a laptop with notebooks and coffee on a desk

How to Design a Database Schema for an Online Learning Platform

Introduction

Platforms like Udemy, Coursera, and Teachable are deceptively complex from a data modelling perspective. On the surface you have courses and students — but behind that are structured lesson hierarchies with sections and sort orders, per-lesson progress tracking, quiz engines with scoring and attempt limits, assignment submissions with instructor feedback, and certificate generation once a student reaches 100% completion. Get the schema wrong and you end up either over-normalising (a separate table per role, per content type) or under-normalising (progress buried inside an enrollment JSON blob).

In this guide we'll build a complete PostgreSQL schema for an online learning platform from scratch. Along the way we'll cover:

  • Why a single users table with a role column beats separate students and instructors tables
  • How to model the course → section → lesson hierarchy without adding a sections table
  • The difference between lesson_progress (per-lesson) and enrollments.progress_pct (aggregate) and why you need both
  • How DISTINCT ON gives you the most recent quiz attempt per student in one clean query
  • Five production queries covering progress dashboards, quiz results, instructor revenue, and certificate eligibility

Step 1: Identify Your Entities

An online learning platform needs to manage content (what's being taught), access (who's enrolled), activity (what progress has been made), and outcomes (grades, certificates).

TableDescription
usersStudents, instructors, and admins — single table with a role column
coursesCourse listings with category, difficulty, price, and publish status
lessonsIndividual lessons, ordered within named sections inside a course
enrollmentsStudent–course relationship tracking status and aggregate progress
lesson_progressPer-lesson completion record for each enrollment
quizzesQuiz definitions attached to a course, with passing score and attempt limit
quiz_questionsIndividual questions with type and correct answer
quiz_attemptsStudent quiz submissions recording score and pass/fail
assignmentsWritten or project-based tasks with a due date and max score
assignment_submissionsStudent submissions with grade and instructor feedback
reviewsStar ratings and comments — one per student per course
certificatesIssued on course completion, with a URL to the generated credential

Step 2: Define the Relationships

  • An instructor (user with role INSTRUCTOR) creates many courses; each course has many lessons and may have many quizzes and assignments
  • A student (role STUDENT) enrolls in many courses; each enrollment tracks overall progress_pct
  • Each enrollment has many lesson_progress rows — one per lesson — recording whether that lesson is complete
  • A quiz belongs to a course and has many quiz_questions; a student can make multiple quiz_attempts up to max_attempts
  • An assignment belongs to a course; a student has at most one assignment_submission per assignment (enforced by a UNIQUE constraint)
  • A review is written by a student for a course — one per student per course (UNIQUE constraint)
  • A certificate is issued per enrollment once the student reaches full completion; UNIQUE on enrollment_id ensures it's issued only once

DBA Tip: A single users table with a role column is almost always the right choice over separate students and instructors tables. Users can change roles (a student becomes an instructor), and you avoid duplicated contact/auth columns. If role-specific data grows significantly, extend with a instructor_profiles or student_profiles satellite table rather than splitting the core identity table.


Step 3: Build the Schema

Key design choices throughout:

  • UUID primary keys via gen_random_uuid() — safe for distributed deployments and public-facing API IDs
  • VARCHAR + CHECK instead of ENUM for role, content_type, status — easy to add new values without a schema migration
  • lesson_progress is insert-only — a row is created when a student completes a lesson; is_completed defaults to TRUE. The enrollment's progress_pct is the derived aggregate
  • quiz_questions.correct_answer TEXT — stores the option key for MCQ ('A', 'B') or the expected text for short-answer; complex scoring lives in the application layer
  • UNIQUE(assignment_id, user_id) on submissions and UNIQUE(user_id, course_id) on reviews — both enforced at DB level, not just application level
-- Enable UUID generation
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- ============================================================
-- USERS  (students, instructors, admins in one table)
-- ============================================================
CREATE TABLE users (
  user_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  full_name   VARCHAR(150) NOT NULL,
  email       VARCHAR(255) UNIQUE NOT NULL,
  password_hash TEXT NOT NULL,
  role        VARCHAR(20) NOT NULL DEFAULT 'STUDENT'
    CHECK (role IN ('STUDENT', 'INSTRUCTOR', 'ADMIN')),
  bio         TEXT,
  avatar_url  TEXT,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- COURSES
-- ============================================================
CREATE TABLE courses (
  course_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  instructor_id UUID NOT NULL REFERENCES users(user_id),
  title         VARCHAR(200) NOT NULL,
  description   TEXT,
  category      VARCHAR(100),         -- e.g. 'Data Science', 'Web Development'
  difficulty    VARCHAR(20) NOT NULL DEFAULT 'Beginner'
    CHECK (difficulty IN ('Beginner', 'Intermediate', 'Advanced')),
  price         NUMERIC(10, 2) NOT NULL DEFAULT 0,
  thumbnail_url TEXT,
  is_published  BOOLEAN NOT NULL DEFAULT FALSE,
  created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at    TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- LESSONS  (ordered within named sections)
-- ============================================================
CREATE TABLE lessons (
  lesson_id    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  course_id    UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
  section_name VARCHAR(150),           -- logical grouping, e.g. 'Module 1: Foundations'
  title        VARCHAR(200) NOT NULL,
  content_type VARCHAR(20) NOT NULL DEFAULT 'Video'
    CHECK (content_type IN ('Video', 'Article', 'Audio', 'Interactive')),
  content_url  TEXT,                   -- video URL, article body, etc.
  duration_min INT,                    -- NULL for articles/interactive lessons
  sort_order   INT NOT NULL DEFAULT 0, -- global order within the course
  is_preview   BOOLEAN NOT NULL DEFAULT FALSE,  -- free preview without enrollment
  created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- ENROLLMENTS  (student ↔ course, with aggregate progress)
-- ============================================================
CREATE TABLE enrollments (
  enrollment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id       UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
  course_id     UUID NOT NULL REFERENCES courses(course_id),
  status        VARCHAR(20) NOT NULL DEFAULT 'ACTIVE'
    CHECK (status IN ('ACTIVE', 'COMPLETED', 'DROPPED', 'EXPIRED')),
  progress_pct  NUMERIC(5, 2) NOT NULL DEFAULT 0
    CHECK (progress_pct BETWEEN 0 AND 100),
  enrolled_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  completed_at  TIMESTAMPTZ,
  UNIQUE (user_id, course_id)  -- one enrollment per student per course
);

-- ============================================================
-- LESSON PROGRESS  (granular per-lesson completion)
-- ============================================================
CREATE TABLE lesson_progress (
  progress_id   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  enrollment_id UUID NOT NULL REFERENCES enrollments(enrollment_id) ON DELETE CASCADE,
  lesson_id     UUID NOT NULL REFERENCES lessons(lesson_id) ON DELETE CASCADE,
  is_completed  BOOLEAN NOT NULL DEFAULT TRUE,
  completed_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (enrollment_id, lesson_id)  -- one completion record per lesson per enrollment
);

-- ============================================================
-- QUIZZES
-- ============================================================
CREATE TABLE quizzes (
  quiz_id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  course_id        UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
  title            VARCHAR(200) NOT NULL,
  description      TEXT,
  passing_score_pct NUMERIC(5, 2) NOT NULL DEFAULT 70
    CHECK (passing_score_pct BETWEEN 0 AND 100),
  max_attempts     INT NOT NULL DEFAULT 3,  -- -1 = unlimited
  time_limit_min   INT,                     -- NULL = no time limit
  created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- QUIZ QUESTIONS
-- ============================================================
CREATE TABLE quiz_questions (
  question_id    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  quiz_id        UUID NOT NULL REFERENCES quizzes(quiz_id) ON DELETE CASCADE,
  question_text  TEXT NOT NULL,
  question_type  VARCHAR(20) NOT NULL DEFAULT 'MCQ'
    CHECK (question_type IN ('MCQ', 'True/False', 'Short Answer')),
  option_a       TEXT,  -- MCQ options; NULL for short-answer questions
  option_b       TEXT,
  option_c       TEXT,
  option_d       TEXT,
  correct_answer TEXT NOT NULL,  -- e.g. 'A', 'True', or expected short text
  points         INT NOT NULL DEFAULT 1,
  sort_order     INT NOT NULL DEFAULT 0
);

-- ============================================================
-- QUIZ ATTEMPTS
-- ============================================================
CREATE TABLE quiz_attempts (
  attempt_id   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  quiz_id      UUID NOT NULL REFERENCES quizzes(quiz_id),
  user_id      UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
  score_pct    NUMERIC(5, 2) NOT NULL,
  passed       BOOLEAN NOT NULL,
  started_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  submitted_at TIMESTAMPTZ
);

-- ============================================================
-- ASSIGNMENTS
-- ============================================================
CREATE TABLE assignments (
  assignment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  course_id     UUID NOT NULL REFERENCES courses(course_id) ON DELETE CASCADE,
  title         VARCHAR(200) NOT NULL,
  instructions  TEXT,
  due_date      TIMESTAMPTZ,
  max_score     INT NOT NULL DEFAULT 100,
  created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- ASSIGNMENT SUBMISSIONS  (one per student per assignment)
-- ============================================================
CREATE TABLE assignment_submissions (
  submission_id  UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  assignment_id  UUID NOT NULL REFERENCES assignments(assignment_id) ON DELETE CASCADE,
  user_id        UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
  submission_url TEXT,               -- link to file, repo, or written response
  grade          NUMERIC(6, 2),      -- NULL until graded
  feedback       TEXT,               -- instructor comments
  submitted_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  graded_at      TIMESTAMPTZ,
  UNIQUE (assignment_id, user_id)    -- one submission per student
);

-- ============================================================
-- REVIEWS  (one per student per course)
-- ============================================================
CREATE TABLE reviews (
  review_id    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id      UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
  course_id    UUID NOT NULL REFERENCES courses(course_id),
  rating       SMALLINT NOT NULL CHECK (rating BETWEEN 1 AND 5),
  comment      TEXT,
  reviewed_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (user_id, course_id)        -- one review per student per course
);

-- ============================================================
-- CERTIFICATES  (issued once on completion)
-- ============================================================
CREATE TABLE certificates (
  certificate_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  enrollment_id  UUID UNIQUE NOT NULL REFERENCES enrollments(enrollment_id),
  user_id        UUID NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
  course_id      UUID NOT NULL REFERENCES courses(course_id),
  certificate_url TEXT,              -- URL to generated PDF or image credential
  issued_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- ============================================================
-- INDEXES
-- ============================================================
CREATE INDEX idx_courses_instructor       ON courses(instructor_id);
CREATE INDEX idx_courses_category         ON courses(category, is_published);
CREATE INDEX idx_lessons_course_order     ON lessons(course_id, sort_order);
CREATE INDEX idx_enrollments_user         ON enrollments(user_id);
CREATE INDEX idx_enrollments_course       ON enrollments(course_id);
CREATE INDEX idx_lesson_progress_enroll   ON lesson_progress(enrollment_id);
CREATE INDEX idx_quiz_attempts_user_quiz  ON quiz_attempts(user_id, quiz_id);
CREATE INDEX idx_reviews_course           ON reviews(course_id);
CREATE INDEX idx_certificates_user        ON certificates(user_id);

Step 4: ER Diagram

The schema organises cleanly around three concerns — content, access, and outcomes:

Scroll or pinch to zoom · drag to pan · double-click to zoom in
Entity-relationship diagram for the online learning platform
Entity-relationship diagram — online learning platform

The enrollments table is the load-bearing centre of the student experience — it links a student to a course and anchors both lesson_progress rows (granular) and the certificate (final outcome).


Step 5: Example Queries

Query 1 — Top-Rated Courses by Category

Ranks courses within each category using a window function, so you can show "Best in Data Science" and "Best in Web Development" in a single query.

SELECT
  category,
  title,
  instructor_name,
  avg_rating,
  total_reviews,
  RANK() OVER (
    PARTITION BY category
    ORDER BY avg_rating DESC, total_reviews DESC
  ) AS rank_in_category
FROM (
  SELECT
    c.course_id,
    c.category,
    c.title,
    u.full_name                           AS instructor_name,
    ROUND(AVG(r.rating), 2)               AS avg_rating,
    COUNT(r.review_id)                    AS total_reviews
  FROM courses c
  JOIN users u   ON c.instructor_id = u.user_id
  LEFT JOIN reviews r ON c.course_id = r.course_id
  WHERE c.is_published = TRUE
  GROUP BY c.course_id, c.category, c.title, u.full_name
  HAVING COUNT(r.review_id) >= 3
) ranked
ORDER BY category, rank_in_category;

Query 2 — Student Course Progress Breakdown

Shows exactly how many lessons a student has completed vs the total, alongside their aggregate progress percentage and time since enrollment.

SELECT
  c.title                                              AS course_title,
  e.status,
  e.progress_pct,
  COUNT(l.lesson_id)                                   AS total_lessons,
  COUNT(lp.progress_id)                                AS completed_lessons,
  ROUND(
    COUNT(lp.progress_id) * 100.0
    / NULLIF(COUNT(l.lesson_id), 0), 1
  )                                                    AS calculated_pct,
  EXTRACT(DAY FROM NOW() - e.enrolled_at)::INT         AS days_since_enrollment
FROM enrollments e
JOIN courses c    ON e.course_id   = c.course_id
JOIN lessons l    ON c.course_id   = l.course_id
LEFT JOIN lesson_progress lp
  ON e.enrollment_id = lp.enrollment_id
  AND l.lesson_id    = lp.lesson_id
WHERE e.user_id = 'your-student-uuid-here'
GROUP BY c.title, e.status, e.progress_pct, e.enrolled_at
ORDER BY e.enrolled_at DESC;

Analyst Tip: Notice calculated_pct (computed from lesson_progress) vs progress_pct (stored on enrollments). The stored value is kept for fast list-page queries. If they ever diverge, calculated_pct is the source of truth — use it to audit and repair progress_pct in a maintenance job.

Query 3 — Latest Quiz Attempt Per Student

DISTINCT ON is a PostgreSQL gem — it returns the first row per group after ordering, giving you each student's most recent attempt in a single scan without a self-join or subquery.

SELECT DISTINCT ON (qa.user_id, qa.quiz_id)
  u.full_name,
  u.email,
  qz.title          AS quiz_title,
  qa.score_pct,
  qa.passed,
  qa.submitted_at   AS last_attempt_at,
  COUNT(*) OVER (
    PARTITION BY qa.user_id, qa.quiz_id
  )                 AS total_attempts
FROM quiz_attempts qa
JOIN users u    ON qa.user_id = u.user_id
JOIN quizzes qz ON qa.quiz_id = qz.quiz_id
WHERE qz.course_id = 'your-course-uuid-here'
ORDER BY qa.user_id, qa.quiz_id, qa.submitted_at DESC;

Query 4 — Instructor Revenue Stats

Joins enrollments to courses to aggregate total earnings per course. FILTER on status breaks down active vs dropped enrollments without needing a subquery.

SELECT
  c.title                                                   AS course_title,
  c.price,
  COUNT(e.enrollment_id)                                    AS total_enrollments,
  COUNT(e.enrollment_id) FILTER (WHERE e.status = 'ACTIVE')      AS active_students,
  COUNT(e.enrollment_id) FILTER (WHERE e.status = 'COMPLETED')   AS completions,
  ROUND(
    COUNT(e.enrollment_id) FILTER (WHERE e.status = 'COMPLETED')
    * 100.0 / NULLIF(COUNT(e.enrollment_id), 0), 1
  )                                                         AS completion_rate_pct,
  ROUND(COUNT(e.enrollment_id) * c.price, 2)               AS gross_revenue
FROM courses c
LEFT JOIN enrollments e ON c.course_id = e.course_id
WHERE c.instructor_id = 'your-instructor-uuid-here'
  AND c.is_published   = TRUE
GROUP BY c.course_id, c.title, c.price
ORDER BY gross_revenue DESC;

Query 5 — Students Eligible for a Certificate

Finds students who have completed every lesson in a course AND passed at least one quiz attempt, but haven't yet received a certificate. These are the rows your certificate-generation job should process.

SELECT
  e.enrollment_id,
  u.user_id,
  u.full_name,
  u.email,
  c.title      AS course_title,
  e.progress_pct
FROM enrollments e
JOIN users u   ON e.user_id   = u.user_id
JOIN courses c ON e.course_id = c.course_id
WHERE e.status       = 'ACTIVE'
  AND e.progress_pct = 100

  -- All lessons completed
  AND NOT EXISTS (
    SELECT 1
    FROM lessons l
    LEFT JOIN lesson_progress lp
      ON lp.lesson_id     = l.lesson_id
      AND lp.enrollment_id = e.enrollment_id
    WHERE l.course_id    = c.course_id
      AND lp.progress_id IS NULL         -- lesson has no completion record
  )

  -- At least one passing quiz attempt
  AND EXISTS (
    SELECT 1
    FROM quiz_attempts qa
    JOIN quizzes qz ON qa.quiz_id = qz.quiz_id
    WHERE qa.user_id    = u.user_id
      AND qz.course_id  = c.course_id
      AND qa.passed     = TRUE
  )

  -- Certificate not yet issued
  AND NOT EXISTS (
    SELECT 1
    FROM certificates cert
    WHERE cert.enrollment_id = e.enrollment_id
  )

ORDER BY e.enrollment_id;

Extensibility

FeatureExtension
Discussion forumsAdd forum_threads and forum_replies tables linked to a course_id or lesson_id
Live sessionsAdd a live_sessions table with a scheduled time, meeting URL, and a session_attendees junction
Coupons & discountsAdd a coupons table (code, discount_type, amount, expiry) and reference it in enrollments
Learning pathsAdd a learning_paths table grouping ordered courses, with a learning_path_enrollments junction
Instructor payoutsAdd a payouts table recording transfer amounts and dates; derive payout amounts from enrollment aggregates
Course sections as a tableIf section metadata grows (section-level quizzes, section thumbnails), promote section_name into a proper course_sections table that lessons FK into

DBA Tip: lesson_progress will be your highest-insert table at scale — one row per lesson per student. Consider partitioning it by enrollment_id range or archiving completed enrollments to a cold-storage table. For real-time progress bars, a Redis counter per enrollment_id updated on each lesson completion avoids hitting PostgreSQL on every page load.


Summary

We've built a complete online learning platform schema in PostgreSQL across 12 tables covering:

  • A single users table with a role column that cleanly handles students, instructors, and admins without duplication
  • A two-level progress model — granular lesson_progress rows per lesson, plus a fast aggregate progress_pct on enrollments for dashboards
  • A quiz engine with configurable attempt limits, multiple question types, and per-attempt score tracking using DISTINCT ON to find the latest result efficiently
  • DB-enforced uniqueness on assignment submissions and reviews, so data integrity doesn't depend solely on application logic
  • Certificate eligibility driven entirely from the database — completeness checks via NOT EXISTS, pass checks via EXISTS, all in a single query your background job can poll

The schema gives you a working foundation you can extend incrementally — forums, live sessions, coupons, and learning paths all slot in without restructuring the core tables.

Try this schema in a live sandbox

Create a free SQLExplain account and land directly in a sandbox pre-loaded with the Online Learning Schema from this post — no password, no setup.