
Explainable AI - XAI#
This is a three-day instructor-led program (24 hours of scheduled content), 12 modules, six hands-on labs, and a team capstone with presentations. Typical classroom hours are 09:00–18:00 each day (including breaks and lunch).
Software policy: All hands-on labs and recommended tooling for this course are free and open-source (FOSS) under recognized OSI-style licenses. The syllabus does not require proprietary SaaS, paid MLOps platforms, or commercial-only explainability products for any exercise.
Data policy: Every dataset used in labs, workshops, model-card exercises, and the capstone must be free to obtain and free to use for teaching and learning under its published terms (for example public domain, CC BY or similarly permissive terms, U.S. or other government open data, or repository rules that allow no-cost educational use). Paid commercial datasets, clinical data behind fees or restricted-use agreements, and data gated by NDAs are not required. Where a real-world scenario would normally rely on restricted data, instructors substitute open benchmarks (for example UCI Machine Learning Repository entries with suitable licenses, Folktables, MedMNIST, or Hugging Face Datasets splits that clearly permit classroom use).
Course at a glance#
| Item | Detail |
|---|---|
| Format | In-person or live online (same schedule) |
| Duration | 3 days |
| Total scheduled content | 24 hours |
| Modules | 12 |
| Hands-on labs | 6 |
| Daily hours | 09:00–18:00 |
Session formats#
| Label | What it means |
|---|---|
| Lecture / concept | Instructor-led theory and methods |
| Hands-on lab | Guided exercises in the lab environment |
| Workshop / discussion | Cohort discussion, debates, or structured group work |
| Capstone / presentation | Team project work and final presentations |
Prerequisites#
- Python proficiency and comfort with pandas and scikit-learn (pipelines, metrics, train/validation splits).
- Prior exposure to supervised learning (classification and regression) and at least one tree-based or neural model in practice.
- Helpful but not mandatory: basic PyTorch for labs using Captum and transformer explanations; basic computer vision and NLP vocabulary for Days 2–3 modules.
Learning outcomes#
By the end of the program, participants should be able to:
- Choose and justify intrinsic vs post-hoc explanations for a given stakeholder and risk level.
- Run a full tabular SHAP workflow (including global and local views) and interpret dependence plots responsibly.
- Apply LIME, anchors, and DiCE-style counterfactuals where appropriate, including awareness of limitations.
- Use Grad-CAM (and related tools) for vision models and token-level attributions for NLP/transformers, without mistaking attention for causal explanation.
- Connect fairness metrics and mitigation approaches to explanation artifacts (including common failure modes).
- Use PDP, ICE, and explanation-guided analysis for debugging and slice-based evaluation.
- Map organizational work to GDPR, EU AI Act, NIST AI RMF, and India sector guidance at a high level, and draft or critique model cards and datasheets.
- Describe how XAI fits into MLOps, monitoring, and stakeholder communication—including explanation drift and human-in-the-loop review.
Tools and platforms (referenced in labs and lectures)#
All items below are open-source (or free developer libraries) suitable for local or self-hosted use.
- Interpretable models: InterpretML (Explainable Boosting Machines and related interpretable components), scikit-learn (linear models, trees), scorecard-style linear structures where applicable.
- Attribution / explanation: SHAP (TreeSHAP, KernelSHAP, DeepSHAP), LIME, Captum (integrated gradients and related), SHAP for text / transformers (partition or custom explainers as appropriate), TCAV (concept-based explanations via the open-source TCAV library).
- Anchors: Alibi Explain (
alibion PyPI; Apache License 2.0) for Anchor tabular, text, and image explainers where used in the cohort stack. - Counterfactuals: DiCE-ML (open-source counterfactual library).
- Fairness: Fairlearn, AI Fairness 360 (AIF360) (overview and exercises as scheduled).
- Deep learning and NLP stack (labs): PyTorch, Hugging Face Transformers (for example open BERT checkpoints), Captum for attributions.
- Production / MLOps (open-source examples): MLflow (tracking, models, registry), Evidently (reports and monitoring), WhyLogs (data and model profiling), TensorBoard (experiment visualization), DVC (data and pipeline versioning). Cohort environments may combine a subset of these; nothing listed requires a paid subscription.
Example open datasets (non-exhaustive)#
Instructors pick concrete files that satisfy the data policy above. Typical choices include:
- Tabular: UCI-style default-risk or German credit-style benchmarks (verify each file’s license before shipping to a cohort), Folktables (Census-derived, open license), or other Hugging Face Datasets tabular splits marked for research and educational use.
- Vision: MedMNIST subsets (including chest X-ray–style classification), MNIST/Fashion-MNIST, or similar permissively licensed image benchmarks—not paid PACS or hospital-only archives.
- NLP: SST-2, IMDb, or other sentiment sets loaded via Hugging Face Datasets with licenses compatible with classroom redistribution norms.
Assessment and capstone#
- Days 1–2: Lab completion and participation in workshops (fairness debate, debugging lab).
- Day 3: Capstone — teams build an end-to-end XAI pipeline on an instructor-approved dataset that meets the data policy (120-minute facilitated work block), then present (10 minutes presentation + 5 minutes Q&A per team) using a peer scoring rubric and instructor feedback.
- Closing: Short Module 12 on emerging topics, then certificates and curated next-step resources.
Day 1 — Foundations and core methods (8 hours)#
| Time | Duration | Format | Session | Topics |
|---|---|---|---|---|
| 09:00–09:30 | 30 min | Workshop | Bootcamp kickoff and learning objectives | Overview, logistics, cohort introductions |
| 09:30–10:45 | 75 min | Lecture | Module 1 — Foundations of Explainable AI | Why XAI matters; interpretability vs explainability; taxonomy; global vs local; accuracy–interpretability trade-offs |
| 10:45–11:00 | 15 min | — | Break | — |
| 11:00–12:00 | 60 min | Lecture | Module 2 — Interpretable models by design | Linear models, decision trees, GAMs, EBMs; scorecard models in credit and medicine |
| 12:00–12:45 | 45 min | Lab | Lab 1 — Building an interpretable EBM | Fit an Explainable Boosting Machine on an openly licensed tabular default-risk or loan-style benchmark (for example from UCI or Hugging Face Datasets); inspect shape functions |
| 12:45–13:30 | 45 min | — | Lunch | — |
| 13:30–14:45 | 75 min | Lecture | Module 3 — Feature importance and SHAP | Permutation importance; Shapley theory; TreeSHAP, KernelSHAP, DeepSHAP; waterfall and beeswarm plots |
| 14:45–15:30 | 45 min | Lab | Lab 2 — SHAP analysis on tabular data | Full SHAP pipeline on an openly licensed credit-style tabular benchmark; dependence plots; global vs local explanations |
| 15:30–15:45 | 15 min | — | Break | — |
| 15:45–16:45 | 60 min | Lecture | Module 4 — LIME, anchors, and counterfactuals | LIME; anchor rules; DiCE counterfactuals; contrastive explanations |
| 16:45–17:30 | 45 min | Lab | Lab 3 — Counterfactuals for a hiring model | Generate and evaluate counterfactual explanations with DiCE-ML on an openly licensed hiring- or admissions-style tabular benchmark (no proprietary HR feeds) |
| 17:30–18:00 | 30 min | Workshop | Day 1 recap and Q&A | Key takeaways; preview of Day 2 |
Day 2 — Vision, NLP, fairness, and debugging (8 hours)#
| Time | Duration | Format | Session | Topics |
|---|---|---|---|---|
| 09:00–09:15 | 15 min | Workshop | Day 1 recap and warm-up | Ten-question rapid-fire review |
| 09:15–10:30 | 75 min | Lecture | Module 5 — XAI for computer vision | Saliency maps; Grad-CAM; LIME for images; TCAV concept-based explanations |
| 10:30–11:15 | 45 min | Lab | Lab 4 — Grad-CAM on a chest X-ray classifier | Class activation maps; compare with LIME superpixels on an open medical-style image benchmark (for example MedMNIST Chest X-ray view or another permissively licensed teaching set—not paid clinical PACS data) |
| 11:15–11:30 | 15 min | — | Break | — |
| 11:30–12:30 | 60 min | Lecture | Module 6 — XAI for NLP and transformers | Attention as explanation (pitfalls); SHAP for text and transformers; integrated gradients via Captum; probing classifiers; explaining LLMs (framing) |
| 12:30–13:00 | 30 min | Lab | Lab 5 — Explaining a BERT sentiment classifier | Token-level attributions with Captum on a public sentiment corpus (for example SST-2 or IMDb via Hugging Face Datasets); compare SHAP vs attention visualizations |
| 13:00–13:45 | 45 min | — | Lunch | — |
| 13:45–14:45 | 60 min | Lecture | Module 7 — Fairness, bias, and responsible AI | Fairness definitions; bias sources; Fairlearn and AIF360; mitigation strategies; case study from public literature (for example recidivism risk tools) paired with hands-on data from Folktables or another openly licensed fairness benchmark |
| 14:45–15:15 | 30 min | Workshop | The fairness impossibility debate | Small groups: demographic parity vs equalized odds in a hiring scenario |
| 15:15–15:30 | 15 min | — | Break | — |
| 15:30–16:30 | 60 min | Lecture | Module 8 — Model debugging and robustness | PDP and ICE; error analysis; slice-based evaluation; adversarial robustness; explanation-guided hardening |
| 16:30–17:15 | 45 min | Lab | Lab 6 — Debugging a biased image classifier | Grad-CAM and SHAP together on an open image benchmark to surface spurious correlations; propose fixes |
| 17:15–18:00 | 45 min | Workshop | Day 2 recap and Q&A | Key takeaways; capstone briefing; preview of Day 3 |
Day 3 — Governance, production, and capstone (8 hours)#
| Time | Duration | Format | Session | Topics |
|---|---|---|---|---|
| 09:00–09:15 | 15 min | Workshop | Day 2 recap and warm-up | Quick review; capstone check-in |
| 09:15–10:15 | 60 min | Lecture | Module 9 — Regulatory landscape and AI governance | GDPR Article 22; EU AI Act; NIST AI RMF; RBI / SEBI / IRDAI themes; model cards and datasheets |
| 10:15–11:00 | 45 min | Workshop | Drafting a model card | Teams draft a model card for a provided case study bundled with an openly licensed dataset; peer review in pairs |
| 11:00–11:15 | 15 min | — | Break | — |
| 11:15–12:00 | 45 min | Lecture | Module 10 — Communicating explanations to stakeholders | Audience design; XAI dashboards; cognitive load; misleading explanations and the Rashomon effect |
| 12:00–12:45 | 45 min | Lecture | Module 11 — XAI in production and MLOps | CI/CD hooks for explanations; explanation drift monitoring; human-in-the-loop; open-source stack (for example MLflow, Evidently, WhyLogs, TensorBoard, DVC) |
| 12:45–13:30 | 45 min | — | Lunch | — |
| 13:30–15:30 | 120 min | Capstone | Capstone project work session | End-to-end XAI pipeline on a team-chosen dataset that meets the data policy; instructors facilitate; lab open |
| 15:30–15:45 | 15 min | — | Break | — |
| 15:45–17:15 | 90 min | Capstone | Capstone presentations | Ten minutes per team + five minutes Q&A; peer rubric; instructor feedback |
| 17:15–17:45 | 30 min | Lecture | Module 12 — Emerging frontiers | Mechanistic interpretability; self-explaining networks; XAI for generative AI; open research challenges |
| 17:45–18:00 | 15 min | Workshop | Closing and certification | Takeaways; next steps; resources; certificates |
Module index (quick reference)#
| Module | Title |
|---|---|
| 1 | Foundations of Explainable AI |
| 2 | Interpretable models by design |
| 3 | Feature importance and SHAP |
| 4 | LIME, anchors, and counterfactuals |
| 5 | XAI for computer vision |
| 6 | XAI for NLP and transformers |
| 7 | Fairness, bias, and responsible AI |
| 8 | Model debugging and robustness |
| 9 | Regulatory landscape and AI governance |
| 10 | Communicating explanations to stakeholders |
| 11 | XAI in production and MLOps |
| 12 | Emerging frontiers |

Comments: