Validation Report — summaries vs. source slides
Generated by a per-lecture automated cross-check (one Sonnet agent per lecture) comparing each summary against its official slide deck. L7 has no slide deck, so it was reviewed for internal consistency and against well-established public facts only.
Overall verdict: the summaries are conceptually faithful to the slides. No core framework was found to be wrong. The two recurring caveats are:
- Image-only slides. Many decks embed their tables/figures as images, so specific numbers (e.g. Kosinski's accuracy table, Hinds & Joinson's figures, the LBA parameter formulas) could not be auto-verified from slide text. They are likely correct (they match the papers) but should be eyeballed against the original papers if you intend to quote exact figures.
- A few additions worth knowing — listed below per lecture.
Fixes already applied are marked ✅. Items left for you to verify are marked 🔎.
L1 — Intro & IOS
- 🔎 Third pillar name. Summary uses "Equity and Diversity"; the slide diagram text reads "Equality and diversity." Also, "Open Cities" may sit under Transitions & Wellbeing in the slide Venn, not under the third pillar. Diagram is image-based, so this is medium-confidence — verify the platform→pillar layout against the L1 slides before quoting it. A caveat note has been added to the lecture page. ✅(note added)
- ✅ Verified: TRUST acronym + per-letter definitions and the central course question match the slides verbatim. Correction (second pass): the platform names and the platform-to-pillar mapping do not match cleanly; the slides give two conflicting layouts (see the second-pass addendum below). Only the headline "15 platforms" figure (slide 15) is stable.
- 🔎 The four Open Society bases, the North-ian institutions quote, and the CDR Venn intersection labels are not on extractable slide text (they come from the Elliott paper / IOS position paper). Correct, but slide-unverifiable.
L2 — Decision Making (LBA, Palada)
- ✅ Added: iBorderCtrl — the slides feature this as a named negative applied-AI example (EU border lie-detection via micro-expressions; ~50 ms windows, huge individual variability, contested validity) plus a SWOT of cognitive-model integration. A brief note was added (good essay fodder for "risks of algorithmic-level integration").
- 🔎 The LBA four-parameter formulas (
U[0,A],N(v,s),b,t₀), the "~7.5 algorithmic-level" score, the Palada "~2.6→2.2 s" RT figures, and the Newell exponent ranges are image-only / from the paper, not extractable slide text. Conceptually right; verify figures against the paper if quoting. - ✅ Verified: three integration levels, LBA principle, difficulty→drift / instruction→threshold, individual-difference axes (cautiousness/efficiency/execution time).
L3 — Autonomy (Rahwan, digital traces)
- ✅ Fixed: Tay "<24h" softened to "within a day" — the precise window isn't on the slides (the slides just say users fed it racist/misogynistic content).
- 🔎 Highest-risk for figures: Kosinski's accuracy table and the Hinds & Joinson comparison (computer .56 ≈ spouse .58, etc.) are image-only on the slides — verify against Kosinski 2013 / Hinds & Joinson 2019 before quoting exact numbers.
- ✅ Verified: Rahwan 3 scales × 4 domains, digital-trace advantages + self-selection bias, Kosinski N=58,466 / likes→SVD→regression, Kramer N=689,003 + tiny-effect framing.
- 🔎 Minor: slides also note "no difference between emotions" / "doesn't require nonverbal cues" (Kramer) and a formal PNAS editorial expression of concern — not in the summary.
L4 — Collective Patterns (HK, Douven & Hegselmann)
- ✅ Added: Douven & Hegselmann observation (iv) — the "changing confidence" variation (updating ε from peers further reduces free-rider impact; larger τ–ρ distance can increase effect on free riders). A note was added.
- 🔎 Nuance: the "subtle disinformer beats a bold one" finding is specifically about disinformation (impeding belief in truth), and non-monotonicity is shown parenthetically in the slide's five-concept list. Worth precise wording.
- ✅ Verified: emergence + Coleman's bathtub, the HK update rule (exact form), three agent types + update rules, mis/disinformation definitions and the logical relation.
L5 — Linguistic Models (van der Vegt)
- ✅ Added: the base-rate fallacy — the slides devote a multi-slide worked example (99%-accurate model on 100M people at 0.01% base rate → ~999,900 false positives). This directly answers mock Q4 (rare-event prediction), so it's high-value. Also added the Support / Human-in-the-loop / Full Replacement AI-use taxonomy and the GenAI-hallucination + EU-AI-Act-prohibition challenges. Notes added.
- ✅ Verified: all key numbers (n=22 leaders, 1,909,844 tweets, 33 platforms, 11,717,516 posts, 172-word dictionary), the six Perspective measures, the identity-attack mis-classification caution, CTAP-25.
L6 — Medical AI & Digital Twins (Van Rooij + Bontje, Wang)
- ✅ HIGH confidence — no factual errors against either deck. ADHD figures exact (N=700, 77.1%, sens 75%, spec 80%, AUC 0.82); SWOT exact; digital-twin "bi-directional" definition exact.
- ✅ Wang et al. (2023) section is appropriately caveated as a title-based reconstruction (the paper isn't in the source folder). Don't quote it as authoritative.
- 🔎 Minor omission: Bontje's closing "a DT is more than a simulation — connected to the real world" framing (could appear in a definition question). Note added.
L7 — Trust in AI (Grimmelikhuijsen & Meijer) — NO SLIDES
- ✅ Fixed: internal lecture-number inconsistency — the body said "L6 sits squarely…" in an L7 file; corrected to "This lecture."
- ✅ Likely-correct: paper citation (confirmed via student manual reading list), Mayer–Davis–Schoorman trust triad, Dietvorst 2015 aversion, Logg 2019 appreciation, input/throughput/output legitimacy, toeslagenaffaire core facts + Rutte III resignation (Jan 2021).
- 🔎 Check against the paper: the exact wording and ordering of the six threats (the author admits glossing them) and the threat→mitigation mappings. The Liefooghe cognitive-psych section is explicitly an inferred sketch — verify against any Teams slides.
- ℹ️ Note: the course's own numbering puts Grimmelikhuijsen in lecture slot 6; the site keeps Trust as L7 for consistency. Don't be thrown if the real exam references a different lecture number.
L8 — Synthesis & Mock (Van Rooij)
- ✅ Verified: mock Q1, Q3, Q5 wording + correct answers match the slides (Q5 = A, C, D ✓). Four-level scaling framework matches (lecture-number assignments are the author's inference — slides give the four levels without numbers).
- 🔎 Mock Q2 numbering: the slide labels it "Lecture 3"; the site calls Van Maanen's decision-making lecture L2. Same course-vs-site numbering offset as L7 — content is right, only the label differs.
- 🔎 Mock Q4: the slides give no model answer; the summary's "strong answer template" (class imbalance → van der Vegt → CTAP-25) is our suggested answer, not official. It's sound (and now reinforced by the L5 base-rate addition) but flagged as author-constructed.
Bottom line for studying
Trust the frameworks and arguments in these summaries — they held up. Before quoting an exact number from L2/L3 in an essay, glance at the original paper (those tables are image-only on the slides). The single most useful slide-derived addition this pass surfaced is the base-rate fallacy (L5) — it's the backbone of mock Q4.
Second audit pass (2026-06-20): lectures 1, 3 to 8 and the mock, against the obligatory papers
A second pass checked every lecture except L2 against the obligatory papers (now all present except the paywalled Wang 2023), the slide decks, and the student manual. The numbering decision was kept: the site stays on Medical AI / Digital Twins = L6 and Trust = L7, with a note on each page that the manual numbers them the other way. Corrections applied this pass:
- L7 six threats (highest impact). The materials listed six themes (deskilling, opacity, bias, privacy, accountability, value erosion) and called them the paper's. The paper's actual six are two per legitimacy type (Tables 2 to 4, Fig 1, p.240): input (erosion of democratic control; limited responsiveness), throughput (fails procedural fairness; insufficient checks and balances), output (ineffective and inefficient; undesirable outcomes). Rewritten in lecture_07, flashcards, core, and mock Q6, with the old themes kept as "concrete manifestations" mapped onto the real six.
- L7 external content. The trust triad (Mayer, Davis & Schoorman), algorithm aversion/appreciation, and the toeslagenaffaire are accurate but are not in the Grimmelikhuijsen & Meijer paper; each is now flagged as lecture/external material.
- Mock Q2 (L2 holdout). Still said "air-traffic control" and "workload lowers both drift and threshold." Corrected to the simulated UAV surveillance task and selective influence (workload lowers only the threshold; difficulty lowers the drift rate).
- L4 free riders. Metric corrected from "MSE" to sum of squared errors (SSE); the "only when campaigners are subtle" overstatement softened (the paper shows the effect generally; the subtle case is the illustrative one, small effect, eta-squared about 0.004).
- L3 Kosinski table. Split the conflated "AUC / Pearson r" column into a categorical AUC table and a continuous Pearson r table; intelligence and extraversion corrected to r = 0.39 and r = 0.40 (the 0.78 / 0.75 were ceiling-relative benchmark bars, not the correlations).
- L5 attribution. The Dutch-politician study and the identity-attack example are from the slides / a separate van der Vegt study, not the 2023 commentary (which is the four cautions plus VISOR-P); re-attributed across lecture_05, mock Q5, and core. CTAP-25 expansion hedged.
- L6 Wang. Reconstruction hedging strengthened (paper paywalled and unavailable); the federated-learning mechanics are flagged as unverified conjecture in lecture_06, flashcards, core, and mock Q7.
- Logistics and numbering. Mock practice date fixed (the exam is Fri 29 May, so the suggested sit is the Friday before, not 19 June); the in-class mock is clarified as 5 questions (covering L1, L3, L4, L5), distinct from the 9-question mock_exam.md; the four-level scaling note now states the lecture-number mapping is editorial.
- L1. The platform "Verify against the slides" box no longer claims the 15-platform set is "confirmed correct"; it now records that the per-pillar slides and the circular overview give two conflicting layouts. TRUST decomposition re-attributed to Shaw (2020: 176); the "North-ian" label flagged as the note's own gloss.