University of Waterloo · OVPRI

Generative AI
for Everyone

An interactive guide to understanding, using, and thinking critically about Generative AI. Based on training materials by Pavol Chvala, Director, AI Innovation & Systems Development.

12 hrs

Today's agentic task horizon (50% success) — Claude Opus 4.6

4–7mo

Doubling time for agentic task capability (METR, 2026)

9–900×

Annual reduction in LLM inference cost per token

Module 1–2

How It Works

AI hierarchy, training, inference, model growth and cost trends.

Module 3

Capabilities

Agentic benchmarks, industry comparisons, and what's coming next.

Module 4

Use Cases

Create, Augment, and Collaborate — with live prompting examples.

Module 5

Prompting Techniques

Zero-shot, few-shot, role prompting, step-back, reusable prompts and more.

Module 6

Tools & Selection

Copilot, ChatGPT, Gemini, Claude, Ollama — choosing the right one.

Module 7

Adoption Journey

Learn → Iterate → Standardize. The path to an AI-enabled team.

Module 8

Agents & Automation

How agents work, real examples, risk areas and responsible principles.

Module 9

Challenges

Data awareness, Policy 46, bias, hallucinations, sustainability.

Key insight: Sharing use-cases amongst colleagues is the fastest way to gain efficiencies using GenAI tools. The technology is accelerating — nobody is an expert yet, and that's okay.

Module 1–2

How It Works

Understanding the AI landscape — from foundational concepts to how models generate content.

The AI Hierarchy

Generative AI sits inside a nested set of technologies. It's a type of AI capable of generating original content — commonly known as Gen AI, LLM, GPT, or just "the model."

Artificial Intelligence (AI) · Narrow AI · AGI

Machine Learning

Deep Learning

Generative AI · Diffusion · GANs

Large Language Models (LLMs) · BERT · GPT series

ChatGPT · Claude · Gemini · Copilot

How Does It Generate Content?

01

Training

Pattern Recognition

Machine learning identifies patterns in vast data (text, images, video, music). The result is a model.

"The cat sat on the ___" → "mat"
"The cat ___ on the mat" → "sat"
…do this trillions of times…

02

Preference Tuning

Alignment

The raw model is tweaked so that its outputs align with human preferences — making it helpful, harmless, and honest rather than just statistically likely.

03

Generation

Inference

"Hey model, I have the following data — what comes next?"

Every response is the model predicting the most probable continuation of your input.

Model Growth & Cost Trends

4.5×

Annual growth in training compute during the Deep Learning era (Epoch AI)

9–900×

Annual reduction in LLM inference prices, depending on task type

40×

Per-year cost reduction for mid-range capability models

2010

Start of the Deep Learning era — compute growth accelerated dramatically

Bottom line: Models are getting dramatically more capable while simultaneously getting far cheaper to use. This combination is what makes now such a pivotal moment for AI adoption.

Module 3

AI Capabilities

How capable are today's AI systems — and how fast are things moving?

Agentic Task Horizon (50% Success)

How long a software task can an LLM complete with 50% success? This measure has been doubling every 4–7 months.

12 hrs

Today — Claude Opus 4.6 (METR, Mar 2026)

2026

Projected: 2-hour tasks routinely

2027

Projected: 1 full work day of autonomous tasks

2029

Projected: 1 full work month of autonomous tasks

4–7 month doubling time. Today's best model (Claude Opus 4.6) can complete a 12-hour software task with 50% reliability. At 80% reliability the horizon is currently ~1 hour 10 minutes. Capabilities are growing exponentially.

Agentic Task Horizon — Timeline

2020 — GPT-2 / GPT-3

Task horizon: ~4 seconds. Could answer simple questions reliably.

2022 — GPT-3.5

Task horizon: ~36 seconds. Could count words, simple lookups.

2023 — GPT-4

Task horizon: ~6 minutes. Could fix bugs in small Python libraries.

2024 — o1-preview / GPT-4o

Task horizon: ~1 hour. Could implement a simple web server, find facts on the web.

2025 — o3 / GPT-5 / Claude Opus 4.5

Task horizon: ~5 hours. Could train adversarially robust image models.

Today (Mar 2026) — Claude Opus 4.6 / GPT-5.2

Task horizon: 12 hours. Can implement complex protocols from multiple RFCs.

AI Agent vs. Industry Professional

OpenAI's GDP-Val benchmark measures how often AI agents outperform human industry professionals on work tasks. The dashed line represents parity (50%).

GPT-5.4

83%

Claude Opus 4.6

78%

GPT-5.2 (wins+ties)

71%

Claude Opus 4.5

60%

Gemini 3 Pro

54%

Claude Sonnet 4.5

50%

GPT-4o

12%

■ Wins only □ Wins + Ties Source: evals.openai.com/gdpval · Dec 2025

What Else Can It Do?

March 2025

Google Gemini Robotics

Physical robots powered by multimodal AI — sorting items, following verbal instructions, performing delicate manipulation tasks in unstructured environments.

May 2025

Google Veo 3

AI video generation with cinematic quality — photorealistic scenes, consistent characters, and natural motion from a text description alone. deepmind.google/models/veo

January 2026

Google Project Genie

Interactive world generation — AI creates playable, interactive environments from images or descriptions. Games and simulations without a developer. deepmind.google/models/genie

Application of AI today: Knowledgebase assistants · Complex reasoners · Deep research · Agents acting on your behalf · Medical imaging · Credit risk assessment · Legal research automation · Adaptive learning platforms · Student support agents · Personalized tutors · Course design assistants

Module 4

Use Cases

The hardest part of AI adoption is identifying the right use cases. There are three core patterns.

Pattern 1

Create ✨

Compose net-new content from scratch. Tell me a joke · draft an email · explain how AI works · generate an image · write a song.

Pattern 2

Augment 🔧

Improve or ground existing content using your own sources. Summarize a transcript · check grammar/tone · highlight anomalies in data · review an email.

Pattern 3

Collaborate 🤝

Co-think to brainstorm, compare options, and decide. Draft a business case · analyze sales data · write executive summaries · explore trade-offs.

Prompting Example: Welcome Email

See how the quality of a prompt transforms the output — starting with too little context, then with more specifics, then using augmentation.

Your Prompt

Write me an email welcoming a new hire

AI Output — Generic & Too Long

Subject: Welcome to Our Team!

Dear [New Hire's Name], I hope this email finds you well. On behalf of the entire team, I want to extend a warm welcome to you as our newest team member…

[Full of placeholder text, generic advice, far too verbose. Not useful without significant editing.]

Your Prompt (Better)

I'm the hiring manager for a software developer position and I've selected a candidate whom I've already interviewed. Please write me a concise welcome email for Cathy, this new hire. The start date is June 1st and she will be working in EC2.

AI Output — Better But Still Generic

Subject: Welcome to Our Team, Cathy!

Dear Cathy, I am thrilled to welcome you to our team as a software developer! Your start date is June 1st, and you'll be working with our EC2 team.

[Still uses some generic filler paragraphs about company culture. The format is imposed by the AI, not you.]

Your Prompt (Best — Augment your own draft)

Hi Cathy,

Welcome to the team! We are very excited to have you. As we discussed in the interview, the start date is June 1st and the team works out of EC2. Work hours are 8:30am – 4:30pm. We will do a full onboarding and orientation on your first day so you don't need to worry about preparing anything ahead of time.

Thank you and looking forward to seeing you on June 1st!

----
Here is a welcome email for a new hire. Can you please check it for grammar and tone?

AI Output — ✅ Polished, Your Voice, Right Length

Hi Cathy,

Welcome aboard! We're thrilled to have you join our team. As we discussed during the interview process, your start date is June 1st, and our team operates out of EC2. Your work hours will be from 8:30 am to 4:30 pm. On your first day, we'll provide a comprehensive onboarding and orientation, so there's no need to prepare anything in advance.

Feel free to reach out if you have any questions or need assistance before your start date. Looking forward to meeting you in person on June 1st!

Best regards,

The lesson: The V3 approach starts with Create (you write the draft) and ends with Augment (AI polishes it). GenAI tools do best when provided with examples and boundaries — and your voice is preserved. This is also why sharing use-cases with colleagues accelerates learning for your entire team.

⚠️ A Warning About Accuracy

Asked to generate a chronological list of influential GenAI tools, an AI confidently produced dates — some wrong, some out of order. It listed Claude 3.5 Sonnet (June 2024) after Claude 2 (Jul 2023), but placed Claude 3 (Mar 2024) after Claude 3.5 Sonnet. Models can confabulate plausible-sounding but incorrect information.

Always verify AI outputs — especially dates, statistics, names, and citations.

Module 5

Prompting Techniques

Getting better results from AI tools is a skill. Here are six techniques to use immediately.

Vague prompts get vague answers. Use positive instructions rather than negatives. Specify format, length, and audience.

❌ Bad

Tell me about climate change.

Write something about AI.

Don't make it too long or too technical. Don't use jargon.

✅ Good

Summarize the main causes and effects of climate change on coastal cities with recent scientific data.

Write a 150-word blog post explaining how generative AI is being used in university classrooms in 2025.

Create a 10-slide presentation for a general audience on AI basics. Keep language simple, use visuals, and aim for a story-driven flow.

Zero-Shot

No examples

Quick and easy. High reliance on the model's own knowledge. Best for straightforward tasks.

One-Shot

One example

A bit more effort, but useful for bounding the response type and format. Guides the model with a single reference.

Few-Shot

3–5 examples

Most powerful for complex tasks. Bounds context, format, patterns. Requires quality examples. Best for structured outputs.

Tell the model who it is and what situation it's in. The more context, the more tailored the response.

Role prompting

"You are a helpful writing assistant…" | "You are a senior software developer…" | "You are a travel agent…"

Context prompting

Context: You are building a website for the University of Waterloo which will promote an event for international students arriving in September.

Suggest some design ideas for the landing page layout, keeping accessibility and mobile-first design in mind.

Get the model to answer a general question about your ask first, then feed that context into your specific question.

Step 1 — Direct attempt (often not optimal)

"You are an HR assistant. Give me a job description for a senior web developer based on the following format {attached document}"

Step 2 — Step back to the general question

"What are some of the most important skills that a senior web developer should possess, both technical and non-technical?"

Step 3 — Use the general answer as context for your specific ask

"Context: {response from Step 2, maybe modified}
Give me a job description for a senior web developer based on the following format {attached document}"

If you reuse a prompt often with slight changes, define a variables section at the top.

Template

VARIABLES
{project} = "student information system upgrade"
{stakeholder} = "faculty administrators"

PROMPT
You are a senior project manager.
Provide a status update for the {project} tailored to {stakeholder}.
Highlight current progress, key risks, and upcoming milestones for the {project}.
Then recommend communication strategies to keep {stakeholder} engaged and informed throughout the remainder of the {project}.

Prompt Generation

Ask AI to write your prompts

"I need to generate a new job description for a senior web developer. Give me 10 different prompts I can use with an LLM."

"Give me the strengths and drawbacks of these prompts."

Code

You don't need to be a developer

"I have an Excel table with columns: name, count, price. How do I highlight anomalous values?"

"I copy a file from a network share every morning. Help me write code to automate this. Now walk me through what I do with it."

Key principle: GenAI tools do well when provided with examples and boundaries (dos and don'ts). Don't just ask — show the model what success looks like.

Module 6

Tools & Selection

Choosing the right tool matters. Understanding tool types and service models will help you work smarter.

Tool Types

Assistant

Simple Q&A — text, images, music, video, code. Answers right away without extended thinking.

Reasoner

Complex multi-step thinking. "Think deeper" mode. Deep research. Slower but more accurate for hard problems.

Agent

Any combo of above + ability to act using tools. Can browse the web, run code, read files, send emails.

AI-Enabled Apps

Browsers, media generators, specialized productivity tools with AI built in to an existing workflow.

Service Types

Free SaaS

Good to get started. Typically restricted in message limits, model access, or features. Good for exploration.

Paid SaaS

More advanced features and/or higher capacities. Access to latest models and reasoning modes.

Open Source

Free with full control over infrastructure and software. High complexity — requires technical expertise to deploy and maintain.

The Main Tools

🪟

Microsoft Copilot

✓ UW Approved

office.com/chat · Enterprise data protection applies. University-approved with restricted data support. Auto, Quick response, and Think deeper modes.

Assistant Reasoner Agent Public ✓ Restricted ✓

💬

OpenAI ChatGPT

Public Only

chatgpt.com · Current model: GPT-5.2. Modes: Instant (quick), Thinking (complex), Pro (research-grade). Limits reached on free tier.

Assistant Reasoner Agent

✨

Google Gemini

Public Only

gemini.google.com · Current model: Gemini 3. Fast, Thinking (complex problems), and Pro (advanced math/code) modes. Multimodal: image and music creation.

Assistant Reasoner AI-Enabled Apps

🤖

Anthropic Claude

Public Only

claude.ai · Models: Haiku 4.5 (fastest), Sonnet 4.6 (everyday tasks), Opus 4.6 (most capable). Extended thinking mode available. Currently #1 on agentic benchmarks.

Assistant Reasoner Agent

🦙

Ollama (Open Source)

Self-hosted

ollama.com · Run open-source models locally: Qwen3, LLaMA 3.1, Phi4, and many more. Full control — your data never leaves your machine. High technical complexity to set up.

Assistant Reasoner Open Source

UW Policy 46 — Data Classification Reminder: Public information → any AI system. Confidential/Restricted → University-approved systems only (Copilot). Highly-Restricted (SIN, PHI, credit card) → no AI systems. When in doubt, use Copilot.

uwaterloo.ca/genai/ · aitools.uwaterloo.ca

Module 7

The Adoption Journey

AI adoption is a people challenge as much as a technology challenge. There's a proven path to an AI-enabled team.

01

Phase 1

Learn

Foundational AI training + use-case identification. Online courses, personal experimentation, trial and error.

Roadblock	Countermeasure
Lack of interest	Make it relevant · Leadership & peer engagement · Adoption curve awareness
Lack of time	Empower champions · Protected learning time · Give challenges
Fear of making mistakes	Leadership messaging (experimentation is expected) · Sandbox environments

02

Phase 2

Iterate

Habit forming and use-case scaling. Repeated use leading to refinement. Understanding tool differences and the intersection with automation.

Roadblock	Countermeasure
Productivity ≠ quality	Definition of done · Spot checks/peer review · Set quality benchmarks
Data access concerns	Green light zones (safe data areas)
Tool & integration decisions	Approved tools list

03

Phase 3

Standardize

Team transformation with confident AI use — the AI Enabled Team. Usage is habitual, team self-evaluates, adoption of new tools is openly discussed.

What it looks like	Key resources
AI use is a daily habit	Responsible AI Principles
Team self-evaluates & iterates	AI Tools & Data Guidelines
New tools openly discussed	Approved tools list + green light zones

Wrap-Up: Key Principles

The tech is accelerating — nobody is an expert
Use-case discovery is hard work
You must make time for experimentation
Share amongst team to accelerate
Define green light zones & approved tool lists

Leadership must be engaged
Empower champions — find your early adopters
Set definition of done and quality benchmarks
Productivity gains ≠ quality gains — spot-check outputs
Share use-cases — it's the fastest way to learn

Module 8

Agents & Automation

What makes an AI agent different from a chatbot — and what are the risks?

From Chatbot to Agent

CHATBOT — Basic Architecture

👤 User

⟺

Website / App

⟺

🤖 Model(s)

AGENT — Extended Architecture

👤 User

⟺

Website / App / Agent
orchestrates everything below

🤖 Model(s)

🌐 The Internet

📁 Internal Knowledgebase

🧠 Memories

⚙️ Tools / APIs

An agent = Model + Tools + Data + Memory

Real example: Curricular overlap analysis using ChatGPT Agent. Given two program curricula, the agent browsed the university's course catalog, compared learning outcomes across courses, identified overlaps, and produced a structured report — all autonomously from a single prompt.
→ View the ChatGPT Agent conversation

Risk Areas — Click to reveal mitigations

⚠️ Misaligned Goals

Agent optimizes the wrong objective or interprets intent incorrectly.

✅ Mitigation: Define clear objectives, constraints, and success criteria before deployment. Be specific about what the agent should not do.

⚠️ Uncontrolled Actions

Agent takes harmful or unauthorized actions without human approval.

✅ Mitigation: Limit permissions and require explicit human approval for high-impact actions. Use sandboxed environments for testing.

⚠️ Data Privacy & Security

Agent accesses or exposes sensitive information it shouldn't.

✅ Mitigation: Apply least-privilege data access principles. Continuous monitoring controls. Never feed restricted data to non-approved systems.

⚠️ Lack of Accountability

Unclear ownership when the agent causes harm or makes errors.

✅ Mitigation: Assign a named business owner, technical owner, and risk owner for every agent in production. Document decisions.

⚠️ Over-Trust & Reduced Oversight

Overreliance on the agent without sufficient human review or judgment.

✅ Mitigation: Design meaningful human review checkpoints. Train users on model limitations. Spot-check outputs regularly even when the system seems reliable.

Module 9

Challenges & Data Awareness

AI tools are exciting — but they come with real responsibilities. Understanding these challenges helps you use AI critically and safely.

🎯

Data & Accuracy

Models can hallucinate — generating confident-sounding but incorrect information. Training data reflects historical biases. Always verify outputs, especially dates, statistics, and citations.

Bias Hallucinations

©

Explainability & Copyright

It's often impossible to explain why a model produced a specific output. Training data may include copyrighted material. AI-generated content raises authorship questions.

Accountability Value Alignment

🌳

Resources & Sustainability

Training and running large models consumes significant energy and water. Cost of inference is falling, but aggregate demand is rising. Sustainability is a growing concern.

Cost Energy

UW Policy 46 — Information Classification

An AI-enabled tool is just another information system owned by a vendor. The same data rules apply.

🟢 Public

Use in ANY AI system ✓

University publications

Websites without authentication

Social media channels

University Calendar

Published RFPs

Salaries under Ontario PSSDA

🟡 Restricted / Confidential

University-approved systems only (Copilot)

Personal information (FIPPA/PIPEDA)

WatIAM username

Student/Employee Number

WatCard number

IP Address

Data that can identify a person when combined

🔴 Highly Restricted

NO AI systems — ever ✗

Social Insurance Numbers (SIN)

Protected Health Information (PHI)

Credit card / financial data

Passwords & secrets

Legal privilege information

Remember: An AI-enabled tool is just another information system owned by a vendor. When uncertain about data classification, default to Copilot for work tasks — it provides enterprise data protection.

Resources & Next Steps

For Work — Use Microsoft Copilot

Your university-approved, data-protected tool. Available at office.com/chat.

Explore & Learn

Try tools in a safe environment. Take the ITPD courses: Intro to AI & Advanced AI for Everyone.

Generative AIfor Everyone

How It Works

The AI Hierarchy

How Does It Generate Content?

Model Growth & Cost Trends

AI Capabilities

Agentic Task Horizon (50% Success)

Agentic Task Horizon — Timeline

AI Agent vs. Industry Professional

What Else Can It Do?

Use Cases

Prompting Example: Welcome Email

⚠️ A Warning About Accuracy

Prompting Techniques

Tools & Selection

Tool Types

Service Types

The Main Tools

The Adoption Journey

Learn

Iterate

Standardize

Wrap-Up: Key Principles

Agents & Automation

From Chatbot to Agent

Risk Areas — Click to reveal mitigations

Challenges & Data Awareness

UW Policy 46 — Information Classification

Resources & Next Steps

Generative AI
for Everyone