Core Content — Parts 1–4
Part 1 — Know What You Hold: Data Inventory & Mapping
You cannot protect, classify, or feed to AI what you cannot see. A data inventory is simply a written list of every place your organization holds data, what kind it is, who owns it, where it lives, and how long you keep it. It is the single most useful document this module produces — and most non-profits have never made one.
Walk your organization function by function and write down every data source: intake forms, case notes, the donor database, the email newsletter list, the volunteer roster, grant-reporting spreadsheets, the shared inbox, paper files in a cabinet. For each, capture where it lives (which system, cloud or on-premises or paper), who owns it (a named person, not "the team"), what's in it, and how sensitive it is. This is exactly the practice enterprise data teams call an "AI-ready data inventory" — you are doing the same thing, just lean (BigID, 2025; NonProfit PRO, 2025).
DATA INVENTORY — what feeds the map
┌─────────────┬──────────────┬──────────────┬───────────────┐
│ Source │ Where it │ Owner │ Sensitivity │
│ (what) │ lives │ (named) │ (P/I/C/HS) │
├─────────────┼──────────────┼──────────────┼───────────────┤
│ Case notes │ Apricot │ Program Dir. │ Highly Sens. │
│ Donor gifts │ Raiser's Edge│ Dev. Director │ Confidential │
│ Newsletter │ Mailchimp │ Comms lead │ Internal │
│ Annual rpt │ Website │ Comms lead │ Public │
└─────────────┴──────────────┴──────────────┴───────────────┘
A one-page inventory that names a human owner for every data set. Ownership is what makes a rule enforceable — "everyone is responsible" means no one is.
An expensive data-governance platform bought before anyone has written down what data the organization actually holds. Tools do not create the inventory; people do. Start with a spreadsheet.
Don't aim for perfect on day one. A first inventory listing your top 10–15 data sources, made in 45 minutes, is worth more than a "complete" one that never gets finished. Mark gaps as "unknown — to investigate" and move on.
4D tie-in — Delegation: You cannot decide what to safely hand to AI (Delegation) until you know what each data set contains and how sensitive it is. The inventory is the precondition for every other decision in the kit.
Part 2 — Label the Risk: Sensitivity Classification, Ownership, Access & Retention
Once you can see your data, sort it by how much harm a leak would cause. A simple four-level scale is enough for almost every non-profit:
| Level | Meaning | Non-profit examples |
|---|---|---|
| Public | Already published; no harm if seen | Annual report, public program descriptions, press releases |
| Internal | Routine internal data; mild embarrassment if leaked | Staff rosters, meeting notes, general newsletter list |
| Confidential | Real harm if leaked; donor/financial trust | Donor giving history, board minutes, budgets, contracts |
| Highly Sensitive | Could endanger a person; legal/ethical duty | Immigration status, health/mental-health records, financial distress, abuse/safety details, children's data, biometric or legal data |
The "Highly Sensitive" row is the heart of this module. These are the categories that, if exposed, can cost someone their housing, their safety, their immigration case, or their dignity. Vera Solutions' responsible-AI principles put Privacy & Data Protection at the center: collect only what you need, anonymize or encrypt sensitive data, and follow the regulations that apply to you (Vera Solutions, 2024). Data minimization — collecting only what is absolutely necessary — is the cheapest protection you have (ASU Lodestar, 2025).
For each data set, also write down two more things:
- Access — who can see it, on a strict need-to-know basis with role-based permissions, reviewed periodically (ASU Lodestar, 2025).
- Retention — how long you keep it before secure deletion. AI-ready organizations enforce retention and keep only what is necessary, accurate, and current (Transcend, 2025; Striim, 2025).
CLASSIFY → then apply the right protection
Public ───────► share freely
Internal ─────► internal access; default-deny external
Confidential ─► role-based access · encryption · NEVER public AI
Highly Sens. ─► strict need-to-know · consent check · NEVER any
external AI · elevated protection · audit
Any data set classified Confidential or Highly Sensitive is automatically off-limits to public/consumer AI tools — no exceptions, no "just this once." Classification and the cardinal rule (Part 3) are the same decision viewed from two angles.
When in doubt, classify up, not down. It costs nothing to over-protect an internal memo; it can cost someone everything to under-protect a case note. The default for anything about a beneficiary's situation is Highly Sensitive until proven otherwise.
Do / Don't
| Do | Don't |
|---|---|
| Tag every data set with one of four levels | Invent fifteen levels nobody will remember |
| Name a human owner per data set | Leave ownership as "IT" or "the team" |
| Set a retention date and stick to it | Keep everything forever "just in case" |
| Review access when staff leave (high turnover!) | Leave a departed staffer's login active |
Part 3 — The Cardinal Rule, Consent, and the Agreements Around Your Data
The cardinal rule, stated plainly: Never paste sensitive beneficiary or donor data into public AI tools. Candid frames it as the test that anyone can apply: do not enter personally identifying or confidential information, legal documents, passwords, or anything you wouldn't paste into a public website (Candid, 2025). Public AI tools can store your input and use it to train future models, which means you can lose ownership and control of that data the moment you hit send (ASU Lodestar, 2026). This is the one rule every staff member must know cold — it is the spine of the all-staff rule card (Part B and the Templates section).
Consent — does your existing language cover AI? Most non-profit consent forms were written before generative AI existed, so they almost never mention it. Good consent is freely given, informed, specific, documented, and obtained in advance — and you should explain in plain terms what data you collect, why, and how it is used, offer opt-outs for non-essential collection, and renew consent periodically (GDPR principles; ASU Lodestar, 2025). Before any AI touches beneficiary data, check: does the consent the person signed actually cover this use? If not, you need new language or you do not proceed. (Module 7 covers trauma-informed, culturally appropriate consent in depth.)
Does GDPR or HIPAA apply to you? (plain-language version)
- GDPR applies to any organization that holds personal data about people in the EU/EEA — regardless of where your office is. International programs, EU donors, or EU beneficiaries pull you in (Usercentrics, 2024; Foundation Group, 2025).
- HIPAA applies only if you are a "covered entity" — a health-care provider, health plan, or their business associate handling protected health information. A health clinic or hospital foundation is covered; a general social-services charity usually is not — but if you hold health data, treat it as Highly Sensitive regardless (Foundation Group, 2025).
- U.S. state laws are multiplying fast: by mid-2025, 13 states had comprehensive privacy laws, with more taking effect through 2025 (501c3.org, 2025). You do not need to memorize them — you need a named person responsible for staying informed (ASU Lodestar, 2025).
Data-sharing agreements with funders, partners, and contractors. Whenever data leaves your walls — to a government funder, a partner agency, a contractor, or an AI vendor — there should be a written agreement that spells out: who plays which role (data "controller" vs. "processor"), the exact purpose and the exact data shared, security requirements (encryption, access controls), breach-notification timelines, sub-processing rules, audit rights, and what happens to the data at the end (deletion or return) (ContractsCounsel, 2024; GDPR.eu DPA template). Vet every third-party provider's own data practices and put protective clauses in the contract (ASU Lodestar, 2025). For AI vendors specifically, the make-or-break clause is "this vendor will not train on our data" — that single line is what separates an enterprise tool from a consumer one (covered in depth in Module 3).
Before sharing any beneficiary data with a partner or contractor, confirm two things: (1) a signed data-sharing agreement exists, and (2) the beneficiary's consent actually covers that sharing. Missing either one means stop.
"It's fine, we trust them." Trust is not a control. A two-page agreement and a consent check protect the relationship and the people in the data — and they protect you if something goes wrong.
4D tie-in — Diligence: Consent checks, signed agreements, and the cardinal rule are Diligence in practice — verifying and taking ownership before data moves, not apologizing after.
Part 4 — "AI-Ready Data" and the Data-Hygiene Clinic
What "AI-ready" actually means. Data is AI-ready when it is structured (in fields, not buried in free-text or paper), consistent (the same thing is recorded the same way every time), complete (few blanks, minimal duplicates), and pulled toward a single source of truth rather than scattered across disconnected systems (Alteryx; Transcend; NonProfit PRO, 2025). For a non-profit, the practical version is: one place where each fact lives, recorded the same way each time.
NonProfit PRO's five practical steps to an AI-ready data foundation map cleanly onto lean teams (NonProfit PRO, 2025):
1. ASSESS → where does data live? siloed or unified? who owns quality?
2. PRIORITIZE → pick ONE use case / pilot, not everything at once
3. CONSOLIDATE → reduce silos toward a single source of truth
4. STANDARDIZE → clean: dedupe, fix fields, agree shared metric definitions
5. GOVERN → access controls, privacy rules, named ownership
The clinic — data hygiene on a real system. Whatever you use — Salesforce Nonprofit Cloud / NPSP, Blackbaud Raiser's Edge NXT, Bonterra Apricot, or just spreadsheets — the same problems show up: duplicate records, names in ALL CAPS, misspellings, inconsistent field values, and the same metric defined three different ways by three teams (Omatic, 2024). Apricot is built around the operational case record (intake, case notes, service history); NPSP organizes data into accounts, contacts, opportunities, and campaigns; Raiser's Edge NXT includes built-in validation tools. In every one, a basic hygiene pass means: merge duplicates, standardize key fields, fill or flag critical blanks, and write down one agreed definition for each outcome metric so "served" means the same thing across every project (Omatic, 2024; PairSoft).
Why this matters for M&E: AI-assisted impact analysis (Module 8) only works if your outcome data is tracked consistently across cohorts and time. Vera Solutions built Salesforce-based M&E tracking outcomes for millions of beneficiaries precisely because consistent, structured outcome data is the prerequisite for any analysis — by hand or by AI (Vera Solutions). A clean data catalog with shared metric definitions is what lets every team "reference the same shared understanding" (NonProfit PRO, 2025).
Your team can answer "how many people did we serve last quarter?" the same way no matter who you ask, because the metric has one written definition. That consistency is worth more to AI-readiness than any new tool.
Cleaning is not a one-time event. Schedule a recurring hygiene pass (quarterly is realistic for lean teams) and assign it to the data owner. A little maintenance beats a heroic annual cleanup nobody has time for.
4D tie-in — Description: Clean, consistently-labeled data is what lets you describe a task to AI precisely. Garbage in, confident-garbage out — Discernment can only catch so much if the underlying data is a mess.