Uncommon Insights
AI Optimization
AI Optimization

Ethics in AI for Business: Four Operator Gates Before Launch

Air Canada's chatbot told a grieving customer there was a bereavement-fare refund policy. The policy did not exist. The customer booked the flight, paid full price, and tried to claim the refund the bot had promised. Air Canada refused.

9 min read · 5 January 2026

Ethics in AI for Business: Four Operator Gates Before Launch

Ethics in AI for Business: Four Operator Gates Before Launch

Air Canada's chatbot told a grieving customer there was a bereavement-fare refund policy. The policy did not exist. The customer booked the flight, paid full price, and tried to claim the refund the bot had promised. Air Canada refused. The customer took it to a small-claims tribunal. The tribunal ruled the airline liable for its chatbot's hallucination, ordering an $812 refund and explicitly rejecting the airline's argument that the chatbot was a separate legal entity from the airline.

Eight hundred and twelve dollars. That is the headline number. The legal and reputational tail is the real cost. Every chatbot deployment in the English-speaking world now operates against this precedent. Every $1M to $10M ecommerce brand running a generative-AI customer service layer or AI-drafted product copy is exposed to the same liability geometry. Most operators do not know the precedent exists, let alone what to do about it.

The Retrofit Problem That Produces Refund Storms

The standard pattern at $1M to $10M brands is to plug generative AI into product copy, support chat, and pricing decisions, ship it, and worry about ethics if something goes wrong. That sequencing is the failure. By the time the first hallucinated product description, biased recommendation, or wrong-policy chatbot answer surfaces, the harm has already happened. The operator is now triaging a refund storm, a regulator letter, or a viral customer-complaint thread, and the ethics work that should have been done pre-launch is being done in crisis mode.

Air Canada chatbot ruling is the precedent that anchors the operator-side risk picture. The tribunal's rejection of the "chatbot is a separate entity" defence means brands cannot disclaim AI-generated statements as if they were a third party's words. The brand owns the bot's output. Air Canada hallucinations walks through the operator-grade case mechanics and the refund flow that followed.

Moffatt v Air Canada analysis goes deeper into the legal reasoning, which matters for the disclosure-rule gate. The tribunal found that Air Canada had not adequately disclosed that the chatbot's answers might be unreliable, and that the customer reasonably relied on the bot's stated policy. The disclosure question is now the load-bearing legal test for chatbot-driven misrepresentation.

The FTC has been running parallel enforcement on the deceptive-AI-claims side. FTC Operation AI Comply detailed the 2024 crackdown on deceptive AI claims, including five specific ecommerce-adjacent enforcement actions. FTC Operation AI Comply 2025 confirms that the enforcement focus has carried into the new administration, with continued investigation of brands making unsupported AI-capability claims in marketing.

Holland Knight FTC AI frames the bias-test side of the picture. The FTC has signalled that AI tools producing biased outputs (in pricing, in recommendations, in customer service triage) face the same Section 5 unfairness test as any other business practice. Brands that ship AI without pre-launch bias testing are running an unfairness-claim exposure they may not see coming.

FTC AI enforcement is the hub page that operators should bookmark. The agency's posture has hardened across pricing AI, recommendation AI, and customer-service AI. The pattern across enforcement actions is consistent: brands that documented their pre-launch testing and disclosure mechanics get faster, lighter outcomes; brands that did not get the slower, heavier ones.

The Pre-Launch Guardrail Protocol

I call the fix The Pre-Launch Guardrail Protocol. It is four operator-grade gates that every customer-facing AI tool has to clear before launch, with named owners, documented evidence, and SLAs.

Gate one is bias testing. Before any AI tool ships into pricing, recommendation, support, or copy generation, run a bias test against a representative sample of customer demographics, geographies, and use cases. The output is a documented test result that demonstrates the tool's outputs do not systematically disadvantage protected classes, geographic regions, or customer cohorts. The named owner is usually the head of data or the senior engineer responsible for the AI tool. The SLA is that bias-testing evidence is filed before the launch decision is made, not after.

Gate two is the disclosure rule. Every customer-facing AI interaction includes a clear, plain-language disclosure that the customer is interacting with an AI system or seeing AI-generated content. The disclosure has to be at the point of interaction, not buried in a footer or terms-of-service document. The Air Canada precedent makes this gate non-optional. Lathrop AI claims walks through the transparency standard the FTC is applying. The named owner is the legal or compliance lead, with the marketing or product lead as the executor. The SLA is that the disclosure is reviewed and approved by counsel before the AI tool ships.

Gate three is the human-review threshold. For any AI output that creates a customer-facing commitment (pricing decisions, refund policies, product claims, fitment guarantees), define the threshold above which a human reviews the output before it ships to the customer. For low-stakes interactions (a generic FAQ answer, a product description on a long-tail SKU) the threshold can be high. For high-stakes interactions (a refund commitment, a custom quote, a policy interpretation) the threshold should be near-zero, with human review on every case. The Air Canada chatbot would have failed this gate. The tribunal made clear that "the AI said it" is not a defence when the customer reasonably relied on the AI's stated policy.

Gate four is incident logging. Every AI tool ships with a logged record of every output, every customer interaction, and every flagged exception. The log is the audit trail when (not if) the FTC, a state AG, a customer's lawyer, or a journalist asks how the system was tested and what it has been doing in production. The log retention follows the brand's data-retention policy, with a minimum of 12 months. The named owner is the data lead, and the SLA is that the log is queryable in under 24 hours when an incident is raised.

I have walked four ecommerce brands through The Pre-Launch Guardrail Protocol in the last 12 months. The consistent finding is that the bias-testing and disclosure-rule gates are the ones operators most often skip, and the human-review threshold is the one that gets most often watered down between drafting and launch under pressure to "ship the AI". Hold the line on all four.

Phase 1: Bias Test and Disclosure Rule (Days 1-30)

Phase 1 ships the two gates that cannot be bolted on after launch.

Days 1 to 14 are the bias-testing build. Define the protected classes, geographies, and customer cohorts that the AI tool's outputs will affect. Build a representative test set across those dimensions. For a pricing AI, that means test prompts spanning customer demographics inferable from order history, delivery geographies, and product categories. For a recommendation AI, that means test inputs across customer cohorts (new versus returning, high-LTV versus low-LTV, by gender if the product category infers it). The test output is a documented analysis showing whether the AI's outputs systematically vary across the dimensions in a way the brand cannot defend on a non-discriminatory basis.

Days 15 to 21 are the disclosure rule. Write the customer-facing disclosure. Get counsel sign-off. Place the disclosure at the point of interaction (chat-widget header, product-description tag, recommendation block label). The disclosure language should be plain English: "This response was generated by an AI assistant" or "This product description was AI-drafted". Avoid lawyer-speak. The Air Canada ruling turned in part on whether a reasonable customer would understand they were getting AI output. Plain English is the safest answer.

Days 22 to 30 are the launch decision gate. The legal lead, the head of data, and the operations lead sit down with the bias-testing results and the disclosure rule. The output is one of three: ship (gates passed, launch authorised), iterate (specific failures identified, ship blocked until fixed), or kill (the AI tool's risk profile cannot be made acceptable inside a reasonable timeframe). The discipline at this gate is what separates brands that ship AI safely from brands that ship AI and apologise later.

The KPI for Phase 1 is gate-passage rate. Every AI tool has to clear both gates before launch. Brands that let the bias test slide because "we did not have time" or skip the disclosure rule because "the bot will look less professional" are the ones who end up writing the apology letter to a regulator six months later.

Phase 2: Human-Review Threshold and Incident Log (Days 31-60)

Phase 2 builds the production-time guardrails. These two gates have to be live before the AI tool sees its first real customer.

Days 31 to 45 are the human-review threshold. For each AI tool, document the categories of output that require human review before customer delivery. Map each category to a numeric threshold (the AI's confidence score, the dollar value of the customer-facing commitment, the policy area being interpreted). Wire the routing so that outputs above the threshold pause for human review automatically. The named owner is the operations lead, and the review SLA depends on the use case (real-time chat: review under 60 seconds; product description batches: review within 24 hours; pricing decisions: review within the same business day).

Days 46 to 55 are the incident log build. Stand up logging on every AI tool's inputs, outputs, customer interactions, and flagged exceptions. Use a structured format (JSON, columnar) that supports query within 24 hours when an incident is raised. The retention period is 12 months minimum, with quarterly archiving for high-volume tools. The data lead owns the log. The legal lead has standing query access.

Days 56 to 60 are the incident-response runbook. Document the steps the team takes when an AI incident is raised: who gets notified, how the AI tool is paused (kill switch within 5 minutes of incident raised), how the customer is contacted, how counsel is engaged, and how the log is preserved. The runbook is rehearsed at least once before the AI tool goes live and quarterly thereafter. Brands that skip the runbook end up making it up under pressure during their first incident, which is the worst possible time to design a process.

A subtle Phase 2 rule that pays for itself is the named-owner discipline at the threshold gate. When the human-review threshold has no named owner, reviews drift in cadence and quality, and the AI tool starts shipping outputs that should have been paused. The named owner has to be a single person, with a documented backup who picks up the review queue when the primary owner is on leave. Two-deep coverage is non-negotiable for any tool that affects customer-facing commitments.

The other rule worth holding the line on is the disclosure-rule audit. Customer-facing disclosures drift over time as marketing teams iterate on the chat widget, the PDP design, or the support-page layout. Run a quarterly audit on every active disclosure surface. If the disclosure has been buried, shrunk, or removed, it gets restored before any further AI output ships through that surface. Brands that skip the audit are not being malicious; they are letting design drift quietly erode the legal protection the disclosure was meant to provide.

The New North Star: Pre-Launch Gate Passage

Stop measuring AI ethics by abstract principle or post-launch incident count. Start measuring it by gate-passage rate before launch and incident-response quality after launch.

A brand running The Pre-Launch Guardrail Protocol with 100 percent gate passage and a documented incident-response runbook has a defensible position when the next regulator letter arrives. A brand without those mechanics has the position Air Canada had: the AI did it, and the tribunal rejected the defence. Eight hundred and twelve dollars per incident, multiplied by every customer who relied on a bad AI output, plus the legal costs and the brand-trust drag.

The Pre-Launch Guardrail Protocol is not ethics philosophy. It is four checklist gates with named owners and SLAs. Brands that ship AI without the gates are not "moving fast". They are running an exposure they have not measured. The brands running the discipline get the AI value with the legal and brand risk contained, and that is the only operator-grade definition of ethics that matters in 2026.

Free tool · put it to numbers

Unit Economics Calculator

Contribution margin per order after COGS, shipping and fees — the number scaling actually depends on.

Open calculator →

Newsletter

The Uncommon Insights Letter

Practical FMCG & eCommerce growth playbooks — margins, retention and scaling tactics, straight to your inbox.

No spam. Unsubscribe anytime.

Put it to work

Turn ai optimization into profit you can see

Get a hands-on operator to turn the frameworks above into results — book a free audit call.