Uncommon Insights
AI Optimization
AI Optimization

Voice Search Optimization Built for Transactional Intent

Most of what gets written about voice search for ecommerce is built for the wrong device.

10 min read · 3 February 2026

Voice Search Optimization Built for Transactional Intent

Voice Search Optimization Built for Transactional Intent

Most of what gets written about voice search for ecommerce is built for the wrong device. The standard playbook chases featured snippets on Alexa and Google Home for informational queries about how to do things, what something is, or where something comes from. That work is real, but it is not where physical product brands earn revenue from voice. The actual voice money is moving through Google Assistant on Android phones and Siri on iPhones, where the shopper is asking a transactional question on a device that already has their payment credentials, their shipping address, and their app stack. The brand that ranks for "buy organic coffee beans 250g" inside Google Assistant earns the order. The brand that ranks for "what is single origin coffee" on Alexa earns nothing.

The gap between those two queries is the gap between informational and transactional intent, and it is the central blind spot in almost every voice-search guide you can find in 2026.

The Smart Speaker Trap That Wastes Voice SEO Effort

Voice-commerce research notes that pages correctly applying Google's Speakable schema have a meaningfully higher chance of being selected as a source for Google Assistant. Voice commerce 2026 walks through the schema mechanics and the ecommerce-specific build steps. Most DTC brands skip the schema markup entirely, which means the brand is leaving an open lane on transactional voice queries that the platform itself prioritises.

The reason brands skip Speakable is that the standard voice-search advice tells them to focus somewhere else. Generic checklists push FAQ pages, conversational keywords, long-tail informational targets, and snippet wins. All of that work is calibrated for smart speaker queries, which are dominated by informational and navigational intent. The shopper asking Alexa a question is rarely about to buy. The shopper asking Google Assistant on their phone, with the shopping app suite installed, often is.

Voice commerce 2026 breaks down the 2026 view of how Alexa and Siri actually surface purchase decisions. The honest read is that smart speaker shopping has plateaued for physical goods, while in-app voice shopping through Google Assistant and Siri Shopping has continued to grow. The traffic the brand wants is on the phone, not on the speaker. The standard playbook keeps optimising for the speaker because that is where the snippet wins are easiest to measure.

Admetrics voice DTC covers the DTC-specific context and surfaces the same gap. Brands invest in voice-search SEO without separating intent layers, the work concentrates on the easiest snippet targets, and the transactional surface goes uncovered. The brand reports voice-search wins to leadership. Leadership reads voice-attributed revenue and finds none. The two reports are about different surfaces and different intents, and neither team realises until the renewal review.

The schema gap compounds the problem. Speakable schema, structured product data, price markup, availability markup, and variant attributes are the four signals an in-app assistant needs to read a product result aloud and route the shopper into a purchase flow. Wizzy ecommerce voice documents the AI-assistant context for ecommerce voice ranking and lines up which structured data each assistant prioritises. Brands that ship the FAQ pages without shipping the structured product data are optimising the informational layer and leaving the transactional layer naked.

You might think the structured data work is something the platform takes care of automatically. It does not. Shopify ships baseline product schema. The Speakable schema, the structured FAQ markup, and the variant-level price and availability attributes are operator-side work. The brands that have done it have a working transactional voice channel. The brands that have not done it have a voice-search SEO investment that produces snippets nobody buys from.

The Voice Intent Architecture

I call the fix The Voice Intent Architecture. It is a three-layer separation of voice work, with the SEO investment concentrated where physical product purchases actually happen. Every voice-search initiative gets sorted into one of three intent layers: informational, navigational, or transactional. The brand's effort, schema, and content production weight toward the transactional layer.

The informational layer is the FAQ-and-content surface aimed at smart speakers and assistant general-knowledge queries. The work is real but the revenue contribution is small for physical products. The architecture allocates 10 to 20 percent of voice-search effort here, no more.

The navigational layer is the brand-name and product-line query surface, where the shopper asks for the brand directly and expects to be routed to the brand's site or app. The work is mostly hygiene: brand schema, accurate Google Business Profile data, app linking, and consistent NAP information across directories. The architecture allocates another 10 to 20 percent of effort here, focused on making sure the brand is reachable when asked for by name.

The transactional layer is where the architecture concentrates the remaining 60 to 80 percent of effort. This is the in-app assistant surface where Google Assistant on Android and Siri on iOS read product results aloud, including price, availability, and variant attributes. Every PDP gets Speakable schema. Every product gets structured price markup with currency. Every variant gets availability markup at the SKU level. Every product line gets the variant attributes (size, colour, capacity) marked up so the assistant can read them aloud and let the shopper specify a variant by voice.

I have walked The Voice Intent Architecture through brand stacks across apparel, supplements, and small appliances. The pattern at the start of the rebuild is consistent. The brand has a reasonable FAQ presence, decent informational ranking, and almost no Speakable or variant-level structured data. The transactional surface is empty. The fix is not more content. The fix is structured product data on the surface where in-app assistants are already trying to find it.

Evinent voice search covers the broader voice-search behaviour shift and shows how product discovery has moved off smart speakers and onto in-app assistants for transactional queries. The architecture treats that shift as the design constraint. The brand is not trying to win the smart speaker. The brand is trying to be the result Google Assistant reads aloud when the shopper says "buy lavender shampoo 250ml."

Phase 1: Intent Classification (Days 1-30)

Day 1 is the query inventory. Pull two data sources. The first is Google Search Console, filtered for queries that match conversational patterns: questions starting with "how", "what", "where", "when", "buy", "order", or "find". The second is the in-app analytics for the brand's own iOS and Android apps if they exist, or the assistant-attributed traffic in GA4 for brands without a native app. Both sources surface the queries the brand is currently visible for.

Build a single spreadsheet with six columns: Query, Source Surface, Intent Layer, Current Rank, Schema Present (Y/N), Variant-Level Data (Y/N). The Intent Layer column drives the rebuild. Every query is sorted as informational, navigational, or transactional. "How do I clean my coffee grinder" is informational. "Brand name coffee beans" is navigational. "Buy 250g whole bean coffee delivered" is transactional. The sorting is not always clean, but the rough cut is enough to identify where the effort is currently going.

Omnia voice retail covers the retail-specific view of voice-search purchase intent and gives the operator a credible reference for what transactional queries look like across categories. Use the patterns in that piece as a vocabulary check against the brand's own query inventory.

Week 2 is the schema audit. For each transactional query, check the source page for four schema elements: Speakable, structured price, structured availability, and variant attributes. Most pages will have one or two of the four. Almost no pages will have all four. The gap is the work queue for Phase 2. Mastroke voice Shopify covers the Shopify-stack voice optimisation guidance and lines up which schema can be added through theme-level edits versus which require app-level injection.

Week 3 and Week 4 are the priority sort. Not every transactional query is worth the schema investment. Sort the inventory by query volume, current rank, and product margin. The brand should be working on the high-volume, high-margin, mid-rank queries first. Those are the queries where a schema investment can move the brand from rank 5 to rank 1 inside an in-app assistant result, and the resulting transactional voice traffic actually pays for the work. Low-volume queries and high-rank queries get worked later or not at all.

Phase 2: The Transactional Layer Build (Month 2-6)

Month 2 is the Speakable schema rollout. Every PDP in the priority sort gets Speakable schema covering the product name, the price, the availability status, and the primary variant attributes. The schema is a JSON-LD block that Google Assistant reads as a candidate for spoken results. The work is theme-level for most Shopify brands and app-level for brands on more locked-down themes. The schema is a JSON-LD block injected into the PDP head. The fields the assistant reads aloud are name, offers, priceCurrency, price, and availability. Variant-level schema lives inside the offers array, with one offer entry per variant SKU.

Month 3 is the variant-level structured data rebuild. A coffee brand selling whole bean and ground variants of the same SKU needs the schema to expose the variant choice as a parameter the assistant can read aloud and accept by voice. A skincare brand selling 50ml and 100ml variants needs the same treatment. The structured data has to expose price, availability, and variant attribute at the SKU level, not the product level. Most brands run variant-level data at the product level, which means the assistant cannot offer the shopper a variant choice without a clarification round-trip that breaks the purchase flow.

The operator example worth pulling is Sonos. The brand structures product data at the variant level, exposes availability and price for each variant, and ships Speakable schema on the high-priority PDPs. The result is that Google Assistant can read a Sonos product result aloud, including the variant choice, and route the shopper into a purchase flow without a navigation round-trip. Patagonia runs a similar discipline with structured product data on its high-margin lines. Both brands treat the structured data as a transactional asset, not a hygiene chore.

Month 4 to Month 6 is the mobile site-speed and in-app review pass. Google Assistant requires a mobile-friendly page that loads quickly before it will read a result aloud. PageSpeed targets in the green band on mobile are the floor, not the ceiling. Brands that miss the mobile site-speed target lose voice-eligibility regardless of how good the schema is. The architecture treats site-speed as a voice ranking factor, not a UX afterthought.

The in-app review is where the brand confirms the work landed. Open Google Assistant on Android. Speak the priority transactional query. Note whether the brand's product result reads aloud, whether the variant attributes are spoken, and whether the path-to-purchase routes into the Google Shopping flow or back to the brand's own app or site. Repeat on Siri Shopping on iOS. Document the result for every priority query. The voice-attributable session count in GA4 should start moving inside two to three months of the schema rebuild landing.

Phase 3: The Per-SKU Voice Scorecard

Day 91 onwards is the steady-state discipline. The brand runs a per-SKU voice scorecard that tracks four signals: in-app assistant visibility (the SKU surfaces in spoken results), variant readability (the assistant reads the variant attributes aloud correctly), purchase-flow routing (the shopper can complete a purchase by voice), and voice-attributable session count in GA4. The scorecard is updated quarterly, with the priority SKU list re-sorted based on which products are surfacing and which are not.

The scorecard exposes the schema decay that most brands miss. Theme updates can break Speakable markup. App migrations can break variant-level structured data. Shopify's product taxonomy changes can shift how variant attributes are exposed. Without the per-SKU scorecard, the schema rebuild from Phase 2 silently degrades over six to twelve months, and the brand loses the voice traffic without realising it.

The Metric That Replaces Snippet Wins

Stop reading voice-search performance through snippet count or smart speaker visibility. Both of those metrics are upstream of the metric that pays the bills. The metric that proves The Voice Intent Architecture is working is voice-attributable transactional sessions per month, plus the share of those sessions that completed a purchase, sustained across consecutive monthly reviews. That number is reportable in dollars. That number is owned by an operator. That number moves only when the structured data on the transactional surface is current.

The brands that complete this rebuild end up with marginally fewer informational snippet wins, materially more variant-level structured data on PDP, and a voice-attributable session count that grows quarter on quarter as in-app assistants surface the brand's products for spoken transactional queries. The snippet softness is the informational layer falling out of the architecture's investment priority. The session growth is the transactional layer being built where it actually earns revenue.

The Voice Intent Architecture does not abandon the smart speaker. It deprioritises it. The brand that wants the voice money concentrates the work where the assistants are routing transactional queries, ships the structured data the platforms prioritise, and reads the result in voice-attributable revenue rather than in snippet counts. That is what an honest voice-search program looks like in 2026, and it is the only configuration of the work that survives a serious cost-of-capital review at the next budget cycle.

Free tool · put it to numbers

Unit Economics Calculator

Contribution margin per order after COGS, shipping and fees — the number scaling actually depends on.

Open calculator →

Newsletter

The Uncommon Insights Letter

Practical FMCG & eCommerce growth playbooks — margins, retention and scaling tactics, straight to your inbox.

No spam. Unsubscribe anytime.

Put it to work

Turn ai optimization into profit you can see

Get a hands-on operator to turn the frameworks above into results — book a free audit call.