Opinion

Opinion

Opinion

{Title} goes here

Notice something off? Remind you of email spam? Time to rethink it. We spent the last couple of weeks (technically the last 5ish years) exploring short-form copy. This time, we zoomed out, drawing parallels across the industry on how it’s crafted and delivered to users.What if we create a world where every single user sees unique content that is tailored to them?

Notice something off? Remind you of email spam? Time to rethink it.

We spent the last couple of weeks (technically the last 5ish years) exploring short-form copy. This time, we zoomed out, drawing parallels across the industry on how it’s crafted and delivered to users.

What is a short-form copy?

Phrases shorter than 30 words long — such as SEO descriptions, email subject lines, button text or taglines on a creative ad.

Why does short-form copy matter?

With so much generic content produced by AI today, brevity & on-point relevance are more important than ever. Tiny tweaks in content, such as email subjects, SEO snippets, and button text, can significantly boost traffic and conversions. Even small changes like adding a new word or using Gen Z slang have led to millions in daily active user gains for companies.

When you see the notification “Elon just t̶w̶e̶e̶t̶e̶d̶ posted,” remember we carefully inserted “just” because millions more opened it with that word. It was a small tweak, but it made a big impact.

Sounds like, just words matter? (no pun intended).

If these changes are clearly (and counterintuitively) very impactful, why aren’t companies going ham on copy experimentation?

As a matter of fact, 30% of A/B experiments are text changes, but let’s take it a notch further. Why are we not able to see a world where every single user sees content that’s uniquely tailored to them?

Enter the long content iteration cycles

If you have worked in Marketing or Growth, you are already smirking. Every small change in content directly presented to users goes through a 1–3 month (sometimes 6 months) cycle before it’s shipped. The iteration cycle is a flavor of these steps:

Step 1: Writing copy variations

Let’s go back to our notification example: “Elon just posted …”.

Traditionally, a content writer would take 1–2 days to write variations of this copy.

“{first name} just posted”, “{first name} recently posted”, “Recent post from {first name}”. You get the idea.

With Gen AI, probably a few hours, after several prompt iterations.

Step 2: Reviewing and approving the copies

The larger team is tagged for reviews in a Google doc and this is how the exchange goes-

Product: “Doesn’t even appeal to the user. Have you seen Duolingo notifications?”

Trust & Safety: “Remove the 32 sensitive terms and then we are gold.”

Legal: “[ACP] Let’s trade off the humor for the plain language given the reputational risk”

Marketing: “The branding is off. We sound like a machine, can we stick to our voice?”

Localization expert:The translations don’t quite catch the nuances of French, can’t ship in French-speaking markets. Can we crowd-source these to native French speakers?”

Engineer: “Umm, the remaining copies don’t even work. We have already experimented with the last quarter. Can we iterate again?”

Step 2 → Step 1 → Step 2 continues and can eat up months.

So we asked all of these folks, how do you evaluate anyway, which variants are “good”?

“You know, it’s subjective” “We just eyeball and pick a few” “We add ad hoc rules like — don’t use exclamations.”

“Objective content quality evaluation doesn’t quite exist, we are building some quality principles.”

“This was not a problem when we had limited copies from writers. Gen AI scaled the generation of content but it doesn’t scale for quality. It’s like picking a needle in a haystack of highly generic copies.”

Step 3: Integrating it with the experimentation code

Integrating every small copy change in code takes time. Here’s why —

  1. Stitching dynamic content with static content: Integrating dynamic content (e.g., {first name}) with static content (‘just posted’) requires dynamic key incorporation during copy creation, pulling values at runtime for coherence, and stitching them together so the overall phrase is coherent.

  2. Rotating copies across users is an unsolved problem: Determining the number of variants per experiment, balancing novelty with consistency for each user, and assessing the effectiveness of algorithms like Multi-armed Bandit is a hard nut to crack.

  3. Latency and scalability: Every copy variation needs to work with low-latency calls to serve users in real-time.

Step 4: Running the experiment

This takes 2–4 weeks depending on the traffic so the experiment gains statistical power.

Step 5: Analysis

The stakeholder group reviews experiment metrics and records significant insights in a Google Doc, Excel sheet, or launch email. Occasionally, a disciplined engineer may include these details in a log of all learnings stored in a separate Excel sheet. However, these valuable insights often remain as tribal knowledge within the engineering, product management, or marketing team and do not automatically inform the next content iteration programmatically.

Step -1: Starting over

It’s a fresh quarter and we are ready to run our next copy iteration. Where do we begin?

From scratch. There is no history of previously tried copies (well it’s in some Excel sheet). The reviews were across docs, meetings, and slack. The experiment analysis was in another doc. The brand and tone need to be set again.

Psssttt, painful. We just spent months to add a 4 letter word to a notification copy.

So, this week we are asking ourselves —

The History & Versioning

  1. What if we have a string management system that maintains the versioning history of all your copy iterations?

  2. What if it lets you create “templates” of prompts for each product so you can always speak your unique voice?

  3. What if it also allows you to instantly incorporate your experiment insights back into the next iteration?

The Quality

  1. What if it gives you a fairly objective quality evaluation on AI-generated copies (eg: performance, factual correctness, brand, and safety adherence)?

  2. What if it offers in-built aggregate industry insights for every type of copy and lets you make them more sophisticated over time?

  3. What if it can evaluate translation quality with nuanced cultural context?

The Efficiency

  1. What if it makes it really easy for non-technical folks to discover, edit, and play with dynamic parts of a copy?

  2. What if it truly lets each function collaborate effectively? What if it lets different teams share learnings across copy iterations (or differentiate them)?

The Vision

  1. Is it time to build the genetic algorithm for copy evolution?

  2. What if we create a world where every single user sees unique content that is tailored to them?

We are building justwords.ai, in an attempt to answer these questions. We’d love to hear from you if you’d like to be an early partner, or even if you just want to share an opinion.

Write to us at founders@justwords.ai or find us on LinkedIn.

© 2024. Choice AI Inc. All Rights Reserved

© 2024. Choice AI Inc. All Rights Reserved

© 2024. Choice AI Inc. All Rights Reserved