Design strategy for a branded AI experience

The gray polygons above represent versions of the two final prototypes put forth in our final recommendation to the client. All other confidential aspects of the case have been similarly blinded to maintain client confidentiality.

 

The problem

Our financial services client—even just before the recent rise of LLMs—knew it wanted to elevate its customer-facing AI. Initially, our client looked only to marketing, wondering how to promote the AI’s capabilities. The client was considering adopting a human manifestation for their AI: this client considered making a human ‘mascot’ to represent their AI in an ad campaign.

Instead of starting with “How should this AI be humanized?” or “Should [its] personality deviate from the financial institution’s?”—we instead advised our client to consider the dozens of questions before those questions, including:

  • What do customers actually want from a financial services AI? What don’t they want from it?

  • What “Jobs” (as in Jobs-to-be-Done) are customers trying to accomplish that AI can uniquely solve?

  • Do these needs and wants vary across segments?

  • How will the AI’s capabilities grow in the future, and what does our experience need to account for now in order to facilitate that roadmap—and foster accompanying customer expectations?

  • What is our vision for the future of AI at this institution as a whole? How will it drive the business performance metrics that matter?

  • How can our branded AI help boost overall brand appeal?

  • How big of a spotlight should it have?

  • How exactly should it be expressed or manifested?

  • Should it be humanized?

…and so on. Across customer insights, experience, technology planning, business, and brand questions, we showed our client the benefit of taking a step back to assess their vision for their branded AI experience. Ultimately, our client wanted their customers to know, care, and use more of their AI’s capabilities—in service of changed perceptions of this financial institution as more cutting-edge, and to accelerate the brand’s meaning in their customer’s lives.

We showed the client that there’s a range of manifestations that the AI could meaningfully take—across form factors that are more lifelike and humanoid, to something more like a robot or an animal, to just is the brand’s logo, to something with no visible representation at all, but with diffuse functionality that enriches the entire experience. We demonstrated that the client’s branded AI had the potential for felt, useful ubiquity within a consistent experience—to be one experience that creates outrageous engagement through interactions with people, places, and things in the technology of our day—always as itself, with one consistent manifestation.

We argued that once the AI had a human manifestation, that potential to interact with customers across touchpoints, always available—becomes greatly diminished.

When customers interact with Flo from Progressive, for example, they’re not interacting with the Flo they see on the screen. They’re just not. The Flo customers can chat with doesn’t even call itself Flo—it calls itself “The Essence of Flo” or “Flo Chatbot.” On the other hand, with Apple’s Siri and its amorphous manifestation—in a real way, customers really are interacting with the same Siri they see on TV, on their Apple Watch, their HomePod Mini, their car, and their phone right there in their pocket. Always available, always there, always Siri.

Thus began our exploration into manifestations for this branded AI for a consistent, cohesive experience within the brand that holds its arms open for added capabilities in the future, without ever diminishing its potential for what it can be to customers while delivering what customers really want from AI.


The process

Leading with inspiration and cultural savvy to explore different manifestations, we started with pretty big-picture questions:

  • What’s the optimal role for this branded AI in the whole of the brand experience?

  • What makes for an engaging AI, in the financial services category and beyond?

  • What are customer perceptions of and expectations for AI, in the financial services category and beyond?

  • How can we harness opportunities for an engaging, positive AI experience to enhance customer and business outcomes—now and into the future?

  • What manifestations will best allow for flex across every plausible current and future touchpoint, across channels (e.g. mobile, web, advertisements, in-person experiences, social media, mixed reality, etc. etc.), to stay relevant in a rapidly changing digital landscape—now and into the future?

These questions together fueled our overall approach:

Step 1) Immerse in AI: Learn the landscape of AI and develop big-picture heuristics for an engaging AI experience

Step 2) Craft territories: Craft concept territories to capture our hypotheses and guide go-forward exploratory design

Step 3) Test & learn: Build prototypes to be researched in iterative test & learn cycles to hone a final recommendation


Step 1: Immerse in AI

Reviewing academic literature, going back to the basics on the Nielsen usability heuristics, even thinking about character design and pulling in Jungian character archetypes—and, most importantly, deeply researching over 70 competitive and comparative examples of AI in the landscape—I led and conducted the immersion research, forming opinions to discuss and refine with partners on what makes for engaging AI experiences.

Looking at competitors in the category of financial services, we gleaned three big takeaways that helped our clients understand what our client would be directly contrasted with (not shared here).

Looking at out-of-category peers, we drew takeaways that would inform what customers would inevitably compare their AI experiences to—the best-in-class, top-of-the-line AI experiences out there. I looked at the obvious ones—Siri, Alexa—as well as BMW, Clippy (what not to do), Anki’s Vector, Replika, Soul Machines, Slackbot, and many others. I even looked at a few simple point-and-click video games, where there’s the feeling you’re interacting with the game itself—like There is No Game.

Finally, we went so far as to dive into 14 pop-culture examples, too—from R2D2 to Hal to Samantha from Her—learning how these representations of AI might shape user expectations of AI interactions. We found that when AI is highly capable, we seem to be more comfortable with less humanoid manifestations (think R2D2)—when an AI is both highly capable and very humanoid, we seem to be less comfortable with it, and find it less approachable. We also gleaned that villainous depictions of AI center overwhelmingly on fears related to lack of control—which has implications for transparency and a high degree of user command in the experience.

From this deep immersion, I developed a general set of takeaways and heuristics that would guide our concept development.

My contribution: I led this phase in its entirety—including the research planning, execution, and synthesis into useful, client-ready takeaways. I presented what I had to partners on an ongoing basis, leading discussion and receiving feedback, refining accordingly. Minimal refinements were needed, and I received exceptionally positive feedback from both internal partners and our client.

Reflection: The audit was huge. We’d need a more strategic sampling if we were to repeat this project today—especially with the proliferation of AI products, but also to stay efficient. Though I’ve gotten great feedback consistently about my ability to go deep and wide and never lose the forest for the trees—to synthesize great quantities of information into sharp, incisive, useful takeaways—I’ve also learned the importance of being more efficient and choiceful in making fewer, more powerful strikes. That said, this audit remains one of the work products I’m most proud of, and set the foundation for years of thought leadership for the firm and for me personally.


Select captures from my audit of over 70 examples in the then-cutting-edge AI landscape.





Step 2: Craft territories

One of four highly distinct exploration territories as fodder for an internal brainstorm—which would inform the craft of four Concept Territories for client review and discussion

We broke this part down into two major pieces so that we could hone the best strategic territories possible and best guide go-forward prototype development for research.

In the first piece, we presented four crafted “exploration territories” to an internal team of designers to ideate against in full brainstorm mode.

In the second piece, with the creative input from our designers, we crafted another set of strategic Concept Territories to present to the client for feedback.


Complete, written so as to be inspirational and with designed mockups from our team, our Concept Territories structured our client conversation. We led our client through our four unique Concept Territories, each with 2-3 distinct concepts underneath to add dimension to the discussion. These ranged from a more concrete, personable (but still non-humanlike) manifestation—to the idea that that the AI’s name and any distinctive manifestation disappear, so as to embed the AI functionality throughout the financial institution’s experience for a smarter, more responsive brand experience on the whole.

Through the discussion, our client’s decision-maker surfaced their preferences and strategic priorities. With that input, we moved forward to create three distinct prototypes for testing.

My contribution: I led the development of the exploration territories. Conceptualizing each and drafting the writeups based on my deep immersion in the landscape, I then took them to partners and made refinements. At the partners’ suggestion, I also added references to stand-out examples from the audit that exemplified the spirit of that territory. I also planned and managed the logistics of inviting designers to take part in the internal brainstorm. I then synthesized their input into directions for the Concept Territories. I wrote the first draft of the Concept Territories, which I then finalized with a partner in real time. I also presented the exploration territories internally and facilitated the brainstorm session.

Reflection: Bringing in designers with minimal exposure to the project to date helped our small team to get fresh eyes and out-of-the-box thinking and input from our team of creatives. I derived great satisfaction from being the person in the room with helpful tidbits to supply from research to help facilitate the designers in doing their thing. Synthesizing the huge body of research into this first pass—for one more creative push of input—helped to bolster strategic thinking, which honed the final Concept Territories and made them super-sharp. We received extremely positive feedback from the client about both the range and integrity of the Concept Territories. If I were to do anything differently, I might reconsider how to make the exploration territories even more digestible for the designers, although I know many also appreciated having all the texture to look through; it was important that we started and ended brainstorming together—two separate sessions, a day apart—while also having some independent work time in between those sessions for full digestion and creative flourishing. Finding that balance between supplying the texture from research while making it super-digestible is something I know I’ll always continue to develop.



 

Step 3: Test & learn

We engaged over 70 participants in an in-person, deep 1:1 interviews with interactive prototypes in a test-and-learn setup to better understand how different manifestations of this one AI impact the customer’s experience. Participants were all customers of the financial institution, and ranged in age, gender, digital savviness, and wealth tier (as defined by the financial institution).

We engaged in three iterative sprints of research.

In each round, we had participants join an independent moderator in a room set up with cameras on the overall setup as well as the participant’s face and hands as they interacted with the prototype. The moderator followed an established script to meet our research goals, and asked follow-up questions as we wrote them to her in real time where we wanted to hear more. Participants interacted with our live prototypes on a tablet device.

Round 1 of testing focused on broad reactions to the three prioritized concepts, through three developed interactive prototypes. We aimed to learn about:

Expectations: What do participants expect this AI to be like—in terms of personality, functionality, and how they would interact with it?

Affinity: How do people feel about it? What stands out about it? Abstractly, would they want to interact with it more?

Participants interacted with two of three prototypes and were asked questions about their expectations and feelings throughout their interactions. Participants were also encouraged to talk aloud as they clicked through the prototypes, read aloud, and note their expectations of what would happen before moving to the next step.

After Round 1, we gathered our learnings and made adjustments to our prototypes to learn more. We immediately learned that participants strongly disliked one of our prototypes in a way that was related to the strategic concept behind it. We considered it a great success of our test and learn approach. Accordingly, we narrowed our exploration to the other two prototypes to learn more about what worked well.

In Round 2, we tested the two refined prototypes across a greater range of simulated environments. In addition to questions about expectations and affinity, we delved into:

Ubiquity: Does it make sense to consumers that this manifestation of this AI would appear across environments and applications—beyond just the app?

Continuity: Within a prototype, from environment to environment, does it feel like “the same [AI]”?

We also tested specific reactions to possible humanlike manifestations in response to a client request—and ruled it out once and for all.

In Round 3, we tested our further refined two prototypes based on learnings from the previous round, aiming to gather more evidence for our updated hypotheses to share with the client in our final recommendation.

At the close of testing, we synthesized our learnings and included one more round of redesign to embody our takeaways to present to the client, along with our final recommendation for the direction of the AI’s manifestation—which was, decidedly, not a humanlike manifestation.

My contribution: I, along with a technical advisor for the A/V setup, led the coordination with our research partner. I put forth the strategic objectives of the research, reviewed their prepared discussion guide and made edits, and attended every single research session (often as the only member of our team). I took high-level notes and engaged with the moderator live to ask follow-up questions where my team wanted to hear more. I engaged in debrief sessions with our research partner, internal partners, and with our client. I synthesized takeaways after each round of research and presented them to our team, including partners and designers, for discussion of what changes to make for the next round. I made myself available for any questions that our designers had while making changes to the prototype, and directed them from a strategic perspective. I led interim small presentations to our client about how research was going. And at the end of the three rounds, I synthesized the first draft of our final takeaways deck to the client, complete with our top three learnings and final recommendation. I directed which illustrations we’d need in the deck and pulled all customer quotes from the transcriptions our research partner provided.

Reflection: The synthesis into the final recommendation deck is one of the most satisfying moments of my career to date. After leading so much of such a big effort, it was incredibly rewarding to pull it all together into one concise, elegant deck for our client, with a recommendation we all really believed in. I also thoroughly enjoyed engaging in in-person IDIs (in-depth interviews) and enjoyed working very closely with the moderator between and during sessions. It was immensely enjoyable too to engage in truly iterative test-and-learn research.

Below is an excerpt from our final recommendation deck—blinded to maintain client confidentiality. The final recommendation is excluded to maintain client confidentiality. That said, all of this material is still sensitive and should not be captured from this site without my explicit permission.