Biometric Authentication: Building Trust Through Voice

Industry

Digital Banking

Client

Nubank

Focus Area

Trust & Security

Timeline

2024

Industry

Digital Banking

Client

Nubank

Focus Area

Trust & Security

Timeline

2024

1. Overview

Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.

Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.

This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.

Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.

Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.

This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.

Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.

Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.

This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.

How do we authenticate millions of customers by voice without sacrificing privacy, inclusion, and experience?

2. Strategic Context

By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.

Manual verification created systemic risks:

  • Exposure of personal data

  • Increased vulnerability to social engineering

  • High abandonment and retry rates

  • Rising operational costs

  • Agent training complexity

At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.

By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.

Manual verification created systemic risks:

  • Exposure of personal data

  • Increased vulnerability to social engineering

  • High abandonment and retry rates

  • Rising operational costs

  • Agent training complexity

At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.

By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.

Manual verification created systemic risks:

  • Exposure of personal data

  • Increased vulnerability to social engineering

  • High abandonment and retry rates

  • Rising operational costs

  • Agent training complexity

At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.

Main Project Image
Main Project Image
Main Project Image

3. Role & Scope

As Product Designer, I owned the end-to-end authentication experience across the phone channel.

My scope included:

  • Enrollment and voice capture journeys

  • Re-authentication and retry logic

  • Failure recovery patterns

  • Alignment with ML confidence models

  • Coordination with Legal, Privacy, Fraud, and Risk

  • Design of agent-facing visibility tools

My responsibility was to design confidence in a system customers could not see.

As Product Designer, I owned the end-to-end authentication experience across the phone channel.

My scope included:

  • Enrollment and voice capture journeys

  • Re-authentication and retry logic

  • Failure recovery patterns

  • Alignment with ML confidence models

  • Coordination with Legal, Privacy, Fraud, and Risk

  • Design of agent-facing visibility tools

My responsibility was to design confidence in a system customers could not see.

As Product Designer, I owned the end-to-end authentication experience across the phone channel.

My scope included:

  • Enrollment and voice capture journeys

  • Re-authentication and retry logic

  • Failure recovery patterns

  • Alignment with ML confidence models

  • Coordination with Legal, Privacy, Fraud, and Risk

  • Design of agent-facing visibility tools

My responsibility was to design confidence in a system customers could not see.

4.Design Workflow

4. Design Workflow

  1. Designing secure interaction: From interrogation to conversation

    Early prototypes revealed that traditional security language increased anxiety and resistance. Customers perceived the process as accusatory rather than protective. We reframed authentication as a cooperative interaction.

    Instead of demanding compliance, the system explained intent. Instead of repeating rigid scripts, it adapted to context.

    Key design interventions:

    • Conversational, supportive prompts

    • Clear explanations of why voice was being collected

    • Simplified instructions

    • Reduced repetition

    • Explicit confirmation steps

    To support real-world conditions, we introduced:

    • Structured retry guidance

    • Environment-aware messaging

    • Progressive fallback paths

    • Agent-side status indicators

    This ensured that both customers and operators could trust the outcome.


  2. Balancing accuracy, inclusion, and dignity

    Biometric systems are inherently sensitive to noise, illness, accent variation, aging, and voice changes. Optimizing only for model accuracy would systematically exclude vulnerable groups. Instead, we designed progressive verification flows that balanced:

    • Security requirements

    • Accessibility

    • User dignity

    • Operational efficiency

    Clear escape routes and alternative paths ensured that customers were never trapped by automation.

  1. Designing secure interaction: From interrogation to conversation

    Early prototypes revealed that traditional security language increased anxiety and resistance. Customers perceived the process as accusatory rather than protective. We reframed authentication as a cooperative interaction.

    Instead of demanding compliance, the system explained intent. Instead of repeating rigid scripts, it adapted to context.

    Key design interventions:

    • Conversational, supportive prompts

    • Clear explanations of why voice was being collected

    • Simplified instructions

    • Reduced repetition

    • Explicit confirmation steps

    To support real-world conditions, we introduced:

    • Structured retry guidance

    • Environment-aware messaging

    • Progressive fallback paths

    • Agent-side status indicators

    This ensured that both customers and operators could trust the outcome.


  2. Balancing accuracy, inclusion, and dignity

    Biometric systems are inherently sensitive to noise, illness, accent variation, aging, and voice changes. Optimizing only for model accuracy would systematically exclude vulnerable groups. Instead, we designed progressive verification flows that balanced:

    • Security requirements

    • Accessibility

    • User dignity

    • Operational efficiency

    Clear escape routes and alternative paths ensured that customers were never trapped by automation.

  1. Designing secure interaction: From interrogation to conversation

    Early prototypes revealed that traditional security language increased anxiety and resistance. Customers perceived the process as accusatory rather than protective. We reframed authentication as a cooperative interaction.

    Instead of demanding compliance, the system explained intent. Instead of repeating rigid scripts, it adapted to context.

    Key design interventions:

    • Conversational, supportive prompts

    • Clear explanations of why voice was being collected

    • Simplified instructions

    • Reduced repetition

    • Explicit confirmation steps

    To support real-world conditions, we introduced:

    • Structured retry guidance

    • Environment-aware messaging

    • Progressive fallback paths

    • Agent-side status indicators

    This ensured that both customers and operators could trust the outcome.


  2. Balancing accuracy, inclusion, and dignity

    Biometric systems are inherently sensitive to noise, illness, accent variation, aging, and voice changes. Optimizing only for model accuracy would systematically exclude vulnerable groups. Instead, we designed progressive verification flows that balanced:

    • Security requirements

    • Accessibility

    • User dignity

    • Operational efficiency

    Clear escape routes and alternative paths ensured that customers were never trapped by automation.

“Felipe consistently aligned product decisions with privacy and regulatory standards, making complex security topics accessible and collaborative.”

Legal & Privacy Lead

5. Trade-offs & Decisions

  1. Security vs Inclusion

    Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.

    Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.


  2. Critical Trade-offs

    We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.

    • Precision vs Coverage

    • Automation vs Human Override

    • Security vs Convenience


  3. Missteps & Corrections

    Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:

    • Recalibrating thresholds

    • Simplifying capture instructions

    • Improving noise-handling guidance

    • Adjusting pacing

    These changes stabilized validation rates.

  1. Security vs Inclusion

    Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.

    Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.


  2. Critical Trade-offs

    We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.

    • Precision vs Coverage

    • Automation vs Human Override

    • Security vs Convenience


  3. Missteps & Corrections

    Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:

    • Recalibrating thresholds

    • Simplifying capture instructions

    • Improving noise-handling guidance

    • Adjusting pacing

    These changes stabilized validation rates.

  1. Security vs Inclusion

    Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.

    Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.


  2. Critical Trade-offs

    We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.

    • Precision vs Coverage

    • Automation vs Human Override

    • Security vs Convenience


  3. Missteps & Corrections

    Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:

    • Recalibrating thresholds

    • Simplifying capture instructions

    • Improving noise-handling guidance

    • Adjusting pacing

    These changes stabilized validation rates.

Project Gallery Image for 50% width of the screen #1
Project Gallery Image for 50% width of the screen #1
Project Gallery Image for 50% width of the screen #1

7. Experimentation

  1. Learning without compromising trust

    We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:

    • High noise-related failure rates

    • Selection bias in test groups

    • Misalignment between UX messaging and ML thresholds

    • Increased unanswered calls and return rates

    Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:

    • Prompt language

    • Timing and pacing

    • Confidence thresholds

    • Retry limits

    • Recovery logic

    Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.

    Because voice data is highly sensitive, governance was embedded from the start. We formalized:

    • Explicit consent flows

    • Transparent data usage communication

    • Opt-out mechanisms

    • Retention controls

    • Auditability

    Compliance with LGPD and internal privacy standards was treated as a design constraint.

  1. Learning without compromising trust

    We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:

    • High noise-related failure rates

    • Selection bias in test groups

    • Misalignment between UX messaging and ML thresholds

    • Increased unanswered calls and return rates

    Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:

    • Prompt language

    • Timing and pacing

    • Confidence thresholds

    • Retry limits

    • Recovery logic

    Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.

    Because voice data is highly sensitive, governance was embedded from the start. We formalized:

    • Explicit consent flows

    • Transparent data usage communication

    • Opt-out mechanisms

    • Retention controls

    • Auditability

    Compliance with LGPD and internal privacy standards was treated as a design constraint.

  1. Learning without compromising trust

    We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:

    • High noise-related failure rates

    • Selection bias in test groups

    • Misalignment between UX messaging and ML thresholds

    • Increased unanswered calls and return rates

    Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:

    • Prompt language

    • Timing and pacing

    • Confidence thresholds

    • Retry limits

    • Recovery logic

    Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.

    Because voice data is highly sensitive, governance was embedded from the start. We formalized:

    • Explicit consent flows

    • Transparent data usage communication

    • Opt-out mechanisms

    • Retention controls

    • Auditability

    Compliance with LGPD and internal privacy standards was treated as a design constraint.

8. Impact

Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:

  • 44% improvement in first-attempt validation

  • 29.3-second reduction in average handling time

  • 4.8-point increase in tNPS

  • More than 5.3M voice signatures collected

  • Average cost of approximately R$0.12 per verification

  • Recognition as an award-winning initiative

While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.

Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:

  • 44% improvement in first-attempt validation

  • 29.3-second reduction in average handling time

  • 4.8-point increase in tNPS

  • More than 5.3M voice signatures collected

  • Average cost of approximately R$0.12 per verification

  • Recognition as an award-winning initiative

While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.

Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:

  • 44% improvement in first-attempt validation

  • 29.3-second reduction in average handling time

  • 4.8-point increase in tNPS

  • More than 5.3M voice signatures collected

  • Average cost of approximately R$0.12 per verification

  • Recognition as an award-winning initiative

While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.