Biometric Authentication: Building Trust Through Voice
Industry
Digital Banking
Client
Nubank
Focus Area
Trust & Security
Timeline
2024
Industry
Digital Banking
Client
Nubank
Focus Area
Trust & Security
Timeline
2024
1. Overview
Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.
Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.
This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.
Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.
Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.
This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.
Before any financial support interaction can happen, customers must prove who they are. At Nubank, this step had become one of the most stressful and inefficient parts of the phone experience.
Static security questions and PIN-based flows produced long handling times, inconsistent execution, and frequent failures — especially in noisy or emotionally charged contexts.
This case documents how we designed Nubank’s first large-scale voice biometrics platform, transforming authentication from manual interrogation into a scalable, humane, and trustworthy system.
How do we authenticate millions of customers by voice without sacrificing privacy, inclusion, and experience?
2. Strategic Context
By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.
Manual verification created systemic risks:
Exposure of personal data
Increased vulnerability to social engineering
High abandonment and retry rates
Rising operational costs
Agent training complexity
At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.
By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.
Manual verification created systemic risks:
Exposure of personal data
Increased vulnerability to social engineering
High abandonment and retry rates
Rising operational costs
Agent training complexity
At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.
By the early 2020s, Nubank’s phone channel was operating at massive scale. Millions of customers relied on it for sensitive matters such as fraud resolution, account recovery, and transaction disputes.
Manual verification created systemic risks:
Exposure of personal data
Increased vulnerability to social engineering
High abandonment and retry rates
Rising operational costs
Agent training complexity
At the same time, biometric technologies promised efficiency but raised ethical, regulatory, and reputational concerns. The challenge was not only technical. It was emotional, legal, and organizational.



3. Role & Scope
As Product Designer, I owned the end-to-end authentication experience across the phone channel.
My scope included:
Enrollment and voice capture journeys
Re-authentication and retry logic
Failure recovery patterns
Alignment with ML confidence models
Coordination with Legal, Privacy, Fraud, and Risk
Design of agent-facing visibility tools
My responsibility was to design confidence in a system customers could not see.
As Product Designer, I owned the end-to-end authentication experience across the phone channel.
My scope included:
Enrollment and voice capture journeys
Re-authentication and retry logic
Failure recovery patterns
Alignment with ML confidence models
Coordination with Legal, Privacy, Fraud, and Risk
Design of agent-facing visibility tools
My responsibility was to design confidence in a system customers could not see.
As Product Designer, I owned the end-to-end authentication experience across the phone channel.
My scope included:
Enrollment and voice capture journeys
Re-authentication and retry logic
Failure recovery patterns
Alignment with ML confidence models
Coordination with Legal, Privacy, Fraud, and Risk
Design of agent-facing visibility tools
My responsibility was to design confidence in a system customers could not see.
“Felipe consistently aligned product decisions with privacy and regulatory standards, making complex security topics accessible and collaborative.”
Legal & Privacy Lead
5. Trade-offs & Decisions
Security vs Inclusion
Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.
Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.
Critical Trade-offs
We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.
Precision vs Coverage
Automation vs Human Override
Security vs Convenience
Missteps & Corrections
Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:
Recalibrating thresholds
Simplifying capture instructions
Improving noise-handling guidance
Adjusting pacing
These changes stabilized validation rates.
Security vs Inclusion
Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.
Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.
Critical Trade-offs
We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.
Precision vs Coverage
Automation vs Human Override
Security vs Convenience
Missteps & Corrections
Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:
Recalibrating thresholds
Simplifying capture instructions
Improving noise-handling guidance
Adjusting pacing
These changes stabilized validation rates.
Security vs Inclusion
Early optimization efforts focused heavily on improving biometric accuracy. However, stricter thresholds disproportionately affected customers in noisy environments, with speech impairments, strong accents, or temporary voice changes.
Maximizing precision alone would systematically exclude vulnerable users. I advocated for progressive verification flows that balanced security requirements with accessibility and dignity. This position required constant negotiation between UX, ML performance, and risk tolerance.
Critical Trade-offs
We deliberately accepted slightly lower automation rates to preserve inclusivity and reliability.
Precision vs Coverage
Automation vs Human Override
Security vs Convenience
Missteps & Corrections
Early pilots revealed high false-negative rates in noisy environments and misalignment between UX messaging and ML thresholds. These issues increased customer frustration and agent escalations. Corrections included:
Recalibrating thresholds
Simplifying capture instructions
Improving noise-handling guidance
Adjusting pacing
These changes stabilized validation rates.









7. Experimentation
Learning without compromising trust
We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:
High noise-related failure rates
Selection bias in test groups
Misalignment between UX messaging and ML thresholds
Increased unanswered calls and return rates
Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:
Prompt language
Timing and pacing
Confidence thresholds
Retry limits
Recovery logic
Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.
Because voice data is highly sensitive, governance was embedded from the start. We formalized:
Explicit consent flows
Transparent data usage communication
Opt-out mechanisms
Retention controls
Auditability
Compliance with LGPD and internal privacy standards was treated as a design constraint.
Learning without compromising trust
We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:
High noise-related failure rates
Selection bias in test groups
Misalignment between UX messaging and ML thresholds
Increased unanswered calls and return rates
Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:
Prompt language
Timing and pacing
Confidence thresholds
Retry limits
Recovery logic
Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.
Because voice data is highly sensitive, governance was embedded from the start. We formalized:
Explicit consent flows
Transparent data usage communication
Opt-out mechanisms
Retention controls
Auditability
Compliance with LGPD and internal privacy standards was treated as a design constraint.
Learning without compromising trust
We launched the platform through tightly controlled pilots and continuous monitoring. Early experiments surfaced important weaknesses:
High noise-related failure rates
Selection bias in test groups
Misalignment between UX messaging and ML thresholds
Increased unanswered calls and return rates
Rather than framing these as model limitations, we treated them as system design problems. We iterated across multiple dimensions simultaneously:
Prompt language
Timing and pacing
Confidence thresholds
Retry limits
Recovery logic
Close monitoring and rapid iteration led to significant improvements, including a 44% increase in first-attempt validation.
Because voice data is highly sensitive, governance was embedded from the start. We formalized:
Explicit consent flows
Transparent data usage communication
Opt-out mechanisms
Retention controls
Auditability
Compliance with LGPD and internal privacy standards was treated as a design constraint.
8. Impact
Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:
44% improvement in first-attempt validation
29.3-second reduction in average handling time
4.8-point increase in tNPS
More than 5.3M voice signatures collected
Average cost of approximately R$0.12 per verification
Recognition as an award-winning initiative
While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.
Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:
44% improvement in first-attempt validation
29.3-second reduction in average handling time
4.8-point increase in tNPS
More than 5.3M voice signatures collected
Average cost of approximately R$0.12 per verification
Recognition as an award-winning initiative
While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.
Over time, voice biometrics evolved from a pilot into a foundational identity layer. Outcomes:
44% improvement in first-attempt validation
29.3-second reduction in average handling time
4.8-point increase in tNPS
More than 5.3M voice signatures collected
Average cost of approximately R$0.12 per verification
Recognition as an award-winning initiative
While some secondary metrics (such as unanswered calls and return rates) highlighted areas for refinement, the core hypothesis was validated and guided subsequent iterations.