TOTP Automation Testing: Complete Guide with pyotp & Selenium
TOTP authentication testing is one of the most failure-prone parts of modern QA pipelines. Expiring codes, timing drift, and UI instability can break dozens of tests overnight—often right before release. In this guide, you’ll learn how to automate TOTP authentication tests using pyotp and Selenium, along with proven strategies to eliminate flaky tests in CI/CD environments. For more information on TOTP authentication, you can refer to the OWASP TOTP Authentication guidelines. For industry-standard security practices, refer to the OWASP Multifactor Authentication Cheat Sheet.

How to Automate TOTP Authentication Testing
Implementation: Automating TOTP with pyotp & Selenium:
- Generate TOTP using a shared secret (pyotp)
- Sync system time with the authentication server
- Inject OTP into login flow during execution
- Validate success and failure scenarios
This approach works with Selenium, Cypress, and API testing frameworks.
What is TOTP Authentication?
TOTP (Time-based One-Time Password) is a two-factor authentication method that generates temporary codes using a shared secret key and the current time.
Unlike SMS-based OTPs, TOTP codes:
- Are generated locally on the device
- Expire every 30–60 seconds
- Do not rely on network delivery
- Are more secure against interception attacks
Learn more from the official OWASP guide:
https://cheatsheetseries.owasp.org/cheatsheets/Time-Based_One-Time_Password_Authenticator.html
For a deeper technical explanation of how TOTP works, see this detailed guide on What is TOTP Authentication.
Why Automating TOTP Authentication Testing Matters
Automating TOTP authentication testing is essential for modern QA teams because:
- It ensures secure login flows in CI/CD pipelines
- It eliminates manual OTP entry during testing
- It reduces flaky test failures caused by timing issues
- It improves coverage for security-critical workflows
Without automation testing TOTP flows becomes slow, error-prone, and difficult to scale.
You can also explore real-world authentication testing strategies in the OWASP Web Security Testing Guide for MFA.
Related Authentication Testing Scenarios
- Testing 2FA flows (TOTP vs SMS OTP)
- Handling OTP bypass in staging environments
- API-level authentication testing strategies
- Automating 2FA testing using Selenium
- Testing OTP authentication flows in CI/CD
- Automating MFA (Multi-Factor Authentication)
- Generating OTP using Python (pyotp)
Advanced TOTP Testing Challenges
Even mature automation setups face edge cases:
- Clock skew tolerance: Servers may allow ±30s drift, tests must account for this
- Retry logic: OTP may expire mid-test, requiring regeneration strategies
- Backup codes: Often ignored but critical for recovery flow testing
- Rate limiting: Too many OTP attempts can lock accounts during automation
- Multi-device sync issues: Different environments may generate mismatched codes
Handling these correctly is essential for building reliable, production-grade test suites.
Key Takeaways
- Automating TOTP Authentication Tests is essential for security-critical applications
- TOTP codes can be generated programmatically using libraries like pyotp
- Comprehensive test scenarios should cover valid code authentication, expired code handling, and invalid code rejection
- Clock synchronization is crucial for TOTP testing
- Integration with test frameworks like Selenium WebDriver is possible
The Real Problem: Why TOTP Tests Break
Here’s a common scenario:
Three days before release, 40+ tests fail.
Nothing changed in your backend just a UI update.
What follows:
- QA engineers spend 2–3 days fixing selectors
- CI/CD pipelines get blocked
- Release timelines slip
- Engineers lose sprint velocity fixing brittle tests instead of shipping features
- Confidence in test suites drops, leading to skipped or ignored failures
In high-release environments, this cycle repeats every sprint.
Core Concept: How TOTP Works Under the Hood
TOTP is not just “time-based codes” — it’s a deterministic cryptographic process built on the HOTP standard.
Here’s what happens internally:
- A shared secret key exists on both the server and the authenticator app
- The current Unix time is divided into fixed intervals (typically 30 seconds)
- This creates a moving value called the time-step (T = currentTime / 30)
- The system applies an HMAC hashing algorithm (usually HMAC-SHA1) using the secret key and T
- The output is dynamically truncated to generate a short numeric code (e.g., 6 digits)
Because both client and server use the same inputs (secret + time), they independently generate identical codes without any network transmission.
Time Drift & Validation Window
To handle slight clock mismatches, most systems allow a ±1 time-step window, meaning:
- Previous code (−30s)
- Current code (0s)
- Next code (+30s)
This is why a code may still work briefly after refresh.
Why this matters for automation
- Even minor clock drift can invalidate tests
- Codes can expire mid-execution
- Parallel tests using the same secret can create race conditions
These low-level mechanics are the root cause of flaky TOTP tests in CI/CD pipelines. The TOTP algorithm is based on RFC 6238 standards, which you can explore in this simplified breakdown: TOTP Complete Guide.

Why TOTP Tests Fail in CI/CD Pipelines
Even with automation, TOTP tests fail due to:
- Clock drift between CI runners and auth servers
- Delayed execution causing OTP expiry
- Parallel test collisions using the same secret
- Environment-specific secrets not synced properly
To fix this:
- Use time sync (NTP) in CI environments
- Generate OTP just-in-time (not before page load)
- Use isolated secrets per test run
How to Automate TOTP Authentication Tests
1. Generate TOTP Codes Programmatically
Use a library like pyotp:
import pyotp
totp = pyotp.TOTP("your_base32_secret")
code = totp.now()
print(code)
2. Integrate with UI Automation (Selenium Example)
from selenium import webdriver
import pyotp
driver = webdriver.Chrome()
driver.get("https://example.com/login")
driver.find_element("name", "username").send_keys("your_username")
driver.find_element("name", "password").send_keys("your_password")
totp_code = pyotp.TOTP("your_secret").now()
driver.find_element("name", "totp").send_keys(totp_code)
driver.find_element("id", "login").click()
For implementation-level guidance and authentication workflows, refer to this TOTP Authentication Documentation.
Comparison Table
| Approach | Setup Time | Coding Required | Maintenance Burden | CI/CD Ready | Best For |
|---|---|---|---|---|---|
| Selenium | 2-4 hrs | High | Medium | Yes | Large teams |
| Cypress | 1-2 hrs | Medium | Low | Yes | Small teams |
| Manual | N/A | N/A | High | No | None |
| Record-and-Playback | 1-2 hrs | Low | High | Limited | None |
| AI/No-Code (Robonito) | 15-30 min | None | Low | Yes | All teams |
For more information on testing best practices, you can refer to Martin Fowler's blog on testing.
Real-World Scenario
A team of 5 QA engineers working on a financial platform had 200+ tests, with a failure rate of 10%. They spent 2-3 days per sprint on test maintenance. After implementing Robonito, they reduced their test maintenance time by 60-80% and improved their test coverage by 20-30%. The team was able to focus on feature development and shipping, rather than test triage.
Tools Breakdown
- Selenium: Best for large teams with complex test scenarios. Hard limitation: locator decay on design system updates.
- Cypress: Best for small teams with simple test scenarios. Hard limitation: single-tab architecture blocks multi-tab OAuth/SSO flows.
- Robonito: Best for teams of all sizes with a need for low-maintenance, high-coverage testing. Hard limitation: requires AI-driven test planning.
Advanced Best Practices
- Data-testid vs CSS vs XPath: Data-testid is the most robust locator strategy, but it requires additional setup. CSS and XPath locators are more prone to decay.
- Test isolation strategies: Use test isolation to prevent shared-state race conditions at scale.
- Quarantining flaky tests: Quarantine flaky tests to prevent them from blocking the pipeline.
Where Teams Go Wrong
- Automating happy path first: Teams often automate the happy path first, leaving edge cases and error handling for later.
- Over-investing in UI tests: Teams often over-invest in UI tests, leaving API tests and other critical areas under-tested.
- Not tagging tests by stability tier: Teams often don't tag tests by stability tier, making it difficult to prioritize test maintenance.
When Not to Use This Approach
This approach may not be suitable for teams with very simple test scenarios or those who require a high degree of customization. In such cases, traditional tools like Selenium or Cypress may be a better fit.
Industry Standard for TOTP
TOTP is defined by RFC 6238, which standardizes how time-based one-time passwords are generated using HMAC and shared secrets. Most authenticator apps like Google Authenticator and Microsoft Authenticator follow this standard.
Step-by-Step Workflow
- Setup: Set up the test environment and obtain the TOTP shared secret.
- Code Generation: Use a library like pyotp to generate TOTP codes.
- Test Scenarios: Cover comprehensive test scenarios, including valid code authentication and expired code handling.
- Clock Synchronization: Ensure clock synchronization between the test environment and the system under test.
- Integration: Integrate TOTP generation into end-to-end test flows.
FAQ
What is TOTP authentication and how does it work?
TOTP is a 2FA method that generates short-lived codes using a shared secret and the current time.
The authenticator app and server independently create the same 6-digit code, which refreshes every 30–60 seconds.
Because codes expire quickly and work offline, TOTP is more secure than SMS OTP.
How do you automate TOTP authentication testing?
Generate OTPs using a shared secret with libraries like pyotp, then inject them into login flows during execution. Combine this with UI tools like Selenium or API-based testing frameworks.
What are the benefits of automating TOTP testing?
The benefits of automating TOTP testing include reduced test maintenance time, improved test coverage, and increased efficiency.
Can I use traditional tools like Selenium or Cypress for TOTP testing?
Traditional tools like Selenium or Cypress have limitations when it comes to TOTP testing, but they can be used in certain scenarios.
How do I ensure clock synchronization between the test environment and the system under test?
You can ensure clock synchronization by using a library like pyotp that takes into account the system clock.
What is the difference between TOTP and HOTP?
TOTP (Time-based One-Time Password) is a security mechanism that generates temporary codes for two-factor authentication based on the current time, while HOTP (HMAC-based One-Time Password) is a security mechanism that generates temporary codes for two-factor authentication based on a counter.
Conclusion
If your QA team is losing time fixing flaky TOTP tests, the cost compounds every sprint. Stabilizing authentication flows isn’t optional in modern CI/CD, it’s critical. AI-based tools including Robonito help eliminate selector breakage and reduce maintenance overhead, allowing teams to focus on shipping instead of debugging tests.
Automate your QA — no code required
Stop writing test scripts.
Start shipping with confidence.
Join thousands of QA teams using Robonito to automate testing in minutes — not months.
