System Usability Scale: 10 Powerful Insights You Need Now
If you’ve ever wondered how to measure the ease of use of a product or system, the System Usability Scale (SUS) is your go-to tool. Simple, reliable, and widely trusted, it’s the gold standard in usability evaluation.
What Is the System Usability Scale (SUS)?

The System Usability Scale, commonly known as SUS, is a 10-item questionnaire designed to assess the perceived usability of a system, product, or interface. Developed in the late 1980s by John Brooke at Digital Equipment Corporation, it has since become one of the most widely used tools in usability testing across industries—from software and websites to medical devices and consumer electronics.
Origins and Development of SUS
The System Usability Scale was first introduced in 1986 as a quick, reliable way to evaluate usability without requiring extensive resources. At the time, usability testing was often complex, time-consuming, and required expert observers. Brooke aimed to create a lightweight, yet effective, method that could be applied across different systems regardless of their complexity or domain.
Brooke’s original research was published in a technical report titled SUS: A Quick and Dirty Usability Scale, which laid the foundation for what would become a cornerstone in human-computer interaction (HCI) research. The beauty of the SUS lies in its simplicity: it doesn’t require observers, video recordings, or task analysis—just a short survey filled out by users after interacting with a system.
Over the decades, the System Usability Scale has been validated across countless studies and translated into multiple languages. Its robustness and adaptability have made it a favorite among UX researchers, product managers, and designers alike. You can read more about its origins on the Usability.gov website, a trusted resource by the U.S. Department of Health and Human Services.
Structure of the SUS Questionnaire
The System Usability Scale consists of 10 statements, each rated on a 5-point Likert scale ranging from “Strongly Disagree” (1) to “Strongly Agree” (5). The statements alternate between positive and negative phrasing to reduce response bias. Here are the standard 10 items:
I think that I would like to use this system frequently.I found the system unnecessarily complex.I thought the system was easy to use.I think that I would need the support of a technical person to be able to use this system.I found the various functions in this system were well integrated.I thought there was too much inconsistency in this system.I would imagine that most people would learn to use this system very quickly.I found the system very cumbersome to use.I felt very confident using the system.I needed to learn a lot of things before I could get going with this system.Notice how odd-numbered questions are positively worded (e.g., “easy to use”), while even-numbered ones are negatively worded (e.g., “unnecessarily complex”)..
This counterbalancing helps prevent users from mechanically agreeing or disagreeing throughout the survey..
“The SUS is not just a usability metric—it’s a diagnostic tool that gives you insight into user perception with minimal effort.” — Jakob Nielsen, Nielsen Norman Group
Scoring and Interpretation of SUS Results
Scoring the System Usability Scale is straightforward but requires careful attention to the alternating positive and negative items. Here’s how it works:
- For odd-numbered items (positive): Subtract 1 from the user’s response (e.g., if they answered 4, use 3).
- For even-numbered items (negative): Subtract the user’s response from 5 (e.g., if they answered 2, use 3).
- Sum all the converted scores and multiply by 2.5 to get a final score between 0 and 100.
For example, if a user gives all 4s across the 10 items, the calculation would be:
- Items 1,3,5,7,9: (4-1) = 3 → 5 items × 3 = 15
- Items 2,4,6,8,10: (5-4) = 1 → 5 items × 1 = 5
- Total = 15 + 5 = 20 → 20 × 2.5 = 50
The resulting score of 50 is considered average. According to research by Bangor, Kortum, and Miller (2008), SUS scores can be interpreted using a grading scale:
- 90–100: Excellent
- 80–89: Good
- 70–79: Acceptable
- 60–69: Poor
- 50–59: Awful
- Below 50: Unacceptable
It’s important to note that the SUS does not diagnose specific usability problems—it measures overall perceived usability. A low score tells you something is wrong, but not exactly what. That’s why it’s often paired with qualitative feedback or observational testing.
Why the System Usability Scale Is So Widely Used
The enduring popularity of the System Usability Scale isn’t accidental. Its widespread adoption stems from a powerful combination of simplicity, reliability, and versatility. Unlike more complex usability metrics that require specialized equipment or extensive training, SUS can be administered by almost anyone, anywhere, and at any stage of product development.
Simplicity and Ease of Administration
One of the biggest advantages of the System Usability Scale is how easy it is to deploy. You don’t need a lab, eye-tracking software, or a team of moderators. All you need is a digital form or printed sheet and a few minutes of a user’s time.
Because the questionnaire is only 10 questions long, it doesn’t burden participants. This increases completion rates and reduces survey fatigue, which is crucial when testing with diverse user groups, including older adults or people with limited technical experience.
Tools like Google Forms, SurveyMonkey, or Typeform make it easy to distribute the SUS digitally. You can embed it directly into user testing sessions, post-study interviews, or even as part of a product’s onboarding flow. The low overhead makes it ideal for startups, academic researchers, and enterprise teams alike.
Reliability and Validity Across Domains
Despite its simplicity, the System Usability Scale has been rigorously tested for reliability and validity. Numerous studies have confirmed its internal consistency (Cronbach’s alpha typically above 0.9), meaning the items in the survey are measuring the same underlying construct—usability.
Research has shown that SUS performs well across a wide range of domains, including:
- Web and mobile applications
- Medical devices (e.g., insulin pumps, patient portals)
- Automotive infotainment systems
- Enterprise software (e.g., CRM, ERP systems)
- Consumer electronics (e.g., smart home devices)
A 2014 study published in the Journal of Usability Studies found that SUS scores were highly correlated with other usability metrics, including task success rates and user satisfaction. This cross-validation strengthens its credibility as a holistic usability indicator.
Moreover, the System Usability Scale has been translated into over 30 languages and adapted for use in non-Western cultures, maintaining its psychometric integrity. This global applicability makes it a truly universal tool.
Cost-Effectiveness for UX Teams
For UX teams operating under tight budgets or aggressive timelines, the System Usability Scale offers a high return on investment. It requires minimal training to administer and analyze, and the data it generates can be used to justify design decisions to stakeholders.
Imagine you’re comparing two versions of a checkout flow. Instead of running a full usability lab study, you can test both versions with 10–15 users each, collect SUS scores, and present a clear, quantifiable comparison. A difference of 15 points (e.g., 65 vs. 80) is often enough to convince product managers to move forward with the better-performing design.
Additionally, because SUS scores are standardized, they allow for benchmarking over time. You can track how usability improves across product iterations, measure the impact of a redesign, or compare your product against competitors (if you have access to their SUS data).
“SUS gives you a usability ‘vital sign’—a quick pulse check on how your product is performing from the user’s perspective.” — Tom Tullis, UX Research Expert
How to Administer the System Usability Scale Correctly
While the System Usability Scale is simple to use, getting accurate and meaningful results depends on proper administration. Even small mistakes in timing, context, or data collection can skew results and lead to incorrect conclusions.
Best Practices for Survey Timing and Context
The timing of the SUS administration is critical. It should be given immediately after the user completes a set of representative tasks with the system. If you administer it too early, users haven’t had enough interaction to form a reliable opinion. Too late, and their memory of the experience may fade.
Ideally, the SUS should follow a usability test session where participants complete core tasks (e.g., signing up, making a purchase, navigating a dashboard). This ensures their feedback is grounded in actual experience rather than first impressions.
Context also matters. If you’re testing a mobile app, make sure users are interacting with it in a realistic environment—on their own device, in a quiet space, without interruptions. Avoid administering the SUS in high-stress situations (e.g., during a timed test with penalties) as this can artificially lower scores.
Avoiding Common Administration Mistakes
Despite its simplicity, there are several pitfalls to avoid when using the System Usability Scale:
Using it in isolation: SUS measures perception, not performance.Always pair it with behavioral data (e.g., task success, time on task) for a complete picture.Administering it before interaction: Never give the SUS as a pre-test.Users can’t rate usability without using the system.Changing the wording: The original SUS items are carefully calibrated..
Modifying them (e.g., replacing “system” with “app”) can affect reliability.Ignoring demographic context: A SUS score of 70 from tech-savvy millennials may mean something very different than the same score from older adults with limited digital experience.Another common error is treating SUS as a one-time metric.Usability is not static—it evolves with design changes, user learning, and context shifts.Regular SUS assessments help track trends and catch usability regressions early..
Data Collection and Sample Size Considerations
While SUS can be administered to a single user, the real power comes from aggregating scores across multiple participants. Research suggests that even small sample sizes (5–10 users) can provide reliable SUS averages, thanks to the scale’s high sensitivity.
However, for more robust statistical analysis—such as comparing two designs or calculating confidence intervals—a sample size of 15–20 is recommended. Larger samples (30+) allow for more nuanced analysis, such as segmenting by user type (e.g., new vs. returning users).
When collecting data, ensure anonymity to encourage honest feedback. Use tools that automatically calculate SUS scores to reduce human error. And always document the testing context (e.g., device type, task set, environment) so you can interpret results accurately later.
Interpreting SUS Scores: Beyond the Number
A SUS score is more than just a number—it’s a story about user experience. But interpreting it correctly requires context, comparison, and a deeper understanding of what the score represents.
Understanding the SUS Grading Scale
As mentioned earlier, the widely accepted grading scale for SUS scores is:
- 90–100: Excellent
- 80–89: Good
- 70–79: Acceptable
- 60–69: Poor
- 50–59: Awful
- Below 50: Unacceptable
But these labels are not absolute. A score of 68 might be “poor” in a consumer app but “acceptable” in a complex enterprise system used by trained professionals. Always interpret scores relative to your domain, audience, and goals.
For example, a medical device used by nurses in a high-stress ICU environment might aim for a SUS score above 80, while an internal HR tool used once a month might be deemed usable at 70.
Benchmarking Against Industry Standards
One of the most powerful uses of the System Usability Scale is benchmarking. Over the years, researchers have compiled normative data showing average SUS scores across industries.
According to data from the Nielsen Norman Group and other sources:
- Consumer websites: Average SUS ~68
- Mobile apps: Average SUS ~72
- Enterprise software: Average SUS ~65
- Medical devices: Target SUS > 80 (due to safety implications)
If your product scores significantly below the industry average, it’s a red flag. If it’s above, you’re likely delivering a better-than-average user experience.
You can also benchmark against your own past versions. A redesign that increases your SUS from 60 to 75 represents a tangible improvement in perceived usability—even if users can’t articulate exactly what changed.
Combining SUS with Qualitative Feedback
The System Usability Scale is quantitative, but it gains even more value when paired with qualitative insights. After users submit their SUS responses, ask follow-up questions like:
- What was the most frustrating part of using the system?
- Was there anything that surprised you?
- What one change would make this system easier to use?
This mixed-methods approach helps you understand not just how usable your system is, but why. For instance, a low SUS score might be traced back to a confusing navigation menu or inconsistent terminology.
Some teams even use open-ended comments to categorize common pain points and prioritize fixes. This turns SUS from a diagnostic tool into a roadmap for improvement.
Advanced Applications of the System Usability Scale
While the System Usability Scale is often used in basic usability testing, its applications go far beyond simple score tracking. Advanced teams leverage SUS for comparative analysis, longitudinal studies, and even predictive modeling.
Comparative Usability Testing
One of the most powerful uses of the System Usability Scale is comparing two or more versions of a product. This is especially useful during A/B testing, prototype evaluation, or competitive analysis.
For example, you might test:
- Old vs. new interface
- Mobile app vs. web version
- Your product vs. a competitor’s
By collecting SUS scores from the same task set across conditions, you can determine which version users perceive as more usable. A statistically significant difference (e.g., 70 vs. 85) provides strong evidence for design decisions.
Tools like t-tests or ANOVA can be used to assess whether differences are meaningful. Even without advanced statistics, a 10+ point gap is usually noticeable and actionable.
Longitudinal Usability Tracking
Usability isn’t a one-time event—it changes over time as users learn, systems evolve, and contexts shift. By administering the System Usability Scale repeatedly, you can track usability trends across product lifecycles.
For example:
- Measure SUS after each major release to ensure usability doesn’t degrade.
- Track SUS over time with the same users to study learning curves.
- Compare SUS scores across different user cohorts (e.g., new vs. experienced users).
This longitudinal approach helps identify whether usability improvements are sustained or whether new features introduce friction.
Some organizations build dashboards that visualize SUS trends alongside other KPIs like retention, support tickets, or Net Promoter Score (NPS), creating a holistic view of user experience.
Integration with Other UX Metrics
The System Usability Scale doesn’t exist in a vacuum. It’s most powerful when integrated with other user experience metrics. Common combinations include:
- SUS + Task Success Rate: High SUS but low task success? Users may feel confident but are failing silently.
- SUS + Time on Task: Low SUS and high time? Likely usability bottlenecks.
- SUS + Net Promoter Score (NPS): Correlate perceived usability with loyalty.
- SUS + Error Rate: High errors and low SUS? Strong indicator of design flaws.
By combining SUS with behavioral and attitudinal data, you create a multidimensional view of usability that’s far more insightful than any single metric.
Limitations and Criticisms of the System Usability Scale
No tool is perfect, and the System Usability Scale is no exception. While it’s widely respected, it has limitations that users should be aware of to avoid misinterpretation or overreliance.
What SUS Doesn’t Measure
The System Usability Scale measures perceived usability, but it doesn’t capture everything. Specifically, it does not assess:
- Learnability over time: SUS is a snapshot after one session, not a measure of long-term learning.
- Emotional experience: It doesn’t measure delight, frustration, or emotional engagement.
- Accessibility: A high SUS score doesn’t guarantee the system is usable by people with disabilities.
- Performance: Users might rate a system as usable even if it’s slow or buggy.
For example, a user might find a system “easy to use” (high SUS) but still experience frequent crashes or long load times. This disconnect highlights the need to complement SUS with other metrics.
Subjectivity and Response Bias
Because SUS relies on self-reported data, it’s vulnerable to subjectivity and response bias. Users may:
- Give socially desirable answers (e.g., not wanting to criticize a product in front of a researcher).
- Be influenced by their mood or external factors.
- Interpret terms like “easy to use” differently based on personal experience.
The alternating positive/negative wording helps reduce acquiescence bias (the tendency to agree with statements), but it doesn’t eliminate it entirely. Cultural differences in survey-taking behavior can also affect results.
When to Use Alternatives or Complements
In some cases, other usability scales may be more appropriate than SUS:
- UMUX (Usability Metric for User Experience): A 4-item shorter alternative based on ISO standards.
- UMUX-Lite: Just 2 items, ideal for quick mobile feedback.
- PSSUQ (Post-Study System Usability Questionnaire): More detailed, 16-item survey for in-depth analysis.
- SEQ (Single Ease Question): One question: “How easy was this task?” Useful for per-task feedback.
These alternatives offer trade-offs between brevity, depth, and specificity. The key is choosing the right tool for your research goals.
Future of the System Usability Scale in UX Research
As technology evolves, so too does the role of the System Usability Scale. While its core structure remains unchanged, new applications and integrations are expanding its relevance in modern UX research.
Adaptation to Emerging Technologies
The System Usability Scale is being successfully applied to new domains, including:
- Voice interfaces (e.g., Alexa, Google Assistant): Users rate how easy it is to complete tasks using voice commands.
- Virtual and augmented reality: SUS helps evaluate the intuitiveness of immersive experiences.
- AI-powered systems: As users interact with chatbots and AI assistants, SUS measures perceived ease of collaboration.
While the core questionnaire remains the same, researchers are exploring supplemental questions to capture domain-specific challenges, such as voice recognition accuracy or spatial navigation in VR.
Automated SUS Analysis and AI Integration
Advances in AI and data analytics are transforming how SUS data is collected and interpreted. Tools now exist that:
- Automatically calculate SUS scores in real-time.
- Use natural language processing (NLP) to analyze open-ended feedback alongside SUS.
- Integrate SUS into continuous user feedback loops within apps.
Some platforms even use machine learning to predict SUS scores based on behavioral data (e.g., click patterns, error rates), reducing the need for manual surveys.
These innovations make SUS faster, smarter, and more scalable—especially for large user bases.
The Enduring Value of a Simple Metric
In an age of big data and complex analytics, the enduring success of the System Usability Scale is a testament to the power of simplicity. It proves that a well-designed, concise metric can provide deep insights without overwhelming users or researchers.
As long as people interact with technology, there will be a need to understand how usable it feels. The System Usability Scale, with its balance of rigor and accessibility, is likely to remain a cornerstone of UX evaluation for years to come.
What is the System Usability Scale?
The System Usability Scale (SUS) is a 10-item questionnaire used to measure the perceived usability of a system, product, or interface. It produces a single score from 0 to 100, with higher scores indicating better usability.
How do you calculate a SUS score?
To calculate a SUS score: for odd-numbered items, subtract 1 from the response; for even-numbered items, subtract the response from 5. Sum all adjusted scores and multiply by 2.5 to get a final score between 0 and 100.
What is a good SUS score?
A SUS score above 68 is considered above average. Scores of 80+ are good, and 90+ are excellent. However, what’s “good” depends on the context, industry, and user population.
Can I modify the SUS questionnaire?
It’s strongly recommended not to modify the wording of the SUS items, as this can affect its reliability and validity. If you need a shorter version, consider using UMUX-Lite instead.
How many users do I need for a reliable SUS score?
As few as 5–10 users can provide a reliable average SUS score. For comparative studies or statistical analysis, 15–20 users per condition are recommended.
The System Usability Scale remains one of the most trusted and versatile tools in usability evaluation. Its simplicity, reliability, and broad applicability make it indispensable for UX professionals. While it has limitations, its value is amplified when used alongside qualitative feedback and other metrics. Whether you’re testing a website, app, or emerging technology, SUS offers a quick, effective way to gauge user experience. As technology evolves, so too will the applications of SUS—proving that even in a complex digital world, simple metrics can deliver powerful insights.
Further Reading: