Allen Institute for AI
Semantic Scholar Usability Testing
Product description
Semantic Scholar is an online, AI-powered research tool for peer-reviewed, scientific literature. Semantic Scholar aims to more quickly connect scholars to the most relevant articles within their field.
Study goals
Our purpose in this usability study is to understand and discover how scholars use the search feature, in particular, how they narrow search results with the current filters and sort options.
Role: Researcher: Facilitator, note taker, observation
Team: Nick Gorlovski, Robin Marsh, Crystina McShay, and Megan Peaslee
What: Usability Testing, Analysis and Recommendations
Duration: 7 weeks
Study Participants
Screening Criteria:
Participants needed to be academics or professionals working within the sciences, who read 2-3 scholarly articles per week for their profession.
Participants:
8 participants total:
3 experienced users
5 novice users
Fields of study:
Political Science, Social Science, Speech and Hearing Science, Medicine, HCI, Academia and Structural engineering.
Methodology and Procedures
Task-based usability test
Measured task success with qualitative and quantitative methods
Session details:
A mix of remote and in-person interviews
Recorded sessions on Zoom
Finding severity ratings:
Interaction Flow - Search and Save
Executive Summary
Generally the product is usable: average satisfaction is 3.75 (1-5 scale), SUS score of 74 (0-100 scale)
Transparency of how Semantic Scholar populates results is unclear and may impact product credibility
Majority of participants expected to rely on search bar as primary means of narrowing results
Filters were not all located in the same area and proved difficult to discover
Identified several filters that could be useful when using Semantic Scholar
What’s working well today
Overall functionality of site works well, with as expected search, and results that are easily skimmed
Users were happily surprised about the content within the article detail page and access to content without logging in
Sort options: users found and used appropriately. No users made comments about it being anything but what was expected
Search bar is the primary method to refine results
Search results algorithm is not transparent
Users struggled to find all filters
Filter options are breaking expectations and needs
Task metrics
System Usability Scale (SUS) Score
System Usability Scale (SUS) was developed in 1986
Calculated through a set of 10 questions focused on perceived usability at a moment in time
Summary of Design Recommendations
Semantic Scholar Implementations
We received word back from Semantic Scholar that several of our team’s recommendations were taken into account, with improvements to relevance of search results in progress.
Other changes noted post recommendations:
UI elements:
Greater emphasis on the search bar, with increased size and prevalence on the screen.
A brighter blue contrasting color improving visibility of filters.
Reflection
Recruitment
More time to recruit experienced participants given their relatively low response rate
Study Structure
Replicating tasks in the participants’ search engine of choice, for some comparison data
If we had more time, we might have tried using different methods, e.g., something similar to a diary study and asking new users to use Semantic Scholar over the course of a week or two
Potentially create more complex tasks to reflect real-life scenarios of finding and consuming scholarly research
Ensure search queries are spelled correctly since incorrect spelling may have inadvertently impacted task experience