Screen Shot 2020-06-25 at 5.27.36 AM.png
 
 

Allen Institute for AI

Semantic Scholar Usability Testing

 
 

Product description

Semantic Scholar is an online, AI-powered research tool for peer-reviewed, scientific literature. Semantic Scholar aims to more quickly connect scholars to the most relevant articles within their field.

Study goals

Our purpose in this usability study is to understand and discover how scholars use the search feature, in particular, how they narrow search results with the current filters and sort options. 

SS-lady-laptop.png

Role: Researcher: Facilitator, note taker, observation

Team: Nick Gorlovski, Robin Marsh, Crystina McShay, and Megan Peaslee

What: Usability Testing, Analysis and Recommendations

Duration: 7 weeks

 

Study Participants

Screen Shot 2020-06-25 at 6.08.03 AM.png

Screening Criteria:

  • Participants needed to be academics or professionals working within the sciences, who read 2-3 scholarly articles per week for their profession. 

Participants: 

  • 8 participants total: 

    • 3 experienced users 

    • 5 novice users

  • Fields of study: 

    • Political Science, Social Science, Speech and Hearing Science, Medicine, HCI, Academia and Structural engineering.

 

Methodology and Procedures

  • Task-based usability test 

  • Measured task success with qualitative and quantitative methods 

  • Session details: 

    • A mix of remote and in-person interviews

    • Recorded sessions on Zoom

Finding severity ratings:

 

Interaction Flow - Search and Save

Executive Summary

SS-lady-Ipad.png
  • Generally the product is usable: average satisfaction is 3.75 (1-5 scale), SUS score of 74 (0-100 scale)

  • Transparency of how Semantic Scholar populates results is unclear and may impact product credibility 

  • Majority of participants expected to rely on search bar as primary means of narrowing results

  • Filters were not all located in the same area and proved difficult to discover

  • Identified several filters that could be useful when using Semantic Scholar

 
Screen Shot 2020-06-25 at 7.29.11 AM.png

What’s working well today

  • Overall functionality of site works well, with as expected search, and results that are easily skimmed

  • Users were happily surprised about the content within the article detail page and access to content without logging in

  • Sort options: users found and used appropriately. No users made comments about it being anything but what was expected

Screen Shot 2020-06-25 at 7.02.10 AM.png

Search bar is the primary method to refine results

Screen Shot 2020-06-25 at 7.05.09 AM.png

Search results algorithm is not transparent

Screen Shot 2020-06-25 at 7.05.23 AM.png

Users struggled to find all filters

Screen Shot 2020-06-25 at 7.05.42 AM.png

Filter options are breaking expectations and needs

Screen Shot 2020-06-25 at 7.12.44 AM.png
 
Screen Shot 2020-06-25 at 7.33.51 AM.png

Task metrics

Screen Shot 2020-06-25 at 7.14.39 AM.png
 

System Usability Scale (SUS) Score

  • System Usability Scale (SUS) was developed in 1986

  • Calculated through a set of 10 questions focused on perceived usability at a moment in time

Screen Shot 2020-06-25 at 7.14.55 AM.png
 

Summary of Design Recommendations 

 

Semantic Scholar Implementations

We received word back from Semantic Scholar that several of our team’s recommendations were taken into account, with improvements to relevance of search results in progress.

Other changes noted post recommendations:

  • UI elements:

    • Greater emphasis on the search bar, with increased size and prevalence on the screen.

    • A brighter blue contrasting color improving visibility of filters.

Reflection

  • Recruitment

    • More time to recruit experienced participants given their relatively low response rate

  • Study Structure

    • Replicating tasks in the participants’ search engine of choice, for some comparison data

    • If we had more time, we might have tried using different methods, e.g., something similar to a diary study and asking new users to use Semantic Scholar over the course of a week or two

    • Potentially create more complex tasks to reflect real-life scenarios of finding and consuming scholarly research

    • Ensure search queries are spelled correctly since incorrect spelling may have inadvertently impacted task experience