digest | Semantic Scholar uses AI to transform scientific search

April 1, 2021

Example of the top return in a Semantic Scholar search for “quantum computer silicon” constrained to overviews (52 out of 1,397 selected papers since 1989) (credit: AI2)

The Allen Institute for Artificial Intelligence (AI2) produces its free Semantic Scholar service, intended to allow scientific researchers to quickly cull through the millions of scientific papers published each year to find those most relevant to their work.

Semantic Scholar leverages the institute’s AI expertise in data mining, natural-language processing, and computer vision, according to according to Oren Etzioni, PhD, CEO at the institute. The system searches more than 3 million computer science papers, and will add scientific categories on an ongoing basis.

With Semantic Scholar, computer scientists can:

  • Quickly find what they are looking for, with advanced selection filtering tools. Researchers can filter search results by author, publication, topic, and date published. This gets the researcher to the most relevant result in the fastest way possible, and reduces information overload.
  • Instantly access a paper’s figures and findings. Unique among scholarly search engines, this feature pulls out the graphic results, which are often what a researcher is really looking for.
  • Jump to cited papers and references and see how many researchers have cited each paper, a good way to determine citation influence and usefulness.
  • Be prompted with key phrases within each paper to winnow the search further.

Example of figures and tables extracted from the first document discovered (“Quantum computation and quantum information”) in the search above (credit: AI2)

How Semantic Scholar works

Using machine reading and vision methods, Semantic Scholar crawls the web, finding all PDFs of publicly available scientific papers on computer science topics, extracting both text and diagrams/captions, and indexing it all for future contextual retrieval.

Using natural language processing, the system identifies the top papers, extracts filtering information and topics, and sorts by what type of paper and how influential its citations are. It provides the scientist with a simple user interface (optimized for mobile) that maps to academic researchers’ expectations.

Filters such as topic, date of publication, author and where published are built in. It includes smart, contextual recommendations for further keyword filtering as well. Together, these search and discovery tools provide researchers with a quick way to separate wheat from chaff, and to find relevant papers in areas and topics that previously might not have occurred to them.

Semantic Scholar builds from the foundation of other research-paper search applications such as Google Scholar, adding AI methods to overcome information overload.

“Semantic Scholar is a first step toward AI-based discovery engines that will be able to connect the dots between disparate studies to identify novel hypotheses and suggest experiments that would otherwise be missed,” said Etzione. “Our goal is to enable researchers to find answers to some of science’s thorniest problems.”