Mozilla Skin

Torngat1

From CISTI-ICIST LAB WIKI

Contents

Project Torngat

Principal Investigator: Glen Newton, glen.newton@gmail.com

Ungava2 subproject. The goal of this work is to create semantic journal maps to support the user search experience, in a large scale digital library of science, technology and medical (STM) journal articles. By projecting article search results onto a semantic map, we seek to visualize and contextualize the query results, and offer interactive tools for users to refine queries and discover related articles.

This initial work is to find a technique that can scale to 10s of millions of terms. The prototype, (described in the paper below & requiring Java on the browser) shows how LuSql, Lucene, Semantic Vectors and R's MDS are used to create a 'Map of Science' from the full-text (only: no metadata used) of 5,733,721 articles from 2231 journals.

Note that the application has progressed (with improvements) since the writing of the paper, so there are some (small but noticeable) differences between the prototype and the paper.

Image:TorngatV0.png


Plan

  1. Find & validate method that can scale to large numbers of terms & build prototype visualization (Completed)
  2. Evaluate above at the subject category level
  3. Validate usefulness in a search context by projecting article search results onto semantic journal mapping space, & create tools to support discovery (like finding articles close to the articles in semantic journal space, etc.)
  4. Evaluate additional use cases such as:
    1. Given an arbitrary manuscript that is uploaded, project its location onto the semantic journal space. Show surrounding journals and surrounding articles. Useful for finding additional citations and possible journals for submission.
    2. Extend the semantic journal space to represent time, and visualize journals' relative movements through the semantic journal space over time
    3. Visualize in three dimensions, thus revealing better structure than in two dimensions
    4. Given a series of terms, display their locations on the semantic journal space & show similar terms
  5. Find & validate method that can scale to large numbers of terms AND large numbers of items, i.e. able to create an article semantic space.

Partners

Publications