SciTrends – now with abstracts, search, RSS

[This post is an archive – SciTrends didn’t get a lot of traffic, so I let the domain expire]

scitrends2

I’ve added major new features to SciTrends, which should make it a more useful article discovery tool. It now indexes the abstracts of trending papers; you can use the full text search to narrow down the results to your field of interest, for instance “visual cortex” or “h1n1”. You can also narrow down results by journal.

While the general feed tends to be dominated by  stories about academic funding, fraud, editorial policy, politics, climate change, mind-reading, etc. – the narrowed-down results point to interesting articles – try the journal Neuron with timespan of 1 month, for instance.

You can generate an RSS feed for a specific journal/text search combo – so you can receive relevant articles right in your news reader. It’s available now at scitrends.com [dead link].

Behind the scenes

The AltMetric API doesn’t give abstracts of papers. So I decided to cache AltMetric results locally in MongoDB and add abstracts to them using public databases.

I was surprised to find that there is no single public database one can use to retrieve abstracts based on, say, a DOI. I made a script in Python that aggregates data from several APIs:

  • arXiv
  • PubMed
  • PLOS
  • Nature

This retrieves a bit over 60% of abstracts. To complete the set, I built a simple web scraper that uses a variety of heuristics to determine the location of an abstract within a web page. It’s not perfect, but it gets a bit over half of the remaining abstracts, so overall the hit rate is about 85%. Here’s the script [dead link].


Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s