Tag Archives: #altmetrics

Last week (Oct 7 – 9) the altmetrics community made its way to Amsterdam for 2:AM (the second altmetrics conference) and altmetrics15 (the 4th altmetrics workshop). The conference is aimed more at practitioners while the workshop has a bit more research focus. I enjoyed the events from both a content (I’m biased as a co-organizer) as well as logistics perspective (I could bike from home). This was the five year anniversary of the altmetrics manifesto so it was a great opportunity to reflect on the status of the community. Plus the conference organizers brought cake!

This was the first time that all of the authors were in the same room together and we got a chance to share some of our thoughts. The video is here if you want to hear us pontificate:

From my perspective, I think you can summarize the past years in two bullet points:

  • Amazing what the community has done: multiple startups on altmetrics, big companies having altmetric products, many articles and other research objects having altmetric scores, a small but vibrant research community is alive
  • It would be great to focus more on altmetrics to improve the research process rather than just their potential use in research evaluation.

Beyond the reflection on the community itself, I took three themes from the conference:

More & different data please

An interesting aspect is that most studies and implementations rely on social media data (twitter, mendeley, Facebook, blogs, etc). As an aside, it’s worth noting you can do amazing things with this data in a very short amount of time…

However, there is increasing interest in having data from other sources or having more contextualized data.

There were several good examples.  gave a good talk about trying to get data behind who tweets about scientific articles. I’m excited to see how better population data can help us have. The folks at are starting to provide data that looks at how articles are being used in public policy documents. Finally, moving beyond articles, Peter van Besselaar looking at data derived from grant review processes to study, for example, gender bias.

It’s also good to see developments such as the DOI Event Tracker that makes the aggregation of altmetrics data easier. This is hopefully just the start and we will see a continued expansion of the variety of data available for studies.

The role of theory

There was quite a bit of discussions about the appropriateness of the use of altmetrics for different tasks ranging from the development of global evaluation measures to their role in understanding the science system. There was a long discussion of the quality of altmetrics data in particular the transparency of how aggregator’s integrate and provide data.

A number of presenters discussed the need for theory in trying to interpret altmetrics signal. Cameron Neylon gave an excellent talk about his view of the need for a different theoretical view. There was also a break out session at the workshop discussing the role of theory and I look forward to the ether pad becoming something more well defined.  Peter van Bessellaar and I also tried to argue for a question driven approach when using altmetrics.

Finally, I enjoyed the work of Stefanie Haustein, Timothy Bowman, and Rodrigo Costas on interpreting the meaning of altmetrics. This is definitely a must read.

Going beyond research evaluation

I had a number of good conversations with people about the desire to do something that moves beyond the focus of research evaluation. In all honesty, being able or tell stories with a variety of metrics is probably why altmetrics has gained traction.

However, I think a world in which understanding the signals produced by the research system can be used to improve research is the exciting bit. There were some hints of this. In particular, I was compelled by the work of Kristi Holmes on using measures to improve translational medicine at northwestern.


Overall, It’s great to see all the great activity around altmetrics. There are a bunch of great summaries of the event. Check out the altmetrics conference blog and Julie Brikholz’s summary.

This past week I was at Academic Publishing in Europe 9 (APE 2014) for two days. I was invited to talk about altmetrics. This was kind of an update of the double act that I did with Mike Taylor from Elsevier Labs last year at another publishing conference, UKSG. You can find the slides of my talk from APE below along with a video of my presentation. Overall, the talk was well received:

I think for publishers the biggest thing is to recognize this as something they play a role in as well as emphasizing that altmetrics broaden the measurement space. It’s also interesting that authors want support for telling about their research – and need help.

Given that it was a publishing conference, it’s always interesting to see the themes getting talked about. Here are some  highlights from my perspective.

The Netherlands going gold

Open Access was as usual a discussion point. The Dutch State Secretary of Science was there, Sander Dekker, giving a full throated endorsement of gold open access. I thought the discussion by Michael Jubb on monitoring progress of the UK’s Open Access push after the Finich Report was interesting. I think seeing how the UK manages and measures this transition will be critical to understanding the ramifications of open access. However, I have a feeling that they may not be looking at the impact on faculty enough and in particular how money is distributed for open access gold pricing.

Big Data – It’s the variety!

There was a session on big data.  Technically, I thought I wouldn’t get a lot out of this session because with my computer science hat on I’ve heard quite a few technical talks on the subject. However, this session really confirmed to me not that were facing a problem with data processing or storage but data variety.

This was confirmed by the fantastic talk by Jason Swedlow on the Open Microscopy project. The project looks at how to manage and deal with massive amounts of image data and the interoperability of those images. (You can find one of the images that they published here – 281 gigapixels!) If your thinking about data integration or interoperability you should check out this project and his talk. I also like the notion that images as a measurement technique. He noted that their software deals with data size and processing but the difficulties were around the variety and just general dirtiness of all that data.

This emphasis on the issues of data variety as an issue was also emphasize by Simon Hodson  from CoDATA in his talk as he gave an overview of a number of e-science projects where data variety was the central issue.

Data / Other Stuff Citation

Data citation was another theme of the conference. As a community member, it was good to see mentioned frequently, in particular, the work on data citation principles that’s being facilitated by the community. Also, the resource identification initiative another FORCE11 community group – where researchers can identify specific resources (e.g. model organisms, software) in their publications in a machine readable way. This has already been endorsed by a number of journals (~25) and publishers. This ability to “cite” seems to be the central to how all these other scientific products are beginning to get woven into the scholarly literature. (See also

A good example of this was – Hans Pfeiffenberger talk on the Earth System Science Data journal – where they have created a journal specifically for data coming from large scale earth measurements. An interesting issue that came up was the need for bidirectional citation –  that is to publish the data and associated commentary at the same time each including references to each other using permanent identifiers with different publishers.

Digital Preservation

There was also some talk about preservation of content born online. Two things stood out for me here:

  1. Peter Burnhill‘s talk on and both projects to detect what content is being preserved. I was shocked to hear that only 20% of the online serials stored in a long term archives.
  2. This report seems pretty comprehensive on this front. Note to self – it will be good input for thinking about preserving linked data in the Prelinda project.

Science from the coffee shop

The conference had a session (dotcoms-to-watch) on startups in publishing. What caught me  was that we are really moving toward the idea that Ian Foster has been talking about, namely, science as a service.  With services like scrawl and science exchange, we’re starting to be able to even lab based experiments completely from your laptop. I think this is going to be huge. I already see this in computer science where myself and more of my colleagues turn to the Amazon cloud to boot up our test environments. Pretty soon you’ll be able to do your science just by calling an API.

Random Notes

My Slides & Talk

Update: A full video of my talk on altmetrics has been posted. 

Altmetrics has seen an increasing interest as an alternative to traditional measures of academic performance. This past week I gave a talk in Amsterdam for Open Access week about how altmetrics can be used by academics and their organizations to highlight their broader set of contributions. These can be used to tell a richer and fuller story about how what we do has impact. The talk had a nice turn out of librarians, faculty and administrators (friendly faces below).

Audience for Altmetrics talk Open Access Week 2013

In relation to the talk, I was interviewed by the Dutch national newspaper, de Volkskrant, about the same theme (Twitter neemt wetenschap steeds meer de maat).


You can find the slides of the talk below. I’m told there will be video as well. A big thanks to the altmetrics community. The recent PLOS ALM workshop was a great resource for material. A big thanks goes to Cameron Neylon for allowing me to reuse some of his slides. Overall, I hope that I helped some more people understand how these new forms of metrics can help in showing the impact of what they do.


I was invited by to do a webinar for Elsevier Journal Editors on Altmetrics this past Tuesday. You can find the complete recording here.  260 people attended live. The best part probably was the Q/A session starting 33 minutes in. Broadly, I would broadly characterize the questions as: “I really would like to use these but can you give me some assurances that they are ok.”  Anyway, have a listen for yourself. Hannah Foreman did a great job of directing (she also brought me a cake for my birthday!). I also thought it was great to see Mike Taylor doing a demo of ImpactStory. I felt this was an important webinar to do as it reaches a traditional journal editor audience that may have not yet fully gotten on board. A final note, thanks to Steve Pettifier for letting me use him as an example.




For the past couple of days (April 8 – 10, 2013), I attended the UKSG conference. UKSG is organization for academic publishers and librarians. The conference itself has over 700 attendees and is focused on these two groups. I hadn’t heard of it until I was invited by Mike Taylor from Elsevier Labs to give a session with him on altmetrics.

The session was designed to both introduce and give a start-of-the art update on altmetrics to publishers and librarians. You can see what I had to say in the clip above but my main point was that altmetrics is at a stage where it can be advantageously used by scholars, projects and institutions not to rank but instead tell a story about their research. It’s particular important when many scientific artifacts beyond the article (e.g. data, posters, blog posts, videos) are becoming increasingly trackable and can help scholars tell their story.

The conference itself was really a bit weird for me as it was a completely different crowd than I normally would connect with… I had to one of the few “actual” academics there, which lead to my first day tweet:

It was fun to randomly go up to the ACM and IEEE stand and introduce myself not as a librarian or another publisher but as an actual member of their organizations. Overall, though people were quite receptive of my comments and were keen to get my views on what publishing and librarians could be doing to help me as a research out. I do have to say that it was a fairly well funded operation (there is money in academia somewhere)…. I came away with a lot of free t-shirts and USB sticks and I never have been to a conference that had bumper cars for the evening entertainment:

UKSG bumper cars

In addition to (hopefully) contributing to the conference, I learned some things myself. Here are some bullet points in no particular order:

  • Outrageous talk by @textfiles – the Archive Team is super important
  • I talked a lot to Geoffrey Bilder from CrossRef. Topics included but not limited to:
    • why and when indirection is important for permanence in url space
    • the need for a claims (i.e. nanopublications) database referencing ORCID
    • the need for consistent url policies on sites and a “living will” for sites of importance
    • when will scientist get back to being scientists and stop being marketers (is this statement true, false, in-between, or is it even a bad thing)
    • the coolness of
  • It’s clear that librarians are the publishers customers, academics are second. I think this particular indirection  badly impacts the market.
  • Academic content output is situated in a network – why do we de-link it all the time?
  • The open access puppy

  • It was interesting to see the business of academic publishing going done. There were lots of pretty intense looking dealings going down that I witnessed in the cafe.
  • Bournemouth looks like it could have some nice surfing conditions.

Overall, UKSG was a good experience to see, from the inside, this completely other part of the academic complex.

One of the ideas in the altmetrics manifesto was that almetrics allow a diversity of metrics. With colleagues in the VU University Amsterdam’s Network Institute, we’ve been investigating the use of online data (in this case google scholar) to help create new metrics to measure the independence of researchers. In this case, we need fresh data to establish whether an emerging scholar is becoming independent from their supervisor. We just had the results of one our approaches accepted into the Web Science 2013 conference. The abstract is below and here’s a link to the preprint.

Identifying Research Talent Using Web-Centric Databases 

Anca Dumitrache, Paul Groth, and  Peter van den Besselaar

Metrics play a key part in the assessment of scholars. These metrics are primarily computed using data collected in offline procedures. In this work, we compare the usage of a publication database based on a Web crawl and a traditional publication database for computing scholarly metrics. We focus on metrics that determine the independence of researchers from their supervisor, which are used to assess the growth of young researchers. We describe two types of graphs that can be constructed from online data: the co-author network of the young researcher, and the combined topic network of the young researcher and their supervisor, together with a series of network properties that describe these graphs. Finally, we show that, for the purpose of discovering emerging talent, dynamic online resources for publications provide better coverage than more traditional datasets.

This is fairly preliminary work, it mainly establishes that we want to use the freshest possible data for this work. We are expanding the work to do a large scale study  of independence as well as to use different sources of data. But to me, this shows how the freshness of web data allows us to begin looking at and measuring research in new ways.

From November 1 – 3, 2012, I attended the PLOS Article Level Metrics Workshop in San Francisco .

PLOS is a major open-access online publisher and the publisher of the leading megajournal PLOS One. A mega-journal is one that accepts any scientifically sound manuscript. This means there is no decision on novelty just a decision on whether the paper was done in a scientifically sound way. The consequence is that this leads to much more science getting published and the corresponding need for even better filters and search systems for science.
As an online publisher, PLOS tracks many what are termed article level metrics – these metrics go beyond of traditional scientific citations and include things like page views, pdf downloads, mentions on twitter, etc. Article level metrics are to my mind altmetrics aggregated at the article level.
PLOS provides a comprehensive api to obtain these metrics and wants to encourage the broader adoption and usage of these metrics. Thus, they organized this workshop. There were a variety of people attending ( from publishers (including open access ones and the traditional big ones), funders, librarians to technologists. I was a bit disappointed not to see more social scientists there but I think the push here has been primarily from the representative communities. The goal was to outline key challenges for altmetrics and then corresponding concrete actions that could place in the next 6 months that could help address these challenges. It was an unconference so no presentations and lots of discussion. I found it to be quite intense as we often broke up into small groups where one had to be fully engaged. The organizers are putting together a report that digests the work that was done. I’m excited to see the results.

Me actively contributing 🙂 Thanks Ian Mulvany!


  • Launch of the PLOS Altmetrics Collection. This was really exciting for me as I was one of the organizers of getting this collection produced. Our editorial is here: This collection provides a nice home for future articles on altmetrics
  • I was impressed about the availability of APIs. There are now several aggregators and good sources of altmetrics in just a bit of time. ImpactStory,, plos alm apis, mendeley,, microsoft academic search
  • rOpenSci ( is a cool project that provides R apis to many of these alt metric and other sources for analyzing data
  • There’s quite a bit of interest in services to do these metrics. For example, Plum Analytics ( has a test being done at the University of Pittsburgh. I also talked to other people who were getting interest in using these alternative impact measures and also heard a number of companies are now providing this sort of analytics service.
  • I talked a lot to Mark Hahnel from about the Data2Semantics LinkItUp service. He is super excited about it and loved the demo. I’m really excited about this collaboration.
  • Microsoft Academic Search is getting better, they are really turning it into a production product with better and more comprehensive data. I’m expecting a really solid service in the next couple of months.
  • I learned from Ian Mulvany of eLife that Graph theory is mathematically “the same as” statistical mechanics in physics.
  • Context, Context, Context – there was a ton of discussion about the importance of context to the numbers one gets from altmetrics. For example, being able to quickly compare to some baseline or by knowing the population which the number is applied.

    White board thoughts on context! thanks Ian Mulvany

  • Related to context was the need for simple semantics – there was a notion that for example we need to know if a retweet in twitter was positive or negative and what kind of person retweeted the paper (i.e. a scientists, a member of the public, a journalist, etc). This was because that unlike citations the population that altmetrics uses is not as clearly defined as it exists in a communication medium that doesn’t just contain scholarly communication.
  • I had a nice discussion with Elizabeth Iorns the founder of . There doing cool stuff around building markets for performing and replicating experiments.
  • Independent of the conference, I met up with some people I know from the natural language processing community and one of the things that they were excited about is computational semantics but using statistical approaches. It seems like this is very hot in that community and something we in the knowledge representation & reasoning community should pay attention to.


Associated with the workshop was a hackathon held at the PLOS offices. I worked in a group that built a quick demo called . This was a bookmarklet that would highlight papers in pubmed search results based on their online impact according to impact story. So you would get different color coded results based on alt metric scores. This only took a day’s worth of work and really showed to me how far these apis have come in allowing applications to be built. It was a fun environment and was really impressed with the other work that came out.

Random thought on San Francisco

  • Four Barrel coffee serves really really nice coffee – but get there early before the influx of ultra cool locals
  • The guys at Goody Cafe are really nice and also serve good coffee
  • If you’re in the touristy Fisherman’s Wharf area walk to the Fort Mason for fantastic views of the golden gate bridge. The hostel there also looks cool.
%d bloggers like this: