Archive

Monthly Archives: January 2015

NewsReader Amsterdam Hackathon

This past Wednesday (Jan. 21, 2015) I was at the NewsReader Hackathon. NewsReader is a EU project to extract events and build stories from the news. They use a sophisticated NLP pipeline combined with semantic background knowledge to perform this task. The hackathon was an opportunity to talk to members of one of the leading NLP groups in the Netherlands (CLTL) and find out more about their current pipeline. Additionally, one of the project partners is Lexis Nexis, a sister company of Elsevier, so it was nice to see how their content was being used as basis for event extraction and also meet some of my colleagues.  The combination of news and research  is particularly of interest in light of the recent Elsevier acquisition of NewsFlo.

Besides the chance to meet people, I also got to do some hacking myself to see how the NewsReader API worked. I used the api to plot the number and type of events featuring universities. (The resulting iPython Notebook)

A couple of pointers for future reference:

Advertisements

2015-01-13 10.06.28Last week, I was at FORCE 2015 – the future of research communications and e-scholarship conference held in Oxford. This is the third conference in a series that started with Beyond the PDF in 2011 and continued with Beyond the PDF 2 that I led the organization of in Amsterdam in 2013 (my wrap-up is here). This conference provides one of the only forums that brings together a variety of people who are in the vanguard of scholarly communication from librarians and computer scientists, to researchers, funders and publishers. Pretty much every role was represented in the ~250 attendees.

To give you an idea, I saw the developers of the Papers reference manager, the editorial director of PLOS One, a funder from the Wellcome Trust, a librarian from University of Iowa, and public policy junior researchers from Brazil/Germany2015-01-12 12.40.34

The curators (i.e. conference chairs), Dave De Roure and Melissa Haendel did a great job of pulling in a whole range of topics and styles in a a great venue. We even had the opportunity to see copies of the Philosophical Transactions. Speaking from experience this is a tough conference to organize because everything is pretty dynamic and there’s lots of different styles. (e.g. Dave and last minute beer run for the Hackathon!)

So what was I doing there? I helped organize the hackathon, which gave some space to work on content extraction, and reference manger support for data citation and for people to talk over pizza. This lead to proposals for two 1k challenges. (Remember to vote for which one you want to give 1000 pounds to..) I also helped organize the poster and demo / geek out sessions. A trailer for those sessions is below:

Themes

The conferences was too packed to go through everything but I wanted to go through the core themes that I got out of it:

1. Scholarly media is not just text

Data, images, slides, videos, software – scholarly media is not just text.  It never ways but it’s clear that the primacy of text is slowly being reduced and eventually be treated on par with these other forms of output. This is being made possible by the number of new platforms being introduced whether it’s Fighsare or Xenodo for data, github for code or HUBZero for the entire analytics lifestyle. It’s about sharing the actual research object rather than the textual argument. I think what brought this home to me is the amount of time spent discussing and presenting how these content types can be shoehorned into traditional text environments (e.g. journal citations).

2. Not access, understanding

The assumption at FORCE 2015, is that scholarship will be open access. The question then arises what do you do with the open access content. Phil Bourne, in his closing remarks, mentioned the lack of things being done with the current open access corpus. This notion of the need to do more clearly came over in Chris Lintott, founder of Galaxy Zoo, keynote:

He discussed how the literature was a barrier to amateurs contributing more to science. Specially, he mentioned accessible research summaries.  But, in general, there is a need to consider a more diverse audience in our communication not only for amateurs but for scientists from other disciplines or policy makers, for example.

3. Quality under pressure

The amount of scholarship continues to grow and there are perverse incentives. Scott Edmunds from Gigascience brought this out in his vision idea’s talk.

The current answer to this is peer review. But as most researchers will tell you, we are already overwhelmed. I get tons of requests to review and it’s hard to turn down my colleagues. Maybe a market for peer review will develop (see below) but what we need is more automated mechanisms of quality control or for publishers to do more quality control before things get sent to reviewers. Maybe we should see peer review as constructive feedback and not a filter. Likewise, by valuing other parts of the system maybe we can increase both the transparency and overall quality of the science.

4. Science as a service

The poster below from Bianca Kramer and  Jeroen Bosman highlighted the explosion in services available for scholarly communication. This continues a theme that I emphasized last year and that Ian Foster has talked about – the ability to do more and more science by just calling an API. Why can’t I build my lab from a cafe?

Wrap-up & Random Notes

The FORCE community is a special one. I hope we can continue to work together to push scholarly communication forward. I’m already looking forward to FORCE 2016 in Portland. There’s lots to be excited about as the way we do research rapidly changes. Finally, here are some random notes from the conference:

%d bloggers like this: