In preparation for Science Online 2011, I was asked by Mark Hahnel from over at Science 3.0 if I could do some analysis of the blogs that they’ve been aggregating since Octobor (25 thousand posts from 1506 authors). Mark along with Dave Munger will be talking more about the role/importance of aggregators in a session Saturday morning 9am (Developing an aggregator for all science blogs). These analysis provide a high level overview of the content of science blogs. Here are the results.
The first analysis tried to find the topics of blogs and their relationships. We used title words as a proxy for topics and co-occurrence of those words as representative of the relationships between those topics. Here’s the map (click the image to see a larger size):
The words cluster together according to their co-occurrence. The hotter the color the more occurrence of those words. You’ll notice that for example Science and Blog are close to one another. Darwin and days as well as fumbling and tenure are close as well. The visualization was done with Vosviewer software.
I also looked at how blogs are citing research papers. We looked for the occurrence of DOIs as well as research blogging style citations within all the blog posts. We found that there were 964 posts with these sorts of citations. In this case, I thought there would be more but maybe this is down to how I implemented it.
Finally, I looked at what URLs were most commonly used in all the blog posts. Here are the top 20:
I was quite happy with this list because they are pretty much all science links. I thought there would be a lot more links to non-science places.
I hope the results can provide a useful discussion piece. Obviously, this is just the start and we can do a lot more interesting analyses. In particular, I think such statistics can be the basis for alt-metrics style measures. If you’re interested in talking to me about these analysis come find me at Science Online.