Last week, I was in Japan for the 15th International Semantic Web Conference.
For me, this was a big event as I was research track program co-chair together with the amazing Elena Simperl. Being a program chair is a funny thing, you’re not directly responsible for any individual paper, presentation or review but you feel responsible for the entirety. And obviously, organizing 664 reviews for 212 submissions isn’t something to be taken lightly. Beyond my service as research track chair, I think my main contribution was finding good coffee near the event:
With all that said, I think the entire program was really solid. All the preprints are on the website and the proceedings are available from Springer. I’ll try to summarize my main takeaways below. But first some quick stats:
- 430 participants
- 212 (research track) + 43 (application track) + 71 (resources track) = 326 submissions
- that’s up by 61 submission from last year!
- Acceptance rates:
- 39/212 = 18% (research track)
- 12/43 = 28% (application track)
- 24/71 = 34% (resources track)
- I think these reflect the aims of the individual tracks
- We also had 102 posters and demos and 12 journal track papers
- 35 student travel winners
My three main takeaways:
- Frames are back!
- semantics on the web (notice the case)
- Science as the next challenge
- SPARQL as a driver for other CS communities
(Oh and apologies for the gratuitous use of images and twitter embeds)
Frames are back!
For the past couple of years, a chunk of the community has been focused on the problem of entity resolution/disambiguation whether that’s from text to a KB or across multiple KBs. Indeed, one of the best paper winners (yes, we gave out two – both nominees had great papers) by ISI’s Information Integration Group was an excellent approach to do multi-type entity resolution. Likewise, Axel and crew gave a pretty heavy duty tutorial on link discovery. On the NLP front, Stefano Faralli presented a nice resource that disambiguates text to lexical resources with a focus on providing both symbolic and distributional representations .
What struck me at the conference were the number of papers beginning to think not just about entities and their relations but the context they are in. This need for context was well motivated by the folks at IBM research working on medical question answering.
Essentially, thinking about classic AI frames but how do obtain these automatically. A clear example of this is the (ongoing) work on FRED:
Similarly, the News Reader system for extracting information into situated events is another example. Another example is extracting process graphs from medical texts. Finally, in the NLP community there’s an increasing focus on developing resources in order to build automated parsers for frame-style semantic representations (e.g. Abstract Meaning Representation). Such representations can be enhanced by connections to semantic web resources as discussed by Burns et al. (I knew this was a great idea in 2015!)
In summary, I think we’re beginning to see how the background knowledge available on the Semantic Web combined with better parsers can help us start to deal better with context in an automated fashion.
semantics on the web
Chris Bizer gave an insightful keynote reflecting on what the community’s expectations were for the semantic web and where we currently are at.
He presented stats on the growth of Linked Data (e.g. stuff in the LOD cloud) as well as web data (e.g. schema.org marked pages) but really the main take away is the boom in the later. About 30% of the Web has html embedded data something like 12 million websites. There’s an 86% adoption rate on top travel website. I think the choice quote was:
“Probably, every hotel on earth is represented as web data.”
The problem is that this sort of data is not clean, it’s messy – it’s webby data, which brings to Chris’s important point for the community:
While standards have brought us a lot, I think we are starting as a research community to think increasingly about different kinds of semantics and different kinds of structured data. Some examples from the conference:
- Can we combine social and formal semantics for web data?
- What about something other than OWL (e.g. prototypes – or frames…see above)?
- What about focusing on common sense knowledge based on distributional semantics?
- What about rules as the primitive for reasoning about knowledge graphs?
- What about entities that are defined in the wild (e.g. the web) vs. from wikipedia?
An embrace of the whole spectrum of semantics on the web is really a valuable move for the research community. Interestingly enough, I think we can truly experiment with web data through things like Common Crawl and the Web Data Commons. As knowledge graphs, triple stores, and ontologies become increasingly common place especially in enterprise deployments, I’m heartened by these new areas of investigation.
The next challenge: Science
Personally, the third keynote of ISWC by Professor Hiroaki Kitano – the CEO of Sony CSL and creator among other things of the AIBO and founder of RoboCup gave a inspirational speech laying out what he sees as the next AI grand challenge:
It will be hard for me to do justice to the keynote as the material per second ratio was pretty much off the chart but he has AI magazine article laying out the vision.
Broadly, he used RoboCup as a framework for discussing how to organize a challenge and pointed to its effectiveness. (e.g Kiva systems a RoboCup spinout was acquired by Amazon for $770 million). He then focused on the issue of the inefficiency in scientific discovery and in particular how assembling knowledge is just too difficult.
He then went on to reframe the scientific question as one of a massive search and verification of hypothesis space.
I walked out of that keynote pretty charged up.
I think the semantic web community can be a big part of tackling this grand challenge. Science and medicine have always been important domains for applying these technologies and that showed up at this conference as well:
SPARQL as a driver for other CS communities
The 10 year award was given to Jorge Perez , Marcelo Arenas and Claudio Gutierrez for their paper Semantics and Complexity of SPARQL. Jorge gave just a beautiful 10 minute reflection on the paper and the relationship between theory and practice. I think his slide below really sums up the impact that SPARQL has had not just on the semantic web community but CS as a whole:
As further evidence, I thought one of the best technical talks of the conference (even through an earthquake) was by Peter Bonz on emergent schemas for RDF querying.
It was a clear example of how the two DB and semweb communities are learning from one another and that by the semantic web having different requirements (e.g. around schemas), this drives new research.
As a whole, it’s hard to beat a conference where you learn a ton and has the following:
- What I said about ISWC 2015
- ISWC 2016 was covered by NHK TV
- I wonder what will replace COLD – Hot?
- Speaking of COLD, I attempted to channel Brad Allen at the Linked Data Debate about how do connect neural computing with semantic web.
- Check out his keynote from Dublin Core
- But really Avi won
- I enjoyed the biomedical data integration and discovery workshop. Papers here. Two to call out:
- In Japan with a large amount of people who are on your Facebook feed from the west => toilet posts
- Help make wikidata complete
- Did you know that Graph DB has an elastic search backend?
- IKEA meatballs in Japan are like IKEA meatballs everywhere else
- Good to see what the folks at Springer Nature are doing
- Go Raphael & team for tackling reproducibility
- Scientometrics on the conference
- Highly recommended – Japanese tea ceremony with English explanation
- This week in triple stores: RDFox and SPARQL on Apache Spark
- Graph store, triple store, or relational – you decide
- See you in Vienna!