Archive

Tag Archives: ESWC

From June 2 – 6, I had the pleasure of attending the Extended Semantic Web Conference 2019 held in Portorož, Solvenia. After ESWC, I had another semantic web visit with Axel Polleres, Sabrina Kirrane and team in Vienna. We had a great time avoiding the heat and talking about data search and other fun projects. I then paid the requisite price for all this travel and am just now getting down to emptying my notebook. Note to future self, do your trip reports at the end of the conference.

It’s been awhile since I’ve been at ESWC so it was nice to be back. The conference I think was down a bit in terms the number of attendees but the same community spirit and interesting content (check out the award winners) was there.  Shout out to Miriam Fernandez and the team for making it an invigorating event:

So what was I doing there. I was presenting work at the Deep Learning for Knowledge Graph workshop on trying to see if we could answer structured (e.g. SPARQL) queries over text (paper):

The workshop itself was packed. I think there were about 30-40 people in the room.  In addition to the presenting the workshop paper, I was also one of the mentors for the doctoral consortium. It was really nice to see the next up and coming students who put a lot of work into the session: a paper, a revised paper, a presentation and a poster. Victor and Maria-Esther did a fantastic job organizing this.

So what were my take-aways from the conference. I had many of the same thoughts coming out of this conference that I had when I was at the recent AKBC 2019 especially around the ideas of polyglot representation and scientific literature understanding as an important domain driver (e.g. a Predicting Entity Mentions in Scientific Literature and Mining Scholarly Data for Fine-Grained Knowledge Graph Construction. ) but there were some additional things as well.

Target Schemas

The first was a notion that I’ll term “target schemas”. Diana Maynard in her keynote talked about this. These are little conceptually focused ontologies designed specifically for the application domain. She talked about how working with domain experts to put together these little ontologies that could be the target for NLP tools was really a key part of building these domain specific analytical applications.   I think this notion of simple schemas is also readily apparent in many commercial knowledge graphs.

The notion of target schemas popped up again in an excellent talk by Katherine Thornton on the use of ShEx. In particular, I would call out the introduction of an EntitySchema part of Wikidata. (e.g. Schema for Human Gene or Software Title). These provide these little target schemas that say something to the effect of “Hey if you match this kind of schema, I can use them in my application”. I think this is a really powerful development.

The third keynote by Daniel Quercia was impressive. The Good City Life project about applying data to understand cities just makes you think. You really must check it out. More to this point of target schemas, however, was the use of these little conceptual descriptions in the various maps and analytics he did. By, for example, thinking about how to define urban sounds or feelings on a walking route, his team was able to develop these fantastic and useful views of the city.

I think the next step will be to automatically generate these target schemas. There was already some work headed into that direction. One was Generating Semantic Aspects for Queries , which was about how to use document mining to select which attributes for entities one should show for an entity. Think of it as selecting what should show up in a knowledge graph entity panel. Likewise, in the talk on Latent Relational Model for Relation Extraction, Gaetano Rossiello talked about how to think about using analogies between example entities to help extract these kind of schemas for small domains:

m97pj-zb.jpeg

I think this notion is worth exploring more.

Feral Spreadsheets

What more can I say:

We need more here. Things like MantisTable. Data wrangling is the problem. Talking to Daniel about the data behind his maps just confirmed this problem as well.

Knowledge Graph Engineering

This was a theme that was also at AKBC – the challenge of engineering knowledge graphs. As an example, the Knowledge Graph Building workshop was packed. I really enjoyed the discussion around how to evaluate the effectiveness of data mapping languages led by Ben de Meester especially with emphasis around developer usability. The experiences shared by the team from the industrial automation from Festo were really insightful. It’s amazing to see how knowledge graphs have been used to accelerate their product development process but also the engineering effort and challenges to get there.

GBVSKwXC

Likewise, Peter Haase in his audacious keynote (no slides – only a demo) showed how far we’ve come in the underlying platforms and technology to be able to create commercially useful knowledge graphs. This is really thanks to him and the other people who straddle the commercial/research line. It was neat to see the Open PHACTS style biomedical knowledge graph being built using SPARQL and api service wrappers:

V9RbjOo_.jpeg

However, still these kinds of wrappers need to be built, the links need to be created and more importantly the data needs to be made available. A summary of challenges:

Overall, I really enjoyed the conference. I got a chance to spend sometime with a bunch of members of the community and it’s exciting to see the continued excitement and the number of new research questions.

Random Notes

 

I seem to be a regular attendee of the Extended Semantic Web Conference series (2013 trip report). This year ESWC was back in Crete, which means that you can get photos like the one below taken to make your colleagues jealous:

2014-05-26 18.11.15

 

As I write this, the conference is still going on but I had to leave early to early to head to Iceland where I will briefly gate crash the natural language processing crowd at LREC 2014. Let’s begin with the stats of ESWC:

  • 204 submissions
  • 25% acceptance rate
  • ~ 4.5 reviews per submission

The number of submissions was up from last year. I don’t have the numbers on attendance but it seemed in-line with last year as well. So, what was I doing at the conference?

This year ESWC introduced a semantic web evaluation track. We participated in two of these new evaluation tracks. I showed off our linkitup tool for the Semantic Web Publishing Challenge. [paper]. The tool lets you enrich research data uploaded to Figshare with links to external sites. Valentina Maccatrozzo presented her contribution to the Linked Open Data Recommender Systems challenge. She’s exploring using richer semantics to do recommendation, which, from the comments on her poster, was seen as a novel approach by the attendees. Overall, I think all our work went over well. However, it would be good to see more of the VU Semweb group content in the main track. The Netherlands only had 14 paper submissions. It was also nice to see PROV mentioned in several places. Finally, conferencse are great places to do face-2-face work. I had nice chats with quite a few people, in particular, with Tobias Kuhn on the development of the nanopublications spec and with Avi Bernstein on our collaboration leveraging his group’s Signal & Collect framework.

So what were the big themes of this year’s conference. I pulled out three:

  1. Easing development with Linked Data
  2. Entities everywhere
  3. Methodological maturity

Easing development

As a community, we’ve built interesting infrastructure for machine readable data sharing, querying, vocabulary publication and the like. Now that we have all this data,  the community is turning towards making it easier to develop applications with it. This is not necessarily a new problem and people have tackled it before (e.g. ActiveRDF). But the availability of data seems to be renewing attention to this problem. This was reflected by Stefan Staab’s Keynote on Programming the Semantic Web. I think the central issue he identified was how to program against flexible data models that are the hallmark of semantic web data. Stefan argued strongly for static typing and programmer support but, as an audience member noted, there is a general trend in development circles towards document style databases with weaker type systems. It will be interesting to see how this plays out.

Aside: A thought I had was whether we could easily publish the type systems that developers create when programming back out onto the web and merge them with existing vocabularies….

This notion of easing development was also present in the SALAD workshop (a workshop on APIs). This is dear to my heart. I’ve seen in my own projects how APIs really help developers make use of semantic data when building applications. There was quite a lot of discussion around the role of SPARQL with respect to APIs as well as whether to supply data dumps or an API and what type of API that should be. I think it’s fair enough to say that Web APIs are winning, see the paper RESTful or RESTless – Current State of Today’s Top Web APIs, and we need to devise systems that deal with that while still leveraging all our semantic goodness. That being said it’s nice to see mature tooling appearing for Linked Data/Semantic Web data (e.g. RedLink toolsMarin Dimitrov’s talk on selling semweb solutions commercially).

Entities everywhere

There were a bunch of papers on entity resolution, disambiguation, etc. Indeed, Linked Data provides a really fresh arena to do this kind of work as both the data and schemas are structured and yet at the same time messy. I had quite a few nice discussions with Pedro Szekely on the topic and am keen to work on getting some of our ideas on linking into the Karma system he is developing with others.  From my perspective, two papers caught my eye. One on using coreference to actually improve sparql query performance. Often times we think of all these equality links as a performance penalty, it’s interesting to think about whether they can actually help us improve performance on different tasks. The other paper was “A Probabilistic Approach for Integrating Heterogeneous Knowledge Sources“, which uses Markov Logic Networks to align web information extraction data (e.g. NELL) to DBpedia. This is interesting as it allows us to enrich clean background knowledge with data gathered from the web. It’s also neat in that it’s another example of the combination of  statistical inference and (soft) rules.

This emphasis on entities is in contrast with the thought-provoking keynote by Oxford philosopher Luciano Floridi, who discussed various notions of complexity and argued that we need to think not in terms of entities but in fact interactions. This was motivated by the following statistic – that by 2020 7.5 billion people vs. 50 billion devices and all of these things will be interconnected and talking.

Indeed, while entities especially in messy data is far from being a solved problem, we are starting to see dynamics emerging as clear area of interest. This is reflected by the best student paper Hybrid Acquisition of Temporal Scopes for RDF Data.

Methodological maturity

The final theme I wanted to touch on was methodological maturity. The semantic web project is 15 years old (young in scientific terms) and the community has now become focused on having rigorous evaluation criteria. I think every paper I saw at ESWC had a strong evaluation section (or at least a strongly defensible one). This is a good thing! However, this focus pushes people towards safety in their methodology, for instance the plethora of papers that use LUBM, which can lead towards safety in research. We had an excellent discussion about this trend in the EMPIRICAL workshop – check out a brief write up here. Indeed, it makes one wonder if

  1. these simpler methodologies (my system is faster than yours on benchmark x) exacerbate a tendency to do engineering and not answer scientific questions; and
  2. whether the amalgamation of ideas that characterizes semantic web research is toned down leading to less exciting research.

One answer to this trend is to encourage a more wide spread acceptance and knowledge of different scientific methodologies (e.g. ethnography), which would allow us to explore other areas.

Finally,  I would recommend Abraham Bernstein & Natasha Noy – “Is This Really Science? The Semantic Webber’s Guide to Evaluating Research Contributions“, which I found out about at the EMPIRICAL workshop.

Final Notes

Here are some other pointers that didn’t fit into my themes.

 

%d bloggers like this: