This page contains draft notes towards an essay on how data merging is carried out with RDF and Semantic Web technologies

Everyone (including myself) always says that data merging is "easier" with RDF and Semantic Web technologies. But there's very little documentation that explicitly backs up that argument. Most of the discussion involves hand waving such as "it all magically happens when we use the same identifiers"

But RDF, along with a smattering of OWL and a rules language, supports a range of declarative approaches to data integration.

Laying out the options will hopefully highlight where the "easier" data merging happens. It'll also highlight where there's still room for improvement. And perhaps also suggest where people need to focus on just the useful bits of OWL.

Where We Share the Same Identifiers

Where We Have Different Identifiers

...but have a common "primary key"

...but our resources are a "primary key" themselves

...where we can infer a relationship

