How to edit DBpedia Mappings: Difference between revisions
No edit summary |
No edit summary |
||
Line 14: | Line 14: | ||
* [http://mappings.dbpedia.org/index.php?title=Special%3AAllPages&from=&to=&namespace=204 Infobox Mappings] | * [http://mappings.dbpedia.org/index.php?title=Special%3AAllPages&from=&to=&namespace=204 Infobox Mappings] | ||
* [http://mappings.dbpedia.org/index.php?title=Special%3APrefixIndex&prefix=Table&namespace=204 Table Mappings] | * [http://mappings.dbpedia.org/index.php?title=Special%3APrefixIndex&prefix=Table&namespace=204 Table Mappings] | ||
== Tools == | == Tools == | ||
*'''[[Mapping_Guide|Best practice mapping guide]]''' This mapping guide gives you MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | |||
* '''Mapping Validator.''' When you are editing a mapping, there is a ''validate button'' on the bottom of the page. Pressing the button validates your changes for syntactic correctness and highlights inconsistencies such as missing property definitions. | * '''Mapping Validator.''' When you are editing a mapping, there is a ''validate button'' on the bottom of the page. Pressing the button validates your changes for syntactic correctness and highlights inconsistencies such as missing property definitions. | ||
* '''Extraction Tester.''' The extraction tester tests a mapping against a set of example Wikipedia pages. This gives you direct feedback about whether a mapping works and how the resulting data will look like. | * '''Extraction Tester.''' The extraction tester tests a mapping against a set of example Wikipedia pages. This gives you direct feedback about whether a mapping works and how the resulting data will look like. | ||
* '''MappingTool.''' The [[MappingTool|DBpedia MappingTool]] is a graphical user interface that supports users to create and edit mappings. | * '''MappingTool.''' The [[MappingTool|DBpedia MappingTool]] is a graphical user interface that supports users to create and edit mappings. | ||
* [http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/84eeec36fa5d/core/doc/mapping_language/ DBpedia Mapping Language Specification] (detailed) | * [http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/file/84eeec36fa5d/core/doc/mapping_language/ DBpedia Mapping Language Specification] (detailed) | ||
== Create new mappings == | == Create new mappings == | ||
Line 44: | Line 36: | ||
On the top you can click on "create" and start writing the mapping. | On the top you can click on "create" and start writing the mapping. | ||
== How to map a Wikipedia | == How to map a Wikipedia Template == | ||
* Get the encoded template page name from Wikipedia. Make sure that the template is no redirect page. | * Get the encoded template page name from Wikipedia. Make sure that the template is no redirect page. | ||
Line 156: | Line 148: | ||
If a table mapping is defined, all rows of the table are mapped to instances of an ontology class, all of its columns are be mapped to ontology properties. | If a table mapping is defined, all rows of the table are mapped to instances of an ontology class, all of its columns are be mapped to ontology properties. | ||
== How to map a Wikipedia | == How to map a Wikipedia Table == | ||
* Find important keywords in the table header that identify a table unambiguously. | * Find important keywords in the table header that identify a table unambiguously. |
Revision as of 11:57, 7 July 2011
In the spirit of open source projects, the idea of this wiki is to enable the interested public to contribute to the definition of DBpedia mappings by updating existing mappings and by adding new mappings to this wiki.
The type of Wikipedia content that is most valuable for the DBpedia extraction are infoboxes and tables. Infoboxes display an article's most relevant facts as a table of attribute-value pairs on the top right-hand side of the Wikipedia page.
As Wikipedia's infobox template system has decentrally evolved over time, different communities of Wikipedia editors use different templates to describe the same type of things (e.g. infobox_city_japan, infobox_swiss_town and infobox_town_de). Different templates use different names for the same attribute (e.g. birthplace and placeofbirth). As many Wikipedia editors do not strictly follow the recommendations given on the page that describes a template, attribute values are expressed using a wide range of different formats and units of measurement.
In order to overcome the problems of synonymous attribute names and multiple templates being used for the same type of things, the DBpedia project maps Wikipedia templates as well as tables within an article to the DBpedia ontology. These mappings are specified using the DBpedia Mapping Language. The mapping language makes use of MediaWiki templates that define DBpedia ontology classes and properties as well as template/table to ontology mappings.
The following mappings map English Wikipedia infoboxes and tables to this ontology. From the existing one, you can get a good idea of how they work:
Tools
- Best practice mapping guide This mapping guide gives you MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
- Mapping Validator. When you are editing a mapping, there is a validate button on the bottom of the page. Pressing the button validates your changes for syntactic correctness and highlights inconsistencies such as missing property definitions.
- Extraction Tester. The extraction tester tests a mapping against a set of example Wikipedia pages. This gives you direct feedback about whether a mapping works and how the resulting data will look like.
- MappingTool. The DBpedia MappingTool is a graphical user interface that supports users to create and edit mappings.
- DBpedia Mapping Language Specification (detailed)
Create new mappings
To create a new mapping, type the following line into your web browser
http://mappings.dbpedia.org/index.php/Mapping_LANGUAGE:INFOBOXNAME
- replace LANGUAGE by the language code you are currently working on (for example mt for Maltese)
- replace INFOBOXNAME by the box that you want to create a mapping for (replace spaces with underscores)
e.g.
http://mappings.dbpedia.org/index.php/Mapping_mt:Infobox_album
for the Album infobox on the Maltese Wikipedia.
If there is no mapping for this box yet, you will see a page saying "There is currently no text in this page. You can search for this page title in other pages, search the related logs, or edit this page." On the top you can click on "create" and start writing the mapping.
How to map a Wikipedia Template
- Get the encoded template page name from Wikipedia. Make sure that the template is no redirect page.
- Example: For the Wikipedia template Infobox musical artist use
Infobox_musical_artist
.
- Example: For the Wikipedia template Infobox musical artist use
- Create a wiki page in this wiki in the Mapping namespace, using the encoded Wikipedia template page name.
- Example: For the Wikipedia template Infobox musical artist create the wiki page Mapping:Infobox_musical_artist.
- Decide on the ontology class you would like to map the template to.
- Example: Ontology classes belong to the Class namespace. A list of existing ontology classes can be found via the sidebar (Ontology Classes).
- Write a Template:TemplateMapping or Template:ConditionalMapping to map the Wikipedia template to an ontology class and save it to the created wiki page in the Mapping namespace.
The full documentation on writing mappings can be found via the DBpedia Repository.
Some helpful hints can be found in the best practice mapping guide.
Template to Ontology Mapping Language
When mapping a Wikipedia template to an ontology class and mapping template properties to ontology properties for this template, users will have to edit the corresponding template documentation page in MediaWiki.
The following templates cover the template to ontology schema mapping:
- TemplateMapping Mapping from Wikipedia templates to ontology classes.
- PropertyMapping Mapping from Wikipedia template properties to ontology properties.
- IntermediateNodeMapping For extracting multiple values from a single property it is necessary to introduce an intermediate node. The IntermediateNodeMapping allows to express mappings from Wikipedia template properties to ontology properties on an additional node and to connect the additional node to the mapped instance.
- ConditionalMapping Maps templates to ontology classes. In comparison to a TemplateMapping the mapping can be defined depending on template properties and their values.
- Custom mappings
- To cover specific, more complex mapping cases, the DBpedia extraction framework can be extended with custom parsers which have to implement a specific PHP interface. These parsers are invoked using custom mappings.
Template Mapping
The TemplateMapping template offers the following template parameters:
- mapToClass
- Templates are mapped to ontology classes. The template parameter mapToClass allows one DBpedia ontology class as a value.
- correspondingClass, correspondingProperty
- In the case that different templates are used on the same page (for instance Automobile and Automobile Generation), the instance resulting from the second grade template (Automobile Generation) can be connected to the instance of the first grade template (Automobile) using a corresponding property. Thus, if an instance of type correspondingClass is found on the same page, it will be connected to the instances of the mapped template by correspondingProperty.
- mappings
- Mappings map template properties to ontology properties, they have to be defined by using PropertyMapping or IntermediateNodeMapping. Custom, user-defined, mappings like the GeocoordinatesMapping can also be defined.
Property Mapping
The PropertyMapping template offers the following template parameters:
- ontologyProperty
- A template property to ontology property mapping should list one ontology property.
- templateProperty
- A template property to ontology property mapping should list one template property which is to be mapped.
- unit
- If a template property containing a numerical value and a unit is mapped, the unit has to be defined. If a template property has no default unit defined, e.g. its values can contain different units of the same dimension, the dimension has to be defined for usability as well as validation reasons. Possible dimensions are Length or Mass.
Intermediate Node Mapping
The IntermediateNodeMapping template offers the following template parameters:
- nodeClass, correspondingProperty
- Creates an additional node of the type nodeClass, which will be connected to the instance extracted from template by the property provided by correspondingProperty.
- mappings
- Mappings map template properties to ontology properties, they have to be defined by using PropertyMapping, IntermediateNodeMapping, or a CustomMapping.
Conditional Mapping
The ConditionalMapping template offers mapping templates to ontology classes. In comparison to a TemplateMapping the mapping can be defined depending on template properties and their values.
- cases: Cases define conditions on template properties and their values and can change the default mapping, like the ontology class the template is mapped to and the ontology properties the template properties are mapped to. The cases template property should contain a list of Condition templates.
- defaultMappings: The default mapping defines the default template property mappings using PropertyMapping etc.. The default ontology class the template is mapped to has to be defined by an otherwise condition.
Custom Mappings
For specific tasks, such as extracting durations or calculating a geo-location-ID based on multiple properties, we allow the DBpedia extraction framework to be extended with custom value parsers and allow the definition of DBpedia custom mapping templates. The name of a custom mapping template has to be equal to the name of the corresponding DBpedia parser class. As examples of custom mapping, we define the DateIntervalMapping and the GeocoordinatesMapping.
The DateIntervalMapping template provides an exact mapping from start and end dates of a template property value to ontology properties. It offers the following template parameters:
- templateProperty
- startDateOntologyProperty
- endDateOntologyProperty
The GeocoordinatesMapping template offers the following template parameters:
- coordinates
- Use the coordinates parameter if the geo coordinates are covered by one template property.
- latitude
- longitude
- latitudeDirection
- latitudeDegrees
- latitudeMinutes
- latitudeSeconds
- longitudeDirection
- longitudeDegrees
- longitudeMinutes
- longitudeSeconds
- ontologyProperty
The CombineDateMapping template offers the following template parameters:
- templateProperty1
- unit1
- templateProperty2
- unit2
- templateProperty3
- unit3
- ontologyProperty
The CalculateMapping template offers the following template parameters:
- operation
- templateProperty1
- unit1
- templateProperty2
- unit2
- ontologyProperty
Table mappings apply to tables containing a set of keywords in the table header.
If a table mapping is defined, all rows of the table are mapped to instances of an ontology class, all of its columns are be mapped to ontology properties.
How to map a Wikipedia Table
- Find important keywords in the table header that identify a table unambiguously.
- Create a wiki page in this wiki in the Mapping namespace, using the Table prefix, or use an existing table mappings wiki page. You can define more than one table mapping on one wiki page. The wiki page name doesn't have to refer to any of the table keywords. Bundling table mappings depending on the table topic could be of use.
- A list of existing table mappings can be found via the sidebar (Table Mappings).
- Decide on the ontology class you would like to map the table to.
- A list of existing ontology classes can be found via the sidebar (Ontology Classes).
- Write a Template:TableMapping to map the Wikipedia table rows to an ontology class and save it to the created wiki page in the Mapping namespace.
The full documentation on writing mappings can be found via the DBpedia Repository.