Mapping Guide
Dear fellow Mappers, one benefit of the DBpedia ontology is to standardise and reduce the redundancy in properties used by entities. At the moment, the DBpedia ontology is starting to inflate with equal properties. The ontology is getting unclear and the benefits of standardisation get lost. For instance, there are the ontology properties: dateClosed, closingDate, closed, dateOfAbandonment and dissolved. All these properties describe the same or at least nearly the same. For example a closure of a firm, closing a road, decommissioning of facilities, or an abandonment of a project. It seems that there is a need for a short guide how to write mappings and take care of the usefulness of the ontology. The basic introductions for writting mappings could be found here.
General instructions
Create a user page for your account and insert some information about yourself. Please add your email address. Thank You!
Please try to minimize the amount of edits. First write the whole mapping before committing it. That helps other people to keep track of the edits.
Generally, if you found unclear or doubled ontology properties, do not hesitate to create a discussion page for this property and note your questions or objections about this property. Help us to keep the ontology clean and useful.
Check the mapping statistics
If you ask yourself "Where do I start mapping?", please check the Mapping Statistics. They give you a good idea of where new mappings would make the biggest impact.
Check redirects for your infobox
If you have found an infobox that isn't mapped already, check whether the infobox is redirected to another. If it is so, check whether that infobox is already mapped. If it is not, create a mapping for the infobox to which is redirected, not for the one that redirects to another.
Read infobox template documentation
Take the template documentation of the infobox that you want to map as your source for property definitions. It can be found at the Wikipedia page of the template. See at the template documentation of the Infobox China station for instance. Of course, not all templates have a adequate documentation. So, if your infobox hasn't one, the following points become even more important.
Check for similar mappings
A helpful hint. Check for already mapped infoboxes that describe similar things. Example: If you want to map the "Infobox China station", the mappings for "Infobox station" or "Infobox japan station" are really helpful. You can find similar infoboxes via the Wikipedia categories. Most template documentation pages have links to that categories at their bottom.
DO NOT copy blindly.
- Do not just copy and paste, but take a careful look at properties that are equal or similar to properties used in your infobox to map.
- Your infobox may be different from the infobox of the mapping you're reusing. Read the documentation of your infobox. Especially check whether to map to OntologyProperties or DataProperties (see Object/DataProp Dichotomy)
- Many mappings have various errors. Do not propagate the errors by copying uncritically. If you find (or even suspect) an error in the mapping you're reusing, raise an issue
Map the properties
Please spend some research effort into this issue.
Get an overview of the property values
Get an overview of the values of the infobox property that you want to map. Issue #327 should make this very easy, but for now you need to mess with some (light) SPARQL.
Go to http://dbpedia.org/sparql and enter the following query:
SELECT DISTINCT * WHERE { ?s <http://dbpedia.org/property/platform> ?o. ?s <http://dbpedia.org/property/wikiPageUsesTemplate> <http://dbpedia.org/resource/Template:Infobox_china_station>. }
Instead of platform you enter the name of your infobox property. Consider that spaces and underscores are removed and compound words are camelCase. Instead of Infobox_china_station you enter your infobox for which you are just writing a mapping.
- The current DBpedia version can already be outdated, therefore you have to consider recent redirects. The "Infobox china station" now redirects to "Infobox China station" for example.
- If your query do not deliver results, try a simple property that is mostly used in the infobox like "name" for instance. So you can check whether your query is correct.
- Otherwise, check the infobox history for redirects and try other variations of the infobox name. From the results, you know what kind of values the property holds.
- Better yet, query http://live.dbpedia.org/sparql that should be pretty up to date
Search for ontology properties
Search for ontology properties not only via the left-hand search box in the Wiki-menu, but via the Ontology Properties link in the menu. Consider that you can not just search for "date" and all properties that include "date" in their name or label are displayed. You will only get the properties that start with the term "date", so the property closingDate is not in the results. The search function of the Wiki is not sufficient at all. Therefore, do not rely upon the search results in the moment (btw. do you know a good Wikimedia search extension?). If you have found a possible ontology property for the infobox property, check out the "What links here"-link of the Wiki and compare the already mapped infobox properties with the one you want to map to that ontology property. Do they describe same things? Note that some of the already written mappings can be inaccurate. If you found inconsistency, add your concerns to the discussion page of the inaccurate mapping, or change it, if it's an unambiguous error.
Create new ontology properties
If you have an infobox property that definitely can not be mapped to an existing ontology property, you can create a new one. But please stick to some simple rules:
Naming conventions
The name of the new property should not just copied from the infobox. Better take a look at the template documentation and the property definition if there is one. If not, take a look at a few Wikipedia articles that uses the infobox you want to map, or revert to the SPARQL Query above, and check how the property is used. If the property is used for numbers, it should be considered for the name of the new ontology property by adding a prefix like "numberOf", or if the property is used for dates, the term "date" should be part of the new ontology property name. Generally, the property name should be build from more than one word.
Domain
Take care by defining a domain and a range of properties. Do not just define them as owl:Thing only because it is simple. If your property is especially for an ontology class, do not hesitate to define this class as domain. That will prevent people to reuse this property for other classes by mistake, especially if the property name is not unambiguous.
Range
The range of the property should be defined by considering the property values and the infobox definition. Some infobox properties hold different data types or patterns of values, because the infobox property is not clearly defined in the template documentation. Therefore, Wikipedia authors use that property as they want. That makes it difficult for us to define the property's range. If a range is defined in the infobox definition, generally stick to that range. I found infobox properties with a range defined, but as I checked the values, I had to discover that the property values mostly disagree with the defined range. In such a case, chose a range that covers the property values and leave a note in the property comment. If the infobox property has no range defined, you always have to look at the values. For example, you have to weigh up to chose between a strict object property with an ontology class as range or a data type property with xsd:string as range. A string would catch more information, but a object property is the clearer definition. You can motivate your decision in the property comment.
Comments
YOU MUST add an English comment to the ontology property. And not just repeat the prop name, butgive some useful info on usage, and contrast to other similar props. If the template documentation has a definition of the property, use it as comment. A short description of the property or a definition of the property values is really helpful for other people, which have to decide whether this property can be used for their mapping.
The biggest complaint of your fellow ontology editors is that props are not documented. In the near bright future, new props and classes without comment will be deleted.
Some examples for good comments:
- OntologyProperty:sportDiscipline: the sport discipline the athlete practices, e.g. Diving, or that a board member of a sporting club is focussing at
- OntologyProperty:zodiacSign: Applies to persons, planets, etc
- OntologyProperty:bustWaistHipSize: Use this property if all 3 sizes are given together (DBpedia cannot currently extract 3 Lengths out of a field). Otherwise use separate fields bustSize, waistSize, hipSize
- OntologyProperty:isHandicappedAccessible: True if the station is handicapped accessible.
- OntologyProperty:effectiveRadiatedPower: In radio telecommunications, effective radiated power or equivalent radiated power (ERP) is a standardized theoretical measurement of radio frequency (RF) energy using the SI unit watts (http://en.wikipedia.org/wiki/Effective_radiated_power)
Bad examples of missing comments:
- What's OntologyProperty:member vs OntologyProperty:membership?
- When to use [[OntologyProperty:teamMember] vs OntologyProperty:currentTeamMember vs OntologyProperty:sportsTeamMember]?
- What's OntologyProperty:event? We need to investigate to find out that it should be replaced by OntologyProperty:sportDiscipline (#12). Not good!
Validate the infobox mapping
Validate your mapping. Use the "Test this mapping"-link at the mapping page. Especially, check properties that you have created yourself.
Add a few example Wikipedia articles that use the infobox you just mapped as test cases
- Go to http://mappings.dbpedia.org/server/extraction/en/ (Adjust the language tag in the end of the URL)
- Put the final test link(s) into the Discussion tab under heading "Testing"
- Read more at Main_Page#Testing_Best_Practices