Mapping statistics: Difference between revisions
(→DBpedia Mapping Statistics: added ar, bn, eu) |
(bg, bugfixes) |
||
Line 4: | Line 4: | ||
Statistics are available for the following languages: | Statistics are available for the following languages: | ||
*[http://mappings.dbpedia.org/server/statistics/ | *[http://mappings.dbpedia.org/server/statistics/ar/ Arabic (ar)] | ||
*[http://mappings.dbpedia.org/server/statistics/ | *[http://mappings.dbpedia.org/server/statistics/bg/ Bulgarian (bg)] | ||
*[http://mappings.dbpedia.org/server/statistics/bn/ Bengali (bn)] | |||
*[http://mappings.dbpedia.org/server/statistics/ca/ Catalan (ca)] | *[http://mappings.dbpedia.org/server/statistics/ca/ Catalan (ca)] | ||
*[http://mappings.dbpedia.org/server/statistics/cs/ Czech (cs)] | *[http://mappings.dbpedia.org/server/statistics/cs/ Czech (cs)] | ||
Line 12: | Line 13: | ||
*[http://mappings.dbpedia.org/server/statistics/en/ English (en)] | *[http://mappings.dbpedia.org/server/statistics/en/ English (en)] | ||
*[http://mappings.dbpedia.org/server/statistics/es/ Spanish (es)] | *[http://mappings.dbpedia.org/server/statistics/es/ Spanish (es)] | ||
*[http://mappings.dbpedia.org/server/statistics/ | *[http://mappings.dbpedia.org/server/statistics/eu/ Basque (eu)] | ||
*[http://mappings.dbpedia.org/server/statistics/fr/ French (fr)] | *[http://mappings.dbpedia.org/server/statistics/fr/ French (fr)] | ||
*[http://mappings.dbpedia.org/server/statistics/ga/ Irish (ga)] | *[http://mappings.dbpedia.org/server/statistics/ga/ Irish (ga)] |
Revision as of 10:25, 25 April 2012
DBpedia Mapping Statistics
The statistics will give you an overview of already mapped infoboxes and their properties. In order to spend your "mapping time" efficiently, the statistics reveal on which infoboxes you should pay your main focus of attention. The Statistics are live, thus you can see your changes immediately.
Statistics are available for the following languages:
- Arabic (ar)
- Bulgarian (bg)
- Bengali (bn)
- Catalan (ca)
- Czech (cs)
- German (de)
- Greek (el)
- English (en)
- Spanish (es)
- Basque (eu)
- French (fr)
- Irish (ga)
- Hindi (hi)
- Croatian (hr)
- Hungarian (hu)
- Italian (it)
- Korean (ko)
- Dutch (nl)
- Polish (pl)
- Portuguese (pt)
- Russian (ru)
- Slovene (sl)
- Turkish (tr)
For each language you'll find three percentages at the top of the page. To explain them, we look at the English mapping statistics:
3.94 % templates are mapped ( 285 of 7225 ). 80.73 % of all template occurrences in Wikipedia ( en ) are mapped ( 1695763 of 2100472 ). 49.23 % of all property occurrences in Wikipedia ( en ) are mapped ( 16090379 of 32686631 ).
In the first line we can see that 3.94 % of the templates in the English Wikipedia are mapped. The significance of this percentage should be handled with care, because there are of course more than 7225 templates in the English Wikipedia. But these 7225 templates have multiple properties and therefore fulfil our requirements for a potential infobox. Due to this low criterion, the statistics contain non relevant templates like Unreferenced or Rail line. These templates aren't classical infoboxes and shouldn't affect the statistics. On that account they can be ignored. If a template is on the ignore list, it does not count for the number of potential infoboxes. If you want templates to be ignored, send me a mail with the template names. If you are a really active person in the mappings wiki, we will give you the hint how to add templates to the ignore list.
The second line shows the mapped template occurrences. Here we can see that 80.73 % of all template occurrences are mapped already. This means that 3.94 % mapped templates cover 80.73 % of all templates used in the English Wikipedia. To understand this relation we take a look at the Infobox settlement. It occurs 226467 times in the English Wikipedia, which corresponds to about 10 % of all template occurrences. So, writing the mapping for this infobox was really effective.
In the third line we can see that 49.23 % of all property occurrences in the English Wikipedia are mapped. This is the most interesting percentage, because it includes the property completeness of mappings. Imagine a template mapping for the Infobox person in which only the name property is mapped to the ontology.
Below the statistics at the top, you see a table with all templates ordered by their occurrences.
Here is the explanation for the columns:
- Template occurrence
- The name of the template with a link to detailed property statistics.
- Via the "Edit" link you can directly go to the mapping of an infobox.
- The fourth column (num properties) holds the number of properties of the template. With a click on a template name, you can inspect this properties also ordered by their occurrence. Therefore, properties at the top are best top map.
- In the fifth column (mapped properties (%)) you see the percentage of properties mapped.
- In the sixth column (num property occurrences) you see the number of all template properties that occur in Wikipedia. For a fictive template that has 5 properties and occurs 10 times in Wikipedia, the number would be 50. The properties must have values to count here. If the fictive template occurs 10 times, but in one case only has 8 properties with a value, the number would be 48.
- The seventh column (mapped property occurrences (%)) contains the percentage of mapped property occurrences for this template. This percentage represents the completeness of the mapping and therefore determining for the colour of the row, which indicates the completeness.
DBpedia Mapping Creation Sprint/Race
Check this page for where your language stands in the race for excellence: http://mappings.dbpedia.org/sprint/