How to add a mapping namespace
As an example, we use a fictitious language with code "xx" and Wikipedia rank 44.
CAUTION: some subtle code changes will be needed for the first language code that contains a dash "-". In this case, please update the code and this guide.
Get language code and rank
Get the wiki language code and rank from the list of Wikipedias.
Namespace number: multiply the rank by 2 and add 200
Example: language code "xx", rank 44, namespace number 288.
CAUTION: If the calculated namespace number already exists for another language (because the ranking has changed) do not change the existing namespace number. Please find a neighboring or close enough number that works.
If 288 is in use, we choose some other number that is not used, let's say 298.
Update the extraction framework
Edit core/org.dbpedia.extraction.wikiparser.Namespace.scala
Edit core/org.dbpedia.extraction.wikiparser.Namespace.scala. Add something like this at the appropriate position:
"xx"->288,
Edit dump/extract.default.properties
Edit dump/extract.default.properties. Add something like this at the appropriate position:
extractors.xx=MappingExtractor
Commit changes
Commit and push the changes to default branch.
Update and restart the mapping server
Log onto the machine that is running the mapping server, i.e. serving http://mappings.dbpedia.org/server/ URLs.
Stop the server:
ps axfu | grep java
Look for class ...server.Server, and then:
kill <process id>
Then update, compile and start the server:
cd extraction_framework hg pull hg update mvn clean install --projects core,server cd server ../run server &>server-<YYYY>-<MM>-<DD>.01.log &
Update mappings wiki
Update MediaWiki settings
Log onto the machine that is running this mappings wiki, i.e. serving http://mappings.dbpedia.org/index.php URLs.
Open htdocs/mappings/LocalSettings.php. Add the following snippet at the correct position in the code:
"xx"=>288,
Restart the Apache server.
Update mappings wiki sidebar
Edit MediaWiki:Sidebar. Add a link for the new language:
** {{fullurl:Special:AllPages|namespace=288}}|Mappings (xx)
Update datasets overview
Edit datasets. Add a column for the new language and update all rows according to the settings in . Ouch...
Generate and deploy statistics
Extract data from Wikipedia dump file
Download the latest dump for language xx.
Run RedirectExtractor, InfoboxExtractor and TemplateParameterExtractor. dump/extract.stats.properties should contain the correct settings. cd into directory dump/, copy extract.stats.properties to extract.properties, modify if necessary, and run
dump> ../run extract
Extract statistics from triples files
cd into directory server/ and run
server> ../run stats
Copy src/main/statistics/mappingstatistics_xx.txt to same folder on the mappings server.
Update and deploy sprint stuff
Ask Pablo how to do that...