Troubleshooting - dbpedia-spotlight/dbpedia-spotlight GitHub Wiki

Some frequently observed errors are collected below.

Whatever build error you get, check maven version

if you experience problems with missing dependencies while doing mvn install on the project, you might want to check your installed version of Maven. In the past the build only worked with Maven 2.2.1. We have very recently upgraded to Maven 3 (25.09.2012). If your version is earlier than that, use Maven 2.

Cannot find (maven) model file

Error: org.apache.maven.reactor.MavenExecutionException: Could not find the model file '/usr/local/spotlight/trunk/jung'. for project unknown

Solution: The only required modules for running the web service are: core, rest and demo (if you want the HTML interface as well). If you do not need to index, you can remove every other module from the parent pom.xml The only required modules for running indexing are: core and index. You can remove the other modules from the parent pom.xml

Memory error

Error: Memory error, heap space

You may need to update your pom.xml with adequate heap space for the dictionary file you are using.

    <properties>
      <heapspace.Xmx.server>-Xmx16g</heapspace.Xmx.server>
    </properties>

How much memory?

The memory requirements are directly tied to your target lexicon, as our most rudimentary implementation loads the entire lexicon into memory in order to speed up spotting.

You can build a dictionary of People, Locations and Organizations with about 200M of RAM. See the one that I included in the distribution, for example.

You can also download the dictionary built from URIs that occurred more than 75 times in Wikipedia: http://spotlight.dbpedia.org/download/release-0.4/surface_forms-Wikipedia-TitRedDis.uriThresh75.tsv.spotterDictionary.gz

This should load with a lot less (maybe 5x) less RAM than the one we use in production. And it will spot the most important things anyways.

See: http://sourceforge.net/mailarchive/message.php?msg_id=28255247

Could not resolve dependencies

For some dependencies that either did not have a maven repo or that we had to patch, we distribute the jars alongside our code, and install them via install-file in the parent pom.xml. Make sure you run ??mvn install?? from the parent directory (e.g. /home/user/workspace/dbpedia-spotlight-0.5/)

Error: (Failed to execute goal on project core: Could not resolve dependencies for project org.dbpedia.spotlight:core:jar:0.5) dependencies are missing for: org.semanticweb.yars:nx-parser:jar:1.1 com.aliasi:lingpipe:jar:4.0.0 edu.umd:cloud9:jar:SNAPSHOT weka:weka:jar:3.7.3

Solution: cd /home/user/workspace/dbpedia-spotlight-0.5/ mvn install

MalformedInputException when reading files

If you get a similar stacktrace to the one shown below, you might need to convert your input file to UTF-8.

Error:

 INFO 2012-09-25 10:06:06,215 main [IndexingConfiguration] - Loading configuration file conf\indexing.ca.properties
 INFO 2012-09-25 10:06:06,235 main [IndexLingPipeSpotter$] - Reading surface forms from data\accentsSurfaceForms-small.tsv...
Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
	at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:338)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.fill(BufferedReader.java:154)
	at java.io.BufferedReader.readLine(BufferedReader.java:317)
	at java.io.BufferedReader.readLine(BufferedReader.java:382)
	at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:67)
	at scala.collection.Iterator$class.foreach(Iterator.scala:660)
	at scala.io.BufferedSource$BufferedLineIterator.foreach(BufferedSource.scala:43)
	at org.dbpedia.spotlight.spot.lingpipe.IndexLingPipeSpotter$.getDictionaryFromTSV(IndexLingPipeSpotter.scala:105)
	at org.dbpedia.spotlight.spot.lingpipe.IndexLingPipeSpotter$.getDictionary(IndexLingPipeSpotter.scala:63)
	at org.dbpedia.spotlight.spot.lingpipe.IndexLingPipeSpotter$.main(IndexLingPipeSpotter.scala:141)
	at org.dbpedia.spotlight.spot.lingpipe.IndexLingPipeSpotter.main(IndexLingPipeSpotter.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

Solution:

cp surfaceForms.tsv surfaceForms.iso.tsv
iconv -f  ISO-8859-1 -t UTF-8 surfaceForms.iso.tsv > surfaceForms.tsv

If the above does not work, try the following:

java -Dfile.encoding=utf-8 -jar dbpedia-spotlight-0.7.jar de http://localhost:2222/rest/
⚠️ **GitHub.com Fallback** ⚠️