IR based implementation (Lucene) - dbpedia-spotlight/dbpedia-spotlight GitHub Wiki

The original DBpedia Spotlight implementation uses Apache Lucene for disambiguation and LingPipe for spotting. Pre-built indexes and spotter models are available for English.

Downloads

DBpedia Spotlight looks for ~3.5M things of ~320 types in text and tries to disambiguate them to their global unique identifiers in DBpedia. It uses the entire Wikipedia in order to learn how to annotate DBpedia Resources, the entire dataset cannot be distributed alongside the code, and can be downloaded in varied sizes from the download page. A tiny dataset is included in the distribution for demonstration purposes only. After you've downloaded the files, you need to modify the configuration in server.properties with the correct path to the files. More info here.