DataSpider 104: The DataSpider Crawl - modakanalytics/dataspider GitHub Wiki

The DataSpider Crawl (not a country dance step )

A typical scenario for source crawling is to set up a periodic crawl. The DataSpider has been optimized so that it can basically crawl thousands of relational data sources in a fairly short amount of time. Typically a daily crawl would satisfy most enterprise needs. Actually there are probably few enterprises which have a full metadata profile of their relational datastores that is up-to-date on a daily basis. At a high level the operation of the DataSpider in relation to Kosh and the Sources that it crawls is shown in the diagram below.