Data Export - flagbit/Magento-FACTFinder GitHub Wiki

Common information

In general FACT-Finder can run with any other data export as well, if it fits the FACT-Finder requirements and exports the SKU or the entity IDs. However the module also has the ability to create a data export. That export can be downloaded via an URL which can be found inside the FACT-Finder section of the Magento configuration - it's different for each store. The URL is built like that:
{shop-domain}/factfinder/export/{type}?key={password-md5-hash}[&store={storeid}]

The placeholders have these meanings:

  • {shop-domain}: the address where the Magento start page is reachable - depending on the setup it can be a path (which is used to specify the store) and eventually must end with "index.php" (if no rewriting is enabled). Example: http://foo.com/en/index.php.
  • {type}: getfor pre-generated product export through Magento/system cronjob or product for real-time product export.
  • {password-md5-hash}: to restrict the download, the password md5 hash of the configured FACT-Finder authentication user must be specified at the "key" parameter. This is also the password which is used from the module to authenticate against the FACT-Finder service.
  • {storeid}: if an export should be done for a store which is not the default store, the store id has to be specified. (The store id is not the store code). If the store is part of the shop-domain (so each store has its own domain), this is not necessary as well.

Since version 4.0 we recommend to set up the product export as a cronjob. There are two ways to do that:

Magento Cronjob

This method is only suggested for a product catalog with <= 10000 products. The module comes with a standard magento cronjob which can be enabled in the backend under "System -> Configuration -> FACT-Finder -> FACT-Finder Cron Export Settings"

System Cronjob

Our recommendation is always (if your server provider allows it) to setup the product export through a system cronjob. With that method you can export large amounts of products without influencing magento performance and delaying other cronjobs. For that we provide a cli script in magento's shell folder.

Too see, which options are available in the current version of the module, execute the script without parameters - the usage help message will be shown:

php factfinder.php

Then you can choose the exact parameters that you need and setup your cronjob. Example:

50 1 * * * /usr/bin/php -d memory_limit=2G /var/www/shell/factfinder.php --exportAllTypesForAllStores > /dev/null

There will be a lock file for every store named "ffexport_{storeid}.lock" created for every product-export and deleted after the export is finished. If the export crashes no more import is possible for the next two hours (this is the default value which can be changed). If the lock-file is older than two hours, the export starts and the lock-file will be refreshed.

File location on server

Physically export files are located under /{magento/path}/var/factfinder directory. The names have the following template: store_{store-id}_{export-type}.csv, e.g. store_1_product.csv or store_8_price.csv.

Which values are exported?

These fields are exported all the time:

  • id: It's the entity id of the database product-records.
  • sku: Usually configured for the article number search, it has the field role "Product number for display + campaigns" on FACT-Finder side.
  • parent_id: Depending on which identifier (id or sku) is configured for the module, this is the identifier of the parent product (for example that's a configurable product for a simple product). If no parent product exists, the parent_id is the id of the product itself. Usually it has the field role "Data record id", "Product number for tracking" and "Master article number" on FACT-Finder side.
  • category: The category path of the product in the format FACT-Finder expects it. It must be configured on FACT-Finder using the field type "categoryPath".
  • filterable_attributes: A collection with all attributes that are defined as "filterable" via the Magento attribute property "Use In Layered Navigation". The attributes are formatted in a certain key-value format as FACT-Finder expects it. It must be configured on FACT-Finder side using the field type "multi-attribute".
  • searchable_attributes: A collection with all attribute values that are defined as "searchable" via the Magento attribute property "Use in Quick Search". The values are simply separated by comma.
  • numerical_attributes: A collection with all attributes that are defined as "filterable" and has the input type "price". It must be configured on FACT-Finder side using the field type "multi-attribute".
  • deeplink: The deeplink will be used for the product suggestions and refers to the configurable/parent product.

Additionally all system attributes are exported as a separate csv field. Normally these are attributes like "name", "price", "description" etc. Also all attributes are added as a separate field, which are defined as "sortable" via the Magento attribute property "Used for Sorting in Product Listing". Last but not least there is a possibility to explicitly add attributes to the export (*). In all these cases the attributes won't be inside the "filterable", "numerical" and "searchable" attribute fields anymore.

There's also the possibility to set up unit for numerical attributes. These will be considered when having the setting "Use explicit attribute types" enabled.

(*) If you want to export addidional attributes to own coloums, you have to do the following two steps:

  • set "Use explicit attribute types" to "no" and
  • add the attribute to the list "FACT-Finder Export Configuration" -> "Attributes"

Delta-Updates

Classic delta updates are not possible with FACT-Finder, this is why it is possible to make a full export just of the data that is updated most often: price and stock information. On FACT-Finder end these files can be merged together. With the stock information it is also possible to remove records that are not on stock. The price-values from the product-export are stored in a different column as the price-values from the price-export, so it's up to the FACT-Finder configuration, which prices are used for filtering and sorting.

With these 3 possible exports it's possible to make update strategies like the following:

  • Make a daily full product-export (e.g. 8:00)
  • Every 4 hours make a price export, just some minutes after the full export is finished (e.g. [*/4]:25 => 0:25, 4:25, 8:25 ...)
  • Every 2 hours make a stock export, which is done some minutes after the price import. This one also triggers the index process on FACT-Finder end. (e.g. [*/2]:30 => 0:30, 2:30, 4:30 ...)

Using such a method depends of course on the shop and how often that data really changes. Maybe the product base does not change very often and a full export is only necessary once in a week. If exporting prices and stock and the indexing process are fast enough, an hourly update could also be considered and the changes will be live very soon.

The URL to these two other exports is similar to the product export, just with that one different keyword in the URL:

  • {shop-domain}/factfinder/export/price?key={password-md5-hash}[&store={storeid}]
  • {shop-domain}/factfinder/export/stock?key={password-md5-hash}[&store={storeid}]

FTP Upload

Version >4.1.8

The module supports also uploading export files to a remote FTP directory, which can be configured in the backend. If it was enabled FTP upload is triggered independently of which way the export was executed.

The export files get added to a store specific ZIP archive which then gets uploaded to FTP. Please note, FTP upload is triggered only after a successful product export. If you want your stock and price files also to be uploaded in the same archive, make sure they were created before the product export.

Export CMS Pages

Version >= 4.1.14

In addition to the product data, as of version 4.1.14 CMS pages can also be exported. Read more for configuration details at Export CMS Pages

With parameter --exportCmsForStore {storeId} the factfinder.php script will export cms pages to the export file location as described above /{magento/path}/var/factfinder. The files are named store_{storeId}_cms.csv

The cronjobs have to be configured separately for each storeId 55 1 * * * /usr/bin/php -d memory_limit=2G /{magento/path}/shell/factfinder.php --exportCmsForStore 1 > /dev/null

similar to product exports the cms export file can be triggered and retrieved with http request as follows {shop-domain}/factfinder/export/export?key={password-md5-hash}[&store={storeid}]&resource=cms

You will find this link in the configuration section Export Configuration -> "Trigger" -> "CMS"