Performance of CLISC - Asbjoedt/CLISC GitHub Wiki

In this article, you can read more about the performance issues observed and imposed limits of CLISC.

Test scenario

CLISC has been tested in a scenario processing 16.115 spreadsheets collected from an organization's shared drive. Comparison and validation of .ods file format was turned off, because of observed performance bottlesnecks. 17 spreadsheets were readable but created exceptions during the Archive method and had to be manually exempted. The total process time was: 02:12:37:11 (days:hrs:min:sec).

The results were:

  • CONVERT: 658 of 16115 spreadsheets failed conversion (commonly because of password protection)
  • ARCHIVE: 8 of 15457 converted spreadsheets have invalid file formats
  • ARCHIVE: 15 of 15457 converted spreadsheets had no cell values
  • ARCHIVE: 403 of 15457 converted spreadsheets had data connections
  • ARCHIVE: 88 of 15457 converted spreadsheets had external cell references
  • ARCHIVE: 0 of 15457 converted spreadsheets had RTD functions
  • ARCHIVE: 0 of 15457 converted spreadsheets had external object references
  • ARCHIVE: 11881 of 15457 converted spreadsheets had printer settings
  • ARCHIVE: 2527 of 15457 converted spreadsheets did not have active first sheet
  • ARCHIVE: 606 of 15457 converted spreadsheets have embedded objects
  • ARCHIVE: 1041 of 15457 converted spreadsheets have hyperlinks

Performance bottlenecks

The following performance bottlenecks are observed.

  • Using ODF Validator takes A LONG TIME to validate large .ods spreadsheets
  • Using Beyond Compare 4 takes A LONG TIME to compare VERY large spreadsheets
  • Using Excel Interop increases workload times significantly if you process 1000s of spreadsheets, because Excel is opened and closed for each spreadsheet. The performance drain accumulates.
  • Using LibreOffice to create a copy of each spreadsheet increases workload times significantly if you process 1000s of spreadsheets, because LibreOffice is opened and closed for each spreadsheet. The performance drain accumulates.

In general, it is observed that using Open XML SDK for OOXML spreadsheet manipulation and validation is VERY fast.

Filesize limit

CLISC currently has a conversion filesize limit of 150MB to prevent excessive performance bottlenecks. Larger filesize spreadsheets should be converted manually.