Troubleshooting EDF XML Errors - nsrr/SpectralTrainFig GitHub Wiki

Various errors can impede spectral analysis but many can be avoided with thorough data preparation. The first step is making sure all files are properly formatted before beginning work in MATLAB. Tutorials for formatting file names & extension as well as generating dataset summaries available here. Any errors not caught during data preparation will be presented in MATLAB upon BlockSpectralTrainFig failing spectral analysis.

Note: This is not an exhaustive list of possible errors and represents only the most common problems encountered during the development of SpectralTrainFig.

File Naming and Organization

  1. File extensions should all be EDF and EDF.XML. A mixture of EDF.XML and XML extensions can cause the following MATLAB errors:

  2. Warning: Matched file pairs not found. Check file name.

  3. Cell contents reference from a non-cell array object.

  4. Check to make sure that each EDF and XML pair has the same filename. BlockEdfSummarizeFig and SpectralTrainFig are both expecting EDF and XML file pairs, alternating between EDFs and XMLs when listed alphabetically. Numerous problems can be caused by typos causing mismatched EDF and XML pairs including the two above as well as:

  5. [Fatal Error] subject123.EDF:1:1: Content is not allowed in prolog.

  6. EDF/XML files should be stored on the computer to be used for spectral analysis. Attempting to run SpectralTrainFig over a network can result in the following error:

  7. Warning: Signal Check error: Check signal labels.

  8. No files other than EDFs and their accompanying XMLs can be present in any directory or subdirectory being analyzed. SpectralTrainFig will process studies within any nested folders provided they are properly formatted EDFs and XMLs.

  9. Confirm that all EDFs and XMLs are an appropriate file size. XMLs will range between 50-200 KB. Their paired EDFs should be between 75-450 MB. The most frequent file size to see for an erroneous file is 0 KB. If either file is bad both must be removed from the dataset and re-exported from the original scored file.

  10. The spectral results folder should not be nested inside the raw data folder used for spectral analysis. This will cause any files created by running spectral the first time to cause subsequent runs to fail and produce the following errors:

  11. Could not complete EDF processing (file ID, file name).

  12. Warning: Signal Check error: Check signal labels.

  13. Any files in the spectral results folder need to be deleted or moved between spectral analysis runs. SpectralTrainFig will attempt to use the _FileList.xlsx file generated at the beginning of spectral analysis for subsequent runs and can cause various errors, the most common being the two listed above.

Troubleshooting using HeaderSignalSummary

The HeaderSignalSummary generated by BlockEdfSummarizeFig will assist in three key items involved in data preparation: EDF channel labels, inconsistent sampling rates and low data records.

Inconsistent Channel Labels

The most common issue with the contents of an EDF file that will cause spectral analysis to fail are incorrect channel names in the EDF’s signal header. Checking and correcting these labels can be done as follows:

  1. Columns M and onward in HeaderSignalSummary list all signal labels present in each EDF in the directory. Signal order does not matter, only the labels used. Every EDF needs to have a signal label corresponding to each signal to be analyzed in SpectralTrainFig.

  2. Correct this error during analysis by changing the default Analysis Parameters in SpectralTrainFig. Default settings for spectral anaylsis use EEG derivation C3-A2. If the cohort’s signal labels are consistent the easiest solution is to change the signals used by SpectralTrainFig. A common example would be changing C3-A2 to C3-M2.

  3. Edit channel labels in the EDF Editor and Translator, instructions available here.

Inconsistent Sampling Rates

SpectralTrainFig is designed to run groups of EEG signals sampled at the same rate. Frequency resolution will differ for each subject using a different sampling rate. For this reason it is important to know the rates of all channels before running analysis. Sampling rates can be checked before spectral analysis in the HeaderSignalSummary file generated after running Signal Plus in BlockEdfSummarizeFig:

  1. Launch BlockEdfSummarizeFig as normal, but also include signals whose sampling rates are to be reported under Signal Summary with Sampling Rate.

  2. Format signal labels as follows: label names between apostrophes, separated by commas, all enclosed in curly brackets, no spaces (ex. {'C3','C4','F3','F4'}). Improperly formatted signal labels will return the following MATLAB errors while trying to run BlockEdfSummarizeFig: Error using eval or Undefined function or variable (SignalName)

  3. Use Create File List and Signal Plus to generate a new HeaderSignalSummary.

  4. Reported sampling rates will be added to the HeaderSignalSummary in the farthest right columns listed in Hz.

Low Data Records

HeaderSignalSummary and HeaderCheckSummary can be used to troubleshoot studies with low data records.

  1. Open HeaderSignalSummary or HeaderCheckSummary. Column J will contain a count of the num_data_records in each study. Note the average number of data records across the dataset, looking for suspiciously low results (generally at least 50% below average.)

  2. Low data records are typically indicative of short studies. These can be PSGs that were legitimately short recordings as well as raw data truncated during transfers or editing.

  3. No data records are caused by an EDF/XML file pair that contains all the relevant EDF header settings, but no other data. This can be caused by an XML with no scored events as well as an EDF with no raw data. Both are common side effects of studies that are successfully setup but not properly recorded.

  4. The only fix for these issues is to locate an earlier copy of the raw data and/or regenerate new EDF and XML files.

Troubleshooting using XML_Summary

XML is a common file format used for various applications that can be displayed very differently depending on the software used which does not always reflect the raw data. For reviewing raw XML files Notepad can be used while more sophisticated text editors such as Sublime Text and XML Notepad 2007 will display tags (such as events and staging) in an easier structure to navigate. The XML_Summary generated by BlockEdfSummarizeFig flags studies that would have crashed spectral analysis with a “1” under the Check Flag column and lists an error message that can be used to assist in troubleshooting in the next column.

Failed to read XML file.

This error can occur due to bad characters or formatting in an XML file, most frequently extra white space or an invalid Unicode character. These errors commonly occur when opening or transferring XML files causes software to reformat the XML contents. These changes will not always be obvious due to the way XML display can vary by text editor. Proper formatting for a Compumedics’ XML file will appear as only two lines with one line break (carriage return) between them when viewed in Notepad. There should be no white space between any tags (between < and > symbols) but white spaces are permitted in the event and channel names inside these tags.

Sometimes white spaces or bad formatting can make it through XML Check without being flagged and can cause the following MATLAB error during spectral analysis: Input must be a string. There are several ways to clean up these bad characters:

  1. Regenerate/Re-export the EDF and XML files. If this can be done it will typically fix many of the bad characters and formatting because a newly generated XML should be free from these errors.

  2. If the MATLAB error an invalid XML character (Unicode: 0x4) was found in the element content of the document occurs while running spectral analysis an unknown character is to blame. This can be found by opening the XML file in XML Notepad 2007, causing an error which lists the position of the bad character. This file can then be edited (Sublime Text recommended) to remove the invalid character.

  3. If the XML has been reformatted and contains any line breaks or white spaces it can be cleaned up using Notepad++. Open the XML file in Notepad++, select all lines except line 1, and run Edit > Blank Operations > Trim Leading and Trailing Space, followed by Edit > Line Operations > Join Lines. This is not a guaranteed fix for formatting problems but will address many common ones.

Could not create hypnogram or unique staging list.

There are several XML problems that can create this error, but the most common are a study with no scored sleep or an unrecognized stage label used in an XML. Studies with no scored sleep cannot have spectral analysis run without rescoring and/or re-exporting the PSG, but XMLs with incorrect stage names can be fixed as follows:

  1. Open the XML for editing using your preferred text editor.

  2. Scroll down to the second to last section of the XML containing the SleepStages. Sleep stage tags are located between the ScoredEvents and Montage sections and should appear as 0 when viewed in Notepad.

  3. Search for and replace any stage numbers that SpectralTrainFig will not recognize. SpectralTrainFig currently supports stages 0, 1, 2, 3, 4, and 5.

Could not open (xml file path)

This error message is typically created by changing an EDF or XML file’s name or extension after running Create File List in BlockEdfSummarizeFig. When XML Check is run it uses the file list created earlier to call files for checking. Every time EDF Check, XML Check or Signal is run a new file list should be created and any old file lists should be moved from the results folder.

Troubleshooting using MATLAB Errors

MATLAB will stop running analysis immediately if any error is encountered and display an error message. Many failed studies will generate an error message that begins with a short description followed by lines detailing where in the MATLAB code these issue occurred. For the purpose of practical troubleshooting by staff without coding experience only the initial, short error messages are used in this guide.

Note: If MATLAB crashes during spectral analysis be certain to move/delete any files generated using the EDF/XML pair that caused the crash. If these files are left in and the subject is re-analyzed they can interfere with successfully running spectral analysis and reporting results.

Could not complete Spectral Processing.

This error can be caused by numerous issues and has limited value in troubleshooting. Standard EDF/XML troubleshooting should be followed until a more specific error can be found.

Could not complete EDF Processing.

This error can occur while running both BlockSummarizeFig and SpectralTrainFig and is typically caused by an EDF file that is inaccessible due to its filename. Check the following possible causes before performing full troubleshooting:

  1. SpectralTrainFig needs matching, case-sensitive EDF extensions. A mix of “edf” and “EDF” can cause this MATLAB error.

  2. Any typos in EDF extensions. A common error is “..” (two periods) between the filename and extension.

  3. Any typos in any EDF file names. If the file named in the MATLAB error is spelled correctly another file can still affect its loading. If all EDFs and XMLs in the dataset don’t present as alternating pairs when sorted by filename then a bad filename may be hidden among them, causing EDFs and XMLs to pair incorrectly.

Warning: Signal Error: Check signal labels.

This error may initially appear to be caused by incorrect signal labels, but there are several other problems that can cause this error to be returned. The most obvious first step is to open up the HeaderSignalSummary generated by BlockEdfSummarizeFig and verify that all signal labels to be analyzed are consistent across the dataset. Every EDF study in the dataset must contain a channel label matching each Analysis Signal and Reference Signal entered into SpectralTrainFig. If the error is still unclear the following issues may cause this error:

  1. Signal labels are case sensitive. A mixture of “C3” and “c3” leads can interfere with processing.

  2. Analysis Signal and Reference Signal settings in SpectralTrainFig need to match the raw EDF and are also case sensitive.

  3. SpectralTrainFig will allow multiple instances of the same channel name provided the raw data’s signal dimensions are the same. Mislabeled EEGs will run (if both have a physical dimension range of 250), but not an EMG channel mislabeled as C3 (dimensions of 250 vs 62.5). These values can be viewed and edited using EDF Header Editor under the Signal Header tab. Instructions on using EDF Header Editor are available here.

  4. Check to make sure there are no files other than EDFs and XMLs in the directory specified as the raw data folder, and that the spectral results folder is not a subdirectory of this raw data folder.

  5. Check that files being processed are saved in a local drive. Attempting to run SpectralTrainFig over a network can result in various errors, the most common being Warning: Signal Error: Check signal labels.

Could not create character hypnogram or unique stage list.

This error is indicative of an issue with an XMLs sleep staging. It is typically caught while cleaning data, but has been known to surface during failed spectral analysis runs. The two known causes are:

  1. Studies with no scored sleep. These studies cannot have spectral analysis run on them without rescoring and re-exporting a PSG that includes epochs marked as sleep. Automatic artifact detection flags epochs scored wake as artifact and in this case would flag the entire night’s EEG as artifact for exclusion.

  2. XMLs with incorrect stage names. BlockEdfSummarizeFig and SpectralTrainFig both use default stage names from Compumedics’ ProFusion PSG 3, and will crash when other labels are used. Refer to Could not create character hypnogram or unique stage list under Troubleshooting Using Results from XML_Summary to troubleshoot bad staging in an XML.

Subscripted assignment dimension mismatch.

This somewhat generic MATLAB error can arise from many issues while attempting to run SpectralTrainFig, but the two most common causes are:

  1. Extra lines in the XML file. Check to make sure there is not a 3rd line at the end of the XML file. These extra lines can contain “blank” characters (space, tab, etc) or no characters at all, rendering them subtle or outright unviewable in some text editors. Viewing in a text editor like Sublime Text which displays line and column numbers can aid in troubleshooting. Deleting this extra line and resaving the XML can resolve this error.

  2. XMLs with incorrect stage names. Refer to Could not create character hypnogram or unique stage list under Troubleshooting Using Results from XML_Summary to troubleshoot bad staging in an XML. If this error continues after the above corrections than full troubleshooting will have to be done.

Improper assignment with rectangular empty matrix.

This uncommon error has only been noted in PSGs that have proper file/channel names, and raw data but no scored sleep. As a result all epochs are excluded and spectral analysis fails. If an earlier version of the study is available with scored sleep it can be used, but this error typically marks an unusable study.

Warning: Matched file pairs not found. Check file names.

This error is most commonly caused by a typo in a file name or extension. BlockEdfSummarizeFig and SpectralTrainFig are both setup to run using Compumedics' default extensions of EDF and EDF.XML. A combination of varying file extensions will cause this error (ex. a combination of EDF.XML and XML extensions will crash spectral analysis.) If all studies in the dataset are properly named the data folder should only contain alternating EDFs and EDF.XMLs when sorted by file name. See Preparing Files for SpectralTrainFig for instructions on file structure and naming. These less common errors can also be caused by a file name/extension mismatch:

  1. [Fatal Error] subject123.EDF:1:1: Content is not allowed in prolog.

  2. Cell contents reference from a non-cell array object.

An invalid character was found in the element content of the document.

This error is due to an EOT (end of transmission) or STX (start of text) node accidentally being inserted into an XML event tag’s name in place of a standard unicode character. The most common place for these bad characters is in custom event names, such as ones used to distinguish different arousal types. These special characters cause numerous issues but can easily be corrected as follows:

  1. The bad characters can be clearly seen in some text editors (Sublime Text recommended) and then replaced with an appropriate character or deleted.

  2. If the erroneous character cannot be easily identified another option is to try and open it in an XML-specific editor which can present a more specific error message. For example, opening an XML file in XML Notepad 2007 can cause an error message specifying that: a non-standard character was located at line 2, column 7056.

Input must be a string.

An input string error is a good sign that an XML’s format is incorrect. This can be caused by an invalid character (see above), but also extra white space characters. Both SpectralTrainFig and BlockEdfSummarizeFig are setup to run XMLs formatted as only 2 lines of text when the raw data is viewed in a simple text editor. More sophisticated text editors may apply their own formatting and hide the raw data’s format. If opened in a simple text editor like Notepad the XML should appear as follows:

  1. Line one will be the XML declaration containing the XML version and encoding declaration:

  2. Line two will contain the entirety of the XML's PSG scoring data. It will appear as one long string with elements separated by >< brackets. Spaces are allowed inside labels (ex. Central Apnea) but no spaces or line breaks are allowed anywhere else in line two of the XML.

Due to the design of BlockEdfSummarizeFig some bad XML formats can still pass through earlier error checking. It is also very common for the format of XMLs to change after error checking if an XML is opened in a text editor which applied its own XML formatting. If improper XML formatting and whitespace characters need repairing two options are:

  1. Regenerate/Re-export the EDF and XML files. If possible this will typically fix the majority of bad characters and formatting errors.

  2. Frequent line breaks and white spaces can also be cleaned up using Notepad++ (and other text editors.) If the XML has been reformated and contains any line breaks or white spaces it can be cleaned up using Notepad++. Open the XML file in Notepad++, select all lines except line 1, and run Edit > Blank Operations > Trim Leading and Trailing Space, followed by Edit > Line Operations > Join Lines. This is not a guaranteed fix for white space problems but will address many common ones.

Physical dimension is not constant across channels.

This uncommon error has been seen in studies who’s EDF and XML data has become corrupted due to lost values. Studies presenting this error will almost always need to be reverted to a previous copy before corruption took place, most likely during a file transfer. This error has been caused by two known issues:

  1. An XML file’s event entries having their start/stop times and label names replaced with zeros.

  2. An EDF file’s signal header having its physical dimensions and physical minimums/maximums replaced with zeros.

⚠️ **GitHub.com Fallback** ⚠️