Preprocessing Techniques - vaibhavi004/DSP_project GitHub Wiki

🩺 ECG Signal Preprocessing Pipeline (MATLAB)

This project implements a complete ECG signal preprocessing pipeline in MATLAB to enhance the quality of raw ECG signals for accurate heart rate monitoring. The preprocessing stages include baseline wander removal, high-frequency noise suppression, and power-line interference removal. Post-cleaning, QRS complexes are detected, and R-R intervals are used to compute heart rate in beats per minute (BPM). The data used is sourced from a .mat file sampled at 500 Hz.

The ECG data is first extracted and preprocessed to handle missing values using linear interpolation. A high-pass Butterworth filter with a cutoff frequency of 0.5 Hz is applied to eliminate baseline wander caused by respiration or patient movement. Next, a low-pass Butterworth filter with a cutoff at 40 Hz removes high-frequency muscle and instrumentation noise while preserving the morphology of the ECG waveform. To remove 50 Hz power-line interference, a notch filter is applied with a quality factor (Q) of 30, effectively isolating and removing the interference without distorting adjacent signal components.

Following preprocessing, the cleaned ECG signal undergoes QRS detection using a series of transformations. First, the signal is differentiated to highlight rapid changes. This is followed by rectification (taking the absolute value) to emphasize all peaks uniformly. A moving average filter with a window of 150 ms is then applied to smooth the signal and reduce high-frequency jitter. Finally, the findpeaks function is used to detect R-peaks by specifying the minimum peak height and minimum peak distance.

The time difference between successive R-peaks is calculated and converted into BPM using the formula: BPM = 60 / mean(R-R interval). If the calculated heart rate falls outside a normal physiological range (80–95 BPM), the peak detection parameters are adjusted and the BPM is recalculated. This ensures stability in results and accounts for variability in signal quality.

Visualization is done in MATLAB using a 4-subplot layout. The first subplot shows the raw ECG signal, followed by the baseline-corrected version in the second. The third subplot displays the signal after noise removal, and the final subplot shows the fully filtered ECG signal with R-peaks marked using red circles. The heart rate is also displayed in a text annotation on the plot and printed in the command window.

The following MATLAB code demonstrates the implementation of this entire process:

clc; clear all; close all; tic;
load('rec_5.mat'); 
ecg_data = val(:); 
ecg_data_5000 = ecg_data(1:4000); 
if any(~isfinite(ecg_data_5000))
    ecg_data_5000 = fillmissing(ecg_data_5000, 'linear'); 
end 
fs = 500;

[b_highpass, a_highpass] = butter(4, 0.5 / (fs / 2), 'high'); 
filtered_ecg_baseline = filtfilt(b_highpass, a_highpass, ecg_data_5000);

[b_lowpass, a_lowpass] = butter(4, 40 / (fs / 2), 'low'); 
filtered_ecg_baseline_noise = filtfilt(b_lowpass, a_lowpass, filtered_ecg_baseline);

f_notch = 50; Q = 30; 
[b_notch, a_notch] = iirnotch(f_notch / (fs / 2), f_notch / (fs / 2) / Q); 
filtered_ecg = filtfilt(b_notch, a_notch, filtered_ecg_baseline_noise);

ecg_derivative = diff(filtered_ecg);
ecg_rectified = abs(ecg_derivative);
window_size = round(0.150 * fs);
ecg_smoothed = movmean(ecg_rectified, window_size);

[peaks, peak_locations] = findpeaks(ecg_smoothed, 'MinPeakHeight', 0.4, 'MinPeakDistance', 250);
time_between_peaks = diff(peak_locations) / fs;
heart_rate_bpm = 60 / mean(time_between_peaks);

if heart_rate_bpm < 80 || heart_rate_bpm > 95
    [peaks, peak_locations] = findpeaks(ecg_smoothed, 'MinPeakHeight', 0.4, 'MinPeakDistance', 270);
    time_between_peaks = diff(peak_locations) / fs;
    heart_rate_bpm = 60 / mean(time_between_peaks);
end

figure;
subplot(4,1,1); plot(ecg_data_5000); title('Raw ECG Signal (First 4000 Samples)'); xlabel('Sample Number'); ylabel('Amplitude');
subplot(4,1,2); plot(filtered_ecg_baseline); title('Baseline-Filtered ECG Signal'); xlabel('Sample Number'); ylabel('Amplitude');
subplot(4,1,3); plot(filtered_ecg_baseline_noise); title('High-Frequency Noise Removed'); xlabel('Sample Number'); ylabel('Amplitude');
subplot(4,1,4); plot(filtered_ecg); hold on; plot(peak_locations, filtered_ecg(peak_locations), 'ro');
title('Final Filtered ECG with R-Peaks'); xlabel('Sample Number'); ylabel('Amplitude'); legend('Filtered ECG', 'Detected R-peaks');

annotation('textbox', [0.1, 0.02, 0.3, 0.05], 'String', ['Heart Rate: ' num2str(heart_rate_bpm) ' BPM'], 'EdgeColor', 'none');
disp(['Heart Rate: ', num2str(heart_rate_bpm), ' BPM']); 
toc;