EDD2020 - sporedata/researchdesigneR GitHub Wiki
General description
The EDD2020 database is a comprehensive and expertly annotated collection of endoscopic images and videos aimed at advancing the development of AI-based tools for disease detection and segmentation in gastrointestinal endoscopy. It plays a key role in enabling researchers to create, benchmark, and improve CAD systems that assist endoscopists in accurately detecting and outlining lesions. The database is instrumental in fostering innovation in real-time assistance systems, ultimately improving patient outcomes through more accurate and efficient endoscopic procedures.
EDD2020 contains endoscopic images and videos that capture a wide variety of gastrointestinal abnormalities, such as polyps, cancerous lesions, bleeding, and other abnormalities found during procedures. This diversity helps researchers develop robust and generalizable algorithms across different types of conditions.
The images in the dataset are meticulously annotated by expert gastroenterologists. These annotations serve as "ground truth" for developing and validating computer vision algorithms. The dataset includes:
- Detection: Marking the presence and location of disease.
- Segmentation: Providing pixel-wise annotations to outline the boundaries of lesions, which helps train models for accurate localization of abnormalities.
Data Categories
- Image data:
Image and Video Data: The EDD2020 dataset contains still images and video sequences from endoscopic procedures. This variety allows researchers to develop models capable of static image analysis and real-time video processing.
- Multiple Disease Types: The dataset includes a variety of gastrointestinal conditions, such as polyps, esophagitis, ulcers, and early-stage cancers. This makes it suitable for developing generalized AI tools applicable to different types of gastrointestinal diseases.
- Annotations: Expert-annotated masks for each lesion indicate the boundaries of abnormal tissue. These detailed annotations are critical for training deep learning models in segmentation tasks.
Limitations
- Data Complexity: Endoscopic images often suffer from challenges like motion blur, varied lighting conditions, and the presence of occluding elements such as bubbles, mucus, or residual food particles. The EDD2020 dataset includes such complexities, which makes developing robust AI models particularly challenging but ultimately more clinically useful.
- Generalization: One of the primary goals for models trained on EDD2020 is to generalize across different patient populations, endoscope types, and procedural conditions. This requires handling substantial variability in image quality, anatomical differences, and disease presentations.
- Annotation Quality: The accuracy of the segmentation and detection models relies heavily on the quality of annotations. EDD2020 provides high-quality expert annotations, which sets a high standard for training reliable AI models and ensures that these models perform well when integrated into clinical settings.
Related publications
Data access
For more information on the EDD2020 dataset, visit https://ieee-dataport.org/competitions/endoscopy-disease-detection-and-segmentation-edd2020
- Visit https://ieee-dataport.org/datasets for EDD2020 datasets