PTB-XL+ Dataset Analysis Report
PTB-XL+: A Comprehensive Electrocardiographic Feature Dataset · physionet.org ↗ · Paper ↗
Dataset Summary
High-level counts across all data sources in PTB-XL+ v1.0.1.
Record Coverage by Source
Number of ECG records available in each component versus the canonical RECORDS file.
Feature Missing Values
Percentage of missing values per feature, by algorithm source. Features above 5% threshold are flagged.
Key Interval & Amplitude Statistics
Summary statistics for clinically important ECG features (P/QRS/T durations, PR/QT intervals, amplitudes) across all records.
PTB-XL Label Frequency (SCP Codes)
Most common SCP diagnostic codes from cardiologist annotations in PTB-XL.
12SL Algorithm Statement Frequency
Most common automated diagnostic statements from the 12SL algorithm.
Label Co-occurrence
How often pairs of labels appear together in the same record (top labels shown).
SNOMED CT Coverage
How many SNOMED CT concepts from the dataset description file are covered by each label source.
Fiducial Point Coverage (ECGDeli)
Number of records with ECGDeli delineation annotations per lead.
Median Beat Coverage
Number of records with median beat waveform files (.dat/.hea) for each algorithm.
Cross-Source Label Agreement
Agreement between PTB-XL cardiologist annotations and 12SL algorithm statements, compared via shared SNOMED CT concepts. "Both" = concept flagged by both sources in the same record.