GEMLeR - Gene Expression Machine Learning Repository

Download

All GEMLeR datsets are available as compressed .zip archives.

GEMLeR v1.0 (Released 10th October 2008):

GEMLeR_ARFF_short.zip (9 OVA + 36 AP datasets, ARFF format, 706 MB)
 
GEMLeR_CSV_short.zip (9 OVA + 36 AP datasets, CSV format, 688 MB)


Both archives contain the following:

AP ("all-paired") Datasets

FilenameNum. SamplesNum. GenesNum. Class1 / Num. Class2
AP_Breast_Colon.arff63010937344 / 286
AP_Breast_Kidney.arff60410937344 / 260
AP_Breast_Lung.arff47010937344 / 126
AP_Breast_Omentum.arff42110937344 / 77
AP_Breast_Ovary.arff54210937344 / 198
AP_Breast_Prostate.arff41310937344 / 69
AP_Breast_Uterus.arff46810937344 / 124
AP_Colon_Kidney.arff54610937286 / 260
AP_Colon_Lung.arff41210937286 / 126
AP_Colon_Omentum.arff36310937286 / 77
AP_Colon_Ovary.arff48410937286 / 198
AP_Colon_Prostate.arff35510937286 / 69
AP_Colon_Uterus.arff41010937286 / 124
AP_Endometrium_Breast.arff4051093761 / 344
AP_Endometrium_Colon.arff3471093761 / 286
AP_Endometrium_Kidney.arff3211093761 / 260
AP_Endometrium_Lung.arff1871093761 / 126
AP_Endometrium_Omentum.arff1381093761 / 77
AP_Endometrium_Ovary.arff2591093761 / 198
AP_Endometrium_Prostate.arff1301093761 / 69
AP_Endometrium_Uterus.arff1851093761 / 124
AP_Lung_Kidney.arff38610937126 / 260
AP_Lung_Uterus.arff25010937126 / 124
AP_Omentum_Kidney.arff3371093777 / 260
AP_Omentum_Lung.arff2031093777 / 126
AP_Omentum_Ovary.arff2751093777 / 198
AP_Omentum_Prostate.arff1461093777 / 69
AP_Omentum_Uterus.arff2011093777 / 124
AP_Ovary_Kidney.arff45810937198 / 260
AP_Ovary_Lung.arff32410937198 / 126
AP_Ovary_Uterus.arff32210937198 / 124
AP_Prostate_Kidney.arff3291093769 / 260
AP_Prostate_Lung.arff1951093769 / 126
AP_Prostate_Ovary.arff2671093769 / 198
AP_Prostate_Uterus.arff1931093769 / 124
AP_Uterus_Kidney.arff38410937124 / 260

OVA ("one-versus-all") Datasets

FilenameNum. SamplesNum. GenesNum. Class1 / Num. Class2
OVA_Breast.arff154510937344 / 1201
OVA_Colon.arff154510937286 / 1259
OVA_Endometrium.arff15451093761 / 1484
OVA_Kidney.arff154510937260 / 1285
OVA_Lung.arff154510937126 / 1419
OVA_Omentum.arff15451093777 / 1468
OVA_Ovary.arff154510937198 / 1347
OVA_Prostate.arff15451093769 / 1476
OVA_Uterus.arff154510937124 / 1421



GEMLeR Full Datasets


GEMLeR FULL v1.0 (Relesead 30th October 2008):

GEMLeR_AP_CSV_full.zip (36 AP datasets, CSV format, 1.30 GB)
GEMLeR_OVA_CSV_full.zip (9 OVA datasets, CSV format, 1.46 GB)

GEMLeR FULL contains all datasets in original dimension (54681 probes).
You might encounter problems working with full versions of datasets on some non 64-bit operating systems with less than 3 GB of RAM due to high memory demands of OVA datasets when building complex classifiers.