Genes
KDD Cup 2001 prediction of gene/protein function and localization.
Original source: pages.cs.wisc.edu (BibTeX)
Versions
Genes (by Jan Motl)
Dataset details
- Associated task:
- Classification
- Domain:
- Medicine
- Data types:
- Size:
- 1.8 MB
- Count of tables:
- 3
- Count of rows:
- 6,063
- Count of columns:
- 15
- Missing values:
- No
- Compound keys:
- No
- Loops:
- Yes
- Type:
- Real
- Instance count:
- 862
- Target table:
- Classification
- Target column:
- Localization
- Target ID:
- GeneID
- Target timestamp:
- ?
Algorithms
Dataset version | Target | Algorithm | Author text | Measure | Value |
---|---|---|---|---|---|
genes | localization | MRDTL | MRDTL: A multi-relational decision tree learning algorithm | Accuracy | 0.85 |
genes | function | MRDTL-2 | A Multi-Relational Decision Tree Learning Algorithm – Implementation and Experiments | Accuracy | 0.9144 |
genes | localization | MRDTL-2 | A Multi-Relational Decision Tree Learning Algorithm – Implementation and Experiments | Accuracy | 0.7611 |
genes | localization | Predictor Factory | Predictor Factory | Accuracy | 0.6161 |
genes | growth | RELAGGS | Comparative Evaluation of Approaches to Propositionalization | Accuracy | 0.78 |
genes | nucleus | RELAGGS | Comparative Evaluation of Approaches to Propositionalization | Accuracy | 0.8 |
genes | nucleus | RPT | Learning Relational Probability Trees | Accuracy | 0.89 |
genes | growth | RSD | Comparative Evaluation of Approaches to Propositionalization | Accuracy | 0.825 |
genes | nucleus | RSD | Comparative Evaluation of Approaches to Propositionalization | Accuracy | 0.84 |
genes | growth | SINUS | Comparative Evaluation of Approaches to Propositionalization | Accuracy | 0.86 |
Show all algorithms |
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: db.relational-data.org
- port: 3306
- username: guest
- password: relational
- Export "genes" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).