FTP
PAKDD'15 Data Mining Competition: The task is to reconstruct the information about user’s gender from product viewing logs. The data were obtained from simulations of product viewing activities of users with known gender. The data closely follow the real-life distribution in that regard.
Original source: knowledgepit.fedcsis.org
Versions
Ftp (by Jan Motl)
Dataset details
- Associated task:
- Classification
- Domain:
- Retail
- Data types:
- Size:
- 7.5 MB
- Count of tables:
- 2
- Count of rows:
- 96,118
- Count of columns:
- 10
- Missing values:
- Yes
- Compound keys:
- No
- Loops:
- No
- Type:
- Synthetic
- Instance count:
- 29,555
- Target table:
- session
- Target column:
- gender
- Target ID:
- session_id
- Target timestamp:
- ?
Algorithms
Dataset version | Target | Algorithm | Author text | Measure | Value |
---|---|---|---|---|---|
ftp | gender | Predictor Factory | Predictor Factory | Accuracy | 0.7773 |
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: db.relational-data.org
- port: 3306
- username: guest
- password: relational
- Export "ftp" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).