FTP

PAKDD'15 Data Mining Competition: The task is to reconstruct the information about user’s gender from product viewing logs. The data were obtained from simulations of product viewing activities of users with known gender. The data closely follow the real-life distribution in that regard.

Original source: knowledgepit.fedcsis.org

Versions

  • Ftp (by Jan Motl)

Dataset details

Associated task:
Classification
Domain:
Retail
Data types:
Size:
7.5 MB
Count of tables:
2
Count of rows:
96,118
Count of columns:
10
Missing values:
Yes
Compound keys:
No
Loops:
No
Type:
Synthetic
Instance count:
29,555
Target table:
session
Target column:
gender
Target ID:
session_id
Target timestamp:
?

Algorithms

Dataset versionTargetAlgorithmAuthor textMeasureValue
ftpgenderPredictor FactoryPredictor FactoryAccuracy0.7773

How to download the dataset

The datasets are publicly available directly from MariaDB database.

  1. Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
  2. Use following credentials:
    • hostname: db.relational-data.org
    • port: 3306
    • username: guest
    • password: relational
  3. Export "ftp" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).