Shakespeare
Alternative names: OSS
The Open Source Shakespeare is a collection of Shakespeare's complete works. This is a much more interesting data set than some boring imaginary online retailer. In this dataset, people die! The task is to predict the character, who speaks the lines.
Original source: www.opensourceshakespeare.org
Versions
Shakespeare (by Jan Motl)
- We use a normalized database schema from https://github.com/mozz100/bardofavon.
Dataset details
- Associated task:
- Classification
- Domain:
- Entertainment
- Data types:
- Size:
- 8.8 MB
- Count of tables:
- 4
- Count of rows:
- 35,234
- Count of columns:
- 19
- Missing values:
- No
- Compound keys:
- No
- Loops:
- No
- Type:
- Real
- Instance count:
- 32,980
- Target table:
- paragraphs
- Target column:
- character_id
- Target ID:
- id
- Target timestamp:
- ?
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: db.relational-data.org
- port: 3306
- username: guest
- password: relational
- Export "Shakespeare" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).