IMDb provides a subset of their data in tab-separated format for personal and non-commercial use. You can find more information, including legal, at IMDb Non-Commercial Datasets. These are some notes on the schema.
There are 7 files provided, at the time of writing this article:
| # | Name | Compressed size (MB) | Uncompressed size (MB) | Number of rows |
|---|---|---|---|---|
| 1 | name.basics.tsv.gz |
245 | 753 | 12,981,035 |
| 2 | title.akas.tsv.gz |
305 | 1783 | 37,728,267 |
| 3 | title.basics.tsv.gz |
172 | 841 | 10,285,368 |
| 4 | title.crew.tsv.gz |
66 | 325 | 10,285,368 |
| 5 | title.episode.tsv.gz |
41 | 196 | 7,844,603 |
| 6 | title.principals.tsv.gz |
436 | 2475 | 58,914,239 |
| 7 | title.ratings.tsv.gz |
7 | 23 | 1,366,240 |
There are 2 unique alphanumeric identifiers in those files:
tconstis an ID for a title, andnconstis an ID for a name.
This diagram shows the relationships between the 7 data exports. This isn't exactly an entity relationship diagram, but it's not too far either.

This diagram was created using DBML and can be imported into dbdiagram.io. Here's the code:
Table name_basics {
nconst string [primary key]
primaryName string
birthYear number
deathYear number
primaryProfession string_array
knownForTitles nconst_array [ref: < title_basics.tconst]
}
Table title_basics {
tconst string [primary key]
titleType string
primaryTitle string
originalTitle string
isAdult boolean
startYear number
endYear number
runtimeMinutes number
genres string_array
}
Table title_akas {
titleId string [ref: > title_basics.tconst]
ordering integer
title string
region string
language string
types string_array
attributes string_array
isOriginalTitle boolean
}
Table title_crew {
tconst string [ref: - title_basics.tconst]
directors nconst_array [ref: > name_basics.nconst]
writers nconst_array [ref: > name_basics.nconst]
}
Table title_episode {
tconst string [primary key]
parentTconst string [ref: > title_basics.tconst]
seasonNumber number
episodeNumber number
}
Table title_principals {
tconst string [ref: - title_basics.tconst]
ordering number
nconst string [ref: - name_basics.nconst]
category string
job string
characters string
}
Table title_ratings {
tconst string [ref: - title_basics.tconst]
averageRating number
numVotes number
}
5 First Rows
Here's a sample of the data, these are the first 5 rows from each export.
name.basics.tsv
| nconst | primaryName | birthYear | deathYear | primaryProfession | knownForTitles |
|---|---|---|---|---|---|
| nm0000001 | Fred Astaire | 1899 | 1987 | soundtrack,actor,miscellaneous | tt0050419,tt0053137,tt0072308,tt0031983 |
| nm0000002 | Lauren Bacall | 1924 | 2014 | actress,soundtrack | tt0075213,tt0037382,tt0038355,tt0117057 |
| nm0000003 | Brigitte Bardot | 1934 | \N | actress,soundtrack,music_department | tt0054452,tt0056404,tt0057345,tt0049189 |
| nm0000004 | John Belushi | 1949 | 1982 | actor,soundtrack,writer | tt0080455,tt0072562,tt0077975,tt0078723 |
| nm0000005 | Ingmar Bergman | 1918 | 2007 | writer,director,actor | tt0083922,tt0069467,tt0050986,tt0050976 |
title.akas.tsv
| titleId | ordering | title | region | language | types | attributes | isOriginalTitle |
|---|---|---|---|---|---|---|---|
| tt0000001 | 1 | Карменсіта | UA | \N | imdbDisplay | \N | 0 |
| tt0000001 | 2 | Carmencita | DE | \N | \N | literal title | 0 |
| tt0000001 | 3 | Carmencita - spanyol tánc | HU | \N | imdbDisplay | \N | 0 |
| tt0000001 | 4 | Καρμενσίτα | GR | \N | imdbDisplay | \N | 0 |
| tt0000001 | 5 | Карменсита | RU | \N | imdbDisplay | \N | 0 |
title.basics.tsv
| tconst | titleType | primaryTitle | originalTitle | isAdult | startYear | endYear | runtimeMinutes | genres |
|---|---|---|---|---|---|---|---|---|
| tt0000001 | short | Carmencita | Carmencita | 0 | 1894 | \N | 1 | Documentary,Short |
| tt0000002 | short | Le clown et ses chiens | Le clown et ses chiens | 0 | 1892 | \N | 5 | Animation,Short |
| tt0000003 | short | Pauvre Pierrot | Pauvre Pierrot | 0 | 1892 | \N | 4 | Animation,Comedy,Romance |
| tt0000004 | short | Un bon bock | Un bon bock | 0 | 1892 | \N | 12 | Animation,Short |
| tt0000005 | short | Blacksmith Scene | Blacksmith Scene | 0 | 1893 | \N | 1 | Comedy,Short |
title.crew.tsv
| tconst | directors | writers |
|---|---|---|
| tt0000001 | nm0005690 | \N |
| tt0000002 | nm0721526 | \N |
| tt0000003 | nm0721526 | \N |
| tt0000004 | nm0721526 | \N |
| tt0000005 | nm0005690 | \N |
title.episode.tsv
| tconst | parentTconst | seasonNumber | episodeNumber |
|---|---|---|---|
| tt0041951 | tt0041038 | 1 | 9 |
| tt0042816 | tt0989125 | 1 | 17 |
| tt0042889 | tt0989125 | \N | \N |
| tt0043426 | tt0040051 | 3 | 42 |
| tt0043631 | tt0989125 | 2 | 16 |
title.principals.tsv
| tconst | ordering | nconst | category | job | characters |
|---|---|---|---|---|---|
| tt0000001 | 1 | nm1588970 | self | \N | ["Self"] |
| tt0000001 | 2 | nm0005690 | director | \N | \N |
| tt0000001 | 3 | nm0374658 | cinematographer | director of photography | \N |
| tt0000002 | 1 | nm0721526 | director | \N | \N |
| tt0000002 | 2 | nm1335271 | composer | \N | \N |
title.ratings.tsv
| tconst | averageRating | numVotes |
|---|---|---|
| tt0000001 | 5.7 | 2004 |
| tt0000002 | 5.8 | 269 |
| tt0000003 | 6.5 | 1903 |
| tt0000004 | 5.5 | 178 |
| tt0000005 | 6.2 | 2685 |