I found these transcripts for Marvel movies online and I figured I'd practice some of my NLP skills. The main thing I wanted to do was see which movies were most similar to each other by their transcripts using Latent Dirichlet allocation (LDA). In addition to the MCU (Marvel Cinematic Universe) movies, this dataset also includes Spider-man movies, LEGO Marvel movies, X-Men Movies, Blade movies, Deadpool Movies, Marvel video games, and the Twilight movies.
brooksjaredc/marvel_transcripts
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|