In this lesson, we're going to be looking at movie budget and revenue data. This dataset is perfect for trying out some new tools like scikit-learn to run a linear regression and seaborn, a popular data visualisation library built on top of Matplotlib. 

The question we want to answer today is: Do higher film budgets lead to more revenue in the box office? In other words, should a movie studio spend more on a film to make more? 


Today you'll learn:


Download and add the Notebook to Google Drive

As usual, download the .zip file from this lesson and extract it. Add the .ipynb file into your Google Drive and open it as a Google Colaboratory notebook.


Add the Data to the Notebook

The .zip file also includes a .csv file called cost_revenue_dirty. This is the data for the project. Add this file to your notebook.