How to work on Kaggle data on your local Jupyter Notebook

This post briefly describe about how to use Kaggle data on your local Jupyter Notebook.

Env details:

  • Ubuntu
  • Python 3.6.3

Steps

We need these steps for our task –

  1. Download file from Kaggle to your local box.
  2. Unzip the Zip file.
  3. Read the file from your Jupyter Notebook.

 

Download dataset from Kaggle

I am downloading the PUBG Finish Placement Prediction dataset from Kaggle. Refer to this post to download Kaggle dataset.

 

Unzip the Zip file

Downloaded Kaggle dataset is in Zip file format. Now, we have to unzip that file to read the data.

$ unzip <file name>

 

Read Data from local Jupyter Notebook

After unzip file, we are ready to use our data on Jupyter Notebook. Open the jupyter notebook on your system.

Note :- If the data size is too large then we can create a small file to run on local system.

$ head -size ~/old_file_name > ~/new_file name
$ head -20000 ~/train_V2.csv > ~/train_V4.csv

 

Now, we are ready to run the data on Jupyter

import pandas as pd
pd.read_csv('/file_path/file_name', engine = 'python')
data =pd.read_csv ('/home/bond/train_V4.csv',engine='python')
data.head(10)

When we run this, we might get error: Permission denied

 

 

In order to get Permission we have to run following command on command prompt:

$ sudo chmod 600  <file path>

Now, we have to run the code again

Now, we are ready to play with our data.

That’s all for this post, hope it was helpful. Cheers!

 

Leave a Reply

Your email address will not be published. Required fields are marked *