This post briefly describe about how to use Kaggle data on your local Jupyter Notebook.
Env details:
- Ubuntu
- Python 3.6.3
Steps
We need these steps for our task –
- Download file from Kaggle to your local box.
- Unzip the Zip file.
- Read the file from your Jupyter Notebook.
Download dataset from Kaggle
I am downloading the PUBG Finish Placement Prediction dataset from Kaggle. Refer to this post to download Kaggle dataset.
Unzip the Zip file
Downloaded Kaggle dataset is in Zip file format. Now, we have to unzip that file to read the data.
$ unzip <file name>
Read Data from local Jupyter Notebook
After unzip file, we are ready to use our data on Jupyter Notebook. Open the jupyter notebook on your system.
Note :- If the data size is too large then we can create a small file to run on local system.
$ head -size ~/old_file_name > ~/new_file name $ head -20000 ~/train_V2.csv > ~/train_V4.csv
Now, we are ready to run the data on Jupyter
import pandas as pd pd.read_csv('/file_path/file_name', engine = 'python') data =pd.read_csv ('/home/bond/train_V4.csv',engine='python') data.head(10)
When we run this, we might get error: Permission denied
In order to get Permission we have to run following command on command prompt:
$ sudo chmod 600 <file path>
Now, we have to run the code again
Now, we are ready to play with our data.
That’s all for this post, hope it was helpful. Cheers!