A multi-label classification has multiple target values associated with dataset. Here we are predicting probability of each class instead of predicting a single class.
In this post, I will explain about the multi-label text classification problem with fastai. Here we have used Toxic Comment Classification Challenge to explain how FastAi works for multi-label problem.
Lets look at the data
Let’s have a look on the overview of data and know the data types of each features, to understand the importance of features.
For this problem, we have 6 label classes i.e; 6 different toxicity are as follow :
We have to create a model which predicts a probability of each type of toxicity for each comment.
Load and analyse data
Fast ai expects the data to be loaded as a Data Bunch and then a Fast ai Learner can use this data for the models. Here, we will first create data bunch with our train dataset.
Fit the deep learning model with domain specific data
First we will fit our model with train data without target values so that our model knows better about our data.
Re-fit model with classification label
Here we will re-fit our model with our target values and tuned our model for better accuracy result.
Let’s predict the target values and compare with original target values.
Let’s get the prediction and create the submission file to submit it in Kaggle.
All the code
All the code for this task can be found here on Kaggle kernels:
Checkout my portfolio here: https://confusedcoders.com/nikita-sharma-greenhorn-data-science-student
I am a greenhorn Data Science student with interest in finding patterns in data. My language of choice is Python and I am starting to get my hands dirty with R.
I blog on Medium.com  and ConfusedCoders.com . I share my code on Github.com .