I was recently researching various text mining and language processing techniques to extract Job Skills from Job postings and Resume data. The input data is a free text corpus and the expected output would be the desired skills sets for a given job profile.
I decided to document all my research as a paper with all the technical details that might be useful for someone researching a similar problem. So here it is –
Direct link to paper : https://confusedcoders.com/wp-content/uploads/2019/09/Job-Skills-extraction-with-LSTM-and-Word-Embeddings-Nikita-Sharma.pdf
The output of the exercise were very promising and I was able to extend the model to various Job categories. The techniques is also able to identify new and emerging Skillsets rather than being limited to a known set of Skills.
Sample of skills extracted from a Software Engineering Job post:
Same model extended to a Civil Engineering job post:
Hello,
Interesting article. I have a question about Section 5.3. You mention that “All the phrases classified as skill are then selected for noun keyword extraction.”. Can you detail how you performed noun keyword extraction?
Thanks,
Mpalo
Since the classifier gets rid of the non-skill sentences we are left with the sentences containing skills. I extracted Nouns from the sentence to extract the skills.