Wrapping it all up – Deep Learning & Job Descriptions (Pt 5 of 5)

We definitely achieved some interesting results and obtained a model with a versatile set of applications that are very relevant for contributing to our mission in platform and potentially stand-alone services. We used a large amount of training data and (apart from tokenization and lowercasing) did not perform any preprocessing of the job descriptions.

Where to From Here?

It’s quite impressive to see what results we obtained from such raw text. However, like is usually the case, there are lots of things to experiment with and improve in future versions.

The current CNN is rather shallow and operates on the word level. Although we noticed that it is able to deal with common spelling mistakes, better results might be obtained by directly working with characters (Zhang et al.) or techniques similar to word hashing using 3-grams (Huang et al.). We used Word2Vec to initialize the word lookup table in the CNN. This approach does not deal with polysemy, i.e. the same word having different meanings. In addition, our approach did not use the actual words or characters in the job title labels of our data set.

We mentioned that the use of CNNs for NLP applications is open for discussion. It is of course worth investigating other approaches that specifically deal with sequences like LSTMs for example. On the other hand, the latter are usually relatively shallow and recent research showed that using other types of deeper networks might be advantageous.

The data set of 10 million vacancies that we used is fairly large, but in the meantime we have data sets at our disposal that are multiple times larger. Applying some of the techniques from the previous paragraph to these data sets sounds very exciting!

References

Collobert Ronan, Weston Jason, Bottou Léon, Karlen Michael, Kavukcuoglu Koray, Kuksa Pavel. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493−2537, 2011.

Denil Misha, Demiraj Alban, Kalchbrenner Nal, Blunsom Phil, de Freitas Nando. Modelling, visualising and summarising documents with a single convolutional neural network. University of Oxford, arXiv:1406.3830, 2014.

Huang Po-Sen, He Xiaodong, Gao Jianfeng, Deng Li, Acero Alex, Heck Larry. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Conference on Information & Knowledge Management (CIKM ’13). San Francisco, California, USA, 23332338, 2013.

Kalchbrenner Nal, Grefenstette Edward, Blunsom Phil. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL ’14). Baltimore, Maryland, USA, 655–665, 2014.

Lebret Remi, Collobert Ronan. Word embeddings through Hellinger PCA. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Gothenburg, Sweden, 482–490, 2014.

Mikolov Tomas, Sutskever Ilya, Chen Kai, Corrado Greg, Dean Jeffrey. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS ’13). Lake Tahoe, Nevada, USA, 3111–3119, 2013.

Mikolov Tomas, Yih Scott Wen-tau, Zweig Geoffrey. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT ’13). Atlanta, Georgia, 746–751, 2013.

Weston Jason, Chopra Sumit, Adams Keith. #TagSpace: Semantic Embeddings from Hashtags. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP ’14). Doha, Qatar, 18221827, 2014.

Zhang Xiang, Zhao Junbo, LeCun Yann. Character-level convolutional networks for text classification. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS ’15). Montreal, Canada, 649657, 2015.

 

This article is part of a series on Using Deep Learning to Extract Knowledge from Job Descriptions. For more information, head to Using Deep Learning to Extract Knowledge from Job Descriptions.

Find out more about Search Party

Jan Luts is a senior data scientist at Search Party. He received a Master of Information Sciences from Universiteit Hasselt, Belgium, in 2003. He also received Master degrees in Bioinformatics and Statistics from Katholieke Universiteit Leuven, Belgium, in 2004 and 2005, respectively. After obtaining his PhD at the Department of Electrical Engineering (ESAT) of Katholieke Universiteit Leuven in 2010, he worked in postdoctoral research for a further two years at the institution. In 2012 Jan moved to Australia where he worked as a postdoctoral researcher in Statistics at the School of Mathematical Sciences in the University of Technology, Sydney. In 2013 he moved into the private sector as Data Scientist at Search Party.

Leave a Reply