Home About Login Register Search Current Archive Announcement

Natural Language Processing concepts and methods revisited

Editor IJSMI


The paper starts with the history of Natural Language Processing (NLP) and revisits the concepts and methods involved in the NLP. It provides overview of different classifiers and language modelling techniques. The paper also lists the different fields where NLP is used and also the software available to carry out NLP.


Natural Language Processing; Machine Learning; Text Classification; Language Modelling

Full Text:



Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing (Vol. 999). Cambridge: MIT press.

Liddy, E. D. (2001). Natural language processing.

Hutchins, W. J. (1986). Machine translation: past, present, future (p. 66). Chichester: Ellis Horwood.

Spyns, P. (1996). Natural language processing. Methods of information in medicine, 35(4), 285-301.

Cronin, T. (2014). Automation of Medical Record Risk Factor Tagging Using Machine Learning and Natural Language Processing Methods.

Friedman, C., Shagina, L., Lussier, Y., & Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11(5), 392-402.

Demner-Fushman, D., Chapman, W. W., & McDonald, C. J. (2009). What can natural language processing do for clinical decision support?. Journal of biomedical informatics, 42(5), 760-772.

Szlosek, D. A., & Ferrett, J. (2016). Using Machine Learning and Natural Language Processing Algorithms to Automate the Evaluation of Clinical Decision Support in Electronic Medical Record Systems. eGEMs, 4(3).

Alemzadeh, H., & Devarakonda, M. (2017, February). An NLP-based cognitive system for disease status identification in electronic health records. In Biomedical & Health Informatics (BHI), 2017 IEEE EMBS International Conference on (pp. 89-92). IEEE.

Zeng, Q. T., Goryachev, S., Weiss, S., Sordo, M., Murphy, S. N., & Lazarus, R. (2006). Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC medical informatics and decision making, 6(1), 30.

Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4), 1093-1113

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12(Aug), 2493-2537.

Go, A., Bhayani, R., & Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), 12.

Brill, E. (1995). Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational linguistics, 21(4), 543-565.

Collobert, R., & Weston, J. (2008, July). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning (pp. 160-167). ACM.

Winograd, T. (1972). Understanding natural language. Cognitive psychology, 3(1), 1-191.

Piantadosi, S. T. (2014). Zipf’s word frequency law in natural language: A critical review and future directions. Psychonomic bulletin & review, 21(5), 1112-1130.

Huang, X. D., Ariki, Y., & Jack, M. A. (1990). Hidden Markov models for speech recognition (Vol. 2004). Edinburgh: Edinburgh university press.

Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992). Class-based n-gram models of natural language. Computational linguistics, 18(4), 467-479.

Landauer, T. K. (2006). Latent semantic analysis. John Wiley & Sons, Ltd.

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014, June). The stanford corenlp natural language processing toolkit. In ACL (System Demonstrations) (pp. 55-60).

Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.".

DOI: http://dx.doi.org/10.3000/ijsmi.v4i1.8

DOI (PDF): http://dx.doi.org/10.3000/ijsmi.v4i1.8.g23

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.