A Survey of Part of Speech Tagging of Latin and non-Latin Script Languages: A more vivid view on Persian

  • Meisam Moghadam Fasa University
  • Niloufar Jafarpour Technische Universität Darmstadt, Germany
Keywords: Part of Speech Tagging, Latin Script Language, Non-Latin Script language, RTL system, Persian Language

Abstract

This research is a survey about a general overview of the Latin script languages part of speech (POS) tagging and a specific focus on the non-Latin script languages, especially Persian, among the 23 highest native spoken languages in the world. Some of these languages follow the right-to-left (RTL) writing system such as Arabic, Urdu and Persian languages, which have their own specific issues in POS tagging. This paper also goes through the issues and challenges which occurs during the tokenization and part of speech tagging of these languages. The challenges can be common between the languages or be specified to one. The Persian Language is chosen as the main interest of this paper and it is tried to critically overview the studies on Persian part of speech tagging until now and enumerate the specific challenges occurring in these studies.
Published
2021-02-28
How to Cite
Moghadam, M., & Jafarpour, N. (2021). A Survey of Part of Speech Tagging of Latin and non-Latin Script Languages: A more vivid view on Persian. LANGUAGE ART, 6(1). Retrieved from https://www.languageart.ir/index.php/LA/article/view/180
Section
Article