0 Mėgstami
0Krepšelis

Part of Speech Tagging for Pashto: POS Tagging for Pashto

70,53 
70,53 
2025-07-31 70.5300 InStock
Nemokamas pristatymas į paštomatus per 16-20 darbo dienų užsakymams nuo 19,00 

Knygos aprašymas

This book presents the first ever rule-based part of speech tagging for Pashto language. In natural language processing, part-of-speech tagging plays a vital role. It is a significant pre-requisite for putting a human language on the engineering track. Before developing a part-of-speech tagger, a tagset is required for that language. Initially, a tagset is created according to syntactical properties that contains 54 tags for Pashto language. A simple architecture is proposed for Pashto part of speech tagger. The architecture contains a tokenizer, a lexicon and rules for disambiguation and new words. The lexicon contains words with their tags. The lexicon will grow with each new word, when more and more text is tagged. The above architecture is implemented and tested on real world data. The accuracy was low in the beginning because a very limited lexicon and rules were present. Text is tagged with this tagger and corrections of new words are done manually which result in the growth of both lexicon and rules. When the lexicon reached to 100,000 words and rules grew to 120, the accuracy became 88%. The accuracy will further increase with the increase of words in the lexicon and rules.

Informacija

Autorius: Ihsan Rabbi
Leidėjas: LAP LAMBERT Academic Publishing
Išleidimo metai: 2012
Knygos puslapių skaičius: 104
ISBN-10: 3847324977
ISBN-13: 9783847324973
Formatas: Knyga minkštu viršeliu
Kalba: Anglų
Žanras: Email: consumer / user guides

Pirkėjų atsiliepimai

Parašykite atsiliepimą apie „Part of Speech Tagging for Pashto: POS Tagging for Pashto“

Būtina įvertinti prekę

Goodreads reviews for „Part of Speech Tagging for Pashto: POS Tagging for Pashto“