SinaTools: Open Source Toolkit for Arabic Natural Language Processing

Jan 1, 2024·
Tymaa Hammouda
,
Mustafa Jarrar
,
Mohammed A. Khalilia
· 1 min read
PDF
Abstract
We introduce SinaTools, an open-source Python package for Arabic natural language processing and understanding. SinaTools is a unified package allowing people to integrate it into their system workflow, offering solutions for various tasks such as flat and nested Named Entity Recognition (NER), fully-flagged Word Sense Disambiguation (WSD), Semantic Relatedness, Synonymy Extractions and Evaluation, Lemmatization, Part-of-speech Tagging, Root Tagging, and additional helper utilities such as corpus processing, text stripping methods, and diacritic-aware word matching. This paper presents SinaTools and its benchmarking results, demonstrating that SinaTools outperforms all similar tools on the aforementioned tasks, such as Flat NER (87.33%), Nested NER (89.42%), WSD (82.63%), Semantic Relatedness (0.49 Spearman rank), Lemmatization (90.5%), POS tagging (93.8%), among others. SinaTools can be downloaded from (https://sina.birzeit.edu/sinatools).
Type
Publication
Procedia Computer Science
publications
Note

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Note

Create your slides in Markdown - click the Slides button to check out the example.

Supplementary notes can be added here, including code, math, and images.