Home
Projects
Publications
Contact
Light
Dark
Automatic
Arabic NLP
Arabic Fine-Grained Entity Recognition
Traditional NER systems are typically trained to recognize coarse-grained entities, and less attention is given to classifying entities …
Haneen Liqreina
,
Mustafa Jarrar
,
Mohammed Khalilia
,
Ahmed El-Shangiti
,
Muhammad Abdul-Mageed
PDF
Cite
Code
Project
ArBanking77: Intent Detection Neural Model and a New Dataset in Modern and Dialectical Arabic
This paper presents the ArBanking77, a large Arabic dataset for intent detection in the banking domain. Our dataset was arabized and …
Mustafa Jarrar
,
Ahmet Birim
,
Mohammed Khalilia
,
Mustafa Erden
,
Sana Ghanem
PDF
Cite
Project
SALMA: Arabic Sense-Annotated Corpus and WSD Benchmarks
SALMA, the first Arabic sense-annotated corpus, consists of 34K tokens, which are all sense-annotated. The corpus is annotated using …
Mustafa Jarrar
,
Sanad Malaysha
,
Tymaa Hammouda
,
Mohammed Khalilia
PDF
Cite
Project
Slides
WojoodNER 2023: The First Arabic Named Entity Recognition Shared Task
We present WojoodNER-2023, the first Arabic Named Entity Recognition (NER) Shared Task. The primary focus of WojoodNER 2023 is on …
Mustafa Jarrar
,
Muhammad Abdul-Mageed
,
Mohammed Khalilia
,
Bashar Talafha
,
AbdelRahim Elmadany
,
Nagham Hamad
,
Alaa' Omar
PDF
Cite
Project
Offensive Hebrew Corpus and Detection using BERT
Offensive language detection has been well studied in many languages, but it is lagging behind in low-resource languages, such as …
Nagham Hamad
,
Mustafa Jarrar
,
Mohammed Khalilia
,
Nadim Nashif
Cite
Context-Gloss Augmentation for Improving Arabic Target Sense Verification
Arabic language lacks semantic datasets and sense inventories. The most common semantically-labeled dataset for Arabic is the …
Sanad Malaysha
,
Mustafa Jarrar
,
Mohammed Khalilia
PDF
Cite
Wojood - Arabic NER
Wojood consists of about 550K tokens (MSA and dialect) that are manually annotated with 21 entity types (e.g., person, organization, location, event, date, etc). It covers multiple domains and was annotated with nested entities. The corpus contains about 75K entities and 22.5% of which are nested. A nested named entity recognition (NER) model based on BERT was trained (F1-score 88.4%).
Slides
Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT
This paper presents Wojood, a corpus for Arabic nested Named Entity Recognition (NER). Nested entities occur when one entity mention is …
Mustafa Jarrar
,
Mohammed Khalilia
,
Sana Ghanem
PDF
Cite
Code
Project
Cite
×