Paper-Conference

WojoodOntology: Ontology-Driven LLM Prompting for Unified Information Extraction Tasks

Information Extraction tasks such as Named Entity Recognition and Relation Extraction are often developed using diverse tagsets and annotation guidelines. This presents major …

alaa-aljabari

Active Learning for Multidialectal Arabic POS Tagging

Multidialectal Arabic POS tagging is challenging due to the morphological richness and high variability among dialects. While POS tagging for MSA has advanced thanks to the …

diyam-akra
$mathrmWojood^Relations$: Arabic Relation Extraction Corpus and Modeling featured image

$mathrmWojood^Relations$: Arabic Relation Extraction Corpus and Modeling

Relation extraction (RE) is a core task in natural language processing, crucial for semantic understanding, knowledge graph construction, and enhancing downstream applications. …

alaa-aljabari
Konooz: Multi-domain Multi-dialect Corpus for Named Entity Recognition featured image

Konooz: Multi-domain Multi-dialect Corpus for Named Entity Recognition

We introduce , a novel multi-dimensional corpus covering 16 Arabic dialects across 10 domains, resulting in 160 distinct corpora. The corpus comprises about 777k tokens, carefully …

nagham-hamad

WojoodNER 2024: The Second Arabic Named Entity Recognition Shared Task

We present WojoodNER-2024, the second Arabic Named Entity Recognition (NER) Shared Task. In WojoodNER-2024, we focus on fine-grained Arabic NER. We provided participants with a new …

mustafa-jarrar

AraFinNLP 2024: The First Arabic Financial NLP Shared Task

The expanding financial markets of the Arab world require sophisticated Arabic NLP tools. To address this need within the banking domain, the Arabic Financial NLP (AraFinNLP) …

sanad-malaysha

ArabicNLU 2024: The First Arabic Natural Language Understanding Shared Task

This paper presents an overview of the Arabic Natural Language Understanding (ArabicNLU 2024) shared task, focusing on two subtasks: Word Sense Disambiguation (WSD) and Location …

mohammed-khalilia

SinaTools: Open Source Toolkit for Arabic Natural Language Processing

We introduce SinaTools, an open-source Python package for Arabic natural language processing and understanding. SinaTools is a unified package allowing people to integrate it into …

tymaa-hammouda

NLU-STR at SemEval-2024 Task 1: Generative-based Augmentation and Encoder-based Scoring for Semantic Textual Relatedness

Semantic textual relatedness is a broader concept of semantic similarity. It measures the extent to which two chunks of text convey similar meaning or topics, or share related …

sanad-malaysha

Event-Arguments Extraction Corpus and Modeling using BERT for Arabic

Event-argument extraction is a challenging task, particularly in Arabic due to sparse linguistic resources. To fill this gap, we introduce the corpus (550k tokens) as an extension …

alaa-aljabari

WojoodNER 2023: The First Arabic Named Entity Recognition Shared Task

We present WojoodNER-2023, the first Arabic Named Entity Recognition (NER) Shared Task. The primary focus of WojoodNER 2023 is on Arabic NER, offering novel NER datasets (i.e., …

mustafa-jarrar

SALMA: Arabic Sense-Annotated Corpus and WSD Benchmarks

SALMA, the first Arabic sense-annotated corpus, consists of 34K tokens, which are all sense-annotated. The corpus is annotated using two different sense inventories simultaneously …

mustafa-jarrar

ArBanking77: Intent Detection Neural Model and a New Dataset in Modern and Dialectical Arabic

This paper presents the ArBanking77, a large Arabic dataset for intent detection in the banking domain. Our dataset was arabized and localized from the original English Banking77 …

mustafa-jarrar

Arabic Fine-Grained Entity Recognition

Traditional NER systems are typically trained to recognize coarse-grained entities, and less attention is given to classifying entities into a hierarchy of fine-grained lower-level …

haneen-liqreina

Offensive Hebrew Corpus and Detection using BERT

Offensive language detection has been well studied in many languages, but it is lagging behind in low-resource languages, such as Hebrew. In this paper, we present a new offensive …

nagham-hamad

Context-Gloss Augmentation for Improving Arabic Target Sense Verification

Arabic language lacks semantic datasets and sense inventories. The most common semantically-labeled dataset for Arabic is the ArabGlossBERT, a relatively small dataset that …

sanad-malaysha
Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT featured image

Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT

This paper presents Wojood, a corpus for Arabic nested Named Entity Recognition (NER). Nested entities occur when one entity mention is embedded inside another entity mention. …

mustafa-jarrar
Joint Entity Extraction and Assertion Detection for Clinical Text featured image

Joint Entity Extraction and Assertion Detection for Clinical Text

Negative medical findings are prevalent in clinical reports, yet discriminating them from positive findings remains a challenging task for in-formation extraction. Most of the …

parminder-bhatia
Comprehend Medical: A Named Entity Recognition and Relationship Extraction Web Service featured image

Comprehend Medical: A Named Entity Recognition and Relationship Extraction Web Service

Comprehend Medical is a stateless and Health Insurance Portability and Accountability Act (HIPAA) eligible Named Entity Recognition (NER) and Relationship Extraction (RE) service …

parminder-bhatia

Improving Hospital Mortality Prediction with Medical Named Entities and Multimodal Learning

Clinical text provides essential information to estimate the acuity of a patient during hospital stays in addition to structured clinical data. In this study, we explore how …

mengqi-jin

Identifying Patients at Risk of High Healthcare Utilization

Clinical predictive modeling involves two challenging tasks, model development and model deployment. In this paper we demonstrate a software architecture for developing and …

lincoln-sheets

Cloud-based Predictive Modeling System and its Application to Asthma Readmission Prediction

The predictive modeling process is time consuming and requires clinical researchers to handle complex electronic health record (EHR) data in restricted computational environments. …

robert-chen

Clinical predictive modeling development and deployment through FHIR web services

Clinical predictive modeling involves two challenging tasks, model development and model deployment. In this paper we demonstrate a software architecture for developing and …

mohammed-khalilia

Topology Preservation in Fuzzy Self-Organizing Maps

One of the important properties of SOM is its topology preservation of the input data. The topographic error is one of the techniques proposed to measure how well the continuity of …

mohammed-khalilia

Improvements to the relational fuzzy c-means clustering algorithm

Relational fuzzy c-means (RFCM) is an algorithm for clustering objects represented in a pairwise dissimilarity values in a dissimilarity data matrix D. RFCM is dual to the fuzzy …

mohammed-khalilia

Fuzzy relational self-organizing maps

In this paper we propose a novel fuzzy relational self-organizing map algorithm (FRSOM) that can be used to map a set of n objects described by pairwise dissimilarity values to a …

mohammed-khalilia

Improving disease prediction using ICD-9 ontological features

Disease prediction has become important in a variety of applications such as health insurance, tailored health communication and public health. Disease prediction is usually …

mihail-popescu