Islam and Data Science Roundup

  • In “Semantically Enhanced Concept Search of the Holy Quran: Qur’anic English WordNet” (Arabian Journal for Science and Engineering 44, no.4 (2019)), Hammad Afzal and Mukhtar Tayyeba present a framework to perform concept- and keyword-based English search of the Qur’an. In order to handle abstract queries that do not occur verbatim in the text, the authors state that they implement a Qur’anic English WordNet, which is a database for the English translations of the Qur’an to form the basis of a semantically rich resource for a Qur’an search system. They go on to say that this WordNet structure of semantically linked information is used as a baseline and augmented with a few newly created relations to capture Qur’an specific information. Afzal and Tayyeba extract and conceptually hierarchize vocabulary from the English Qur’anic translations and develop an enhanced search tool to perform the searches. The authors conclude that the semiautomatic approach of creating the knowledge repository used in this research is scalable and easily extendable for other religious knowledge texts, such as Islamic jurisprudence.
  • Nur Aqilah Pashkal Rostam and Nural Hashimah Ahamaed Hassain Malim propose a method using text categorization to classify selected categories by determining the interrelation between the Qur’an and hadith in “Text Categorisation in Quran and Hadith: Overcoming the Interrelation Challenges Using Machine Learning and Team Weighting” (Journal of King Saud University Computer Information Sciences, (2019)). The authors simulate several interrelated cases by using a combination of different Islamic resource datasets. They compare the datasets using three classification methods, the Naive Bayes (NB), Support Vector Machine (SVM),  K-Nearest Neighbor (KNN), and Term Frequency- Inverse Document Frequency (TF-IDF) methods. The authors conclude that the Support Vector Machine method, regardless of being used alone or with term weighting, successfully addresses the interrelationship for single- and multi-label classifications. They continue that it attains better accuracy with 10-20% improvement when compared to the other methods that exhibit slight improvement accuracy wise.

Leave a Reply