By Hussain Awan*
Making edits on nearly 350 canons over the course of the semester and reaching a net edit count of 471 has deepened my insights into the tagging process, especially in terms of efficient error detection. I’ve also grown to better understand the nuanced ways in which primary source texts can be useful.
Below, I’ve highlighted two major takeaways from my experience:
- The Critical Role of Source Documents; Issues in the Absence of Secondary Sources
For some canon sets to which I was assigned, there were no supplementary secondary sources to aid interpretation, making it challenging to parse their meaning. This was exacerbated by the fact that some of these canons were highly pithy, and I did not have a lot of words to refer to while trying to understand the canon’s application. This experience underscored the importance of consulting primary sources, as the concise phrasing in the canons themselves often proved insufficient for the task of tagging.
- Improved Efficiency in Recognizing; Correcting AI-Suggested Mistakes
After going through an additional several hundred canons as compared to my midterm report, I became more attuned to the types of mistakes ChatGPT tends to make. This familiarity enabled me to detect errors more swiftly, enhancing my ability to override initial categorizations.
Refinement of AI-Assisted Categorization Skills
With experience, I became faster at identifying ChatGPT’s common categorization mistakes. This familiarity allowed me to better anticipate where and why the AI might misinterpret canons.
The following table outlines common mistake types and my strategies for their resolution:
Common Mistakes Made by ChatGPT
Mistake Type | Description | Example Canon | Initial Categorization | Correct Categorization | Frequency | ||||
Overuse of Broad Categories (like the “Rules of Interpretation” category) | Tendency to use broad categories, like Sales/Contracts, Rules of Interpretation, or Judicial Procedure, when they were inapt | Canon #6455: “Interpretation and Obligation”[1] | Judicial Procedure; Evidence | Laws of Interpretation & Obligation | Relatively high | ||||
Misinterpretation of Phrasing | Misreading of terms in canon titles or phrases, leading to inaccurate categorization | Canon #3042: “Mention of Correlation”[2] | Family Law | Laws of Interpretation & Obligation → Rules of Interpretation | Very low | ||||
Incorrect Subfield Selection | Assigning non-approved subfields, such as “Waqf,” or conflating similar but distinct subfields | Canon #2175: “Property Dedication for Allah”[3] | Waqf (not on the subfield list) | Property and Real Estate | Relatively high | ||||
|
|
Canon #2194: “Reverting to Original Obligation upon Access” (“According to the Hanafis, if one gains access to the original…”)[4] | Ritual Law | Laws of Interpretation & Obligation → Rules of Interpretation | Medium |
Mistake Descriptions
- Overuse of Broad Categories (like the Rules of Interpretation category): ChatGPT frequently defaulted to broader categories when a nuanced categorization was more apt.
- For example, for canon #6455, “Interpretation and Obligation,” ChatGPT initially placed it under Judicial Procedure; Evidence due to its procedural tone. However, upon closer review, the canon dealt with Rules of Interpretation under Laws of Interpretation & Obligation, addressing interpretive principles rather than strictly judicial evidentiary procedures.
- Misinterpretation of Phrasing: Some mistranslations occurred where ChatGPT misread Arabic phrases, affecting categorization.
- In canon #3042, “Mention of Correlation,” ChatGPT initially interpreted the term “القِران” as referring to the Qur’ān when it actually referred to “correlation” or “conjunction”; this mistake led it to suggest Family Law as the appropriate category. However, this canon in fact focused on interpretative consistency, meaning Rules of Interpretation under Laws of Interpretation & Obligation was its correct categorization.
- Incorrect Subfield Selection: ChatGPT sometimes suggested non-approved subfields or conflated related ones. ChatGPT incorrectly tagged canon #2175, “Property Dedication for Allah,” as Waqf despite the absence of this subfield in the approved list. I ultimately classified the canon under Property and Real Estate more generally.
- Overly Specific Example-Based Categorization: On occasion, ChatGPT would latch onto the specific examples mentioned in the shamela.ws secondary source I would give it. It would thus ignore the wider context of the canon and categorize it according to a particularized application (often in Ritual Law).
Mistake Analysis and Resolution Strategies
Mistake Type | Frequency | Resolution Strategy |
Overuse of Broad Categories (like the Rules of Interpretation category) | High | Manual review of ChatGPT’s mistake |
Misinterpretation of Phrasing | Very low | Verifying translations with context, especially via the use of shamela.ws |
Incorrect Subfield Selection | High | Reminding the AI to strictly adhere to the approved list |
Overly Specific Example-Based Categorization
|
Medium | Manual review of ChatGPT’s mistake |
Through these experiences, I developed an approach that involved more methodical verification, especially when ChatGPT’s categorizations seemed overly broad or contextually misaligned. By observing patterns in these mistakes, I became better equipped to preemptively identify and address inaccuracies.
Data Analysis
Below is a breakdown of categories and subcategories tagged across these several hundred canons, as prepared by ChatGPT:
Category | Subcategory | Count |
Laws of Interpretation & Obligation | Rules of Interpretation | 120 |
Judicial Procedure | Evidence | 98 |
Sales and Contracts | Contracts | 85 |
Ritual Law | Ṣalāt – Prayer | 63 |
Public & International Law | – | 55 |
Torts | Damages; Remedies | 47 |
Family Law | Marriage | 38 |
Property | Real Estate; Real Property | 32 |
Criminal Law | Homicide; Retribution | 25 |
Endowments; Gifts | Charitable Giving | 18 |
Slave Law | – | 15 |
Conclusion
In summary, tagging these additional sets of canons has reinforced my belief in the need for rigorous source consultation and attentive error correction, especially in instances where the brevity of a canon’s phrasing could obscure its proper categorization. ChatGPT was invaluable in suggesting preliminary tags, but with careful review, I was able to refine many of its initial guesses. This combined approach—AI assistance, human verification, and comprehensive document review—has allowed me to develop a more consistent method for the precise categorization of canons.
Notes:
* Hussain Awan is a 3L and Teaching Fellow at Harvard Law School. He is a contributor to Harvard Law School’s Global Anticorruption Blog, a Harvard Law School Chayes Fellow, and a former clerk to Justice Syed Mansoor Ali Shah at the Supreme Court of Pakistan. Hussain attended McGill University in Montreal, where he studied International Development and French and graduated as class valedictorian.
[1] SHARIAsource CnC Database Canon No. 6455 (citing Muḥammad Ṣidqī Būrnū, Mawsūʿat al-qawāʿid al-fiqhiyya (3d ed., 2015), 8:176 [hereinafter Būrnū]): the preponderance of assumption is treated as certainty in legal rulings (غلبة الظن تنزل منزلة اليقين في الأحكام; ghalabat al-ẓann tunzal manzilat al-aḥkām).
[2] Canon No. 3042: retaliation is a punishment that the prosecution does not execute (al-qaṣāṣ ʿuqūba lā tajrī al-niyāba fī īfāʾihā; القصاص عقوبة لا تجري النيابة في إيفائها).
[3] Canon No. 2175 (citing Būrnū, 1:417): the presumption is that every contract is valid at its occurrence, and if not then it is invalid (الأصل أن كل عقد له مجيز حال وقوعه توقف للإجازة وإلا فلا; al-aṣl anna kull ʿaqd lahu mujīz ḥāl wuqūʿih tawaqquf lil-ijāza wa-illā fa-lā).
[4] Canon No. 2194 (citing Būrnū, 1:158): the presumption is that intention, when isolated from action, has no effect in worldly affairs (الأصل أن النية إذا تجردت عن العمل لا تكون مؤثرة، – في الأمور الدنيوية; al-aṣl anna al-niyya idhā tajarradat ʿan al-ʿamal lā takūn muʾaththira, – fī al-umūr al-dunyawiya).
(Suggested Bluebook citation: Hussain Awan, Data Collection Report, Islamic Law Blog (Feb. 28, 2025), https://islamiclaw.blog/2025/02/28/data-collection-report-3/)
(Suggested Chicago citation: Hussain Awan, “Data Collection Report,” Islamic Law Blog, February 28, 2025, https://islamiclaw.blog/2025/02/28/data-collection-report-3/)