Eniiyanu commited on
Commit
23981bc
·
verified ·
1 Parent(s): 2d58264

Upload 20 files

Browse files
Files changed (3) hide show
  1. QUICK_START.md +173 -0
  2. rag_pipeline.py +123 -20
  3. tax_calculator.py +357 -0
QUICK_START.md ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quick Start: Enhanced RAG Responses
2
+
3
+ ## TL;DR
4
+
5
+ Your RAG responses are now **more explanatory and context-aware**. Test it:
6
+
7
+ ```bash
8
+ python test_enhanced_responses.py
9
+ ```
10
+
11
+ ---
12
+
13
+ ## What Changed?
14
+
15
+ ### Before
16
+ ```
17
+ ❌ Generic bullet points
18
+ ❌ Intimidating legal language
19
+ ❌ No practical guidance
20
+ ❌ Inline citations disrupt flow
21
+ ```
22
+
23
+ ### After
24
+ ```
25
+ ✅ Structured explanations (Simple → Detailed → Example → Takeaways)
26
+ ✅ Persona-aware (Student, Business, Employee contexts)
27
+ ✅ Conversational, supportive tone
28
+ ✅ Citations at end, not inline
29
+ ```
30
+
31
+ ---
32
+
33
+ ## Test It Now
34
+
35
+ ### Option 1: Quick Test
36
+ ```bash
37
+ python test_enhanced_responses.py
38
+ ```
39
+ Tests the student question you mentioned.
40
+
41
+ ### Option 2: Test All Personas
42
+ ```bash
43
+ python test_enhanced_responses.py --all
44
+ ```
45
+ Tests student, business, employee, and general questions.
46
+
47
+ ### Option 3: Custom Question
48
+ ```bash
49
+ python rag_pipeline.py --source data --question "Your question here"
50
+ ```
51
+
52
+ ---
53
+
54
+ ## Example Output
55
+
56
+ **Question:** "As a student, what do I need to know about the new tax law?"
57
+
58
+ **You'll now get:**
59
+
60
+ ```
61
+ **Simple Answer:**
62
+ The new tax law creates a Student Education Loan Fund starting in 2030.
63
+
64
+ **What It Means for You:**
65
+ [Student-focused explanation]
66
+
67
+ **How It Works:**
68
+ [Clear breakdown]
69
+
70
+ **Timeline:**
71
+ [When it matters]
72
+
73
+ **Practical Example:**
74
+ [Real scenario]
75
+
76
+ **Key Takeaways:**
77
+ ✅ [Actionable point 1]
78
+ ✅ [Actionable point 2]
79
+ ✅ [Actionable point 3]
80
+
81
+ (Source citations)
82
+ ```
83
+
84
+ ---
85
+
86
+ ## Files You Got
87
+
88
+ 1. **`persona_prompts.py`** - Persona detection logic
89
+ 2. **`test_enhanced_responses.py`** - Test script
90
+ 3. **`ENHANCEMENT_SUMMARY.md`** - Overview (read this)
91
+ 4. **`BEFORE_AFTER_COMPARISON.md`** - Visual examples
92
+ 5. **`RESPONSE_ENHANCEMENT_GUIDE.md`** - Full technical docs
93
+ 6. **`QUICK_START.md`** - This file
94
+
95
+ ---
96
+
97
+ ## How It Works
98
+
99
+ ```
100
+ User Question
101
+
102
+ Detect Persona (student/business/employee/general)
103
+
104
+ Adapt System Prompt
105
+
106
+ Retrieve Relevant Documents
107
+
108
+ Generate Structured Response
109
+
110
+ Enhanced Answer
111
+ ```
112
+
113
+ ---
114
+
115
+ ## Novel Approaches Used
116
+
117
+ 1. **Contextual Layering** - Progressive information disclosure
118
+ 2. **Persona Detection** - Auto-adapt to user context
119
+ 3. **Narrative Structure** - Story-based explanations
120
+ 4. **Citation Optimization** - End of sections, not inline
121
+
122
+ ---
123
+
124
+ ## Configuration
125
+
126
+ ### Disable if Needed
127
+ ```python
128
+ # In rag_pipeline.py, line 39
129
+ _HAS_PERSONA = False # Reverts to generic responses
130
+ ```
131
+
132
+ ### Add New Persona
133
+ ```python
134
+ # In persona_prompts.py
135
+ PERSONA_PROMPTS["your_persona"] = {
136
+ "system_suffix": "Your custom instructions...",
137
+ "keywords": ["keyword1", "keyword2"]
138
+ }
139
+ ```
140
+
141
+ ---
142
+
143
+ ## Troubleshooting
144
+
145
+ **Q: Persona not detected?**
146
+ A: Add more keywords to `persona_prompts.py`
147
+
148
+ **Q: Responses too long?**
149
+ A: Reduce `max_tokens` in RAGPipeline init
150
+
151
+ **Q: Want old format back?**
152
+ A: Set `_HAS_PERSONA = False`
153
+
154
+ ---
155
+
156
+ ## Next Steps
157
+
158
+ 1. ✅ **Test it:** `python test_enhanced_responses.py`
159
+ 2. 📊 **Compare:** Check output vs your current response
160
+ 3. 🔧 **Customize:** Edit personas in `persona_prompts.py`
161
+ 4. 🚀 **Deploy:** Use in production when satisfied
162
+
163
+ ---
164
+
165
+ ## Support
166
+
167
+ - **Examples:** `BEFORE_AFTER_COMPARISON.md`
168
+ - **Technical:** `RESPONSE_ENHANCEMENT_GUIDE.md`
169
+ - **Overview:** `ENHANCEMENT_SUMMARY.md`
170
+
171
+ ---
172
+
173
+ **Start here:** `python test_enhanced_responses.py`
rag_pipeline.py CHANGED
@@ -40,6 +40,13 @@ try:
40
  except ImportError:
41
  _HAS_PERSONA = False
42
 
 
 
 
 
 
 
 
43
  # Optional hybrid and rerankers
44
  from langchain_community.retrievers import BM25Retriever
45
  from langchain.retrievers import EnsembleRetriever
@@ -342,6 +349,12 @@ class RAGPipeline:
342
  print(f"Could not load cross-encoder reranker: {e}")
343
  self.reranker = None
344
 
 
 
 
 
 
 
345
  self.chain = self._build_chain()
346
  print("RAG pipeline ready")
347
 
@@ -458,13 +471,6 @@ class RAGPipeline:
458
  "3) **Make it real**: Always include a relatable example or scenario that shows how this actually plays out.\n"
459
  "4) **Give them clear takeaways**: 2-3 specific points they can remember or act on.\n"
460
  "\n"
461
- "Citation Guidelines:\n"
462
- "- Group ALL citations at the very end under a \"Sources\" or \"Want to dive deeper?\" section\n"
463
- "- Format: \"Nigeria Tax Act 2025, Section [X]\" or \"Tax Administration Guidelines, Chapter [Y]\"\n"
464
- "- NEVER use document filenames (like \"Journal_Nigeria-Tax-Bill.pdf\") - reference the actual law/chapter instead\n"
465
- "- Don't interrupt your explanation with citations - let it flow naturally\n"
466
- "- Only use information from the provided context\n"
467
- "\n"
468
  "Tone - This is Critical:\n"
469
  "- Write like you're explaining this to someone who's never dealt with this before\n"
470
  "- Use \"you\" and \"your\" to make it personal\n"
@@ -474,6 +480,7 @@ class RAGPipeline:
474
  "- Be genuine and understanding: \"I know tax stuff can be confusing, but here's what you need to know...\"\n"
475
  "- If something doesn't apply to them, be clear: \"This won't affect you, but...\"\n"
476
  "- Make it conversational and natural - like explaining to a friend, not reading from a legal document\n"
 
477
  )
478
 
479
  # Add persona-specific context if available
@@ -626,15 +633,61 @@ class RAGPipeline:
626
  })
627
  return final
628
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
629
  # -------- Task routing --------
630
 
631
- @staticmethod
632
- def _route(question: str) -> str:
633
  q = question.lower()
 
 
 
 
 
 
634
  if re.search(r"\bchapter\b|\bsection\b|\bpart\s+[ivxlcdm]+\b|^summari[sz]e\b", q):
635
  return "summarize"
 
 
636
  if re.search(r"\bextract\b|\blist\b|\btable\b|\brate\b|\bband\b|\bthreshold\b|\ballowance\b|\brelief\b", q):
637
  return "extract"
 
 
638
  return "qa"
639
 
640
  # Stub for a future extractor chain - currently route extractor requests to QA chain with strict rules
@@ -643,23 +696,67 @@ class RAGPipeline:
643
 
644
  def query(self, question: str, verbose: bool = False) -> str:
645
  """Route and answer the question with persona-aware responses."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
646
  if verbose:
647
- print(f"\nRetrieving relevant documents...")
648
- docs = self._retrieve(question)
649
- print(f"Found {len(docs)} relevant chunks:")
650
- for i, doc in enumerate(docs[:20], 1):
651
- source = doc.metadata.get("source", "Unknown")
652
- page = doc.metadata.get("page", "Unknown")
653
- preview = doc.page_content[:150].replace("\n", " ")
654
- print(f" [{i}] {source} (page {page}): {preview}...")
655
- print()
656
-
 
 
 
 
 
657
  # Show detected persona if available
658
  if _HAS_PERSONA:
659
  persona = detect_persona(question)
660
  print(f"Detected persona: {persona}\n")
661
 
662
- task = self._route(question)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
663
  if task == "summarize":
664
  return self._summarize_chapter(question)
665
  elif task == "extract":
@@ -671,6 +768,12 @@ class RAGPipeline:
671
 
672
 
673
  def main():
 
 
 
 
 
 
674
  parser = argparse.ArgumentParser(
675
  description="Enhanced RAG pipeline with hybrid retrieval, reranking, and chapter summarization",
676
  formatter_class=argparse.RawDescriptionHelpFormatter,
 
40
  except ImportError:
41
  _HAS_PERSONA = False
42
 
43
+ # Tax calculator for accurate arithmetic
44
+ try:
45
+ from tax_calculator import TaxCalculator
46
+ _HAS_TAX_CALC = True
47
+ except ImportError:
48
+ _HAS_TAX_CALC = False
49
+
50
  # Optional hybrid and rerankers
51
  from langchain_community.retrievers import BM25Retriever
52
  from langchain.retrievers import EnsembleRetriever
 
349
  print(f"Could not load cross-encoder reranker: {e}")
350
  self.reranker = None
351
 
352
+ # Initialize tax calculator for accurate arithmetic
353
+ self.tax_calculator = None
354
+ if _HAS_TAX_CALC:
355
+ self.tax_calculator = TaxCalculator()
356
+ print("Tax calculator loaded - calculations will use real arithmetic")
357
+
358
  self.chain = self._build_chain()
359
  print("RAG pipeline ready")
360
 
 
471
  "3) **Make it real**: Always include a relatable example or scenario that shows how this actually plays out.\n"
472
  "4) **Give them clear takeaways**: 2-3 specific points they can remember or act on.\n"
473
  "\n"
 
 
 
 
 
 
 
474
  "Tone - This is Critical:\n"
475
  "- Write like you're explaining this to someone who's never dealt with this before\n"
476
  "- Use \"you\" and \"your\" to make it personal\n"
 
480
  "- Be genuine and understanding: \"I know tax stuff can be confusing, but here's what you need to know...\"\n"
481
  "- If something doesn't apply to them, be clear: \"This won't affect you, but...\"\n"
482
  "- Make it conversational and natural - like explaining to a friend, not reading from a legal document\n"
483
+ "- Always encourage questions and provide reassurance: \"Feel free to ask if anything is unclear!\"\n"
484
  )
485
 
486
  # Add persona-specific context if available
 
633
  })
634
  return final
635
 
636
+ # -------- Question validation --------
637
+
638
+ def _is_tax_related_question(self, question: str) -> bool:
639
+ """
640
+ Check if the question is related to Nigerian tax law.
641
+ Uses a fast LLM call to classify the question.
642
+ """
643
+ # Fast keyword check first (avoid LLM call if obviously tax-related)
644
+ tax_keywords = [
645
+ 'tax', 'paye', 'vat', 'cit', 'wht', 'income', 'revenue',
646
+ 'levy', 'duty', 'assessment', 'filing', 'return', 'deduction',
647
+ 'allowance', 'relief', 'exemption', 'taxable', 'naira', '₦',
648
+ 'firs', 'lirs', 'pension', 'nhf', 'company', 'business',
649
+ 'employer', 'employee', 'salary', 'profit', 'turnover'
650
+ ]
651
+
652
+ q_lower = question.lower()
653
+ if any(keyword in q_lower for keyword in tax_keywords):
654
+ return True
655
+
656
+ # If no keywords, use LLM to classify (fast check)
657
+ classifier_prompt = ChatPromptTemplate.from_template(
658
+ "You are a question classifier for a Nigerian tax law assistant.\n\n"
659
+ "Is the following question related to Nigerian tax law, taxation, tax administration, "
660
+ "tax calculations, or tax compliance?\n\n"
661
+ "Question: {question}\n\n"
662
+ "Answer ONLY with 'YES' or 'NO'. Nothing else."
663
+ )
664
+
665
+ try:
666
+ response = (classifier_prompt | self.llm | StrOutputParser()).invoke({"question": question})
667
+ return response.strip().upper().startswith("YES")
668
+ except Exception:
669
+ # If classification fails, err on the side of allowing the question
670
+ return True
671
+
672
  # -------- Task routing --------
673
 
674
+ def _route(self, question: str) -> str:
675
+ """Route question to appropriate handler."""
676
  q = question.lower()
677
+
678
+ # Check if this is a calculation question first (highest priority)
679
+ if self.tax_calculator and self.tax_calculator.is_calculation_question(question):
680
+ return "calculate"
681
+
682
+ # Then check for summarization
683
  if re.search(r"\bchapter\b|\bsection\b|\bpart\s+[ivxlcdm]+\b|^summari[sz]e\b", q):
684
  return "summarize"
685
+
686
+ # Then check for structured extraction
687
  if re.search(r"\bextract\b|\blist\b|\btable\b|\brate\b|\bband\b|\bthreshold\b|\ballowance\b|\brelief\b", q):
688
  return "extract"
689
+
690
+ # Default to QA
691
  return "qa"
692
 
693
  # Stub for a future extractor chain - currently route extractor requests to QA chain with strict rules
 
696
 
697
  def query(self, question: str, verbose: bool = False) -> str:
698
  """Route and answer the question with persona-aware responses."""
699
+ # First, check if question is tax-related
700
+ if not self._is_tax_related_question(question):
701
+ return (
702
+ "**I'm Káàntà AI - Your Nigerian Tax Assistant**\n\n"
703
+ "I specialize in answering questions about Nigerian tax law, including:\n"
704
+ "• Personal Income Tax (PAYE)\n"
705
+ "• Company Income Tax (CIT)\n"
706
+ "• Value Added Tax (VAT)\n"
707
+ "• Tax calculations and brackets\n"
708
+ "• Tax filing and compliance\n"
709
+ "• Tax reliefs and exemptions\n\n"
710
+ "Your question doesn't seem to be related to Nigerian taxation. "
711
+ "I can only help with tax-related questions based on the Nigeria Tax Act and Tax Administration documents.\n\n"
712
+ "**Try asking me:**\n"
713
+ "• \"How much tax will I pay on a monthly income of ₦X?\"\n"
714
+ "• \"What are the personal income tax rates in Nigeria?\"\n"
715
+ "• \"What is PAYE and how does it work?\"\n"
716
+ "• \"What tax reliefs are available for individuals?\"\n\n"
717
+ "Feel free to ask any tax-related question!"
718
+ )
719
+
720
+ # Route the question
721
+ task = self._route(question)
722
+
723
  if verbose:
724
+ print(f"\nTask type: {task}")
725
+
726
+ if task == "calculate":
727
+ print("Using tax calculator for accurate arithmetic\n")
728
+ else:
729
+ print(f"\nRetrieving relevant documents...")
730
+ docs = self._retrieve(question)
731
+ print(f"Found {len(docs)} relevant chunks:")
732
+ for i, doc in enumerate(docs[:20], 1):
733
+ source = doc.metadata.get("source", "Unknown")
734
+ page = doc.metadata.get("page", "Unknown")
735
+ preview = doc.page_content[:150].replace("\n", " ")
736
+ print(f" [{i}] {source} (page {page}): {preview}...")
737
+ print()
738
+
739
  # Show detected persona if available
740
  if _HAS_PERSONA:
741
  persona = detect_persona(question)
742
  print(f"Detected persona: {persona}\n")
743
 
744
+ # Handle calculation questions with tax calculator
745
+ if task == "calculate":
746
+ if self.tax_calculator:
747
+ try:
748
+ answer = self.tax_calculator.answer_calculation_question(question)
749
+ if answer:
750
+ return answer
751
+ else:
752
+ # Fallback to regular QA if extraction failed
753
+ task = "qa"
754
+ except Exception as e:
755
+ print(f"Warning: Tax calculation failed: {e}")
756
+ print("Falling back to regular QA chain\n")
757
+ task = "qa"
758
+
759
+ # Handle other task types
760
  if task == "summarize":
761
  return self._summarize_chapter(question)
762
  elif task == "extract":
 
768
 
769
 
770
  def main():
771
+ # Fix encoding for Windows console
772
+ if sys.platform == 'win32':
773
+ import io
774
+ sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
775
+ sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
776
+
777
  parser = argparse.ArgumentParser(
778
  description="Enhanced RAG pipeline with hybrid retrieval, reranking, and chapter summarization",
779
  formatter_class=argparse.RawDescriptionHelpFormatter,
tax_calculator.py ADDED
@@ -0,0 +1,357 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Tax calculation engine with real Python arithmetic.
3
+ Handles Nigerian personal income tax calculations accurately.
4
+ """
5
+
6
+ import re
7
+ from typing import Dict, List, Tuple, Optional
8
+ from dataclasses import dataclass
9
+
10
+
11
+ @dataclass
12
+ class TaxBracket:
13
+ """Represents a single tax bracket."""
14
+ lower: float
15
+ upper: float
16
+ rate: float
17
+
18
+ def __repr__(self):
19
+ return f"₦{self.lower:,.0f} - ₦{self.upper:,.0f}: {self.rate*100:.0f}%"
20
+
21
+
22
+ @dataclass
23
+ class TaxCalculation:
24
+ """Results of a tax calculation."""
25
+ annual_income: float
26
+ annual_tax: float
27
+ monthly_income: float
28
+ monthly_tax: float
29
+ effective_rate: float
30
+ breakdown: List[Dict[str, float]]
31
+
32
+ def format_summary(self) -> str:
33
+ """Format a human-readable summary."""
34
+ lines = []
35
+ lines.append("**Tax Calculation Summary**\n")
36
+ lines.append(f"**Monthly Income:** ₦{self.monthly_income:,.2f}")
37
+ lines.append(f"**Monthly Tax:** ₦{self.monthly_tax:,.2f}")
38
+ lines.append(f"**Take-home pay:** ₦{self.monthly_income - self.monthly_tax:,.2f}\n")
39
+
40
+ lines.append(f"**Annual Income:** ₦{self.annual_income:,.2f}")
41
+ lines.append(f"**Annual Tax:** ₦{self.annual_tax:,.2f}")
42
+ lines.append(f"**Effective Tax Rate:** {self.effective_rate:.2f}%\n")
43
+
44
+ lines.append("**How we calculated this:**")
45
+ for item in self.breakdown:
46
+ if item['tax'] > 0:
47
+ lines.append(
48
+ f" • ₦{item['bracket_start']:,.0f} - ₦{item['bracket_end']:,.0f} "
49
+ f"at {item['rate']*100:.0f}%: "
50
+ f"₦{item['taxable']:,.2f} × {item['rate']*100:.0f}% = ₦{item['tax']:,.2f}"
51
+ )
52
+ else:
53
+ lines.append(
54
+ f" • ₦{item['bracket_start']:,.0f} - ₦{item['bracket_end']:,.0f} "
55
+ f"at {item['rate']*100:.0f}%: TAX-FREE"
56
+ )
57
+
58
+ return "\n".join(lines)
59
+
60
+
61
+ # Nigerian Personal Income Tax Brackets (2025)
62
+ NIGERIA_TAX_BRACKETS = [
63
+ TaxBracket(0, 800_000, 0.00), # 0% - TAX FREE
64
+ TaxBracket(800_000, 3_000_000, 0.15), # 15%
65
+ TaxBracket(3_000_000, 12_000_000, 0.18), # 18%
66
+ TaxBracket(12_000_000, 25_000_000, 0.21), # 21%
67
+ TaxBracket(25_000_000, 50_000_000, 0.23), # 23%
68
+ TaxBracket(50_000_000, float('inf'), 0.25), # 25%
69
+ ]
70
+
71
+
72
+ class TaxCalculator:
73
+ """Calculates Nigerian personal income tax."""
74
+
75
+ def __init__(self, brackets: List[TaxBracket] = None):
76
+ """Initialize calculator with tax brackets."""
77
+ self.brackets = brackets or NIGERIA_TAX_BRACKETS
78
+
79
+ def calculate_annual_tax(self, annual_income: float) -> TaxCalculation:
80
+ """
81
+ Calculate tax on annual income using real arithmetic.
82
+
83
+ Args:
84
+ annual_income: Annual income in Naira
85
+
86
+ Returns:
87
+ TaxCalculation with detailed breakdown
88
+ """
89
+ if annual_income < 0:
90
+ raise ValueError("Income cannot be negative")
91
+
92
+ total_tax = 0.0
93
+ breakdown = []
94
+ remaining_income = annual_income
95
+
96
+ for bracket in self.brackets:
97
+ if remaining_income <= 0:
98
+ break
99
+
100
+ # Calculate taxable amount in this bracket
101
+ bracket_lower = bracket.lower
102
+ bracket_upper = min(bracket.upper, annual_income)
103
+
104
+ # How much income falls in this bracket?
105
+ taxable_in_bracket = min(
106
+ remaining_income,
107
+ bracket_upper - bracket_lower
108
+ )
109
+
110
+ # Calculate tax for this bracket
111
+ tax_in_bracket = taxable_in_bracket * bracket.rate
112
+ total_tax += tax_in_bracket
113
+
114
+ # Record breakdown
115
+ breakdown.append({
116
+ 'bracket_start': bracket_lower,
117
+ 'bracket_end': bracket_upper if bracket_upper != float('inf') else annual_income,
118
+ 'rate': bracket.rate,
119
+ 'taxable': taxable_in_bracket,
120
+ 'tax': tax_in_bracket
121
+ })
122
+
123
+ remaining_income -= taxable_in_bracket
124
+
125
+ # Stop if we've reached the income amount
126
+ if annual_income <= bracket_upper:
127
+ break
128
+
129
+ effective_rate = (total_tax / annual_income * 100) if annual_income > 0 else 0
130
+
131
+ return TaxCalculation(
132
+ annual_income=annual_income,
133
+ annual_tax=total_tax,
134
+ monthly_income=annual_income / 12,
135
+ monthly_tax=total_tax / 12,
136
+ effective_rate=effective_rate,
137
+ breakdown=breakdown
138
+ )
139
+
140
+ def calculate_from_monthly(self, monthly_income: float) -> TaxCalculation:
141
+ """Calculate tax from monthly income."""
142
+ annual_income = monthly_income * 12
143
+ return self.calculate_annual_tax(annual_income)
144
+
145
+ def extract_income_from_question(self, question: str) -> Optional[Tuple[float, str]]:
146
+ """
147
+ Extract income amount and period from question.
148
+
149
+ Returns:
150
+ Tuple of (amount, period) where period is 'monthly' or 'annual'
151
+ or None if not found
152
+ """
153
+ q = question.lower()
154
+
155
+ # Remove common currency symbols and words
156
+ q = q.replace('₦', '').replace('naira', '').replace(',', '')
157
+
158
+ # Patterns for income extraction (order matters - check annual first!)
159
+ # Note: commas already removed from q before matching
160
+ patterns = [
161
+ # Annual patterns first - check before monthly defaults
162
+ # "5000000 per year" or "5000000 annually"
163
+ (r'(\d+(?:\.\d+)?)\s*(?:per|a|each|every)?\s*(?:year|annum)', 'annual'),
164
+ (r'(\d+(?:\.\d+)?)\s*(?:/year|yearly|annually)', 'annual'),
165
+ # "annual income of 5000000" or "income of 5000000 per year"
166
+ (r'annual\s+income\s+(?:of\s+)?(\d+(?:\.\d+)?)', 'annual'),
167
+ (r'income\s+(?:of\s+)?(\d+(?:\.\d+)?)\s+(?:per|a)\s+year', 'annual'),
168
+
169
+ # Monthly patterns
170
+ # "75000 per month" or "75000 per month"
171
+ (r'(\d+(?:\.\d+)?)\s*(?:per|a|each|every)?\s*month', 'monthly'),
172
+ # "75000 monthly" or "75000/month"
173
+ (r'(\d+(?:\.\d+)?)\s*(?:/month|monthly)', 'monthly'),
174
+ # "earning 75000" (assume monthly if not specified)
175
+ (r'earning\s+(\d+(?:\.\d+)?)', 'monthly'),
176
+ # "I earn 75000" (assume monthly)
177
+ (r'earn(?:ing)?\s+(\d+(?:\.\d+)?)', 'monthly'),
178
+ # "salary of 75000" (assume monthly)
179
+ (r'salary\s+(?:of\s+)?(\d+(?:\.\d+)?)', 'monthly'),
180
+ # "income of 75000" (default to monthly if no period specified)
181
+ (r'income\s+(?:of\s+)?(\d+(?:\.\d+)?)', 'monthly'),
182
+ ]
183
+
184
+ for pattern, period in patterns:
185
+ match = re.search(pattern, q)
186
+ if match:
187
+ try:
188
+ amount = float(match.group(1))
189
+ return (amount, period)
190
+ except (ValueError, IndexError):
191
+ continue
192
+
193
+ return None
194
+
195
+ def is_calculation_question(self, question: str) -> bool:
196
+ """
197
+ Detect if question is asking for a tax calculation.
198
+
199
+ Returns:
200
+ True if question requires calculation
201
+ """
202
+ q = question.lower()
203
+
204
+ # Calculation indicators (more comprehensive patterns)
205
+ calc_patterns = [
206
+ r'how much.*tax', # "how much tax will I pay" or "how much tax"
207
+ r'calculate.*tax', # "calculate my tax"
208
+ r'what.*tax.*pay', # "what tax will I pay"
209
+ r'what.*tax.*owe', # "what tax do I owe"
210
+ r'tax.*calculation', # "tax calculation"
211
+ r'compute.*tax', # "compute tax"
212
+ r'my tax', # "what's my tax"
213
+ r'tax liability', # "tax liability"
214
+ r'tax.*amount', # "tax amount"
215
+ r'pay.*tax', # "will I pay tax"
216
+ r'tax.*deduct', # "tax deduction"
217
+ r'how much.*pay', # "how much will I pay" (in context of income)
218
+ r'what.*pay.*month', # "what will I pay per month"
219
+ r'how much.*month.*tax', # "how much per month as tax"
220
+ ]
221
+
222
+ # Must also contain income information
223
+ has_income = self.extract_income_from_question(question) is not None
224
+
225
+ # Check if any calculation pattern matches
226
+ has_calc_intent = any(re.search(pattern, q) for pattern in calc_patterns)
227
+
228
+ return has_calc_intent and has_income
229
+
230
+ def answer_calculation_question(self, question: str) -> str:
231
+ """
232
+ Answer a tax calculation question with accurate arithmetic.
233
+
234
+ Args:
235
+ question: User's question
236
+
237
+ Returns:
238
+ Formatted answer with calculation
239
+ """
240
+ # Extract income
241
+ extraction = self.extract_income_from_question(question)
242
+
243
+ if not extraction:
244
+ return None
245
+
246
+ amount, period = extraction
247
+
248
+ # Calculate tax
249
+ if period == 'monthly':
250
+ result = self.calculate_from_monthly(amount)
251
+ else:
252
+ result = self.calculate_annual_tax(amount)
253
+
254
+ # Validate result
255
+ if result.annual_tax > result.annual_income:
256
+ raise ValueError(
257
+ f"VALIDATION ERROR: Calculated tax (₦{result.annual_tax:,.2f}) "
258
+ f"exceeds income (₦{result.annual_income:,.2f})!"
259
+ )
260
+
261
+ if result.effective_rate > 100:
262
+ raise ValueError(
263
+ f"VALIDATION ERROR: Effective tax rate ({result.effective_rate:.2f}%) exceeds 100%!"
264
+ )
265
+
266
+ # Format answer
267
+ answer_parts = []
268
+
269
+ answer_parts.append("**Bottom Line:**")
270
+ answer_parts.append(
271
+ f"On an income of ₦{result.monthly_income:,.2f} per month "
272
+ f"(₦{result.annual_income:,.2f} per year), you'll pay **₦{result.monthly_tax:,.2f} per month** "
273
+ f"in personal income tax. That's ₦{result.annual_tax:,.2f} per year.\n"
274
+ )
275
+
276
+ answer_parts.append(result.format_summary())
277
+
278
+ answer_parts.append("\n**Important Notes:**")
279
+ answer_parts.append("• The first ₦800,000 of your annual income is completely tax-free")
280
+ answer_parts.append("• Nigeria uses a progressive tax system - you only pay higher rates on income above each threshold")
281
+ answer_parts.append("• This calculation is for personal income tax only (PAYE)")
282
+ answer_parts.append("• Additional deductions may apply: pension (8%), NHF (2.5% if applicable), NHIS, etc.")
283
+
284
+ answer_parts.append("\n**Next Steps:**")
285
+ answer_parts.append("• Check your payslip to verify your employer is deducting the correct amount")
286
+ answer_parts.append("• Keep records of your tax payments for annual filing")
287
+ answer_parts.append("• Consult a tax professional for personalized advice on deductions and reliefs")
288
+
289
+ return "\n".join(answer_parts)
290
+
291
+
292
+ # Convenience functions
293
+ def calculate_tax(income: float, period: str = 'annual') -> TaxCalculation:
294
+ """
295
+ Quick tax calculation.
296
+
297
+ Args:
298
+ income: Income amount in Naira
299
+ period: 'monthly' or 'annual'
300
+
301
+ Returns:
302
+ TaxCalculation object
303
+ """
304
+ calc = TaxCalculator()
305
+ if period == 'monthly':
306
+ return calc.calculate_from_monthly(income)
307
+ else:
308
+ return calc.calculate_annual_tax(income)
309
+
310
+
311
+ def format_tax_brackets() -> str:
312
+ """Format tax brackets for display."""
313
+ lines = ["**Nigerian Personal Income Tax Brackets (2025)**\n"]
314
+ for bracket in NIGERIA_TAX_BRACKETS:
315
+ if bracket.upper == float('inf'):
316
+ lines.append(f"Above ₦{bracket.lower:,.0f}: {bracket.rate*100:.0f}%")
317
+ else:
318
+ lines.append(f"₦{bracket.lower:,.0f} - ₦{bracket.upper:,.0f}: {bracket.rate*100:.0f}%")
319
+ return "\n".join(lines)
320
+
321
+
322
+ if __name__ == "__main__":
323
+ # Test with the example from the user
324
+ import sys
325
+ if sys.platform == 'win32':
326
+ # Fix encoding for Windows console
327
+ import io
328
+ sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
329
+
330
+ print("Testing Tax Calculator\n")
331
+ print("=" * 80)
332
+
333
+ calc = TaxCalculator()
334
+
335
+ # Test case 1: N75,000/month
336
+ print("\nTest 1: N75,000 per month")
337
+ print("-" * 80)
338
+ result = calc.calculate_from_monthly(75_000)
339
+ print(result.format_summary())
340
+
341
+ # Test case 2: Extract from question
342
+ print("\n\nTest 2: Extract and calculate from question")
343
+ print("-" * 80)
344
+ question = "I am earning 75,000 per month, how much will I pay per month as tax"
345
+ answer = calc.answer_calculation_question(question)
346
+ print(answer)
347
+
348
+ # Test case 3: High income
349
+ print("\n\nTest 3: N10,000,000 per year")
350
+ print("-" * 80)
351
+ result = calc.calculate_annual_tax(10_000_000)
352
+ print(result.format_summary())
353
+
354
+ # Test validation
355
+ print("\n\nTest 4: Tax bracket display")
356
+ print("-" * 80)
357
+ print(format_tax_brackets())