Complaint Analysis System with IPC Section Matching

Overview

This program takes a user-submitted complaint and analyzes it by matching relevant Indian Penal Code (IPC) sections using Natural Language Processing (NLP) techniques. The system uses a combination of TF-IDF vectorization, cosine similarity, and fuzzy logic to determine the most applicable IPC sections based on the content of the complaint.

Features

Text Preprocessing: The complaint text is preprocessed by removing stopwords and applying lemmatization to ensure meaningful keyword extraction.
Keyword Matching: IPC sections are associated with predefined keywords. These keywords are compared to the complaint text using the TF-IDF vectorizer to calculate similarity.
Fuzzy Logic: Fuzzy logic is used to translate the cosine similarity scores into a membership value, which helps determine whether a particular IPC section is applicable to the complaint.
Threshold-based Filtering: Only IPC sections with a fuzzy membership value above a threshold are considered relevant and returned as applicable.

Dependencies

This project requires the following Python libraries:

pandas: For handling data and manipulating tabular data.
numpy: For numerical operations and arrays.
sklearn: For machine learning utilities, including text vectorization and cosine similarity.
skfuzzy: For fuzzy logic implementation and control systems.
nltk: For natural language processing tasks, including tokenization, stopword removal, and lemmatization.

Install dependencies using:

pip install pandas numpy scikit-learn skfuzzy nltk

Code Walkthrough

1. Knowledge Base (IPC Sections with Keywords)

The system defines a dictionary containing IPC sections and associated keywords. Each section is identified by a unique code (e.g., "320B", "506") and a set of keywords that represent the types of offenses related to the section.

ipc_sections = [
    {"section": "320B", "keywords": "grievous hurt injuries serious harm damage injury"},
    {"section": "506", "keywords": "threaten kill harm death violence abuse"},
    ...
]

2. Text Preprocessing

The preprocess_text function processes the input complaint text and the keywords associated with each IPC section:

Converts the text to lowercase.
Splits the text into individual words.
Removes stopwords (common words like "the", "and", etc.).
Lemmatizes the remaining words to their root form.

def preprocess_text(text):
    text = text.lower()  # Lowercase text
    words = text.split()  # Split text into words
    words = [lemmatizer.lemmatize(word) for word in words if word not in stop_words]  # Lemmatize and remove stopwords
    return " ".join(words)

3. Vectorization of IPC Keywords

The IPC section keywords are processed using the TfidfVectorizer from scikit-learn to convert the keywords into numerical vectors. The vectors are calculated using TF-IDF (Term Frequency-Inverse Document Frequency) with unigrams and bigrams for better context understanding.

vectorizer = TfidfVectorizer(ngram_range=(1, 2))  # Use bigrams for better context understanding
ipc_vectors = vectorizer.fit_transform(processed_ipc_keywords)

4. Fuzzy Logic Setup

Fuzzy logic is used to map similarity scores (from cosine similarity) to fuzzy membership values. Membership functions are defined for two variables:

score: Represents the similarity score, divided into "low", "medium", and "high".
membership: Represents the applicability of the IPC section, divided into "not applicable" and "applicable".

score = ctrl.Antecedent(np.arange(0, 1.1, 0.1), "score")
membership = ctrl.Consequent(np.arange(0, 1.1, 0.1), "membership")

score["low"] = fuzz.trimf(score.universe, [0, 0, 0.3])
score["medium"] = fuzz.trimf(score.universe, [0.2, 0.5, 0.8])
score["high"] = fuzz.trimf(score.universe, [0.7, 1, 1])

membership["not_applicable"] = fuzz.trimf(membership.universe, [0, 0, 0.4])
membership["applicable"] = fuzz.trimf(membership.universe, [0.3, 0.7, 1])

5. Fuzzy Rules

The fuzzy rules map similarity scores to applicability. If the score is "low" or "medium", the membership value will indicate that the IPC section is not applicable. If the score is "high", the IPC section is considered applicable.

rule1 = ctrl.Rule(score["low"], membership["not_applicable"])
rule2 = ctrl.Rule(score["medium"], membership["not_applicable"])
rule3 = ctrl.Rule(score["high"], membership["applicable"])

6. Complaint Analysis Function

The main function, analyze_complaint, takes a complaint as input, preprocesses it, calculates its cosine similarity with each IPC section's keywords, and applies fuzzy logic to determine the applicability of each IPC section.

def analyze_complaint(complaint_text):
    processed_complaint = preprocess_text(complaint_text)
    complaint_vector = vectorizer.transform([processed_complaint])

    similarity_scores = cosine_similarity(complaint_vector, ipc_vectors)[0]

    applicable_sections = []
    similarity_list = []
    for section_idx, score_value in enumerate(similarity_scores):
        membership_simulation.input["score"] = score_value
        membership_simulation.compute()
        fuzzy_value = membership_simulation.output["membership"]

        if fuzzy_value > 0.1:  # Threshold to consider as applicable
            applicable_sections.append((ipc_sections[section_idx]["section"], fuzzy_value))
            similarity_list.append(
                f"IPC {ipc_sections[section_idx]['section']} - Score: {score_value:.4f}, Fuzzy: {fuzzy_value:.4f}"
            )

    return {
        "complaint": complaint_text,
        "applicable_sections": applicable_sections,
        "details": similarity_list,
    }

7. Main Program Execution

The program prompts the user to input a complaint, then analyzes it and prints the results, including the applicable IPC sections and similarity details.

if __name__ == "__main__":
    complaint_input = input("Enter the complaint: ")
    results = analyze_complaint(complaint_input)
    
    print("\nComplaint Analysis Results:")
    print(f"Complaint: {results['complaint']}")
    print("\nApplicable IPC Sections:")
    for section, fuzzy_value in results["applicable_sections"]:
        print(f"  - IPC Section {section}: Fuzzy Score = {fuzzy_value:.4f}")
    print("\nDetails:")
    for detail in results["details"]:
        print(f"  {detail}")

Sample Output

Enter the complaint: I have been threatened with violence, and I am in danger of harm.

Complaint Analysis Results:
Complaint: I have been threatened with violence, and I am in danger of harm.

Applicable IPC Sections:
  - IPC Section 506: Fuzzy Score = 0.9500
  - IPC Section 307: Fuzzy Score = 0.5000

Details:
  IPC 506 - Score: 0.6543, Fuzzy: 0.9500
  IPC 307 - Score: 0.5123, Fuzzy: 0.5000

Conclusion

This system provides an automated way to analyze complaints and match them with relevant IPC sections based on the content. It combines modern NLP techniques with fuzzy logic to handle the ambiguity in textual data and produce meaningful results.

Future Improvements

Extend IPC Knowledge Base: Add more IPC sections and their associated keywords to expand the coverage.
Contextual Understanding: Integrate more advanced NLP models for better understanding of the complaint context.
GUI Integration: Build a user-friendly graphical interface for easier interaction with the system.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
activityDiagram_uml.mmd		activityDiagram_uml.mmd
classdiagram_uml.mmd		classdiagram_uml.mmd
communicationDiagram_uml.mmd		communicationDiagram_uml.mmd
componentDiagram_uml.mmd		componentDiagram_uml.mmd
deploymentDiagram_uml.mmd		deploymentDiagram_uml.mmd
interactionOverviewDiagram_uml.mmd		interactionOverviewDiagram_uml.mmd
ipc.py		ipc.py
packageDiagram_uml.mmd		packageDiagram_uml.mmd
profileDiagram_uml.mmd		profileDiagram_uml.mmd
sequenceDiagram_uml.mmd		sequenceDiagram_uml.mmd
stateDiagram_uml.mmd		stateDiagram_uml.mmd
structureDiagrams_uml.mmd		structureDiagrams_uml.mmd
timingDiagram_uml.mmd		timingDiagram_uml.mmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Complaint Analysis System with IPC Section Matching

Overview

Features

Dependencies

Code Walkthrough

1. Knowledge Base (IPC Sections with Keywords)

2. Text Preprocessing

3. Vectorization of IPC Keywords

4. Fuzzy Logic Setup

5. Fuzzy Rules

6. Complaint Analysis Function

7. Main Program Execution

Sample Output

Conclusion

Future Improvements

About

Releases

Packages

Languages

License

mradulnatani/Decision-Support-System

Folders and files

Latest commit

History

Repository files navigation

Complaint Analysis System with IPC Section Matching

Overview

Features

Dependencies

Code Walkthrough

1. Knowledge Base (IPC Sections with Keywords)

2. Text Preprocessing

3. Vectorization of IPC Keywords

4. Fuzzy Logic Setup

5. Fuzzy Rules

6. Complaint Analysis Function

7. Main Program Execution

Sample Output

Conclusion

Future Improvements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages