Customer-Complaint-Analyses

Big Data Analytics - EECS E6893 Final Project (Fall 2014)

Download as .zip Download as .tar.gz View on GitHub

``# Customer Complaint Analyses Insights into issues plaguing the banking sector

Authors

@abhaar @avinsrid @nachirau @ss91

Motivation

In this project, we propose a scalable design to counter the above problems!

Our Work

What have we achieved?

With our new metric system, banks can relatively prioritize the complaints to resolve!

DEMO of our Work

Demo Video

Compilation Instructions

$ git clone https://github.com/Sapphirine/Customer-Complaint-Analyses.git
$ cd Customer-Complaint-Analyses/PROJECT_CODE/
$ mvn clean install
$ hadoop jar target/Classification-Files-Big-Data-Project-1.0.jar com.bigdata.complaintanalysis.ClassificationAutomator data/Consumer_Complaints.csv

Sequenced files will be stored in HDFS under classification directory

$ hdfs dfs -ls data/classification/$state_name

Execute Mahout Naive Bayes Classification

$ MAHOUT_PATH/bin/mahout seq2sparse -i data/classifiaction/$state_name -o $state_name-vectors
$ MAHOUT_PATH/bin/mahout split -i $state_name-vectors/tfidf-vectors --trainingOutput train-vectors --testOutput test-vectors --randomSelectionPct 40 --overwrite --sequenceFiles -xm sequential
$ MAHOUT_PATH/bin/mahout trainnb -i train-vectors -el -li labelindex -o model -ow -c
$ MAHOUT_PATH/bin/mahout testnb -i train-vectors -m model -l labelindex -ow -o $state_name-testing -c
$ MAHOUT_PATH/bin/mahout  testnb -i test-vectors -m model -l labelindex -ow -o $state_name-testing -c

View the Confusion Matrix

$ cd Customer-Complaint-Analyses/
$ javac -cp /path/to/jar ProblemClustering.java
$ java -cp /path/to/jar ProblemClustering

Future Work

Contact Us

Feel free to shoot the authors an email at the following email IDs: