Skip to content

BlakePan/spam-classfication

Repository files navigation

SMS Spam Classification

Using bert-based model for SMS Spam Classification

Download Dataset

The SMS Spam Data is already downloaded and saved in the folder data.
You can find the source of the dataset below link:
https://archive.ics.uci.edu/dataset/228/sms+spam+collection

Install Environment

It is recommended to build a virtual Python env by using Anaconda and run the below command:

$ pip install -r requirements.txt

Training

To train bert-base-uncased for spam classification by running:

$ python train.py -c config-bert.yaml

To train distilbert-base-uncased for spam classification by running:

$ python train.py -c config-distilbert.yaml

You can modify hyperparameters in those config files.

To enable the Tensorboard writer, please add -tb arg in the training command.
For example:

$ python train.py -c config-bert.yaml -tb

When training starts, you should be able to see the progress and metrics on the screen: image

The experiment was run on colab, you can check the notebook in the below link: https://colab.research.google.com/drive/1QmWNf6Fo46Qbw0beCvUUJmyQZNZD2bs0?usp=sharing

Tensorboard

The training and validation logs are saved in runs folder by default,
you can run a Tensorboard service to compare different experiment results by using the below command:

$ tensorboard --logdir runs/

And click the URL shown on the screen image

Then, you should be able to see the Tensorboard on the browser image

Download Fine-tuned model

In this project, we also provided fine-tuned model weights.
You can download those files by using the below commands and try demo directly without training:

$ wget –no-check-certificate 'https://drive.usercontent.google.com/download?id=1-xCqfqJqOouxPQmeyvtowa9Cg64lAZdm&export=download&authuser=0&confirm=t&uuid=3b24a51f-e24d-4c46-b7fa-52a660109a4d&at=APZUnTV-bGoABfutQpN4NcWsFmjG:1692866582619' -O models.zip
$ unzip models.zip

Demo

When models are ready, you can run the demo by the below command:
⚠️ If you fine-tuned your own model, please remember to modify the path in config-demo.yaml

$ python demo.py

And click the URL shown on the screen image

Then, you should be able to see the demo webui on the browser image

Here are some samples from the validation set that you could try for the demo.

Hams

0: how come?
1: loosu go to hospital. de dont let it careless.
2: hi my email address has changed now it is
3: k will do, addie & amp ; i are doing some art so i'll be here when you get home
4: aiyo please u got time meh.

image

Spams

0: recpt 1 / 3. you have ordered a ringtone. your order is being processed...
1: december only! had your mobile 11mths +? you are entitled to update to the latest colour camera mobile for free! call the mobile update co
2: ree entry in 2 a weekly comp for a chance to win an ipod. txt pod to 80182 to get entry ( std
3: sms services for your inclusive text credits pls gotto www. comuk. net login 3qxj9 unsubscribe with
4: free for 1st week! no1 nokia tone 4 ur mob every week just txt nokia to 8007 get txting and tell ur mates www.

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages