practice/python/sentiment_analysis_api/model.py at master · RelCode/practice

30 lines (24 loc) · 759 Bytes

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
# load dataset
def load_data():
        df = pd.read_csv("data/sentiment_data.csv")
    except:
        Exception("Error loading dataset")
        exit(1)
    return df["text"], df["label"]
# train model
def train_model():
    x, y = load_data()
    model = Pipeline([
        ("tfidf", TfidfVectorizer(ngram_range=(1,2))), # converts text into numerical form
        ("classifier", MultinomialNB()) # train model using a simple algorithm for text classification
    # train model
    model.fit(x, y)
    return model
# analyze trained model
sentiment_model = train_model()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

model.py

Latest commit

History

model.py

File metadata and controls