Laporkan Masalah

Optimizing Naive Bayes-Driven Bukalapak Sentiment Analysis on Twitter Using Data Analysis Refinement Techniques

Nadya Alifa Irsan, Mhd. Reza M.I. Pulungan, M.Sc, Dr-Ing, Prof.

2023 | Skripsi | ILMU KOMPUTER

Sentiment analysis for the Indonesian stock market using Twitter data presents challenges related to data availability, language barriers and cultural differences. This research aims to comprehensively analyze sentiment in the context of Bukalapak, focusing on optimizing the sentiment analysis results. The study utilizes Tweepy to access the Twitter API for data retrieval. Two machine learning base models, Multinomial Naive Bayes (MNB) and Bernoulli Naive Bayes (BNB) are empoyed to create an optimized, hyperparameter tuned model. 

Based on gathered dataset with the keyword "Bukalapak", the study found that the Bukalapak elicits predominantly positive sentiments, followed by neutral and negative sentiments. When tested using the same dataset, the Bernoulli Naive Bayes model achieved better performance accuracy (81.8%) compared to the Multinomial Naive Bayes (71%). Through hyperparameter tuning, the optimized version of the Bernoulli Naive Bayes model achieved the best performance with an accuracy of 95.3%

This study contributes to understanding sentiment on Twitter within the context of Bukalapak. It provides insights into the sentiment distribution towards the Bukalapak and highlights the superior performance of the hypertuned Bernoulli Naive Bayes model for sentiment analysis. These findings offer valuable guidance for data-driven decision-making and effective sentiment analysis in similar contexts. 

Sentiment analysis for the Indonesian stock market using Twitter data presents challenges related to data availability, language barriers and cultural differences. This research aims to comprehensively analyze sentiment in the context of Bukalapak, focusing on optimizing the sentiment analysis results. The study utilizes Tweepy to access the Twitter API for data retrieval. Two machine learning base models, Multinomial Naive Bayes (MNB) and Bernoulli Naive Bayes (BNB) are empoyed to create an optimized, hyperparameter tuned model. 

Based on gathered dataset with the keyword "Bukalapak", the study found that the Bukalapak elicits predominantly positive sentiments, followed by neutral and negative sentiments. When tested using the same dataset, the Bernoulli Naive Bayes model achieved better performance accuracy (81.8%) compared to the Multinomial Naive Bayes (71%). Through hyperparameter tuning, the optimized version of the Bernoulli Naive Bayes model achieved the best performance with an accuracy of 95.3%

This study contributes to understanding sentiment on Twitter within the context of Bukalapak. It provides insights into the sentiment distribution towards the Bukalapak and highlights the superior performance of the hypertuned Bernoulli Naive Bayes model for sentiment analysis. These findings offer valuable guidance for data-driven decision-making and effective sentiment analysis in similar contexts. 

Kata Kunci : Sentiment Analysis, Naive Bayes Classifiers, Twitter

  1. S1-2023-440456-abstract.pdf  
  2. S1-2023-440456-bibliography.pdf  
  3. S1-2023-440456-tableofcontent.pdf  
  4. S1-2023-440456-title.pdf