E-COMMERCE CUSTOMER SATISFACTION ANALYSIS ON MICROBLOGS

Technology influences various things in people's lives, ranging from socialization to shopping behavior, and hence marketplace inevitably appears as a part of e-commerce concepts. Opinions related to shopping activities are thus interesting for further research. This paper investigates the appropriate methods for analyzing e-commerce customer satisfaction. The research was conducted by compiling aspects of customer satisfaction and then by determining the appropriate lexicon method for sentiment analysis. A dataset containing 88,816 tweets was drawn from Twitter microblogs with a predefined set of keywords related to 5 e-commerce organizers. To analyze the data in a fine-grained manner, we also propose six aspects of customer satisfaction obtained from literature studies. Then we use the Word Cloud visualization technique and topic modeling to get 73 keywords to categorize tweets into aspects. Our experiments carried out two scenarios: Scenario 1 compares several opinionated dictionaries, while Scenario 2 compares different approaches for computing sentiment scores. The application produces an appropriate method using Scenario 1 with an accuracy of 0.54. Scenario 2 produces the highest accuracy is 0.46. The application of the lexicon-based (or dictionary-based) method to sentiment analysis results that throughout e-commerce in every aspect, it has dominant positive sentiment, and the most dominant aspect throughout and in every e-commerce is the "product" aspect. results, collection to processing, then the results are related to aspects of customer satisfaction. Finally, a test and discussion of relevant lexicon-based methods for E-commerce is outlined. the implication of this research is to increase the knowledge related to customer satisfaction and the variety of dictionaries and lexicon-based methods that can be used as references and further research. In addition, for e-commerce organizers, this research can assist analysis for improvement and policy making.


INTRODUCTION
Digital growth in Indonesia is so rapid that it can be seen in Internet users, which is increasing yearly (Tayibnapis, Wuryaningsih, & Gora, 2018). In 2021 as many as 202.6 million people used the internet out of a total population of 274.9 million, which is an increase of 15.5% compared to 2020. One hundred seventy million people are active users of microblogging social media, which means that more than half, namely 61.8% of Indonesians, use social media (Social, 2021). The increase in internet users and the number of interactions using microblogging social media allows people to JURNAL SYNTAX IDEA p-ISSN: 2723-4339 e-ISSN: 2548-1398Vol. 5, No. 1, Januari 2023 express opinions directly owned by those who are concerned with a particular product (Stieglitz & Dang-Xuan, 2013). In addition, it also creates new habits, such as making appointments, studying, and even shopping online. Online shopping has increased very sharply, especially in Indonesia. The development of e-commerce or electronic commerce in Indonesia occupies the top position of the ten ranking countries with the fastest development of e-commerce in the world according to records published by The UK Institute of Merchant Machines (Widowati, 2019). E-Commerce uses various media to expand its network, including Websites and social media such as Instagram, Facebook, and Twitter (Singh & Singh, 2018). The Price released a map of e-commerce in Indonesia with the top 5 major players, namely Tokopedia, Shopee, Bukalapak, Lazada, and Blibli (Iprice, 2020).
E-commerce is closely related to various social media platforms, including Twitter, which is ranked fifth as the most frequently used microblogging site in Indonesia, with as much as 63.6% of the population of social media users (Social, 2021). In the Twitter microblogging sites, e-commerce organizers can share various things with their consumers, such as running programs, promos, and gifts that can increase consumer loyalty and also serve as a guideline if problems occur in ecommerce. Consumers can also interact and express their opinions about e-commerce via social media like Twitter. Twitter, as a social media microblogging site, in addition to connecting and opinion, also has other facilities. One of the facilities is exploring data from Tweets or opinions from various users. This programmatic access makes use of Twitter API so that someone can take data and extract it into useful information. The use of this information varies, one of which is to become a basis for making decisions in an organization or company.
Improvements in technology, developments in e-commerce, and opinions conveyed through Twitter microblogging sites are useful to be investigated . One of the directions that can be done is sentiment analysis. Opinion Mining, another name for sentiment analysis, is extracting and analyzing people's opinions, attitudes, perceptions, appraisals, and others to different entities such as topics, individuals, products, and services. Three analytical techniques can be used for sentiment analysis: Data-driven methods, Lexicon-Based models, and the Hybrid approach. The data-driven analysis makes use of machine learning algorithms and feature engineering to automatically classify sentiment orientations. Meanwhile, the Lexicon-Based approach uses an opinionated lexicon containing a list of opinion words and phrases that express positive or negative sentiments (Yusof, Mohamed, & Abdul-Rahman, 2015). In addition, the Hybrid approach combines both machine learning and lexicon-based approaches to perform sentiment analysis tasks (Birjali, Kasri, & Beni-Hssane, 2021). Sentiment analysis has various benefits, such as knowing the sentiment of a product or service so that a company can make a decision (Jain & Kumar, 2017). It can also be known as a public opinion about something. In E-commerce, a population of consumers is an essential factor and a source of key consideration for companies to reduce or increase various things in the company. From the various conditions described, there is a gap between extensive information and techniques or methods of managing it, so it becomes valuable information, especially in E-commerce. Therefore, in this work, we discuss aspects of customer satisfaction and the distribution of sentiment orientation for five major E-commerce providers.

RESEARCH METHODS
The method used in this research starts from literature review, data collection, data arrangement, preparation of customer satisfaction aspects, aspect categorization, Lexicon-based sentiment analysis method scenarios, method selection, distribution of sentiment orientation in each E-commerce, and then conclusions and suggestions (Liu, Zhou, Jiang, & Zhang, 2020).
This study has several stages in conducting research, from data collection to conclusion. Research details can be seen in Figure 1. The first stage is collecting data from tweets on the Twitter microblog. Retrieval using Twitter API with keywords of 5 E-commerce organizers, namely Blibli, Bukalapak, Lazada, Shopee, and Tokopedia. Data was collected in CSV format from June 11 to June 30, 2021. Data management aggregated and cleaned data from duplicate and useless or meaningless data using orange tools. Data management also includes data categorization into aspects of customer satisfaction which then become aspect data. In addition, also data settings by taking test data to be labeled manually. This arrangement creates several different datasets according to their respective uses.
Preprocessing is the initial stage before the data is further processed. In this section, two preprocessing will be carried out due to the difference in how data is processed between the Word Cloud visualization technique and Topic Modelling. The Word Cloud process uses widgets in Orange, while the Topic Modelling process uses data science libraries of Python programming languages. For Word Cloud, data preprocessing includes transformation, tokenization, and filtering. In comparison, Topic Modeling preprocessing steps includes case folding, tokenization, normalization, and stop word removal.
The Word Cloud stage is to find the 100 words that have the most frequency in a collection of tweets on E-commerce. This word will be input and then processed into keywords. The aspect keyword is useful for categorizing common tweets into aspects. The total number of results on Word Cloud is 500 words. The topic modeling stage is to find topics and words that are related to one another. Words from the Topic Modelling process are also input which will be processed and used as aspect keywords. The Topic Modeling used in this research is LDA. Aspect keywords are words generated from Word Cloud extraction and Topic Modelling. The use of the aspect keyword is to categorize data into aspects of customer satisfaction. Each aspect of customer satisfaction has a different number of keywords. Extraction activities are carried out manually using Excel.
Aspects of customer satisfaction in research will be prepared based on the literature study covering general as well as specific literature studies on E-commerce. Each E-commerce will be analyzed concerning the existing aspects to produce the dominant aspects of all E-commerce and each of them. The data used in analyzing aspects of E-commerce customer satisfaction are aspect data, namely the results of categorizing data into aspect data. Sentiment analysis is done by compiling scenarios and choosing the right and suitable methods for E-commerce consumer sentiment analysis. At this stage, three annotators will label data manually for test data in research. Labeled data will also be assessed for agreement using Fleiss Kappa in this stage. This study conducted two scenarios, namely comparing dictionaries, and the second was comparing formulas. The dictionaries will be compared by source, and the scoring is carried out on a scale of 1 and 5. We also compare various ways of doing or calculating sentiment scores. Scenarios will be tested on labeled data and applied to aspect data. This step concludes the experiments and provides suggestions that build on the research.

RESULT AND DISCUSSION
The results and discussion section describes the results of the various aspects of this study, starting with a description of the data from collection to processing, then the results relate to aspects of customer satisfaction. Finally, testing and discussion of relevant lexicon-based methods for E-commerce are elaborated.
The data in the study are divided into several datasets according to their different uses. Details of the dataset can be seen in Table 1. Crawling data is carried out using the Twitter API for each E-commerce using the five E-commerce names as keywords. After that, the data is cleaned so that duplicate instances are removed. Clean data is used in Word Cloud and Topic Modeling processes. In the following process, the data is cleaned once again for the second time. This is because some data is useless or has no meaning. For example, some agendas in E-commerce require tweeting to take part in a giveaway using specific hashtags that are useless in the following process.
The data cleaned for a second time is then categorized into aspects of customer satisfaction using keywords compiled from Word Cloud extraction and Topic Modelling. Each of the aspects of customer satisfaction has several different keywords. The results of this categorization are called aspect data which will be used for further processing, and which will be utilized in further analysis. From the aspect data, 300 tweets were randomly selected to be labeled manually and used as experimental data on Lexicon methods. The labeled data consists of 10 tweets taken from each aspect of each E-commerce. The labeled data is labeled by three annotators and has a result label. That is, out of the 300 tweets, 80 are labeled negative, 61 are labeled neutral, and 159 data are labeled positive.
Concerning customer satisfaction, we need to prepare aspects of customer satisfaction. The preparation was carried out using a literature review from 5 selected literature. The first literature is from Prastiyo, and Fazariyaawan's research, which compiled several aspects consisting of product and service quality, price aspects, service quality, emotional factors, and cost aspects of convenience (Prasetiyo & Fazarriyawan, 2020). Another study by (Moon, Talha, & Salehin, 2021) and his colleagues arranged customer satisfaction into three aspects: delivery time, product quality, and the return system.
The work conducted by (Nguyen, 2020), arranges aspects of customer satisfaction into five aspects: online shopping experience, external incentives, service, security and privacy, and personal characteristics. In the fourth literature by (Calandra Alencia Haryani, Hamim Tohari, Marhamah, 2018), in their research, the customer satisfaction aspect has five aspects: system quality, information quality, service quality, features, and usability value. In the latest literature, (Liu et al., 2020), divides customer satisfaction into six aspects: product, staff, logistics, price, information, and systems. Based on the analysis of the five kinds of literature, six aspects are compiled in Table 2. The six aspects compiled and their explanations result from an analysis of the five selected kinds of literature. The five kinds of literature have the same names and meanings from their aspects. Some aspects are expanded, and some aspects are narrowed down one another. In addition, several names are different but have the same meaning. Because of these various things, in this study, the aspects of customer satisfaction are arranged into six aspects with descriptions and purposes defined for research.
The next stage is the result of Word Cloud extraction and Topic Modelling. Word Cloud generates 500 words from five E-commerce with 100 words each. While Topic Modeling also generates words which then become input for aspect keywords. Extraction and exploration of words from Word Cloud and Topic Modeling is done manually using Excel. The results of the extraction and exploration of these words are 73 keywords for the six aspects, such as the word "transaction", "money", "account", and so forth. Each aspect has a different number of aspect keywords. On the product aspect, there are 14 keywords. The seller aspect has only ten keywords, and the logistics aspect has 14 keywords. The price aspect has 11 keywords; the system has 13 keywords; and last is the service aspect with 11 keywords. Keywords are used in categorizing clean datasets into aspect datasets, which will be processed later.
The results of the categorization into aspect data make the data even smaller: 12,995 Tweets that are spread across all E-commerce. Figure 2 presents all aspects of Ecommerce.

Figure 2. Aspects of All E-commerce
The dominant aspect of all E-commerce is the product aspect which accounts for 39%. This states that the relation to the product is often discussed or alluded to by Ecommerce users.
Classification is done by preparing data to test the method to be studied. The test data is labeled data of 300 Tweets taken randomly from the aspect data. Tagged manually by three annotators. The agreement value was computed using Fleiss Kappa with a result of 0.50, meaning that the annotators agree. In testing, two scenarios were made. The first scenario tested three different types of dictionaries. The second scenario is to test the formulation of the scoring calculation on the Lexicon. The experimental scenario scheme can be observed in Figure 3.

Figure 3. Experimental Scenario Schematic
In Scenario 1, the formula used is to add up positive and negative scores, and there is a particular clause where a word with a negative connotation can make sentiment negative. Scenario 1 compares three dictionaries. The dictionary is InSet, compiled by (Koto & Rahmaningtyas, 2018). The second dictionary is a modified dictionary of the InSet dictionary and is used in research conducted by (Mahendrajaya, Buntoro, & Setyawan, 2019). A scale one dictionary is a dictionary that was modified in this study to be 1 and -1 in contrast to InSet, which has a scale of five, from -5 to 5. This modification is to determine whether there is a significant difference if the dictionary values differ. Experiments on Scenario 1 can be seen in Table 3. Scenario 1 results show the same accuracy value in each dictionary. The D1 and D2 dictionaries have precisely the same results for accuracy, precision, recall, and F1 scores, so it can be concluded that even using slang modifications and other sentiment words. There are no results or changes in the modified dictionary, namely the InSet dictionary. The following comparison is that the InSet dictionary, whose value is modified to only have a scale of 1, also has the same accuracy as the original InSet dictionary. However, there are differences in the value of precision, recall, and F1 score.
In Scenario 2 (Table 4), a comparison of formulas in scoring is carried out. The first formula adds up the scores, and the second gives the maximum value. Scenario 2 has different accuracy results. Formula 1, namely in F1 and F3 using a dictionary with a scale of 1 has higher accuracy than a dictionary with a scale of 5. In contrast to the second formula, namely on F2 and F4, it shows that the accuracy of scale 1 is lower than that of scale 5. Of the four methods being compared, the first formula, namely F1 and F3, has higher accuracy. So summation is still better, and the use of maximum and minimum values with a scale of 100 and -100 still needs to be developed. Some of the existing data or opinions from E-commerce customers have more than one statement in one sentence. Besides that, opinions in E-commerce often use comparisons to express their opinions by comparing E-commerce.
Therefore, the development of the second formula needs to be done further. Can be done in further studies. The experiments carried out resulted in a comparison of the accuracy of each scenario that was carried out. Table 5 shows the results or comparison of various methods.  Table 5, the method chosen for sentiment analysis on the five E-commerce providers is Scenario 1. Because all the accuracy is the same, Scenario 1 is chosen using the InSet dictionary. This method is then applied to each E-commerce. The results of the sentiment analysis at Blibli are shown in Figure 4. In Blibli's sentiment, all aspects have a positive sentiment. The dominant aspect is the product aspect. Even though Blibli has a positive sentiment, it should focus on the price aspect since the difference between positive and negative sentiment is relatively tiny. Figure 5 illustrates the distribution of positive and negative sentiments on each aspect of Bukalapak's customer satisfaction.

Figure 5. Bukalapak Sentiment
In Bukalapak sentiment, all aspects have a positive sentiment. However, the seller aspect has a minor difference between the positive and negative sentiments, so Bukalapak can further improve various things, such as seller services to customers and other aspects related to the seller. Figure 6 illustrates the distribution of positive and negative sentiments on each aspect of Lazada customer satisfaction. At Lazada, the overall sentiment is positive. The slightest difference between the positive sentiments is the aspect of the system. System aspects covering the web, application use, payment mechanisms, and security need further improvement even though they already have a positive sentiment. An increase in the system's aspect can push positive sentiment even higher, and negative sentiment decreases.
In the following figure, Figure 7, we observe an overview of the distribution of positive and negative sentiment in each aspect of Shopee customer satisfaction. At E-commerce, Shopee has a positive sentiment on every aspect of customer satisfaction. The aspect that has the slightest difference between the positive and negative sentiments is the system's aspect. Shopee can further improve its performance on the system aspect, especially in terms of application use, payment mechanisms, and security. Syntax Idea, Vol. 5, No. 1, Januari 2023 The last image in Figure 8 explains the distribution of Tokopedia's sentiments.

Figure 8. Tokopedia sentiment
At Tokopedia, the sentiment owned in every aspect is positive. Even so, Tokopedia can improve aspects of the system with a slight difference in positive and negative sentiment. Improvements regarding payment systems and security can reduce negative sentiment.
Overall, the sentiment distribution results for the five e-commerce organizers are positive. This result is also in line with the research conducted by Michael, whose sentiment results are also positive, and the values are consistent (Hakkinen, Wong, Anggreainy, & Hidayat, 2021). Even though the sentiment is positive, different improvements can still be made in different aspects because every E-commerce organizer has a different sentiment and focus.

CONCLUSION
In this study, the customer satisfaction aspects were composed of 6 aspects: product, seller, price, logistics, service, and system. From Word Cloud and Topic Modelling, 73 aspect keywords were obtained for categorizing data into aspect data. The dominant aspect of the six aspects is the production aspect. The production aspect is not only dominant in all E-commerce but also dominant in each of them. The method tested in this research has two scenarios, namely dictionary comparisons and formula comparisons. The scenarios were tested on labeled data consisting of 300 tweets. Scenario 1 for three dictionaries produces an accuracy of 0.54 for each. Scenario 2 for the four tests carried out each has an accuracy value of 0.46; 0.30; 0.45; 0.33. Hence, the chosen method is Scenario 1.
Further research can examine data related to other E-commerce and is not limited to five E-commerce providers. Besides that, it can also explore different aspects of customer satisfaction with this research. Finally, this research can be further developed by experimenting with even more diverse methods and with more size of data.