Sentiment Analysis with Supervised Learning Techniques: A Survey

Objectives: This study aims two main goals; one is to provide complete notions relevant to sentiment analysis by SA mechanisms, its categorization, and its techniques. The second goal is to make a comprehensive study of supervised learning techniques used in SA classification to summarize the different works conducted in this area and track the recent developments. Methods: To achieve the first goal, several important survey studies, including modern and relevant works presented would be analyzed for full concepts around SA. As for the second objective of the study, the most important reports would be investigated, analyzed, and compared in the use of supervised learning techniques in SA from the previous to the recent researches till 2019. Findings: This study also made a comprehensive research of the supervised machine learning classifiers used in SA, its recent techniques and enhancement methods and the suggestion future works. There are still some open challenges in this area such as mining the complex reviews and implicit aspect identification. The sentiment language is also a challenge; thus, addressing each language according to its attributes is a difficult task and so the sentiment domain issue. Application/improvements: The information provided is used in assessing opinions and analyzing sentiment that could be used by researchers and institutions, and to identify different trends besides recommending the future research directions.


Introduction
Generally, textual information is available in two categories, target data that just contains realities, target articulation about elements or occasions and subjective information that Sentiments analyses focus on three fundamental segments, the conclusion holder or wellspring of the sentiment, the article about which the supposition is communicated and the assessment articulation [11[. The term object O, sometimes called entity, represents the intended target of the opinion expression. It is related with a couple, O (T, A), where T is a chain of importance of segments (or parts), sub-segments, etc., and A will be a set of characteristics of O. Every part or subcomponent likewise has its own arrangement of characteristics. Liu characterizes a supposition as a quintuple (e i , a ij , o ijkl , h k , t l ), where e i is the name of a substance, a ij is a part of e i , o ijkl is the sentiment on the perspective a ij of element e i , h k is the feeling holder and the time, when the conclusion is communicated. A substance is the objective object of a supposition; it is an item, administration, subject, individual, or occasion. The perspectives speak to parts or traits of a substance (some portion of-connection). The assessment is sure, negative, unbiased, or communicated with force levels. The records i, j, k, l show that the things in the definition must compare to each other. Sentiment study considered a classification procedure as exemplified in Figure 1. It has been examined mostly at three levels; first, known as the document level which aims to find the author's general sentiment (a positive or negative) in an opinionated document, document level assumes that each document associates with a single object and opinions from a single holder [12]. The second, a sentence level, is at times a solitary archive containing various feelings even about similar substances. The assignment at this level goes to the sentences and decides if each sentence communicated a positive, negative, or nonpartisan sentiment, however, before investigating the extremity of the sentences, we should decide whether the sentences are abstract or objective and just take the emotional sentences. The two past levels are appropriate when either the entire archive or every individual sentence alludes to a solitary element. In any case, as a rule, individuals talk about substances that have numerous perspectives (characteristics) and they have an alternate conclusion about every one of the angles. This regularly occurs in audits about items or in discourse discussions committed to explicit item classes (for example, autos, cameras, cell phones, and even pharmaceutical medications), so this level spotlights on their insight of all estimation articulations inside a given record and the angles to which they allude [13]. The angle level prior called the element level [14]. It will probably find opinions on substances as well as their perspectives. For instance, the sentence "this camera can take high-quality photos, but its battery life is short. " Here, the sentiment evaluates two aspects of the target or entity Camera is positive, the camera quality and the other aspect is negative, the battery life of camera. In the aspect level sentiment classification, from the already extracted aspects, opinion is determined [15]. Besides, the feeling of the content can be expressed or understood. On the off chance that express, a book legitimately gives a feeling, for example, (It is a great car) while if certain; the content infers a sort of opinion like the charger labors for multi week. Data classification accomplish in two stages, one is the learning process where the training data are analyzed by a classification technique (a classification model is learned) and the other stage is a classification process where the test data are used to predict class labels for the given data ( Figure 2). Since the class name of each preparation tuple is given, this progression is otherwise called directed learning (i.e., the learning of the classifier is "managed" in that it is advised to which class each preparation tuple has a place). It appears differently in relation to solo learning (or grouping), in which the class mark of each preparation tuple not known, and the number or set of classes to learn may not know ahead of time. Sentiment classification applied in various spaces. The most well-known spaces are motion picture surveys and client audits in a market area. Much inquire about has done in these regions [16]. News is another area examined by analysts [17][18]. The kind of information utilized in conclusion characterization contrasts starting with one space then onto the next, just as from language to language. At the end of the day, an estimation investigation framework that functions admirably for film surveys may not fill in too for client audits [19]. This issue originates from the assorted variety of the estimation starting with one space then onto the next. Subsequently, slant characterization is a very space explicit issue [18]. In [21,22] current years, a big number of surveys have conducted on SA and its related task. Most of these studies presented topics on sentiment analysis such as web data extraction and analysis, polarity and subjectivity, feature selection [23][24], sentiment analysis of comparative sentences, opinion search and retrieval exploration techniques, different classification methods and machine learning techniques for various emotion analysis tasks and finally, fake reviews or spam detection [25][26][27][28]. Schouten and Frasinca conducted comprehensive survey relevant to aspect-level sentiments [29], a survey on methods for selecting features and classifying sentiments and it outlined the depiction of the component choice techniques and posted a depth discussion on strategies of classification and related articles [30]. Fifty-four papers summarized citation the function performed, the area, the algorithm used, the polarity, the data range, the data source, and the type of language [31]. The researchers' major concern is to analyze the methods used in the articles surveyed conducted a survey of multiple emotion analysis aspects for the period 2002-2014; he noted some intelligent technologies such as random forests, evolutionary calculation, association mining, mysterious rule-based schemes, miners, and Conditional Random Field (CRF) theory. The proper idea analysis, neural network of radial foundation functions (RBFNN), and online learning algorithms have not optimized in SA. Furthermore, logic, online learning algorithms and ontology can all be very useful, especially in large data cases [32]. Ayyoub et al. provide an overview of the research on Arabic SA (ASA) so far. The research groups have published papers focused on SA-related issues they discuss and are attempting to identify the gaps for future studies in this area [33]. Research on sentiment analysis relies on the six main topics. 1-The problem of sentiment study, formalized by introducing the basic definition, concepts, and issues. 2-Sentiment and subjectivity classification, which regards supposition examination as content arrangement issue, in subjects; two sub-themes that have widely contemplated subjectivity and polarity. 3-Aspect-based sentiment analysis; this model looks for opinion targets (objects) and their components (attributes and features). 4-Sentiment analysis of near sentences. Here the assessment of an item can done in two principle ways, an immediate supposition that gives a sentiment about the article without referencing some other comparable objects (e.g., the resolution of this screen is good) and comparison which compares the objects with other similar objects (e.g., the resolution of this screen is better than that of screen-x). 5-Feeling search and recovery, conclusion search is in this way a mix of data recovery and notion investigation. 6-Opinion spam, which alludes to the counterfeit feeling that attempts to delude peruses or the robotized framework by giving under serving sentiments to some objective items to advance the articles or harm their notoriety.
The contribution of this survey offers a detailed categorization according to the methods used in a significant number of recent works. This methodology can permit scientists who know about specific systems to utilize them in the field of SA and to pick the fitting procedure for a specific application. In addition, these studies provided an overview of the importance of feature selection in refining the presentation of sentiment classification algorithms, particularly in machine-learning techniques.
This review is structured as follows: Section 2 covers sentiment analysis approaches, defines the techniques of supervised machine learning. Section 3 describes current research gaps and key challenges of sentiment classification. Finally, the conclusion and future trend in research tackled in Section 4.

Sentiment Analysis Approaches
The approaches that manipulate SA can be categorized into three approaches; machine learning, which contains supervised and unsupervised learning, lexicon-based, which depends on the discovery and the opinion lexicon. There are two techniques in this methodology [34]. Word reference based and corpus-based methodology, in certain conditions the half breed approach, which consolidates AI with the vocabulary-based methodology and increases a moderately better presentation [35]. Figure 3 illustrates the classification approaches. The lexicon-based methodology begins with a little arrangement of seed supposition words reasonable for the current space. This arrangement of words at that point extended through the use, afterward looks the lexicon for their equivalent words and antonyms [36] whereby the corpus-based methodology starts with a seed rundown of feeling words, and afterward finds other conclusion words in a huge corpus to help discover sentiment words with setting explicit directions. This could do by utilizing factual or semantic strategies.

Lexicon-Based Approach
Sentiment lexicon contains arrangements of words and expressions used to express people's abstract emotions and conclusions. Vocabulary based Sentiment Analysis procedures are unaided learning in light of the fact that the order of information does not require earlier preparing. Approaches based on the lexicon use a lexicon to describe opinions by counting and measuring words related to sentiments. There are two main approaches used. One of them is the lexicon put together an approach that depends with respect to social affair an underlying arrangement of expressions of supposition and afterward looking in the word reference for their equivalent words and antonyms to grow this set. The other is the corpus-based methodology, which utilizes many expressions of supposition with known extremity, and afterward recognizes different expressions of opinion in a huge corpus to group directions.

Dictionary-Based Approach
This technique starts by using a minor usual set of seed view words and web dictionaries, for example, WordNet. The technique first collects a minor set of opinion terms manually with established alignments and then increases it by looking for their substitutes and antonyms in the WordNet or thesaurus; the novel words added to the list of seeds until there are no more new words. Manual inspection can performed after the process has completed to remove or correct the errors. A significant shortcoming of the dictionarybased method and the view terms derived from it is it does not find words of opinion from different fields and meaning orientations, which is quite usual. For example, silent, it is typically negative for a speakerphone. However, quiet is good for a vehicle. This problem can address by the corpus-based approach.

Corpus-Based Approach
The corpus-put together approach based with respect to the likelihood that a supposition word may happen related to a positive or negative arrangement of words via looking for enormous marked preparing information. Unlike the dictionary approach, the Corpus approach can help find domain-specific words of opinion and their orientations.

Hybrid Approach
The hybrid method combines machine knowledge and lexicon-based approaches to improve performance in sentiment classification. Its main advantage is obtaining the best of both approaches, high precision from an efficient supervised learning algorithm, and lexicon-based method stability.

Machine Learning-Based Approach
The machine learning method in assumption arrangement depends on the use of celebrated AI strategies on the content information to utilize the experience to make a calculation to improve the exhibition of the framework. It is functional as it completely programmed and can deal with huge assortments of information. AI-based assumption arrangement can be isolated into three primary classes: regulated, solo and semi-directed learning techniques ( Figure 2). Indian Journal of Science and Technology Vol 13(03), DOI: 10.17485/ijst/2020/v13i03/148900, January 2020

Unsupervised Learning
Unsupervised learning is close to learning by observation; it conducts clustering. Unsupervised learning applied when there is just input information and no comparing yield factors are accessible. Its point is to show the hidden structure or appropriation in the information to become familiar with the information. Unsupervised learning issues can also group into clustering and association issues. There are some unsupervised learning algorithms commonly used in sentiment analysis (see Figure 2), such as k-means for clustering algorithms, mixture model, and hierarchical clustering.

Semi-Supervised Learning
Semi-supervised learning considers the classification problem when there are fewer corresponding labels only for a small subset of observations. These issues are of significant practical interest in a wide range of applications where unlabeled data are accessible, but it is expensive or impossible to obtain class labels for the entire data set. The key idea behindhand semi-supervised approach is unlabeled data holds a lot of class information, but it contains information on joint distribution over classification features.

Supervised Learning
Supervised learning based on the labeled dataset, so during the process the labels given to the framework. These labeled datasets trained when experienced during decision-making to produce appropriate outputs. Supervised learning classification based on the four-categories، linear classification، rule-based classification، probabilistic classification and decision tree concepts. To solve the classification problem, ML includes two stages; initial, pre-named preparing corpora utilized to gain proficiency with a "classifier" model utilizing a built up managed learning system and afterward once a classifier has been created, it tends to be applied to characterize the inconspicuous information [37]. So, one of ML's main issues is to fit a good generalization capability model to a set of training data. Traditionally, over-fitting mentions to a model that fits the training data too well but generalize poor to testing data, while under fitting refers to a model that can neither fits the training data nor generalize to testing data. A few machine learning methods have received to group the surveys in conclusion investigation. Support Vector Machine (SVM), Naïve Bayes (NB), Maximum Entropy (ME), Artificial Neural Network (ANN), and Decision Tree (DT) classifiers. Some different less generally utilized calculations are LR, K Nearest Neighbor (KNN), RF, and Bayesian Network (BN). Other less commonly used algorithms are LR, K Nearest Neighbor (KNN), RF, and Bayesian Network (BN). This article contributes to a deep understanding of machine learning in sentiment analysis especially, supervised machine learning. To accomplish this, the rest of the article offers more details about some important algorithms supervised learning approach uses in sentiment analysis.

Naïve Bayes (NB)
The Naïve Bayes model is the most direct and least requesting to gather a classifier for the substance course of action framework; it subject to Bayson's theory with a supposition of self-governance among the markers. In [38] direct terms, a Naïve Bayes classifier expects that the proximity of a particular component in a class is immaterial to the closeness of some other segment. The effortlessness of this presumption makes the calculation of Naïve Bayes classifier unquestionably increasingly productive. Bayesian classifiers have additionally shown high exactness and speed when applied to huge databases. To order as the most likely class c* for another component x, it figures: c* = argmaxc P(c|x) As indicated by Bayes' hypothesis, the likelihood that we need to process P(c|x) can be communicated regarding probabilities P(c), P(x|c), and P(x) as the accompanying condition: ( ) 1 2 n P(c |x) P x |c P(x |c) . P(x |c) = × ×… × Above, • P(c|x) is the back probability of class (c, target) given indicator (x, highlights).
• P(c) is the earlier probability of class.
• P(x|c) is the probability which is the probability of indicator given class. methods NB, DT, and KNN, results show that NB classifier is the best, which compared to DT and KNN in terms of accuracy, precision, and value of F-measure.

Support Vector Machine (SVM)
The support vector machine (SVM) is a factual order technique. It is dependent on the basic hazard minimization guideline from the computational learning hypothesis. SVM is a discriminative classifier; it can build a direct or non-straight choice surface to isolate the preparation information that focuses on two classes [44][45].
The key in such classifiers is to decide the ideal limits between the various classes and use them for the motivations behind the arrangement. An isolating hyperplane composed as: where W = {w 1 , w 2 , w 3 , … ,w n }. w n is well-defined as mass vector of n characteristics. b is well-defined as bias. The good ways from the isolating hyperplane to any point on H 1 is 1/|W| and the equivalent to any point on H 2 is 1/|W|. In this way, the extreme edge is 2/|W|. On the off chance that the hyperplane esteem > 0, at that point +ve class, if the hyperplane esteem < 0, at that point −ve classification; in the event that hyperplane esteem = 0, at that point all focuses are opposite to W. In the event that the estimation of the edge is huge, at that point an enormous punishment allotted to mistakes/edge blunders. On the off chance that the estimation of the edge is little, at that point, a few points become edge mistake and the direction of hyperplane is changed W = ∑j αjcjdj , αj ≥ 0 Let c(1,−1) is class (positive, negative) for article d. In Ref. [46], Alotaibi addressed the Political tendency classification of Twitter users by used WEKA5 exactly an SVM-based approach. In Ref. [47], Tan and Zhang created Arabic Corpus with three polarities (positive, negative, and natural) consist of 6267 documents and 33,870 sentences to use in classification problem. Different ML classifiers investigated during this task including Multinomial Naïve Bayes (MNB), Support Vector Machine (MNB) with linear kernel, and Neural Networks (NNs). SVM achieved the best results for all classification types. In Ref. [48], Manek et al. used MI, IG, CHI, and DF feature selection with a set of classification methods SVM, NB, K-nearest neighbor, winnow classifier and winnow classifier to progress a study of sentiment classification on Chinese documents. The Chinese sentiment dataset used in this study consists of 1021 documents. The result is that SVM produces the best performance for the others four classification methods. In Ref. [49], Hutto and Gilbert conducted a study based on using a suitable feature selection Gini Index to enhance the SVM classifier, the experiments emphasis that the method increases the SVM accuracy. In [50] present study that exercise TF-IDF, TF-CHI, TF-RF, and TF-OR feature selection with n-gram tokenization to improve SVM classifier, among all feature selection used, the results show that TF-IDF has the highest performance. In study [34], the

Indian Journal of Science and Technology
Vol 13(03), DOI: 10.17485/ijst/2020/v13i03/148900, January 2020 authors researched the sentiment of Movie reviews; they integrated various preprocessing strategies such as stop words erase, negation treatment and stemming, the feature election methods (FF, TF-IDF, and FP) are used to calculate three different feature matrices and then they used chi-squared technique to filter the unimportant features. Lastly, they applied SVM for classified and experimental results show that a good preprocessing leads to enhancement in classification methods.

Rule-Based Classifier
In a standard-based classifier, a lot of Conditional "if ... at that point ... " style rules is normally built to decide a specific blend of examples that are in all likelihood identified with the various classes. Each standard comprises of two sections: the forerunner part and the ensuing part. The predecessor part compares to a word design and the ensuing part to a class name. Rule-based classifiers provide an advantage, as they are easily understandable by non-experts in the case of decision tree classifiers and that explanations can create easily. Arrangement controls additionally speak to each class by disjunctive ordinary structure (DNF). A k-DNF articulation is of the structure: ( where m is the quantity of disjunctions, n is the quantity of conjunctions in every disjunction, and X n is characterized over the letters in order X 1 , X 2 ,… , X j ∪ ~X 1 , ~X 2 , … ,~X j . The standard-based classifier intends to manufacture the littlest guideline set that is advantageous with the preparation information [51][52]. Adding extra systems to avoid over-fitting of the preparation information improved standard based arrangement. The study [53] present VADER model, a basic rule-based model for general opinion analysis, and compare its efficacy with Eleven benchmarks, including LIWC, ANEW, General Inquirer, SentiWordNet, Naïve Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms, the results find that VADER outperforms individual human raters, F1 Classification Accuracy = 0.96 and 0.84, respectively [54] they combines the rule-based classifier and the SVM classifier To enhance the performance of SVM classifier. They used the rule-based classifier to verify the "neutral" SVM forecast, so the applied rule-based classifier for each "neutral" extracted from the of SVM classifier, although they do not obtain the best accurate, the results show that a rule-based classification can actually debug the SVM's predictions. Provided a study [55] on the use of rule-based machine learning to applied sentiment analysis on online books dataset and political reviews, they use SentiWordNet to create seven classes from strong positive to strong negative, results show this model achieved 97.4 % accuracy and minimize error average.

Decision Tree (DT)
The choice tree groups the preparation information by arranging tests in the informational index contingent upon highlights. DT classifier sorted out decay progressively of the preparation information utilizing highlights. The idea is to follow the edges of a tree, starting from the root, where each non-leaf node represents a test, for example, the value Indian Journal of Science and Technology Vol 13(03), DOI: 10.17485/ijst/2020/v13i03/148900, January 2020 of a feature. Depending on the result of the test, one of the child nodes chosen as the next node. Nodes visited this way until a leaf node reached, which tells what class the instance should classified as. The decision tree is useful because of its simplicity, but it is also quite powerful since the different sub-trees spanned by the children can differ depending on what tests considered the best with the knowledge acquired on the way to the node. Despite these features, decision trees tend to have issues handling linear relationships between variables [56]. The interaction effects of variables have also problems with logistical regression. For example, the FICUS construction algorithm presented by research [57], this algorithm gets as information a set of ordered items, a set of traits, and a detail for a set of constructor capacities to deliver a set of produced highlights that can utilized by standard idea students to make improved classifiers. ID3 is a oneapproach assignment to make all conceivable choice trees that appropriately arrange the preparation set and to choose the least difficult of them. The system of this methodology repeated, it haphazardly picks a sub-set of preparing information called the window to shape a choice tree, which splendidly characterizes every one of the items in the window.
In the event that the tree offers the right response for every one of these articles, at that point it is appropriate for finishing the preparation information, and the iterative closures.
If not, a determination of the inaccurately arranged items added to the window and the activity proceeds [58]. The DT technique [59] used to detect the scope of the negation by dynamic determinants, which use contextual information, and static determinants, which are unambiguous words. These determinants are useful in complex sentences such that only the sentences that contain negation are considered. Experimental rules concentrate on cases where polar statements precede grammatical forms that specifically precede negative words, leading to polar expression being unique in it applied the DT technique with some other machine learning techniques [54] to classify the opinions of patients in one of the English hospitals through a point-based evaluation scale. The purpose of this analysis is to forecast the satisfaction of the patient with the hospital in terms of cleanliness and respect.

A Neural Network (NN)
NN content classifier is a system of units, where the info units speak to terms, the yield unit(s) speak to the classification or classes of intrigue and the loads on the edges associating units speak to reliance relations. For characterizing a test record dj, its term loads wkj are stacked into the information units; the enactment of these units engendered forward through the system, and the estimation of the yield unit(s) decides the classification decision(s). A study [60] presented that used the deep convolutional NN in order to extract information at the character level and then the sentence to classify short sentences of sentiments. This approach applied on two datasets and the approach demonstrated increased classification accuracy in each one based on neural networks, where they analyzed the reviews of Arab hotels using long short-term memory LSTM with aspect opinion target expressions OTEs. Results show that both applications progressed enhancement of 39 percent for the extraction of aspect-OTEs and 6 percent for the polarity classification task of aspect sentiment.

The Maximum Entropy Classifier
The most extreme entropy classifier applies the popular MaxEnt standard to parameter estimation. The fundamental thought is that the classifier changes over marked capabilities to vectors utilizing an encoding. This encoded vector at that point used to compute loads for each component that can then joined to decide the in all probability mark for a list of capabilities. The Max Entropy classifier can utilize to explain a huge assortment of content arrangement issues, for example, language identification, point grouping, assessment examination, and more [61]. Malouf depicted the analyses looking at the exhibition of various calculations for evaluating the parameters of a contingent ME mode [62]. The outcome shows that the standardly utilized iterative scaling calculations perform very pitifully in contrast with the others; besides, the limited memory variable metric algorithm [63] beats different calculations by a significant edge. ME classifier gauge of P(c | d) accepts the exponential structure as: where, P ME (c|d) is the probability of example d existence in class c, Unlike Naïve Bayes, the ME classifier presents the best performance because it varieties no expectations about the associations between the features. A study [46] presented that uses the maximum entropy classifier for political tendency classification based on the Spanish Twitter data set this model achieved good results in the experimental test. In another study [64], the authors use a set of maximum entropy classifiers to classified the polarity for Spanish Twitter data, If all the classifiers decided on a category, the total value for the corresponding score was allocated, otherwise, the positive, negative, and objective score values are corresponding to the number of classifiers assigning the word to each category

Bayesian Network (BN)
The Bayesian Network (BN) assumes, unlike Naïve Bayes, that all features are completely dependent. A BN-coordinated non-cyclic chart whose hubs speak to irregular, every hub in the diagram speaks to an arbitrary variable, while the edges between the hubs speak to probabilistic conditions among the comparing irregular factors. In another depiction, BN speaks to a joint multivariate likelihood conveyance for a lot of irregular factors. Give us a chance to have a progression of sentences s(1),s(2),...,s(T); each speaks to a progression Indian Journal of Science and Technology Vol 13(03), DOI: 10.17485/ijst/2020/v13i03/148900, January 2020 of words so that s(t) = (x1(t), x2(t),..., xL(t)), where L is the length of sentence s(t). Along these lines, the likelihood of a word p(xi(t)) pursues the dissemination : p(xi(t)) = P(x i (t)|(x 1 (t), x 2 (t), (1) ..., x i-1 (t)), (s(1),s(2),...,s(t −1)) The BN dissects the likelihood of hub articulations into a result of restrictive probabilities by accepting the freedom of the non-relative hubs, given their folks. where p(xi |ai ,θi,ai) means the restrictive likelihood of hub articulation xi given its parent hub articulations ai, and θi, ai signifies the most extreme likelihood (ML) gauge of the contingent probabilities. Figure 4 delineates the state space of a Gaussian Bayesian system (GBN) at time moment t where every hub xi (t) is a word in the sentence s (t). For more information review. BN occasionally used in text mining because of its computational complexity and high cost. To research a genuine issue wherein the creator's mentality portrayed by three extraordinary (yet related) target factors. This instrument can aggregate the diverse objective factors in a similar grouping errand to profit by the conceivable measurable relations between them. Experimental results show that this approach outperforms the most common Sentiment Analysis approaches and is useful for improving the identification rates for this problem, in addition, author claim that this methodology could considered for solving future Sentiment Analysis problems. BN additionally utilized by [65] to propose a Bayesian profound convolutional conviction arrange BCDBN to Subjectivity by utilizing Bayesian systems to separate high ML ideas and word themes from the information and use them to pre-train the model grouping. This new approach accomplished very nearly 5-10% improvement in expectation precision contrasted with past approaches and it was multiple times quicker [66,67]. Bayesian networks and fuzzy recurrent NN used to enhance the extreme learning machine ELM for subjectivity detection, as the advantages of both networks used in the improvement of the ELM traditional. The results demonstrated the ability of the proposed system to detect subjectivity.

Challenges and Issues in SA
Following are a portion of the difficulties in the territory of assumption investigation, for example, nullification taking care of, area speculation, pronoun goals, language speculation, and world information. Opinion text could be in a different language, therefore, each language tackled according to its orientation, which is a formidable task. The direction of the view words could be dissimilar, from positive to negative, according to the situation, so an opinion word that is positive in one situation may consider negative in another. As the reviewer comments in free format, opinions may include abbreviations, symbols, and short words. To deal with it requires a lot of work to mine opinion. Most surveys have distinctive composition, positive and negative in same sentence, which is simple for people to see, however, increasingly hard for a PC [68]. The direction of feeling words could be diverse as indicated by their situation in the sentence; for instance, the descriptor "little" can be utilized in a positive or negative sense, in this way, to distinguish the extremity of a similar modifier words in various circumstances is likewise a difficult assignment [69]. Assessment spamming can even be terrifying as they can twist feelings and influence clients' understanding. It protected to state that as suppositions progressively utilized practically speaking, sentiment spamming will turn out to be increasingly wild and refined, which exhibits a significant test for their recognition [47]. Since web clients are settling on choices as indicated by web audits, it is vital that the surveys be high caliber and dependable along these lines, OM experiences the nature of audits issues; in any case, just constrained work has led on supposition quality assurance. Additionally displaying an enormous test despite OM is the accessibility and availability of a standard dataset. Scarcely any information is right now accessible to encourage the arrangement, benchmarking and investigation of the inferred content. At last, some other composing styles, for example, incongruity, mockery, or nullified sentences could carry more difficulties to estimation examination [70].

Summary and Conclusion
This study reviewed the classification techniques using supervised learning machine and what tools are available for sentiment analysis. More specifically, we considered the trend of improving the classification algorithm by using appropriate feature selection methods. There are still some open challenges in this area such as mining the complex reviews and implicit aspect identification. The sentiment language is also a challenge; thus, addressing each language according to its attributes is a difficult task and so the sentiment domain issue. Concatenating the conceptual approaches with the power of machine learning, improving the feature selection methods and applying some of the intelligent techniques in sentiment analysis may lead to good and efficient solutions for the future work to enhance the performance of sentiment analysis for the challenges mentioned above.