Abstract:
Recently, the world has witnessed an exponential growth in the number of social networks subscribers. Social networks have opened a venue for online users to interactively share their opinions in different life aspects. Despite that Users interests classification and sentiment analysis have become more critical for services providers, online advertising, and E-commerce, minimal research efforts have been conducted on these areas. However, the majority of existing approaches are unsupervised or partially supervised they primarily ig- nores the semantic information within the sentences, the complicated structure of the natural languages,inadditiontolackoftoolsandavailablesufficientcorporaarethemainchallenges in this area. In this study, we propose a deep learning model for users interests discovery and sentiment classification. The proposed model takes advantage of the FastText model and WordNet lexical database for words representation. Convolutional Neural Network ar- chitecture is used to capture local features. To overcome the limitation of the Convolutional Neural Network in capturing the temporal information over long-distances, in this study a recurrent neural network is designed to capture the contextual information and long-term dependencies. Furthermore, this model takes advantage of two attention units to emphasize the important contextual features, thus, increasing the classification accuracy. The proposed model is implemented on Tensorflow under Python environment. The model is evaluated us- ingourconstructedmulti-domainsArabicsentimentcorpuswhichcontains32,950instances, Amazon corpus which contains 60,000 instances, in addition to many other corpora. Experi- mental results have demonstrated the outstanding performance of the proposed architecture. Also, intensive experiments are conducted to validate the impact of each deep learning com- ponent in the classification performance. The proposed model has achieved a classification accuracy of up to 88.58% for Arabic text and up to 90.20% for English text. Furthermore, theproposedmodelhasremarkablyoutperformedmanyexistingapproachesforbothtasks.