A Graph-Based Method for Predicting the Helpfulness of Product Opinions



Natural Language Processing, Helpfulness Prediction, Opinion Mining.


This manuscript presents a new approach to predict the helpfulness of opinions. Usually, researchers in this area use tables of attribute-value to aggregate the features that represent the evaluated texts. In this manuscript, this task is modeled as a network, considering the information of relations among objects in the network (comments, stars, and words). A regularization technique of graphs is used to extract the relevant features of graph structure and, after that, the comments are classified as helpful or unhelpful. We compared our network model with two baselines methods, one based on fuzzy logic and other based on Neural Networks. Our model outperformed the fuzzy logic method in 0.17 of F1 measure and 0.19 of F1 on Neural Network method.


Não há dados estatísticos.

Biografia do Autor

Rogério Figueredo de Sousa, Universidade de São Paulo

Doutorando no Instituto de Computação e Matemática Computacional - ICMC da Universidade de São Paulo - São Carlos.

Rafael Tôrres Anchiêta, Universidade de São Paulo

Graduado e Mestre em Ciência da Computação pela Universidade Federal do Piauí. Doutorando em Ciência da Computação pelo Programa de Pós Graduação e Matemática Computacional. ICMC - USP. Tem experiência na área de Ciência da Computação, com ênfase em Processamento de Linguagem Natural e Engenharia de Requisitos.

Maria das Graças Volpe Nunes, Universidade de São Paulo

Possui graduação em Ciências da Computação pela Universidade Federal de São Carlos (1980), mestrado em Ciências da Computação pela Universidade de São Paulo (1985) e doutorado em Informática pela Pontifícia Universidade Católica do Rio de Janeiro (1991). Foi docente e pesquisadora, de 1981 a 2013, no Instituto de Ciências Matemáticas e da Computação, da Universidade de São Paulo (USP) em São Carlos, onde hoje atua como professora senior. Tem experiência na área de Processamento de Língua Natural, atuando principalmente nos seguintes temas: tradução automática, correção ortográfica e gramatical, normalização textual, sumarização automática e análise de sentimentos. 


Anchiêta, R., Sousa, R. F., Moura, R., and Pardo, T. (2017). Improving opinion summarization by assessing sentence importance in online reviews. In Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology, pages 32–36.

Anchiêta, R. T. and Moura, R. S. (2017). Exploring unsupervised learning towards extractive summarization of user reviews. In Proceedings of the 23rd Brazillian Symposiumon Multimedia and the Web, pages 217–220. ACM.

Barbosa, J. L. and Moura, R. S. (2016). Avaliacãoo automática da utilidade de reviewsusando redes neurais artificiais no corpus do steam. In Anais do XXVI Congresso daSociedade Brasileira de Computação: BraSNAM - 5o Brazilian Workshop on SocialNetwork Analysis and Mining. Brazilian Computer Society.

Bertaglia, T. F. C. and Nunes, M. d. G. V. (2016). Exploring word embeddings for unsupervised textual user-generated content normalization. InProceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), pages 112–120.

Bui, T. D., Ravi, S., and Ramavajjala, V. (2018). Neural graph learning: Training neural networks using graphs. In Proceedings of 11th ACM International Conference on WebSearch and Data Mining (WSDM).

de Sousa, R. F., Rabêlo, R. A., and Moura, R. S. (2015). A fuzzy system-based approach to estimate the importance of online customer reviews. In 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pages 1–8. IEEE.

Diaz, G. O. and Ng, V. (2018). Modeling and prediction of online product review helpfulness: A survey. InProceedings of the 56th Annual Meeting of the Association forComputational Linguistics (Volume 1: Long Papers), volume 1, pages 698–708.

Fonseca, E. R. and Rosa, J. L. G. (2013). Mac-morpho revisited: Towards robust part-of-speech tagging. InProceedings of the 9th Brazilian symposium in information andhuman language technology, pages 98–107.

Hartmann, N. S., Avanço, L. V., Balage Filho, P. P., Duran, M. S., Nunes, M. D. G. V.,Pardo, T. A. S., Aluisio, S. M., et al. (2014). A large corpus of product reviews in portuguese: Tackling out-of-vocabulary words. In International Conference on LanguageResources and Evaluation. European Language Resources Association-ELRA.

Ji, M., Sun, Y., Danilevsky, M., Han, J., and Gao, J. (2010). Graph regularized transductive classification on heterogeneous information networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 570–586.Springer.

Kim, S.-M., Pantel, P., Chklovski, T., and Pennacchiotti, M. (2006). Automatically assessing review helpfulness. In Proceedings of the 2006 Conference on empirical methods in natural language processing, pages 423–430. Association for Computational Linguistics.

Krishnamoorthy, S. (2015). Linguistic features for review helpfulness prediction. Expert Systems with Applications, 42(7):3751–3759.

Landauer, T. K., Foltz, P. W., and Laham, D. (1998). An introduction to latent semanticanalysis.Discourse processes, 25(2-3):259–284.

Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on HumanLanguage Technologies, 5(1):1–167.

Liu, J., Cao, Y., Lin, C.-Y., Huang, Y., and Zhou, M. (2007). Low-quality product reviewdetection in opinion summarization. In Proceedings of the 2007 Joint Conferenceon Empirical Methods in Natural Language Processing and Computational NaturalLanguage Learning (EMNLP-CoNLL).

Malik, M. and Hussain, A. (2017). Helpfulness of product reviews as a function of discrete positive and negative emotions.Computers in Human Behavior, 73:290–302.

Martins, A. C. S. and Tacla, C. A. (2015). Assessement of features influencing the votingfor opinions’ helpfulness about services in portuguese. In Proceedings of the annual conference on Brazilian Symposium on Information Systems: Information Systems: A Computer Socio-Technical Perspective-Volume 1, page 21. Brazilian Computer Society.

Orengo, V. and Huyck, C. (2001). A stemming algorithmm for the portuguese language.InString Processing and Information Retrieval, pages 186–193.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau,D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learningin Python.Journal of Machine Learning Research, 12:2825–2830.

Rossi, R. G. (2016).Classificação automática de textos por meio de aprendizado dem ́aquina baseado em redes. PhD thesis, Universidade de São Paulo.

Santos, R. L. d. S., de Sousa, R. F., Rabelo, R. A., and Moura, R. S. (2016). An experimental study based on fuzzy systems and artificial neural networks to estimate theimportance of reviews about product and services. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 647–653. IEEE.

Scarton, C. E. and Aluísio, S. M. (2010). Análise da inteligibilidade de textos via ferramentas de processamento de língua natural: adaptando as métricas do coh-metrix parao português. Linguamática, 2(1):45–61.

Semin, G. R. (2011). The linguistic category model.Handbook of theories of socialpsychology, 1:309–326.

Singh, J. P., Irani, S., Rana, N. P., Dwivedi, Y. K., Saumya, S., and Roy, P. K. (2017). Predicting the “helpfulness” of online consumer reviews. Journal of Business Research,70:346–355.

Sousa, R. F., Brum, H. B., and Nunes, M. d. G. V. (2019). A bunch of helpfulness and sentiment corpora in brazilian portuguese. InProceedings of Symposium in Informationand Human Language Technology - STIL. Sociedade Brasileira de Computação.

Zeng, Y.-C., Ku, T., Wu, S.-H., Chen, L.-P., and Chen, G.-D. (2014). Modeling the helpfulopinion mining of online consumer reviews as a classification problem. International Journal of Computational Linguistics & Chinese Language Processing, Volume 19, Number 2, June 2014, 19(2).

Zhou, D., Bousquet, O., Lal, T. N., Weston, J., and Scholkopf, B. (2004). Learning with local and global consistency. In Advances in neural information processing systems, pages 321–328.

Zhu, X., Ghahramani, Z., and Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conferenceon Machine learning (ICML-03), pages 912–919.




Como Citar

Sousa, R. F. de, Anchiêta, R. T., & Nunes, M. das G. V. (2020). A Graph-Based Method for Predicting the Helpfulness of Product Opinions. ISys - Brazilian Journal of Information Systems, 13(4), 06–21. Recuperado de http://www.seer.unirio.br/isys/article/view/9393