Warianty tytułu
Języki publikacji
Abstrakty
In the field of text mining, many novel feature extraction approaches have been propounded. The following research paper is based on a novel feature extraction algorithm. In this paper, to formulate this approach, a weighted graph mining has been used to ensure the effectiveness of the feature extraction and computational efficiency; only the most effective graphs representing the maximum number of triangles based on a predefined relational criterion have been considered. The proposed novel technique is an amalgamation of the relation between words surrounding an aspect of the product and the lexicon-based connection among those words, which creates a relational triangle. A maximum number of a triangle covering an element has been accounted as a prime feature. The proposed algorithm performs more than three times better than TF-IDF within a limited set of data in analysis based on domain-specific data. (original abstract)
Rocznik
Tom
Strony
1--22
Opis fizyczny
Twórcy
autor
- Samsung Research Institute, Noida, India
autor
- Samsung Research Institute, Noida, India
Bibliografia
- Aggarwal C.C. (2018), Machine Learning for Text, Springer, Cham.
- Biswas S.K., Bordoloi M., Shreya J. (2018), A Graph-based Keyword Extraction Model Using Collective Node Weight, Expert Systems with Applications, 97, 51-59, https://doi.org/10.1016/ j.eswa.2017.12.025.
- Bonatti P., Decker S., Polleres A., Presutti V. (2018), Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), Dagstuhl Reports, 8, 29-111.
- Campolo A., Sanfilippo M., Whittaker M., Crawford K. (2018), AI Now 2017 Report, Symposium and Workshop, January, AI Now Institute at New York University.
- Campos R., Mangaravite V., Pasquali A., Jorge A., Nunes C., Jatowt A. (2020), YAKE! Keyword Extraction from Single Documents using Multiple Local Features, Information Sciences, 509, 257-289, DOI: 10.1016/j.ins.2019.09.013.
- Dave K., Lawrence S., Pennock D.M. (2003), Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews, Proceedings of the 12th International Conference on World Wide Web, 519-528.
- Devika R., Subramaniyaswamy V. (2019), A Semantic Graph-based Keyword Extraction Model Using a Ranking Method on Big Social Data, Wireless Netw, https://doi.org/10.1007/s11276- 019-02128-x.
- Feldman R., Dagan I. (1995), Knowledge Discovery in Textual Databases (KDT), Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, August 20-21, AAAI Press, 112-117.
- Giarelis N., Kanakaris N., Karacapilidis N. (2020), An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents, Artificial Intelligence Applications and Innovations, 583, May 6, 96-106, DOI: 10.1007/978-3-030-49161-1_9.
- Houari M., Rhanoui M., Asri B. (2015), From Big Data to Big Knowledge: The Art of Making Big Data Alive, 1-6, DOI: 10.1109/CloudTech.2015.7337001.
- Htay S.S., Lynn K.T. (2013), Extracting Product Features and Opinion Words Using Pattern Knowledge in Customer Reviews, The Scientific World Journal, Vol. 2013, Article ID 394758, 5 pages, https://doi.org/10.1155/2013/394758.
- Hulth A. (2003a), Improved Automatic Keyword Extraction Given More Linguistic Knowledge, EMNLP, 216-223.
- Hulth A. (2003b), Reducing False Positives by Expert Combination in Automatic Keyword Indexing, RANLP, 367-376.
- Jaideepsinh K., Saini J. (2016), Stop-Word Removal Algorithm and Its Implementation for the Sanskrit Language, International Journal of Computer Applications, 150, 15-17, DOI: 10.5120/ijca2016911462.
- Jia Y., Qui Y., Shang H., Jiang R., Li A. (2018), A Practical Approach to Constructing a Knowledge Graph for Cybersecurity, Engineering, 4(1), 53-60, https://doi.org/10.1016/j.eng. 2018.01.004.
- Jiang X., Hu Y., Li H. (2009), A Ranking Approach to Keyphrase Extraction, SIGIR, 756-757.
- K-CAP '19 (2019), Proceedings of the 10th International Conference on Knowledge Capture, September, 131-138, https://doi.org/10.1145/3360901.3364441.
- Kim K., Hur Y., Kim G., Lim H. (2020), GREG: A Global Level Relation Extraction with Knowledge Graph Embedding, Applied Sciences, 10, 1181.
- LeCun Y., Bengio Y., Hinton G. (2015), Deep Learning, Nature, 521, 436-44, https://doi.org/ 10.1038/nature14539.
- Liu B. (2009), Handbook Chapter: Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, Marcel Dekker, Inc., New York, NY, USA.
- Manrique R., Pereira B., Mariño O. (2019), Exploring Knowledge Graphs for the Identification of Concept Prerequisites, Smart Learning Environments, 6, 21, https://doi.org/10.1186/s40561- 019-0104-3.
- Markov A., Last M., Kandel A. (2007), Fast Categorization of Web Documents Represented by Graphs, Advances in Web Mining and Web Usage Analysis, 4811, 56-71.
- Park D.-H., Kim S. (2008), The Effects of Consumer Knowledge on Message Processing of Electronic Word-of-mouth via Online Consumer Reviews, Electronic Commerce Research and Applications, 7, 399-410
- Ramos J. (2003), Using TF-IDF to Determine Word Relevance in Document Queries, Computer Science, Proceedings of the First Instructional Conference on Machine Learning, 1-4.
- Rose S., Engel D., Cramer N., Cowley W. (2010), Automatic Keyword Extraction from Individual Documents, DOI: 10.1002/9780470689646.ch1.
- Russell S.J., Norvig P. (2003), Artificial Intelligence - A Modern Approach: The Intelligent Agent Book, Prentice-Hall.
- SAC '07 (2007), Proceedings of the 2007 ACM Symposium on Applied Computing, March, 807- 811, https://doi.org/10.1145/1244002.1244182.
- Safrin R., Sharmila K.R., Shri Subangi T.S., Vimal E.A. (2017), Sentiment Analysis on Online Product Review, International Research Journal of Engineering and Technology (IRJET), 4, April, 2381-2388.
- Sammons M., Christodoulopoulos C., Kordjamshidi P., Khashabi D., Srikumar V., Vijayakumar P., Bokhari M., Wu X., Roth D. (2016), Edison: Feature Extraction for NLP, Simplified [in:] N. Calzolari, K. Choukri, H. Mazo, A. Moreno, T. Declerck, S. Goggi, M. Grobelnik, J. Odijk, S. Piperidis, B. Maegaard, J. Mariani (eds.), Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, European Language Resources Association (ELRA), 4085-4092.
- Shi W., Zheng W., Yu J.X., Cheng H., Zou L. (2017), Keyphrase Extraction Using Knowledge Graphs, Data Science Engineering, 2, 275288, https://doi.org/10.1007/s41019-017-0055-z.
- Sidorov G., Velasquez F., Stamatatos E., Gelbukh A., Chanona-Hernández L. (2013), Syntactic Dependency-Based N-grams as Classification Features [in:] I. Batyrshin, M.G. Mendoza (eds.), Advances in Computational Intelligence, MICAI 2012, Lecture Notes in Computer Science, 7630, Springer, Berlin, Heidelberg, https://doi.org/10.1007/978-3-642-37798-3_1.
- Turney P.D. (2002), Learning to Extract Keyphrases from the Text, CoRR, cs. L.G./0212013.
- Vazirgiannis M., Malliaros F., Nikolentzos G. (2018), GraphRep: Boosting Text Mining, NLP, and Information Retrieval with Graphs, Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2295-2296.
- Wang Ch., Ma X., Chen J., Chen J. (2018), Information Extraction and Knowledge Graph Construction from Geoscience Literature, Computers & Geosciences, 112, 112-120, https:// doi.org/10.1016/j.cageo.2017.12.007.
- Wang Q., Mao Z., Wang B., Guo L. (2017), Knowledge Graph Embedding: A Survey of Approaches and Applications, IEEE Transactions on Knowledge and Data Engineering, 29(12), December 1, 2724-2743, DOI: 10.1109/TKDE.2017.2754499.
- Wang W., Do D.B., Lin X. (2005), Term Graph Model for Text Classification, Advanced Data Mining and Applications, 19-30.
- Willemsen L.M., Neijens P.C., Bronner F., de Ridder J.A. (2011), "Highly Recommended!" The Content Characteristics and Perceived Usefulness of Online Consumer Reviews, Journal of Computer-Mediated Communication, 17(1), October 1, 19-38, https://doi.org/10.1111/j.1083- 6101.2011.01551.x.
- Witten I.H., Paynter G.W., Frank E., Gutwin C., Nevill-Manning C.G. (1999), KEA: Practical Automatic Keyphrase Extraction, Proceedings of the Fourth ACM Conference on Digital Libraries, 254-255.
- Xu J., Kim S., Song M., Jeong M., Kim D., Kang J., Rousseau J.F., Li X., Xu W., Torvik V.I., Bu Y., Chen Ch., Ebeid I.A., Li D., Ding Y. (2020), Building a PubMed Knowledge Graph, Scientific Data, 7, 205, https://doi.org/10.1038/s41597-020-0543-2.
- Zhao H., Pan Y., Yang F. (2020), Research on Information Extraction of Technical Documents and Construction of Domain Knowledge Graph, IEEE Access, 8, 168087-168098, DOI: 10.1109/ ACCESS.2020.3024070.
- Zhao J., Wang T., Yatskar M., Ordonez V., Chang K.W. (2017), Men also Like Shopping: Reducing Gender Bias Amplification Using Corpus-level Constraints, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2979-2989.
- (www 1) https://www.merriam-webster.com/dictionary/adjective (accessed: 1.11.2020).
- (www 2) Ji S., Pan S., Cambria E., Marttinen P., Yuar P.S. (2021), A Survey on Knowledge Graphs: Representation, Acquisition and Applications, IEEE Transactions on Neural Networks and Learning Systems, Xiv:2002.00388 (accessed: 8.11.2020).
- (www 3) Mäntylä M.V., Graziotin D., Kuutila M. (2018), The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers, Computer Science Review, 27, February, 16-32, arXiv:1612.01556 [cs.CL] (accessed: 10.11.2020).
- (www 4) http://web.onda.com.br/abveiga/capitulo4-ingles.pdf (accessed: 11.11.2020).
- (www 5) Mutlu E.C., Oghaz T.A., Rajabi A., Garibay I., Review on Learning and Extracting Graph Features for Link Prediction, arXiv:1901.03425 (accessed: 11.11.2020).
- (www 6) https://www.sketchengine.eu/penn-treebank-tagset/
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171635062