PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2020 | 15 | 1--22
Tytuł artykułu

Domain Specific Key Feature Extraction Using Knowledge Graph Mining

Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In the field of text mining, many novel feature extraction approaches have been propounded. The following research paper is based on a novel feature extraction algorithm. In this paper, to formulate this approach, a weighted graph mining has been used to ensure the effectiveness of the feature extraction and computational efficiency; only the most effective graphs representing the maximum number of triangles based on a predefined relational criterion have been considered. The proposed novel technique is an amalgamation of the relation between words surrounding an aspect of the product and the lexicon-based connection among those words, which creates a relational triangle. A maximum number of a triangle covering an element has been accounted as a prime feature. The proposed algorithm performs more than three times better than TF-IDF within a limited set of data in analysis based on domain-specific data. (original abstract)
Rocznik
Tom
15
Strony
1--22
Opis fizyczny
Twórcy
  • Samsung Research Institute, Noida, India
  • Samsung Research Institute, Noida, India
Bibliografia
  • Aggarwal C.C. (2018), Machine Learning for Text, Springer, Cham.
  • Biswas S.K., Bordoloi M., Shreya J. (2018), A Graph-based Keyword Extraction Model Using Collective Node Weight, Expert Systems with Applications, 97, 51-59, https://doi.org/10.1016/ j.eswa.2017.12.025.
  • Bonatti P., Decker S., Polleres A., Presutti V. (2018), Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371), Dagstuhl Reports, 8, 29-111.
  • Campolo A., Sanfilippo M., Whittaker M., Crawford K. (2018), AI Now 2017 Report, Symposium and Workshop, January, AI Now Institute at New York University.
  • Campos R., Mangaravite V., Pasquali A., Jorge A., Nunes C., Jatowt A. (2020), YAKE! Keyword Extraction from Single Documents using Multiple Local Features, Information Sciences, 509, 257-289, DOI: 10.1016/j.ins.2019.09.013.
  • Dave K., Lawrence S., Pennock D.M. (2003), Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews, Proceedings of the 12th International Conference on World Wide Web, 519-528.
  • Devika R., Subramaniyaswamy V. (2019), A Semantic Graph-based Keyword Extraction Model Using a Ranking Method on Big Social Data, Wireless Netw, https://doi.org/10.1007/s11276- 019-02128-x.
  • Feldman R., Dagan I. (1995), Knowledge Discovery in Textual Databases (KDT), Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, August 20-21, AAAI Press, 112-117.
  • Giarelis N., Kanakaris N., Karacapilidis N. (2020), An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents, Artificial Intelligence Applications and Innovations, 583, May 6, 96-106, DOI: 10.1007/978-3-030-49161-1_9.
  • Houari M., Rhanoui M., Asri B. (2015), From Big Data to Big Knowledge: The Art of Making Big Data Alive, 1-6, DOI: 10.1109/CloudTech.2015.7337001.
  • Htay S.S., Lynn K.T. (2013), Extracting Product Features and Opinion Words Using Pattern Knowledge in Customer Reviews, The Scientific World Journal, Vol. 2013, Article ID 394758, 5 pages, https://doi.org/10.1155/2013/394758.
  • Hulth A. (2003a), Improved Automatic Keyword Extraction Given More Linguistic Knowledge, EMNLP, 216-223.
  • Hulth A. (2003b), Reducing False Positives by Expert Combination in Automatic Keyword Indexing, RANLP, 367-376.
  • Jaideepsinh K., Saini J. (2016), Stop-Word Removal Algorithm and Its Implementation for the Sanskrit Language, International Journal of Computer Applications, 150, 15-17, DOI: 10.5120/ijca2016911462.
  • Jia Y., Qui Y., Shang H., Jiang R., Li A. (2018), A Practical Approach to Constructing a Knowledge Graph for Cybersecurity, Engineering, 4(1), 53-60, https://doi.org/10.1016/j.eng. 2018.01.004.
  • Jiang X., Hu Y., Li H. (2009), A Ranking Approach to Keyphrase Extraction, SIGIR, 756-757.
  • K-CAP '19 (2019), Proceedings of the 10th International Conference on Knowledge Capture, September, 131-138, https://doi.org/10.1145/3360901.3364441.
  • Kim K., Hur Y., Kim G., Lim H. (2020), GREG: A Global Level Relation Extraction with Knowledge Graph Embedding, Applied Sciences, 10, 1181.
  • LeCun Y., Bengio Y., Hinton G. (2015), Deep Learning, Nature, 521, 436-44, https://doi.org/ 10.1038/nature14539.
  • Liu B. (2009), Handbook Chapter: Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, Marcel Dekker, Inc., New York, NY, USA.
  • Manrique R., Pereira B., Mariño O. (2019), Exploring Knowledge Graphs for the Identification of Concept Prerequisites, Smart Learning Environments, 6, 21, https://doi.org/10.1186/s40561- 019-0104-3.
  • Markov A., Last M., Kandel A. (2007), Fast Categorization of Web Documents Represented by Graphs, Advances in Web Mining and Web Usage Analysis, 4811, 56-71.
  • Park D.-H., Kim S. (2008), The Effects of Consumer Knowledge on Message Processing of Electronic Word-of-mouth via Online Consumer Reviews, Electronic Commerce Research and Applications, 7, 399-410
  • Ramos J. (2003), Using TF-IDF to Determine Word Relevance in Document Queries, Computer Science, Proceedings of the First Instructional Conference on Machine Learning, 1-4.
  • Rose S., Engel D., Cramer N., Cowley W. (2010), Automatic Keyword Extraction from Individual Documents, DOI: 10.1002/9780470689646.ch1.
  • Russell S.J., Norvig P. (2003), Artificial Intelligence - A Modern Approach: The Intelligent Agent Book, Prentice-Hall.
  • SAC '07 (2007), Proceedings of the 2007 ACM Symposium on Applied Computing, March, 807- 811, https://doi.org/10.1145/1244002.1244182.
  • Safrin R., Sharmila K.R., Shri Subangi T.S., Vimal E.A. (2017), Sentiment Analysis on Online Product Review, International Research Journal of Engineering and Technology (IRJET), 4, April, 2381-2388.
  • Sammons M., Christodoulopoulos C., Kordjamshidi P., Khashabi D., Srikumar V., Vijayakumar P., Bokhari M., Wu X., Roth D. (2016), Edison: Feature Extraction for NLP, Simplified [in:] N. Calzolari, K. Choukri, H. Mazo, A. Moreno, T. Declerck, S. Goggi, M. Grobelnik, J. Odijk, S. Piperidis, B. Maegaard, J. Mariani (eds.), Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, European Language Resources Association (ELRA), 4085-4092.
  • Shi W., Zheng W., Yu J.X., Cheng H., Zou L. (2017), Keyphrase Extraction Using Knowledge Graphs, Data Science Engineering, 2, 275288, https://doi.org/10.1007/s41019-017-0055-z.
  • Sidorov G., Velasquez F., Stamatatos E., Gelbukh A., Chanona-Hernández L. (2013), Syntactic Dependency-Based N-grams as Classification Features [in:] I. Batyrshin, M.G. Mendoza (eds.), Advances in Computational Intelligence, MICAI 2012, Lecture Notes in Computer Science, 7630, Springer, Berlin, Heidelberg, https://doi.org/10.1007/978-3-642-37798-3_1.
  • Turney P.D. (2002), Learning to Extract Keyphrases from the Text, CoRR, cs. L.G./0212013.
  • Vazirgiannis M., Malliaros F., Nikolentzos G. (2018), GraphRep: Boosting Text Mining, NLP, and Information Retrieval with Graphs, Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2295-2296.
  • Wang Ch., Ma X., Chen J., Chen J. (2018), Information Extraction and Knowledge Graph Construction from Geoscience Literature, Computers & Geosciences, 112, 112-120, https:// doi.org/10.1016/j.cageo.2017.12.007.
  • Wang Q., Mao Z., Wang B., Guo L. (2017), Knowledge Graph Embedding: A Survey of Approaches and Applications, IEEE Transactions on Knowledge and Data Engineering, 29(12), December 1, 2724-2743, DOI: 10.1109/TKDE.2017.2754499.
  • Wang W., Do D.B., Lin X. (2005), Term Graph Model for Text Classification, Advanced Data Mining and Applications, 19-30.
  • Willemsen L.M., Neijens P.C., Bronner F., de Ridder J.A. (2011), "Highly Recommended!" The Content Characteristics and Perceived Usefulness of Online Consumer Reviews, Journal of Computer-Mediated Communication, 17(1), October 1, 19-38, https://doi.org/10.1111/j.1083- 6101.2011.01551.x.
  • Witten I.H., Paynter G.W., Frank E., Gutwin C., Nevill-Manning C.G. (1999), KEA: Practical Automatic Keyphrase Extraction, Proceedings of the Fourth ACM Conference on Digital Libraries, 254-255.
  • Xu J., Kim S., Song M., Jeong M., Kim D., Kang J., Rousseau J.F., Li X., Xu W., Torvik V.I., Bu Y., Chen Ch., Ebeid I.A., Li D., Ding Y. (2020), Building a PubMed Knowledge Graph, Scientific Data, 7, 205, https://doi.org/10.1038/s41597-020-0543-2.
  • Zhao H., Pan Y., Yang F. (2020), Research on Information Extraction of Technical Documents and Construction of Domain Knowledge Graph, IEEE Access, 8, 168087-168098, DOI: 10.1109/ ACCESS.2020.3024070.
  • Zhao J., Wang T., Yatskar M., Ordonez V., Chang K.W. (2017), Men also Like Shopping: Reducing Gender Bias Amplification Using Corpus-level Constraints, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2979-2989.
  • (www 1) https://www.merriam-webster.com/dictionary/adjective (accessed: 1.11.2020).
  • (www 2) Ji S., Pan S., Cambria E., Marttinen P., Yuar P.S. (2021), A Survey on Knowledge Graphs: Representation, Acquisition and Applications, IEEE Transactions on Neural Networks and Learning Systems, Xiv:2002.00388 (accessed: 8.11.2020).
  • (www 3) Mäntylä M.V., Graziotin D., Kuutila M. (2018), The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers, Computer Science Review, 27, February, 16-32, arXiv:1612.01556 [cs.CL] (accessed: 10.11.2020).
  • (www 4) http://web.onda.com.br/abveiga/capitulo4-ingles.pdf (accessed: 11.11.2020).
  • (www 5) Mutlu E.C., Oghaz T.A., Rajabi A., Garibay I., Review on Learning and Extracting Graph Features for Link Prediction, arXiv:1901.03425 (accessed: 11.11.2020).
  • (www 6) https://www.sketchengine.eu/penn-treebank-tagset/
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.ekon-element-000171635062

Zgłoszenie zostało wysłane

Zgłoszenie zostało wysłane

Musisz być zalogowany aby pisać komentarze.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.