The "big data" era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Examples of such collections include scientific publications, enterprise logs, news articles, social media, and general web pages. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured, interconnected data. Mining latent structures around entities uncovers hidden knowledge such as implicit topics, phrases, entity roles and relationships. In this monograph, we investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains. This leads to a series of new principles and powerful methodologies for mining latent structures, including (1) latent topical hierarchy, (2) quality topical phrases, (3)entity roles in hierarchical topical communities, and (4) entity relations. This book also introduces applications enabled by the mined structures and points out some promising research directions.
Les mer
The "big data" era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life.
Acknowledgments.- Introduction.- Hierarchical Topic and Community Discovery.- Topical Phrase Mining.- Entity Topical Role Analysis.- Mining Entity Relations.- Scalable and Robust Topic Discovery.- Application and Research Frontier.- Bibliography.- Authors' Biographies.
Les mer

Produktdetaljer

ISBN
9783031007798
Publisert
2015-04-01
Utgiver
Vendor
Springer International Publishing AG
Høyde
235 mm
Bredde
191 mm
Aldersnivå
Professional/practitioner, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet

Forfatter

Biographical note

Chi Wang is a researcher at Microsoft Research, Redmond, Washington. He received his Ph.D. degree in computer science from the University of Illinois at Urbana-Champaign in 2014. He graduated from Tsinghua University, China, in 2009. His research has been focused on data mining, information network analysis, and text mining. He is the first winner of the prestigious Microsoft Research Graduate Research Fellowship in the history of Computer Science, University of Illinois at Urbana-Champaign. Jiawei Han is the Abel Bliss Professor in the Department of Computer Science at the University of Illinois. His research interests include data mining, information network analysis, and database systems, and he has over 600 publications. He served as the founding Editor-in-Chief of ACM Transactions on Knowledge Discovery from Data (TKDD). Jiawei has received the ACM SIGKDD Innovation Award (2004), IEEE Computer Society Technical Achievement Award (2005), IEEE Computer Society W. Wallace McDowell Award (2009), and Daniel C. Drucker Eminent Faculty Award at UIUC (2011). He is a Fellow of ACM and a Fellow of IEEE. He is currently the Director of Information Network Academic Research Center (INARC) supported by the Network Science-Collaborative Technology Alliance (NS-CTA) program of U.S. Army Research Lab. His co-authored textbook Data Mining: Concepts and Techniques (Morgan Kaufmann) has been adopted worldwide.