This book examines the recent trend of extending data dependencies to adapt to rich data types in order to address variety and veracity issues in big data. Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures. The text will also discuss interesting constraints on ordering or similarity relationships contained in novel classes of data dependencies in addition to those in equality relationships, e.g., considered in functional dependencies (FDs). In addition to exploring the concepts of these data dependency notations, the book investigates the extension relationships between data dependencies, such as conditional functional dependencies (CFDs) that extend conventional functional dependencies (FDs). This forms in the book a family tree of extensions, mostly rooted in FDs, that help illuminate the expressive power of various data dependencies. Moreover, the book points to work on the discovery of dependencies from data, since data dependencies are often unlikely to be manually specified in a traditional way, given the huge volume and high variety in big data. It further outlines the applications of the extended data dependencies, in particular in data quality practice. Altogether, this book provides a comprehensive guide for readers to select proper data dependencies for their applications that have sufficient expressive power and reasonable discovery cost. Finally, the book concludes with several directions of future studies on emerging data.
Les mer
Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures.
Les mer
Introduction.- Categorical Data.- Heterogeneous Data.- Ordered Data.- Temporal Data.- Graph Data.- Conclusions and Directions.- Index of Data Dependencies.- References.
This book examines the recent trend of extending data dependencies to adapt to rich data types in order to address variety and veracity issues in big data. Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures. The text will also discuss interesting constraints on ordering or similarity relationships contained in novel classes of data dependencies in addition to those in equality relationships, e.g., considered in functional dependencies (FDs). In addition to exploring the concepts of these data dependency notations, the book investigates the extension relationships between data dependencies, such as conditional functional dependencies (CFDs) that extend conventional functional dependencies (FDs). This forms in the book a family tree of extensions, mostly rooted in FDs, that help illuminate the expressive power of various data dependencies. Moreover, the book points to work on the discovery of dependencies from data, since data dependencies are often unlikely to be manually specified in a traditional way, given the huge volume and high variety in big data. It further outlines the applications of the extended data dependencies, in particular in data quality practice. Altogether, this book provides a comprehensive guide for readers to select proper data dependencies for their applications that have sufficient expressive power and reasonable discovery cost. Finally, the book concludes with several directions of future studies on emerging data.
Les mer
Explores application of data dependencies and their extensions to big data analysis Guides readers on selection of appropriate data dependencies based on expressive power and discovery cost Identifies typical data dependency categories, their relationships, and various specific application scenarios
Les mer

Produktdetaljer

ISBN
9783031271793
Publisert
2024-04-01
Utgiver
Vendor
Springer International Publishing AG
Høyde
240 mm
Bredde
168 mm
Aldersnivå
Professional/practitioner, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet

Biographical note

Shaoxu Song is an Associate Professor in the School of Software at Tsinghua University in Beijing, China. His research interests include data quality and data integration. He has published more than 50 papers in top conferences and journals such as SIGMOD, VLDB, ICDE, ACM TODS, VLDBJ, IEEE TKDE, etc. He served as a Vice Program Chair for the 2022 IEEE International Conference on Big Data (IEEE BigData 2022) and received the Distinguished Reviewer award from VLDB 2019 and an Outstanding Reviewer award from CIKM 2017.
Lei Chen is a Chaired Professor in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology and the Director of the HKUST Big Data Institute. He received the SIGMOD Test-of-Time Award in 2015 and served as the Program Committee Co-Chair of VLDB 2019 and ICDE 2023. He is currently the Editor-in-Chief of the VLDB Journal, and the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (TKDE). He is an IEEE Fellow and ACM Distinguished Scientist.