Celebrity artist Shen Jiawei's history paintings are held in national museums and in public or private collections all around the world, including the Vatican. In this book, he chronicles the contexts in which his paintings were done, giving us rare insights of the national histories behind the canvas of his works.
This book provides a principled data-driven framework that progressively constructs, enriches, and applies taxonomies without leveraging massive human annotated data. Traditionally, people construct domain-specific taxonomies by extensive manual curations, which is time-consuming and costly. In today’s information era, people are inundated with the vast amounts of text data. Despite their usefulness, people haven’t yet exploited the full power of taxonomies due to the heavy curation needed for creating and maintaining them. To bridge this gap, the authors discuss automated taxonomy discovery and exploration, with an emphasis on label-efficient machine learning methods and their real-world usages. Taxonomy organizes entities and concepts in a hierarchy way. It is ubiquitous in our daily life, ranging from product taxonomies used by online retailers, topic taxonomies deployed by news outlets and social media, as well as scientific taxonomies deployed by digital libraries across various domains. When properly analyzed, these taxonomies can play a vital role for science, engineering, business intelligence, policy design, e-commerce, and more. Intuitive examples are used throughout enabling readers to grasp concepts more easily.
This book provides a principled data-driven framework that progressively constructs, enriches, and applies taxonomies without leveraging massive human annotated data. Traditionally, people construct domain-specific taxonomies by extensive manual curations, which is time-consuming and costly. In today’s information era, people are inundated with the vast amounts of text data. Despite their usefulness, people haven’t yet exploited the full power of taxonomies due to the heavy curation needed for creating and maintaining them. To bridge this gap, the authors discuss automated taxonomy discovery and exploration, with an emphasis on label-efficient machine learning methods and their real-world usages. Taxonomy organizes entities and concepts in a hierarchy way. It is ubiquitous in our daily life, ranging from product taxonomies used by online retailers, topic taxonomies deployed by news outlets and social media, as well as scientific taxonomies deployed by digital libraries across various domains. When properly analyzed, these taxonomies can play a vital role for science, engineering, business intelligence, policy design, e-commerce, and more. Intuitive examples are used throughout enabling readers to grasp concepts more easily.
Unstructured text, as one of the most important data forms, plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to scientific research and healthcare informatics. In many emerging applications, people's information need from text data is becoming multidimensional—they demand useful insights along multiple aspects from a text corpus. However, acquiring such multidimensional knowledge from massive text data remains a challenging task. This book presents data mining techniques that turn unstructured text data into multidimensional knowledge. We investigate two core questions. (1) How does one identify task-relevant text data with declarative queries in multiple dimensions? (2) How does one distill knowledge from text data in a multidimensional space? To address the above questions, we develop a text cube framework. First, we develop a cube construction module that organizes unstructured data into a cube structure, by discovering latent multidimensional and multi-granular structure from the unstructured text corpus and allocating documents into the structure. Second, we develop a cube exploitation module that models multiple dimensions in the cube space, thereby distilling from user-selected data multidimensional knowledge. Together, these two modules constitute an integrated pipeline: leveraging the cube structure, users can perform multidimensional, multigranular data selection with declarative queries; and with cube exploitation algorithms, users can extract multidimensional patterns from the selected data for decision making. The proposed framework has two distinctive advantages when turning text data into multidimensional knowledge: flexibility and label-efficiency. First, it enables acquiring multidimensional knowledge flexibly, as the cube structure allows users to easily identify task-relevant data along multiple dimensions at varied granularities and further distill multidimensional knowledge. Second, the algorithms for cube construction and exploitation require little supervision; this makes the framework appealing for many applications where labeled data are expensive to obtain.
Data Mining: Concepts and Techniques, Fourth Edition introduces concepts, principles, and methods for mining patterns, knowledge, and models from various kinds of data for diverse applications. Specifically, it delves into the processes for uncovering patterns and knowledge from massive collections of data, known as knowledge discovery from data, or KDD. It focuses on the feasibility, usefulness, effectiveness, and scalability of data mining techniques for large data sets. After an introduction to the concept of data mining, the authors explain the methods for preprocessing, characterizing, and warehousing data. They then partition the data mining methods into several major tasks, introducing concepts and methods for mining frequent patterns, associations, and correlations for large data sets; data classificcation and model construction; cluster analysis; and outlier detection. Concepts and methods for deep learning are systematically introduced as one chapter. Finally, the book covers the trends, applications, and research frontiers in data mining. Presents a comprehensive new chapter on deep learning, including improving training of deep learning models, convolutional neural networks, recurrent neural networks, and graph neural networks Addresses advanced topics in one dedicated chapter: data mining trends and research frontiers, including mining rich data types (text, spatiotemporal data, and graph/networks), data mining applications (such as sentiment analysis, truth discovery, and information propagattion), data mining methodologie and systems, and data mining and society Provides a comprehensive, practical look at the concepts and techniques needed to get the most out of your data
The real-world data, though massive, is largely unstructured, in the form of natural-language text. It is challenging but highly desirable to mine structures from massive text data, without extensive human annotation and labeling. In this book, we investigate the principles and methodologies of mining structures of factual knowledge (e.g., entities and their relationships) from massive, unstructured text corpora. Departing from many existing structure extraction methods that have heavy reliance on human annotated data for model training, our effort-light approach leverages human-curated facts stored in external knowledge bases as distant supervision and exploits rich data redundancy in large text corpora for context understanding. This effort-light mining approach leads to a series of new principles and powerful methodologies for structuring text corpora, including (1) entity recognition, typing and synonym discovery, (2) entity relation extraction, and (3) open-domain attribute-value mining and information extraction. This book introduces this new research frontier and points out some promising research directions.
Encapsulation Technologies for Electronic Applications, Second Edition, offers an updated, comprehensive discussion of encapsulants in electronic applications, with a primary emphasis on the encapsulation of microelectronic devices and connectors and transformers. It includes sections on 2-D and 3-D packaging and encapsulation, encapsulation materials, including environmentally friendly 'green' encapsulants, and the properties and characterization of encapsulants. Furthermore, this book provides an extensive discussion on the defects and failures related to encapsulation, how to analyze such defects and failures, and how to apply quality assurance and qualification processes for encapsulated packages. In addition, users will find information on the trends and challenges of encapsulation and microelectronic packages, including the application of nanotechnology. Increasing functionality of semiconductor devices and higher end used expectations in the last 5 to 10 years has driven development in packaging and interconnected technologies. The demands for higher miniaturization, higher integration of functions, higher clock rates and data, and higher reliability influence almost all materials used for advanced electronics packaging, hence this book provides a timely release on the topic. Provides guidance on the selection and use of encapsulants in the electronics industry, with a particular focus on microelectronics Includes coverage of environmentally friendly 'green encapsulants' Presents coverage of faults and defects, and how to analyze and avoid them
Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge. Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data— including stream data, sequence data, graph structured data, social network data, and multi-relational data. A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business data Updates that incorporate input from readers, changes in the field, and more material on statistics and machine learning Dozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projects Complete classroom support for instructors at www.mkp.com/datamining2e companion site
A Guide to Chinese Medicine on the Internet frees readers from having to sift through countless websites to find up-to-date, high quality, reliable information on all types of Chinese medicine. This handy resource provides an introduction to the terms and philosophies of Chinese medicine in addition to an extensive categorized listing of online sites related to Chinese culture and medicine, complete with a brief description of each site's content. Guidelines are provided for searching, cataloging, and evaluating websites concerned with Chinese medicine, based on the author's research and personal experience as a practitioner and user of Chinese medicines.
This book presents the state of the art in distributed machine learning algorithms that are based on gradient optimization methods. In the big data era, large-scale datasets pose enormous challenges for the existing machine learning systems. As such, implementing machine learning algorithms in a distributed environment has become a key technology, and recent research has shown gradient-based iterative optimization to be an effective solution. Focusing on methods that can speed up large-scale gradient optimization through both algorithm optimizations and careful system implementations, the book introduces three essential techniques in designing a gradient optimization algorithm to train a distributed machine learning model: parallel strategy, data compression and synchronization protocol. Written in a tutorial style, it covers a range of topics, from fundamental knowledge to a number of carefully designed algorithms and systems of distributed machine learning. It will appeal to a broad audience in the field of machine learning, artificial intelligence, big data and database management.
Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data
Thank you for visiting our website. Would you like to provide feedback on how we could improve your experience?
This site does not use any third party cookies with one exception — it uses cookies from Google to deliver its services and to analyze traffic.Learn More.