In both the database and machine learning communities, data quality has become a serious issue which cannot be ignored. In this context, we refer to data with quality problems as “dirty data.” Clearly, for a given data mining or machine learning task, dirty data in both training and test datasets can affect the accuracy of results. Accordingly, this book analyzes the impacts of dirty data and explores effective methods for dirty data processing. Although existing data cleaning methods improve data quality dramatically, the cleaning costs are still high. If we knew how dirty data affected the accuracy of machine learning models, we could clean data selectively according to the accuracy requirements instead of cleaning all dirty data, which entails substantial costs. However, no book to date has studied the impacts of dirty data on machine learning models in terms of data quality. Filling precisely this gap, the book is intended for a broad audience ranging from researchers in the database and machine learning communities to industry practitioners. Readers will find valuable takeaway suggestions on: model selection and data cleaning; incomplete data classification with view-based decision trees; density-based clustering for incomplete data; the feature selection method, which reduces the time costs and guarantees the accuracy of machine learning models; and cost-sensitive decision tree induction approaches under different scenarios. Further, the book opens many promising avenues for the further study of dirty data processing, such as data cleaning on demand, constructing a model to predict dirty-data impacts, and integrating data quality issues into other machine learning models. Readers will be introduced to state-of-the-art dirty data processing techniques, and the latest research advances, while also finding new inspirations in this field.
This book examines teacher agency in implementing English as a Foreign Language (EFL) curriculum reform in the Chinese university context. It theorizes the concept of teacher agency from a sociocultural theory perspective and draws on a study conducted in a conservative and less developed area in China. The book uses Engeström's activity theory and Vygotsky's concept of the Zone of Proximal Development (ZPD) to understand the nature and extent of teacher agency in adapting one’s teaching with respect to beliefs, knowledge and instructional practices. The study concludes that curriculum reform in China needs to shift from reliance on 'top-down' policies to 'bottom-up' implementation that mobilizes local understandings and practices. One of the implications of this study is that transformative teacher education programs aimed at developing teacher pedagogical agency require that teachers have ongoing opportunities to design, develop and evaluate curriculum-based mediational means.
This book presents a theoretical study on aspect in Chinese, including both situation and viewpoint aspects. Unlike previous studies, which have largely classified linguistic units into different situation types, this study defines a set of ontological event types that are conceptually universal and on the basis of which different languages employ various linguistic devices to describe such events. To do so, it focuses on a particular component of events, namely the viewpoint aspect. It includes and discusses a wealth of examples to show how such ontological events are realized in Chinese. In addition, the study discusses how Chinese modal verbs and adverbs affect the distribution of viewpoint aspects associated with certain situation types. In turn, the book demonstrates how the proposed linguistic theory can be used in a computational context. Simply identifying events in terms of the verbs and their arguments is insufficient for real situations such as understanding the factivity and the logical/temporal relations between events. The proposed framework offers the possibility of analyzing events in Chinese text, yielding deep semantic information.
This book covers the major fundamentals of and the latest research on next-generation spatio-temporal recommendation systems in social media. It begins by describing the emerging characteristics of social media in the era of mobile internet, and explores the limitations to be found in current recommender techniques. The book subsequently presents a series of latent-class user models to simulate users’ behaviors in decision-making processes, which effectively overcome the challenges arising from temporal dynamics of users’ behaviors, user interest drift over geographical regions, data sparsity and cold start. Based on these well designed user models, the book develops effective multi-dimensional index structures such as Metric-Tree, and proposes efficient top-k retrieval algorithms to accelerate the process of online recommendation and support real-time recommendation. In addition, it offers methodologies and techniques for evaluating both the effectiveness and efficiency of spatio-temporal recommendation systems in social media. The book will appeal to a broad readership, from researchers and developers to undergraduate and graduate students.
This is the first book offering a systematic description of tongue image analysis and processing technologies and their typical applications in computerized tongue diagnostic (CTD) systems. It features the most current research findings in all aspects of tongue image acquisition, preprocessing, classification, and diagnostic support methodologies, from theoretical and algorithmic problems to prototype design and development of CTD systems. The book begins with a very in-depth description of CTD on a need-to-know basis which includes an overview of CTD systems and traditional Chinese medicine (TCM) in order to provide the information on the context and background of tongue image analysis. The core part then introduces algorithms as well as their implementation methods, at a know-how level, including image segmentation methods, chromatic correction, and classification of tongue images. Some clinical applications based on these methods are presented for the show-how purpose in the CTD research field. Case studies highlight different techniques that have been adopted to assist the visual inspection of appendicitis, diabetes, and other common diseases. Experimental results under different challenging clinical circumstances have demonstrated the superior performance of these techniques. In this book, the principles of tongue image analysis are illustrated with plentiful graphs, tables, and practical experiments to provide insights into some of the problems. In this way, readers can easily find a quick and systematic way through the complicated theories and they can later even extend their studies to special topics of interest. This book will be of benefit to researchers, professionals, and graduate students working in the field of computer vision, pattern recognition, clinical practice, and TCM, as well as those involved in interdisciplinary research.
This book addresses several knowledge discovery problems on multi-sourced data where the theories, techniques, and methods in data cleaning, data mining, and natural language processing are synthetically used. This book mainly focuses on three data models: the multi-sourced isomorphic data, the multi-sourced heterogeneous data, and the text data. On the basis of three data models, this book studies the knowledge discovery problems including truth discovery and fact discovery on multi-sourced data from four important properties: relevance, inconsistency, sparseness, and heterogeneity, which is useful for specialists as well as graduate students. Data, even describing the same object or event, can come from a variety of sources such as crowd workers and social media users. However, noisy pieces of data or information are unavoidable. Facing the daunting scale of data, it is unrealistic to expect humans to “label” or tell which data source is more reliable. Hence, it is crucial to identify trustworthy information from multiple noisy information sources, referring to the task of knowledge discovery. At present, the knowledge discovery research for multi-sourced data mainly faces two challenges. On the structural level, it is essential to consider the different characteristics of data composition and application scenarios and define the knowledge discovery problem on different occasions. On the algorithm level, the knowledge discovery task needs to consider different levels of information conflicts and design efficient algorithms to mine more valuable information using multiple clues. Existing knowledge discovery methods have defects on both the structural level and the algorithm level, making the knowledge discovery problem far from totally solved.
5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019, Guilin, China, September 20-23, 2019, Proceedings, Part II.
5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019, Guilin, China, September 20-23, 2019, Proceedings, Part II.
This two volume set (CCIS 1058 and 1059) constitutes the refereed proceedings of the 5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019 held in Guilin, China, in September 2019. The 104 revised full papers presented in these two volumes were carefully reviewed and selected from 395 submissions. The papers cover a wide range of topics related to basic theory and techniques for data science including data mining; data base; net work; security; machine learning; bioinformatics; natural language processing; software engineering; graphic images; system; education; application.
Investigating the highly influential enrolment expansion policy in Chinese higher education, this book outlines how educational equity issues were understood and addressed in the formulation and implementation of the policy, and its impacts on the socio-economic fabric of China in the past decade. Drawing on Chinese policy documents and interviews with government and university representatives, Zhang examines the education system under the Mao era and the post-Mao era and outlines the different approaches to equity that have characterized education in China in the 20th and 21st century. Stephen Ball's 'policy cycle' is used as a framework to analyse the various contexts (text, discourse, and social practice) in which policy is formed. Zhang argues that education policy was not simply driven by concerns of equity, but also by economic interests and political discourse. Zhang further goes on to analyze how education policy was implemented by provincial governments and highlights the tension between central policy and on-the-ground implementation. Bringing analysis of Chinese policy and research to a wider audience, this text will interest education policy makers and academics in the field of educational equity and higher education research"--
This will help us customize your experience to showcase the most relevant content to your age group
Please select from below
Login
Not registered?
Sign up
Already registered?
Success – Your message will goes here
We'd love to hear from you!
Thank you for visiting our website. Would you like to provide feedback on how we could improve your experience?
This site does not use any third party cookies with one exception — it uses cookies from Google to deliver its services and to analyze traffic.Learn More.