Ulf Leser's Books

Covering Or Complete?

Discovering Conditional Inclusion Dependencies

By: Jana Bauckmann,Abedjan, Ziawasch,Leser, Ulf,Müller, Heiko,Naumann, Felix

Write a review Read reviews Take a Quiz Solve Book Puzzle

Covering Or Complete?

Discovering Conditional Inclusion Dependencies

By: Jana Bauckmann,Abedjan, Ziawasch,Leser, Ulf,Müller, Heiko,Naumann, Felix

Data dependencies, or integrity constraints, are used to improve the quality of a database schema, to optimize queries, and to ensure consistency in a database. In the last years conditional dependencies have been introduced to analyze and improve data quality. In short, a conditional dependency is a dependency with a limited scope defined by conditions over one or more attributes. Only the matching part of the instance must adhere to the dependency. In this paper we focus on conditional inclusion dependencies (CINDs). We generalize the definition of CINDs, distinguishing covering and completeness conditions. We present a new use case for such CINDs showing their value for solving complex data quality tasks. Further, we define quality measures for conditions inspired by precision and recall. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. Our algorithms choose not only the condition values but also the condition attributes automatically. Finally, we show that our approach efficiently provides meaningful and helpful results for our use case.

Average ratings

N/A

Write a review

Solve jigsaw puzzle

Age

Page Count

Publisher

Universitätsverlag Potsdam

Published Date

ISBN 10

3869562129

ISBN 13

9783869562124

Efficient and Exact Computation of Inclusion Dependencies for Data Integration

By: Jana Bauckmann,Ulf Leser,Felix Naumann

Write a review Read reviews Take a Quiz Solve Book Puzzle

Efficient and Exact Computation of Inclusion Dependencies for Data Integration

By: Jana Bauckmann,Ulf Leser,Felix Naumann

Data obtained from foreign data sources often come with only superficial structural information, such as relation names and attribute names. Other types of metadata that are important for effective integration and meaningful querying of such data sets are missing. In particular, relationships among attributes, such as foreign keys, are crucial metadata for understanding the structure of an unknown database. The discovery of such relationships is difficult, because in principle for each pair of attributes in the database each pair of data values must be compared. A precondition for a foreign key is an inclusion dependency (IND) between the key and the foreign key attributes. We present with Spider an algorithm that efficiently finds all INDs in a given relational database. It leverages the sorting facilities of DBMS but performs the actual comparisons outside of the database to save computation. Spider analyzes very large databases up to an order of magnitude faster than previous approaches. We also evaluate in detail the effectiveness of several heuristics to reduce the number of necessary comparisons. Furthermore, we generalize Spider to find composite INDs covering multiple attributes, and partial INDs, which are true INDs for all but a certain number of values. This last type is particularly relevant when integrating dirty data as is often the case in the life sciences domain - our driving motivation.

Average ratings

N/A

Write a review

Solve jigsaw puzzle

Age

Page Count

Publisher

Universitätsverlag Potsdam

Published Date

ISBN 10

3869560487

ISBN 13

9783869560489

A History of the International Movement of Journalists

Professionalism Versus Politics

By: Kaarle Nordenstreng,Ulf Jonas Björk,Frank Beyersdorf,Svennik Høyer,Epp Lauk

Write a review Read reviews Take a Quiz Solve Book Puzzle

A History of the International Movement of Journalists

Professionalism Versus Politics

By: Kaarle Nordenstreng,Ulf Jonas Björk,Frank Beyersdorf,Svennik Høyer,Epp Lauk

This study presents a general history of how journalism as an emerging profession became internationally organized over the past one hundred and twenty years, seen mainly through the associations founded to promote the interests of journalists around the world.

Average ratings

N/A

Write a review

Solve jigsaw puzzle

Age

Page Count

285

Publisher

Springer

Published Date

ISBN 10

1137530553

ISBN 13

9781137530554

Data Integration in the Life Sciences

Third International Workshop, DILS 2006, Hinxton, UK, July 20-22, 2006, Proceedings

By: Ulf Leser,Felix Naumann,Barbara Eckman

Write a review Read reviews Take a Quiz Solve Book Puzzle

Data Integration in the Life Sciences

Third International Workshop, DILS 2006, Hinxton, UK, July 20-22, 2006, Proceedings

By: Ulf Leser,Felix Naumann,Barbara Eckman

This book constitutes the refereed proceedings of the Third International Workshop on Data Integration in the Life Sciences, DILS 2006, held in Hinxton, UK in July 2006. Presents 19 revised full papers and 4 revised short papers together with 2 keynote talks, addressing current issues in data integration from the life science point of view. The papers are organized in topical sections on data integration, text mining, systems, and workflow.

Average ratings

N/A

Write a review

Solve jigsaw puzzle

Age

Page Count

308

Publisher

Springer Science & Business Media

Published Date

ISBN 10

3540365931

ISBN 13

9783540365938

Book's by Ulf Leser

Covering Or Complete?

Covering Or Complete?

Efficient and Exact Computation of Inclusion Dependencies for Data Integration

Efficient and Exact Computation of Inclusion Dependencies for Data Integration

A History of the International Movement of Journalists

A History of the International Movement of Journalists

Data Integration in the Life Sciences

Data Integration in the Life Sciences

Username

Password

Username

Password

Password again

Birthday

Email