Bøger af Themis Palpanas
-
564,95 kr. Data usually comes in a plethora of formats and dimensions, rendering the exploration and information extraction processes challenging. Thus, being able to perform exploratory analyses in the data with the intent of having an immediate glimpse on some of the data properties is becoming crucial. Exploratory analyses should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so-called example-based methods, in which the user, or the analyst, circumvents query languages by using examples as input. An example is a representative of the intended results, or in other words, an item from the result set. Example-based methods exploit inherent characteristics of the data to infer the results that the user has in mind, but may not able to (easily) express. They can be useful in cases where a user is looking for information in an unfamiliar dataset, when the task is particularly challenging like finding duplicate items, or simply when they are exploring the data. In this book, we present an excursus over the main methods for exploratory analysis, with a particular focus on example-based methods. We show how that different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data. The book presents also the challenges and the new frontiers of machine learning in online settings which recently attracted the attention of the database community. The lecture concludes with a vision for further research and applications in this area.
- Bog
- 564,95 kr.
-
454,95 kr. By its nature, term ¿data quality¿ with its generic meaning ¿fitness for use¿ has both subjective and objective aspects. To demonstrate how one can benefit from measuring and controlling quality of one¿s data, in this book we presented three real world use cases which demonstrate a top-down research approach of the data quality scope in three different real world applications. In particular, we study the following problems: 1) how quality of data can be defined and propagated to customers in a business intelligence application for quality-aware decision making; 2) how data quality can be defined, measured and used in a web-based system operating with semi-structured data from and designated to both humans and machines; 3) how a data-driven (vs. system-driven) time-related data quality notion of staleness can be defined, efficiently measured and monitored in a generic information system. The work should help researchers and professionals working on both generic data quality problems as its understanding in a given context, and on data quality¿s specific applications as measurement its dimensions.
- Bog
- 454,95 kr.