Guide to Intelligent Data Science How to Intelligently Make Use of Real Data

Michael R. Berthold, Christian Borgelt, Frank Höppner, Frank Klawonn, Rosaria Silipo

Cover of Guide to Intelligent Data Science

Teaching material available via open access license

Access now

The new book is available now on Springer

  • Supplies a broad-range of perspectives on data science, providing readers with a comprehensive account of the field

  • Presents a focus on practical aspects, in addition to a detailed description of the theory

  • Emphasizes the common pitfalls that often lead to incorrect or insufficient analyses, to help readers avoid such errors

  • Includes extensive hands-on examples, enabling readers to gain further insight into the topic

Making use of data is not anymore a niche project but central to almost every project.

With access to massive compute resources and vast amounts of data, it seems at least in principle possible to solve any problem. However, successful data science projects result from the intelligent application of: human intuition in combination with computational power; sound background knowledge with computer-aided modelling; and critical reflection of the obtained insights and results.

Substantially updating the previous edition, then entitled Guide to Intelligent Data Analysis, this core textbook continues to provide a hands-on instructional approach to many data science techniques, and explains how these are used to solve real world problems. The work balances the practical aspects of applying and using data science techniques with the theoretical and algorithmic underpinnings from mathematics and statistics. Major updates on techniques and subject coverage (including deep learning) are included.

Topics and features: guides the reader through the process of data science, following the interdependent steps of project understanding, data understanding, data blending and transformation, modeling, as well as deployment and monitoring; includes numerous examples using the open source KNIME Analytics Platform, together with an introductory appendix; provides a review of the basics of classical statistics that support and justify many data analysis methods, and a glossary of statistical terms; integrates illustrations and case-study-style examples to support pedagogical exposition; supplies further tools and information at an associated website.

This practical and systematic textbook/reference is a “need-to-have” tool for graduate and advanced undergraduate students and essential reading for all professionals who face data science problems. Moreover, it is a “need to use, need to keep” resource following one's exploration of the subject.

The Authors

Michael R. Berthold

Michael R. Berthold
Professor of Bioinformatics and Information Mining at the University of Konstanz, Germany and Co-Founder of KNIME AG, Zurich, Switzerland

Christian Borgelt

Christian Borgelt
Professor of Data Science at the University of Salzburg, Austria

Frank Höppner

Frank Höppner
Professor of Information Systems at Ostfalia University of Applied Sciences, Germany

Frank Klawoon

Frank Klawonn
Professor in the Department of Computer Science and Head of the Data Analysis and Pattern Recognition Laboratory at Ostfalia University of Applied Sciences, Germany and Head of the Bioinformatics and Statistics group at the Helmholtz Centre for Infection Research, Braunschweig, Germany

Rosaria Silipo

Rosaria Silipo
Principal Data Scientist and Head of Evangelism at KNIME AG, Zurich, Switzerland

Times Higher Education, 26 May 2011

“The persons, leading scholars in the field based in Germany and Spain, seek to offer a hands-on instructional approach to basic data analysis techniques and consider their use in solving problems. The reader is taken through the process, following the interlinked steps of project understanding, data understanding, data preparation, modelling, and deployment and monitoring. The text reviews the basics of classical statistics that support and justify many data analysis methods, and includes a glossary of statistical terms.”