HomeTopicsSubmissionRegistrationProgrammeFundingSponsorsCommitteesVenueSocial Events

  DMIN'10 Programme
 DMIN'10 Tutorials
 DMIN'10 Special Sess.
 WORLDCOMP'10
 

Tutorial Sessions

All tutorials are free to registered conference attendees of all conferences held at WOLDCOMP'10. Those who are interested in attending one or more of the tutorials are to sign up on site at the conference registration desk in Las Vegas. A complete & current list of WORLDCOMP Tutorials can be found here.

In addition to tutorials at other conferences, DMIN'10 aims at providing a set of tutorials dedicated to Data Mining topics. The 2007 key tutorial was given by Prof. Eamonn Keogh on Time Series Clustering. The 2008 key tutorial was presented by Mikhail Golovnya (Senior Scientist, Salford Systems, USA) on Advanced Data Mining Methodologies.

DMIN'09 provided four tutorials presented by Prof. Nitesh V. Chawla on Data Mining with Sensitivity to Rare Events and Class Imbalance, Prof. Asim Roy on Autonomous Machine Learning, Dan Steinberg (CEO of Salford Systems) on Advanced Data Mining Methodologies, and Peter Geczy on Emerging Human-Web Interaction Research.

DMIN'10 will host the following tutorials:

Tutorial A
Speaker: Prof. Vladimir Cherkassky
Fellow of IEEE;
ECE Department, University of Minnesota, Minneapolis, MN, USA;
Former Director, NATO Advanced Study Institute (ASI);
www.ece.umn.edu/users/cherkass/predictive_learning
Served on the editorial boards of IEEE Transactions on Neural Networks,
the Neural Networks Journal, the Natural Computing Journal and the Neural Processing Letters

 

 

Topic: Advanced Methodologies for Learning with Sparse Data
Webpage http://www.ece.umn.edu/users/cherkass/predictive_learning/
Date & Time July 13, 2010 (6:00pm - 9:30pm)
Location Ballroom 5
Description

OVERVIEW: The field of Predictive Learning is concerned with estimating ‘good’ predictive models from available data. Such problems can be usually stated in the framework of inductive learning, where the goal is to come up with a good predictive model from known observations (or training data samples). In recent years, there has been a growing interest in applying learning methods to sparse high-dimensional data (i.e., in genomics, medical imaging, object recognition, etc.). In such applications, many successful approaches represent minor modifications of existing inductive learning methods (such as neural networks, support vector machines, discriminant analysis etc.) combined with clever preprocessing and feature extraction. At the same time, in the statistical learning community, there is a trend towards development and better understanding of new non-standard and non-inductive learning settings. Examples include (a) several powerful learning formulations developed in VC-theory: transduction, learning through contradictions, SVM+ (Vapnik, 1998, 2006); and (b) non-standard settings proposed in machine learning community, such as Multi-Task Learning (Ben-David et al, 2002), Semi-Supervised Learning (Chapelle et al, 2006) etc.. These new learning formulations are motivated by practical needs (to improve generalization for learning with sparse high-dimensional data). This tutorial will present an overview of recent novel learning formulations, investigate possible connections between these formulations, and discuss application examples illustrating advantages of using these approaches for sparse high-dimensional data. The presentation will be based, to a large extent, on the conceptual framework developed by Vapnik [1998, 2006].

CONTENT: This tutorial will cover three major parts. The first part will present VC-theoretical framework for predictive learning and discuss standard inductive learning setting, in order to motivate alternative approaches. Second part presents several non-standard learning formulations such as transduction, learning through contradictions, learning with hidden information and multi-task learning. In the third part, we discuss practical issues and difficulties arising in application of these advanced learning techniques. Throughout this tutorial, many important points will be illustrated by empirical comparisons and related to practical applications (mainly, biomedical applications).

TUTORIAL DURATION: 2.5 hours

INTENDED AUDIENCE: Researchers and practitioners interested in understanding advanced learning methodologies, and their applications. This tutorial is also helpful for developing improved understanding of the methodological issues for learning with high-dimensional data.

References

S. Ben-David, J. Gehrke and R. Schuller, A theoretical framework for learning form a pool of disparate data sources. ACM KDD, 2002. 

O. Chapelle, B. Schölkopf and A. Zien, Eds., Semi-Supervised Learning, MIT Press, 2006

Cherkassky, V. and Y. Ma, Data complexity, margin-based learning and Popper’s philosophy of inductive learning, in Data Complexity in Pattern Recognition, M. Basu and T. Ho , Eds, Springer, 2006

Cherkassky, V. and F. Mulier, Learning from Data, second edition, Wiley, 2007

Cherkassky, Cai, F., and L. Liang, Predictive learning with sparse heterogeneous data, Proc IJCNN 2009

Vapnik, V., Statistical Learning Theory, Wiley, 1998

Vapnik, V., Empirical Inference Science: Afterword of 2006, Springer 2006

 

Short Bio

Vladimir Cherkassky is Professor of Electrical and Computer Engineering at the University of Minnesota.  He received Ph.D. in Electrical Engineering from University of Texas at Austin in 1985. His current research is on methods for predictive learning from data, and he has co-authored a monograph Learning From Data published by Wiley in 1998. Prof. Cherkassky has served on the Governing Board of INNS. He has served on editorial boards of IEEE Transactions on Neural Networks, the Neural Networks Journal, the Natural Computing Journal and the Neural Processing Letters. He served on the program committee of major international conferences on Artificial Neural Networks. He was Director of NATO Advanced Study Institute (ASI) From Statistics to Neural Networks: Theory and Pattern Recognition Applications held in France, in 1993. He presented numerous tutorials on neural network and statistical methods for learning from data. In 2007, he became Fellow of IEEE, for ‘contributions and leadership in statistical learning and neural network research’.

 

Tutorial B
Speaker: Dr. Peter Geczy

 

 

Topic: Web Mining: Opportunities and Challenges
Date & Time July 12, 2010 (6:50pm - 8:50pm)
Location Ballroom 6
Description

ABSTRACT: Development of world wide web has been influencing various domains of commerce, government, and academia. Its fast-paced growth and widespread adoption inherently present numerous opportunities and challenges. World wide web incorporates a broad range of data available for exploration. Data is significantly diverse, voluminous, and exhibits dynamics reflecting its evolution. Researchers and practitioners have been mining web data for several decades, yet there is a plenty of more to be done. We will briefly survey the status quo, highlight selected approaches, and expose possible promising directions in web mining.

OBJECTIVE: The objective of this tutorial is to provide concise overview of the present state of issues, inherent difficulties, contemporary approaches, and potential future opportunities. Exposé of the state-of-the-art in web mining should prove beneficial to a wide spectrum of individuals researching, studying and/or utilizing web mining techniques for both academic and commercial purposes.

TUTORIAL DURATION: approx. 2 hours

INTENDED AUDIENCE: The tutorial aims to approach a broad audience including, but not limited to:

- Students and Educators
- Academics and Researchers
- Practitioners and Managers

The presentation shall be in an accessible and intuitive manner without extensive technical details.

 
Short Bio

Dr. Peter Geczy is a senior scientist at The National Institute of Advanced Industrial Science and Technology (AIST). He also held positions at The Institute of Physical and Chemical Research (RIKEN) and The Research Center for Future Technology. His interdisciplinary scientific interests encompass domains of data and web mining, human interactions and behavior in digital environments, information systems, knowledge management and engineering, artificial intelligence, and machine learning. His recent research focus also extends to the spheres of service science, engineering, management, and computing. He received several awards in recognition of his accomplishments. Dr. Geczy has been serving on various professional committees, editorial boards, and has been a distinguished speaker in academia and industry.

Keynotes

Keynote
Speaker: Prof. Vladimir Cherkassky
Fellow of IEEE;
ECE Department, University of Minnesota, Minneapolis, MN, USA;
Former Director, NATO Advanced Study Institute (ASI);
www.ece.umn.edu/users/cherkass/predictive_learning
Served on the editorial boards of IEEE Transactions on Neural Networks,
the Neural Networks Journal, the Natural Computing Journal and the Neural Processing Letters

 

 

Topic: Predictive Data Modeling and the Nature of Scientific Discovery
Webpage www.ece.umn.edu/users/cherkass/predictive_learning
Date & Time July 12, 2010 (06:00pm - 06:50pm)
Location Ballroom 6
Description Abstract
 

Scientific discovery involves interaction between two major components:

  • facts, or observations of the Real World (or Nature);

  • Scientific theories (models), i.e. mental constructs, explaining this observed data.

In classical science, the primary role belongs to a well-defined scientific hypothesis which drives data collection and generation. So experimental data is simply used to confirm or refute a scientific theory. In the late 20-th century, the balance between facts and models in scientific research has totally shifted, due to a growing use of digital technology for data collection and recording. Nowadays, there is an abundance of available data describing physical, biological and social systems. Several new technologies, such as machine learning and data mining, hold promise of â 'discovering' new knowledge hidden in a sea of data. Much of recent research in life sciences is data-driven, i.e. when researchers try to establish 'associations' between certain genetic variables and a disease. This is completely different from the classical approach to scientific discovery. Whereas many machine learning and statistical methods can easily detect correlations present in empirical data, it is not clear whether such dependencies constitute new biological knowledge. This is known as the problem of demarcation in the philosophy of science, i.e. differentiating between true scientific theories and metaphysical theories (beliefs).

Knowledge that can be extracted from empirical data is statistical in nature, as opposed to deterministic first-principle knowledge in classical science. Modern science is mainly about such an empirical knowledge, yet there seems to be no clear demarcation between true empirical knowledge and beliefs (supported by empirical data). My talk will discuss methodological issues important for predictive data modeling, i.e.,

  • first-principle knowledge, empirical knowledge and beliefs;

  • understanding of uncertainty and risk,

  • predictive data modeling,

  • interpretation of predictive models.

These methodological issues are closely related to philosophical ideas, dating back to Plato and Aristotle. The main points will be illustrated by specific examples from an on-going project on prediction of transplant-related mortality for bone-and-marrow transplant patients, in collaboration with the University of Minnesota Medical School and the Mayo Clinic.

 

Short Bio

Vladimir Cherkassky is Professor of Electrical and Computer Engineering at the University of Minnesota.  He received Ph.D. in Electrical Engineering from University of Texas at Austin in 1985. His current research is on methods for predictive learning from data, and he has co-authored a monograph Learning From Data published by Wiley in 1998. Prof. Cherkassky has served on the Governing Board of INNS. He has served on editorial boards of IEEE Transactions on Neural Networks, the Neural Networks Journal, the Natural Computing Journal and the Neural Processing Letters. He served on the program committee of major international conferences on Artificial Neural Networks. He was Director of NATO Advanced Study Institute (ASI) From Statistics to Neural Networks: Theory and Pattern Recognition Applications held in France, in 1993. He presented numerous tutorials on neural network and statistical methods for learning from data. In 2007, he became Fellow of IEEE, for ‘contributions and leadership in statistical learning and neural network research’.

 

 

About DMIN'09DMIN'08DMIN'07DMIN'06

Receive Updates

Name

Email

An email will be sent from your computer with only name & email!!

   
 

Contact

Robert Stahlbock
General Conference Chair

E-mail: conference-chair@dmin-2010.com


Sven F. Crone

Programme Chair

E-mail: programme-chair@dmin-2010.com

 

Philippe Lenca

Tutorial Chair

E-mail: tutorial-chair@dmin-2010.com

 

This website is hosted by the Lancaster Centre for Forecasting at the Department of Management Science at Lancaster University Management School.

 

 

 

©  2002-2010 BI3S-lab - All rights reserved - The Lancaster Centre for Forecasting@ www.forecasting-centre.com - last update: 30.06.2010 - Questions, Comments and Enquiries via eMail