报告题目:Efficient and Exact Methods for Big Data Reduction
报 告 人:王杰
报告时间:下周三(3月30日)上午10点
报告地点:将军路校区计算机楼507
报告简介:
Abstract: Big data concerns growing data sets with a huge number of samples, very high-dimensional feature vectors, and complex and diverse structures. Many traditional techniques are inadequate to extract knowledge and insights from these data sets due to their ever-greater volume and complex structures. An encouraging discovery of the early empirical studies on big data was the recognition that many massive real-world data sets can be well interpreted by a few number of features and/or samples. For example, given certain visual stimulus, the fraction of active neurons at that instance is small. This observation enlightens an emerging research area known as sparse learning that has achieved great success in learning from large and complex data by uncovering a small set of most explanatory features and/or samples. Typical examples include selecting features that are the most indicative of users’ preferences for recommender systems, identifying brain regions that are predictive of brain disorder based on fMRI data, and extracting semantic information from raw images for object recognition. Despite of the great success, the learning process of sparse learning methods on large and complex data can be very time-consuming due to their non-smooth and highly complex regularization terms. To address this, we propose a suite of novel techniques, called screening, to quickly identify the redundant features and/or samples—that can be removed from the training phase—without losing useful information of interests. Success in these unique screening techniques is expected to dramatically scaling up sparse learning methods for large and complex data in terms of efficiency and memory usage, by several orders of magnitude. This will significantly expand the use of sparse learning methods to much bigger data sets that were previously impossible, leading to direct impact on many fields where sparse learning is critical, e.g., social media mining, brain data analytics, and imaging genetics.
报告人简介:
Dr. Wang received the B.Sc. degree in electronic information science and technology from the University of Science and Technology of China in 2005, and the Ph.D. degree in computational science from the Florida State University in 2011. He was a postdoctral researcher at Arizona State University and University of Michigan from 2012 to 2015. He was then promoted to a research assistant professor at University of Michigan in 2015. He has broad interests in large-scale optimization, machine learning, data mining, etc., and their applications to biomedical informatics. He has published many papers on top machine learning and data mining journals and conferences such as TPAMI, JMLR, TIP, NIPS, ICML and KDD
计算机科学与技术学院(&国际合作交流处)