The workshop will be held at B2府城廳, Shangri-La's Far Eastern Plaza Hotel. More information is
The proceeding can be downloaded here.
- Address: 89 Section West, University Road, Tainan City 70146, Taiwan
- Phone: +886-6-702-8888
- Website: http://www.shangri-la.com/tainan/fareasternplazashangrila/
The proceeding can be downloaded here.
May 13th, 2014, Tuesday.
13:30~14:30
14:30~15:00
15:00~15:15
15:15~15:30
15:30~15:45
15:45~16:00
16:00~16:15
16:15~16:30
16:30~16:45
16:45-17:00
17:00-17:15
|
Invited talk: Distributed data classification
Prof. Chih-Jen Lin (National Taiwan University) Large-scale classification in distributed environments has emerged as an important research topic because data larger than a machine's capacity are now very common. Unfortunately, most traditional classification algorithms were designed to run on a single computer, so creating new distributed algorithms is essential. In the first part of this talk, we discuss issues in parallelizing some existing linear classification methods. In particular, we propose a distributed Newton method for large-scale logistic regression and linear SVM. Interesting techniques are applied to reduce the communication cost. We present comparisons with state-of-the-art methods. In the second part of this talk, through a real-world application on CTR prediction, we show that the successful deployment of a distributed training method involves many other issues. For example, the storage/movement of raw data and the feature generation may be even more time consuming. The selection of parallel programming frameworks is also an important issue. We investigate these issues from the viewpoint of the whole workflow of big-data applications and present future challenges. Break
7W3: Using Knowledge Graph to Handle Label Imperfection
Yi Liu (Shanghai Jiao Tong University) and Huakang Li (Nanjing University of Posts and Telecommunications) 7W4: Learning to Display in Sponsored Search
Xin Xin (Beijing Institute of Technology) and Heyan Huang (Beijing Institute of Technology) 7W9: A Study of Rumor Spreading with Epidemic Model Based on Network Topology
Dawei Meng (Graduate School at Shenzhen,Tsinghua University), Lizhi Wan (Tsinghua University) and Lei Zhang (Graduate School at Shenzhen,Tsinghua University) 7W11: A scalable data analytics algorithm for mining frequent patterns from uncertain data
Carson K. Leung (University of Manitoba) 7W12: Parallel Time Series Modeling - A Case Study of In-Database Big Data Analytics
Hai Qian (Pivotal Inc.), Rahul Iyer (Pivotal Inc.), Xixuan Feng (Pivotal Inc.), Caleb Welton (Pivotal Inc.) and Shengwen Yang (Pivotal Inc.) 7W15: A Biclustering-based Classification Framework for Microarray Analysis
Baljeet Malhotra (SAP Research & Innovation), Daniel Dahlmeier (SAP Research & Innovation) and Naveen Nandan (SAP Research & Innovation) 7W16: Parallel Implementation of a Density-based Stream Clustering Algorithm over a GPU Scheduling System
Marwan Hassani (RWTH Aachen University), Ayman Tarakji (RWTH Aachen University), Lyubomir Georgiev (RWTH Aachen University) and Thomas Seidl (RWTH Aachen University) 7W5: Models for Distributed, Large Scale Data Cleaning [Video]
Vincent Maccio (McMaster University), Fei Chiang (McMaster University) and Douglas Down (McMaster University) 7W10: GS4: Generating Synthetic Samples for Improving Semi-Supervised Nearest Neighbor Classification Accuracy [Video]
Panagiotis Moutafis (University of Houston) and Ioannis Kakadiaris (University of Houston) |