Given a large graph, which is the most important node? Can we plot and visualize the nodes in a low-dimensional space? A group of twitter users, form a dense network—is it suspicious?
Such questions have attracted high interest in industry and academia alike. Time series have also attracted high interest: Given a large collection of time series, such as web-click logs, electric medical records and motion capture sensors, how can we efficiently and effectively find typical patterns? What are the major tools for forecasting and outlier detection?
This course answers such questions, and cover the fundamentals and the most successful tools, including milestone algorithms like PageRank, HITS, recommendation systems, Belief Propagation, and tensors, for graphs; and ARIMA, Fourier/Wavelets, non-linear forecasting, for time series.
The objective of this tutorial is to provide a concise and intuitive overview of the most important tools that can help us find patterns in large-scale graphs, time-series, and time-evolving graphs. The emphasis of the tutorial is to provide the intuition behind these powerful tools, which is often lost in the technical literature, as well as to introduce case studies that illustrate their practical use.
3h, 4h, 6h (preferred), and 8h.
Yes, with respect to the case studies part.
In-Person or Remote
Researchers and practitioners working on anomaly detection, forecasting, social networks, pattern discover.
- Intuition behind the major tools in graph mining (PageRank, Belief Propagation, etc),
- Intuition behind time series analysis tools (ARIMA forecasting, Fourier/wavelets/SVD, non-linear forecasting, chaotic time series)
- ‘Recipes’ on when (and when NOT) to use what tool
- Patterns and ‘laws’
- Node importance – pageRank, HITS
- Anomaly detection in graphs
- Community detection - graph partitioning
- Belief propagation
- Similarity functions (Euclidean, time-warping)
- Feature extraction – signal processing (Fourier, wavelets, SVD)
- Linear forecasting – ARIMA
- Non-linear forecasting – chaotic time series
- Tensors (Parafac decomposition)
- Case studies (phonecall network, social network)
Anomaly/Fraud case studies
- Financial/accounting fraud detection
- computer network intrusion detection
- bot detection in social networks
- Intuition behind SVD and ICA
- Visualization tools and explanations.
- Influence/virus propagation; viral marketing
BS in CS or Stat; familiarity with linear algebra (SVD, matrix inversion)
- Copies of the presentation slides will be provided.
- Optional: Graph Mining: Laws, Tools, and Case Studies, Chakrabarti and Faloutsos, Morgan Claypool, 2012
To learn about our custom programs and any upcoming open enrollments, reach out to Michael Lisanti.