Overview
Given a large graph, which is the most important node? Can we plot and visualize the nodes in a low-dimensional space? A group of twitter users, form a dense network—is it suspicious?
Such questions have attracted high interest in industry and academia alike. Time series have also attracted high interest: Given a large collection of time series, such as web-click logs, electric medical records and motion capture sensors, how can we efficiently and effectively find typical patterns? What are the major tools for forecasting and outlier detection?
This course answers such questions, and cover the fundamentals and the most successful tools, including milestone algorithms like PageRank, HITS, recommendation systems, Belief Propagation, and tensors, for graphs; and ARIMA, Fourier/Wavelets, non-linear forecasting, for time series.
The objective of this tutorial is to provide a concise and intuitive overview of the most important tools that can help us find patterns in large-scale graphs, time-series, and time-evolving graphs. The emphasis of the tutorial is to provide the intuition behind these powerful tools, which is often lost in the technical literature, as well as to introduce case studies that illustrate their practical use.
Sample instructor(s)
Duration
3h, 4h, 6h (preferred), and 8h.
Customizable?
Yes, with respect to the case studies part.
In-person or remote
Remote
Intended audience
Researchers and practitioners working on anomaly detection, forecasting, social networks, pattern discover.
Takeaways
- Intuition behind the major tools in graph mining (PageRank, Belief Propagation, etc),
- Intuition behind time series analysis tools (ARIMA forecasting, Fourier/wavelets/SVD, non-linear forecasting, chaotic time series)
- “Recipes” on when (and when NOT) to use what tool
Course topics
Graph Mining:
- Patterns and “laws”
- Node importance – pageRank, HITS
- Anomaly detection in graphs
- Community detection - graph partitioning
- Belief propagation
Time series:
- Similarity functions (Euclidean, time-warping)
- Feature extraction – signal processing (Fourier, wavelets, SVD)
- Linear forecasting – ARIMA
- Non-linear forecasting – chaotic time series
Time-evolving graphs
- Tensors (Parafac decomposition)
- Case studies (phonecall network, social network)
Anomaly/fraud case studies
- Financial/accounting fraud detection
- computer network intrusion detection
- bot detection in social networks
Extras:
- Intuition behind SVD and ICA
- Visualization tools and explanations.
- Influence/virus propagation; viral marketing
Prerequisites
BS in CS or Stat; familiarity with linear algebra (SVD, matrix inversion)
Materials
- Copies of the presentation slides will be provided.
- Optional: Graph Mining: Laws, Tools, and Case Studies, Chakrabarti and Faloutsos, Morgan Claypool, 2012
Contact us
To learn about our custom programs and any upcoming open enrollments, reach out to Michael Lisanti.