Overview

Given a large graph, which is the most important node? Can we plot and visualize the nodes in a low-dimensional space? A group of twitter users, form a dense network—is it suspicious?

Such questions have attracted high interest in industry and academia alike. Time series have also attracted high interest: Given a large collection of time series, such as web-click logs, electric medical records and motion capture sensors, how can we efficiently and effectively find typical patterns? What are the major tools for forecasting and outlier detection?

This course answers such questions, and cover the fundamentals and the most successful tools, including milestone algorithms like PageRank, HITS, recommendation systems, Belief Propagation, and tensors, for graphs; and ARIMA, Fourier/Wavelets, non-linear forecasting, for time series.

The objective of this tutorial is to provide a concise and intuitive overview of the most important tools that can help us find patterns in large-scale graphs, time-series, and time-evolving graphs. The emphasis of the tutorial is to provide the intuition behind these powerful tools, which is often lost in the technical literature, as well as to introduce case studies that illustrate their practical use.

Sample instructor(s)

Duration

3h, 4h, 6h (preferred), and 8h.

Customizable?

Yes, with respect to the case studies part.

In-person or remote

Remote

Intended audience

Researchers and practitioners working on anomaly detection, forecasting, social networks, pattern discover.

Takeaways

  • Intuition behind the major tools in graph mining (PageRank, Belief Propagation, etc),
  • Intuition behind time series analysis tools (ARIMA forecasting, Fourier/wavelets/SVD, non-linear forecasting, chaotic time series)
  • “Recipes” on when (and when NOT) to use what tool

Course topics

Graph Mining:

  • Patterns and “laws”
  • Node importance – pageRank, HITS
  • Anomaly detection in graphs
  • Community detection  - graph partitioning
  • Belief propagation

Time series:

  • Similarity functions (Euclidean, time-warping)
  • Feature extraction – signal processing (Fourier, wavelets, SVD)
  • Linear forecasting – ARIMA
  • Non-linear forecasting – chaotic time series

Time-evolving graphs

  • Tensors (Parafac decomposition)
  • Case studies (phonecall network, social network)

Anomaly/fraud case studies

  • Financial/accounting fraud detection
  • computer network intrusion detection
  • bot detection in social networks

Extras:

  • Intuition behind SVD and ICA
  • Visualization tools and explanations.
  • Influence/virus propagation; viral marketing

Prerequisites

BS in CS or Stat; familiarity with linear algebra (SVD, matrix inversion)

Materials

Contact us

To learn about our custom programs and any upcoming open enrollments, reach out to Michael Lisanti.