Data Analysis with TDA Mapper
Spring 2017 Section 0001: 11:00A  12:15P TTh 113 MLH
Instructor: Dr. Isabel K. Darcy,
Department of Mathematics and AMCS,
University of Iowa
Office:B1H MLH
Phone: 335 0778
Email: isabeldarcy AT uiowa.edu
Office hours: Tuesdays/Thursday 8:50  9:15am, 12:30  1:35pm and by appointment.
TA: Maria GommelTDA Mapper was developed by Gurjeet Singh, Facundo Memoli and Gunnar Carlsson. The company Ayasdi is based on the Mapper algorithm. Both python versions and R versions are freely available. The algorithm is very simple (bin data into overlapping bins, cluster each bin, create a graph where vertices = clusters and two clusters are connected by an edge if they have points in common).
TENTATIVE CLASS SCHEDULEALL DATES SUBJECT TO CHANGE (click on date/section for pdf file of corresponding class material):
Tentative Schedule  HW/Announcements  
Week 1  

1/17  Professor Gunnar Carlsson Introduces Topological Data Analysis, Mapper slides, worksheet 
Icon Quiz 1 (Due 1/19 at 7:00 AM) over
Voronoi (6:06 min)
and
kmeans (9:10 min)
HW 1(Due 1/19) 10 points  
1/19  Meet in B5 MLH: Lab 1 files , slides  
FYI: scikitlearn clustering  
Week 2  
1/24  More TDA mapper 
HW 2 (Due 1/24)  5 points : Start writing code to create your own TDA mapper. Note, you only need to outline the algorithm using commenting.
Project HW 1 (Due 1/26)  5 points
HW 3 (due Thursday 1/26, 10 points):
Project HW 2 (Due 1/26)  10 points
 
1/26  Meet in B5 MLH: Lab 2  
FYI:  
Week 3  
1/31  Mapper Examples 
Icon
Quiz 2 (Due 2/2 at 7am) over TDA Mapper videos: Intro, slides Examples, slides Summary, slides
Project HW 3 (Due 2/2)  10 points  
2/2  Mapper applied to cancer data  
Week 4  
2/7 
Project HW 4 (Due 2/7)  10 points Slides describing a clustering method or poster introducing the TDA mapper algorithm.  
2/9  Minipresentations on clustering or TDA mapper(50 points)  
Week 5  
2/14  Lab files 
Project HW (Due 2/17) Intro draft including 2, 8, 10  
2/16  Github (optional)  
Week 6  
2/21  slides, Ayasdi resources, Patient Stratification, Iris data set, databasics.r, flaresTransformed.r  Icon Quiz 3 (10 points; Due 2/23 at 7:00 AM) over first 20 minutes of Applications of TDA to the Understanding of Disease and Drug Discovery, Pek Lum, Ayasdi  
2/23  slides, Identification of type 2 diabetes subgroups through topological analysis of patient similarity, Science Translational Medicine, 2015, Precision Medicine Using Topological Data Analysis,  
Week 7  
2/28  slides, 
Project HW 5 (Due 2/28)  10 points: Analyze dataset as described here Icon Quiz 4 (10 points; Due 2/28 at 7:00 AM) over first 40 minutes of Applications of TDA to the Understanding of Disease and Drug Discovery, Pek Lum, Ayasdi Icon Quiz 5 (10 points; Due 3/2 at 7:00 AM) over Applications of TDA to the Understanding of Disease and Drug Discovery, Pek Lum, Ayasdi Icon Quiz 6 Review (50 points; Due 3/2 at 7:00 AM)
Project HW 6 (Due 3/4)  20 points:
 
3/2  slides  
Week 8  
3/7  Midterm (100 points)  Project HW (Due Sunday 3/5) Intro draft including 2, 3 8, 10. Focus on description of TDA mapper algorithm.  
3/9  slides  
Week 9  
3/21  Exploring data with topological tools, Marinka Zitnik, XRDS: Crossroads, The ACM Magazine for Students, 2014, slides 
Icon Quiz 7 Reading (10 points; Due 3/21 at 7:00 AM) Icon Quiz 8 Reading (10 points; Due 3/23 at 7:00 AM)  
3/23  A Topological Data Analysis Approach to Visualizing Ebola Tweets, Herchel Thaddeus Machacon, 2016, slides  
Week 10  
3/28  KS statistics, ks.r  Project HW (Due Monday, 3/27) Polished draft including 2, 3 4, 7  10.
 
3/30  LAB: meet in B5 MLH (basement computer lab)  
Week 11  
4/4  slides Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. G. Singh, F. Memoli and G. Carlsson. Symposium on Point Based Graphics 2007, Prague, September 2007 
Project HW (due 4/5) Draft of your project which should be at least 50% done. Icon Quiz 9 Reading (10 points; Due 4/6 at 7:00 AM)
 
4/6  Simplicial complex and Topology  
Week 12  
4/11  Multiresolution/multiscale 
Icon Quiz 10 Reading (10 points; Due 4/11 at 7:00 AM) Icon Quiz 11 Reading (10 points; Due 4/13 at 7:00 AM)
 
4/13  Minipresentations  
Week 13  
4/18  Clustering and Multidimensional Mapper  Project HW (due 4/17) Draft of your project which should be at least 80% done. Icon Quiz 12 Reading (10 points; Due 4/20 at 7:00 AM)
 
4/20  Mapper on 3D Shape Database  
Week 14  
4/25  meet in computer lab B5 MLH 
Finished Project due 4/26 Project slides due 4/30  
4/27  Guest Lecturer Wako Bungulo: Grad School/Mapper PCA , Euler characteristic  
Week 15  
5/2  Student presentations (200 pts) Big Data 
HW 9 (due 5/2)
Summarize May 2th presentations (0 points) HW 10 (due 5/4) Summarize May 4th presentations(0 points) Project HW (due 5/4) Outside presentations, etc (300 pts) 

5/4 
Student presentations (200 pts) Cleaning Data 
Topology
and data, G Carlsson (2009) (Mapper: p. 281  289)
Topological methods for exploring
lowdensity
states in biomolecular folding pathways (2009)
Nov 22
video,
pptx,
pdf
An eQTL biological data visualization challenge and approaches
from the
visualization community (2011)
Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival M. Nicolau, A. J. Levine, G. Carlsson (2011) video, pptx, pdf
Extracting insights from the shape of complex data using topology P. Y. Lum, G. Singh, A. Lehman, T. Ishkanov, M. VejdemoJohansson, M. Alagappan, J. Carlsson, G. Carlsson (2013) video, pptx, pdf
video, pptx, pdf
Download Mapper
for Matlab
Python Mapper
Graphviz
Web tool: Progression
Analysis of Disease  PAD (includes Mapper)
Ayasdi Iris,
academic trial
DNA
MICROARRAY VIRTUAL LAB,
youtube
video
How to Analyze DNA Microarray Data,
Howard Hughes Medical Institute
Pearson
ProductMoment Correlation
Nov 20
video,
pptx,
pdf
Intro to RNA & Topological Landscapes for Visualization of ScalarValued Functions.
Generating and exploring a collection of topological landscapes
for visualization of scalarvalued functions.
by W. Harvey and Y. Wang, Comput. Graphics Forum (Special issue from EuroVis) 2010
Topological data analysis of Escherichia coli O157:H7 and nonO157 survival in soils (Sept 2014)
Additional readings
Mathworks Matlab Tutorials
Kaggle data analysis competitions
Data for MATLAB hackers (pre2010)
http://yann.lecun.com/exdb/mnist/
Using the MNIST Dataset
##############################################################
YOU CAN IGNORE EVERYTHING BELOW THIS LINE.
Persistent Topology and Metastable State in Conformational Dynamics
Week 11  

4/4  Discuss Preparatory Lecture 1:
The Euler characteristic (20:32
min) Optional FYI: Mobius band, Klein bottle Jeff Weeks: "Shape of Space" book, 2013 video (60 min), software, games The Geometry Center Shape of Space Video (11 min) 
Project HW (due 4/5) Draft of your project which should be at least 50% done. Icon Quiz 9 Reading (10 points; Due 4/4 at 7:00 AM) Icon Quiz 10 Reading (10 points; Due 4/6 at 7:00 AM)
Icon AT Quiz 1 (Due 4/4 at 7:00 AM) over Preparatory Lecture 1
Icon AT Quiz 2 (Due 4/6 at 7:00 AM) over Preparatory Lecture 2

4/6  Discuss Preparatory Lecture 2:
Addition and Free Abelian Groups (17:53 min),
Worksheet 1,
answers Intro to Data Analysis slides  
Week 12  
4/11  Discuss Preparatory Lecture 3:
Modular Arithmetic (9:40 min)
Worksheet 2,
Installing R and Rstudio, tips, pptx Start discussing On the Local Behavior of Spaces of Natural Images, Gunnar Carlsson, Tigran Ishkhanov, Vin de Silva, Afra Zomorodian, International Journal of Computer Vision January 2008, Volume 76, Issue 1, pp 112. slides 
Icon Quiz 14 Reading (10 points; Due 4/11 at 7:00 AM) Icon Quiz 15 Reading (10 points; Due 4/13 at 7:00 AM)
Icon AT Quiz 3 (Due 4/11 at 7:00 AM) over Preparatory Lecture 3
Icon AT Quiz 4 (Due 4/13 at 7:00 AM) over Preparatory Lecture 4

4/13  Discuss Preparatory Lecture 4:
Addition and Free Vector Spaces (21:26 min) Worksheet 3, Continue image analysis discussion  
Week 13  
4/18 
Discuss Preparatory Lecture 5:
Triangulations and Simplicial Complexes (28:19 min) Worksheet 4, Continue image analysis discussion 
Project HW (due 4/17) Draft of your project which should be at least 80% done. Icon Quiz 16 Reading (10 points; Due 4/20 at 7:00 AM)
Icon AT Quiz 5 (Due 4/18 at 7:00 AM) over Preparatory Lecture 5
Icon AT Quiz 6 (Due 4/20 at 7:00 AM) over Preparatory Lecture 6

4/20  Discuss Preparatory Lecture 6:
Creating a Simplicial Complex from
Data (28:15 min), Equivalence relations and partitions, worksheet 5  
Week 14  
4/25  Homology example, Create Your Own Homology 
Finished Project due 4/26 Project slides due 4/30 
4/27  Barcodes (pptx), Persistence Diagrams  
Week 15  
5/2  Student presentations (200 pts) 
HW 9 (due 5/2)
Summarize May 2th presentations HW 10 (due 5/4) Summarize May 4th presentations Project HW (due 5/4) Outside presentations, etc (300 pts) 
5/4  Student presentations (200 pts) 