Skip to main content

 

Course Notes:

  1.   STOR881-08-22-2017:   Organizational Matters, OODA Book, What is OODA?,  Taste of OODA Examples, Visualization, Scatterplot Matrix Views, Principal Component Analysis (PCA)
  2.   STOR881-08-24-2017:   OODA Basics, Data Object Determination & Representation, Object & Descriptor Spaces, 2-d toy Example, Curves as Data 10-d & 50-d, RNAseq Data, Revisit Mortality Data
  3.   STOR881-08-29-2017:   Correlation PCA, Limitations of PCA, NCI60 data, Marginal Distribution Plots, Start Drug Discovery Data
  4.   STOR881-08-31-2017:   Continue Drug Discovery Data, Marron’s Matlab Software, DiProPerm Hypothesis Test
  5.   STOR881-09-05-2017:   Melanoma Data, Transformations,  Revisit Drug Discovery Data,  Yeast Cell Cycle Data & Fourier Subspace
  6.   STOR881-09-07-2017:   Review of Linear Algebra & Multivariate Probability, PCA as Optimization, Redistribution of Energy
  7.   STOR881-09-12-2017:   Different Views of PCA, Data Representation –  Simulation – Visualization, Dual PCA & Mortality Data, Cornea Data & Robustness
  8.   STOR881-09-14-2017:   Cornea Data, Robustness: Center & PCA, Spherical PCA, Elliptical PCA
  9.   STOR881-09-19-2017:   GWAS Analysis, Classification: Fisher Linear Discrimination, Gaussian Likelihood Ratio, Mean Difference
  10.   STOR881-09-21-2017-part1, STOR881-09-21-2017-part2, STOR881-09-21-2017-part3:   HDLSS Discrimination, Maximal Data Piling
  11.   STOR881-09-26-2017:    Kernel Embedding, Support Vector Machine, Distance Weighted Discrimination, Faces Data
  12.   STOR881-09-28-2017:    DWD Simulations, Batch Adjustment, HDLSS Asymptotics – Jonathan Williams {Bayesian HMM}, Ruibin Ma {Generalized Cylindrical Surface Deformation}
  13.   STOR881-10-03-2017:    Why DWD for Batch Adjustment, HDLSS Asymptotics – Yunxiao Liu {Integrated Volatility Functionals}
  14.   STOR881-10-05-2017:     HDLSS Asymptotics – Jack Prothero {Image Textures}
  15.   STOR881-10-10-2017:    Meilei Jiang {Angle Based Joint & Individual Variation Explained}
  16.   STOR881-10-12-2017:    University Day – No Class
  17.   STOR881-10-17-2017:    Radial DWD, Melanoma Data & ROC curves, Introduction to Clustering – Zhenlin Xu {Introduction to 3D deep learning}, Dylan Glotzer {Extreme Ship Motions}
  18.   STOR881-10-19-2017:    Fall Break
  19.   STOR893-10-24-2017:    Statistical Smoothing – Brendan Brown, Chen Shen, Wesley Hamilton {Topological Data Analysis)
  20.   STOR881-10-26-2017:    SiZer for Inference and Analysis of Mass Flux & Cell Cycle Data, Clustering, K-means, SWISS – Duyeol Lee {PCA in Credit Risk Modelling}
  21.   STOR881-10-31-2017:    Hierarchical Clustering, SigClust, QQ Plots, QQ Envelope – Kevin Donovan {Non-parametric inference for immune response thresholds of risk in vaccine studies}, Matt Jansen {Text Mining}
  22.   STOR881-11-02-2017:    SigClust, Shapes as Data Objects – Aniish Sridhar {Analytics Competition}, Aditya Balaram {Single Pass PCA}
  23.   STOR881-11-07-2017:    Landmark Based Shape, Equivalence Relations, Quotient Spaces, Shape Representations, Male Pelvis Data & S-Reps – Gang Li {Boosting Methods}, Peiyao Wang {Sparse gradient learning}, Michael Conroy {Regularized PCA}
  24.   STOR881-11-09-2017:     Manifold Data Analysis, Principal Nested SpheresBackwards PCA – Mark He {Commuting networks amongst US counties}, Adam Waterbury {Reproducing Kernels for FDA}
  25.   STOR881-11-14-2017:    Backwards PCA, Nonnegative Matrix Factorization – Aman Barot {Introduction to Deep learning}, Pooja Saha {LASSO regression}, Yue Jiang {CART}
  26.   STOR881-11-16-2017:    Nested Constraints, Principal Nested Submanifolds – Shengjie Chai {Cancer Metastesis}, Di Qin {Kernel PCA}, Yaoyu Chen {Introduction to Generative Adversarial Networks}
  27.   STOR881-11-21-2017:    Curve Registration, Fisher Rao Approach,  – Xi Yang {Multi-View Weighted Network}, Hang Yu {Introduction to multiple kernel learning}, Zhipeng Ding {Fast Predictive Simple Geodesic Regression}
  28.   STOR881-11-23-2017:    Thanksgiving
  29.   STOR881-11-28-2017:    Curve Registration, TIC Data, PNS Approach, Juggling Data Yumeng Wang {Efficacy Analysis}, Jiawei Xu {Childbirth and breast cancer risk}
  30.   STOR881-11-30-2017:    Probability Distributions as Data Objects, Random Matrix Theory,  Zhengling Qi {Classification in personalized medicine}, Zhiyuan Liu {CPNS Visualization in Pablo}, Fuhui Fang {DiProPerm Analysis of OsteoArthritis Data}
  31.   STOR881-12-05-2017-part1, STOR881-12-05-2017-part2:    Tree Structured Data Objects

Software:

Link to Marron’s Matlab Software (.zip file, expand to 4 directories, and put those in Matlab Path)

LungCancer2011.m for Analysis of 2011 RNAseq Lung Cancer Data (you need to remove suffix “.txt” from file name)

counts, for 2011 RNAseq Lung Cancer Data

exonsMarron, for 2011 RNAseq Lung Cancer Data

Single .zip file with above 3, plus generated graphics

References:

Ahn, J. (2006) High dimension, low sample size data analysis. PhD Dissertation, University of North Carolina, Chapel Hill (cited 9/26/17)

Ahn, J., Marron, J. S., Muller, K. M., & Chi, Y. Y. (2007) The high-dimension, low-sample-size geometric representation holds under mild conditions. Biometrika, 94(3), 760-766 (cited 10/5/17)

Ahn, J., & Marron, J. S. (2010) The maximal data piling direction for discrimination. Biometrika, 97(1), 254-259 (cited 9/21/17)

Aizerman, A., Braverman, E. M., & Rozoner, L. I. (1964) Theoretical foundations of the potential function method in pattern recognition learning. Automation and remote control, 25, 821-837 (cited 9/21/17)

Alter, O., Brown, P. O., & Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences, 97, 10101-10106 (cited 9/28/17)

Anderson, T. W., & Darling, D. A. (1952) Asymptotic theory of certain” goodness of fit” criteria based on stochastic processes. The Annals of Mathematical Statistics, 193-212  (cited 9/05/17)

Aydin, B., Pataki, G., Wang, H., Bullitt, E., & Marron, J. S. (2009). A principal component analysis for trees. The Annals of Applied Statistics, 1597-1615  (cited 12/05/17)

DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44, 837-845 (cited 10/5/17)

Bai, Z. D., & Saranadasa, H. (1996) Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6(2), 311-329 (cited 8/31/17)

Bendich, P., Marron, J. S., Miller, E., Pieloch, A., & Skwerer, S. (2016). Persistent homology analysis of brain artery trees. The annals of applied statistics, 10(1), 198  (cited 12/05/17)

Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C. M., & Marron, J. S. (2004) Adjustment of systematic microarray data biases. Bioinformatics, 20(1), 105-114 (9/28/17)

Benito, M., García‐Portugués, E., Marron, J. S., & Peña, D. (2017). Distance‐weighted discrimination of face images for gender classification. Stat, 6(1), 231-240 (cited 9/26/17)

Bickel, P. J. and Levina, E. (2004) Some theory for Fisher’s Linear Discriminant function, “naive Bayes”, and some alternatives when there are many more variables than observations, Bernoulli, 10, 989-1010 (cited 9/21/17)

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.  (cited 9/26/19)

Bloomfield, P. (2004) Fourier analysis of time series: an introduction. John Wiley & Sons (cited 9/05/17)

Bookstein, F. L. (1991). Morphometric Tools for Landmark Data, Cambridge: Cambridge University Press (cited 11/2/17)

Born, M. and Wolf, E. (1980) Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light, Pergamon Press, New York (cited 9/12/17)

Borysov, P., Hannig, J., Marron, J. S., Muratov, E., Fourches, D., & Tropsha, A. (2016). Activity prediction and identification of mis‐annotated chemical compounds using extreme descriptors. Journal of Chemometrics, 30(3), 99-108 (cited 8/31/17)

Boser, B. E., Guyon, I. and Vapnik, V. (1992) A Training Algorithm for Optimal Margin Classifiers, in Fifth Annual Workshop on Computational Learning Theory, ACM (cited 9/26/17)

Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252 (cited 9/5/17)

Bradley, R. C. (2005). Basic properties of strong mixing conditions. A survey and some open questions. Probab. Surv. 2 107–144 (electronic). (Update of, and a supplement to, the 1986 original.)  (cited 10/5/17)

Brillinger, D. R. (2001). Time series: data analysis and theory (Vol. 36). Siam (cited 9/05/17)
Brooks, J. P., Dulá, J. H., & Boone, E. L. (2013). A pure L1-norm principal component analysis. Computational statistics & data analysis, 61, 83-98 (cited 9/14/17)
Bubenik, P. (2015). Statistical topological data analysis using persistence landscapes. The Journal of Machine Learning Research, 16(1), 77-102  (cited 12/05/17)
Aylward, S. R., & Bullitt, E. (2002). Initialization, noise, singularities, and scale in height ridge traversal for tubular object centerline extraction. IEEE transactions on medical imaging, 21(2), 61-75  (cited 12/05/17)
Burges, C. J. C. (1998) A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 2, 121-167 (cited 9/26/17)
Cabanski, C. R., Qi, Y., Yin, X., Bair, E., Hayward, M. C., Fan, C., Li, J., Wilkerson, M. D., Marron, J. S., Perou, C. M. and Hayes, D. N. (2010) SWISS MADE: Standardized WithIn Class Sum of Squares to Evaluate Methodologies and Dataset Elements, PLoS ONE, 5(3): e9905.doi:10.1371/journal.pone.0009905, PMCID: PMC2845619.   (cited 10/26/17, 10/31/17)

Cai, T., Liu, W., & Xia, Y. (2014). Two‐sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2), 349-372 (cited 8/31/17)

Carmichael, I., & Marron, J. S. (2017). Geometric Insights into Support Vector Machine Behavior using the KKT Conditions. arXiv preprint arXiv:1704.00767 (cited 9/26/17, 10/3/17)

Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245-276 (cited 9/7/17)

Chaudhuri, P. and Marron, J. S. (1999) SiZer for exploration of structure in curves, Journal of the American Statistical Association, 94, 807-823 (cited 10/26/17)

Chaudhuri, P., & Marron, J. S. (2000). Scale space view of curve estimation. Annals of Statistics, 408-428 (cited 10/26/17)

Chen, S. X., & Qin, Y. L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. The Annals of Statistics, 808-835  (cited 8/31/17)

Cootes, T. F., Hill, A., Taylor, C. J. and Haslam, J. (1993) The use of active shape models for locating structures in medical images, Information in Medical Imaging, H. H. Barret and A. F. Gmitro, eds. Lecture Notes in Computer Science 687, 33-47, Springer Verlag, Berlin (cited 11/7/17)

CRAN-DWD (2014). https://cran.r-project.org/package=DWD (cited 9/26/17)

Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines, Cambridge University Press (cited 9/26/17)

Dai, W., & Genton, M. G. (2016). Directional outlyingness for multivariate functional data. arXiv preprint arXiv:1612.04615 (cited 9/14/17)

Dai, W., & Genton, M. G. (2017). Multivariate Functional Data Visualization and Outlier Detection. arXiv preprint arXiv:1703.06419. (cited 9/14/17)

Damon, J., & Marron, J. S. (2014). Backwards principal component analysis and principal nested relations. Journal of Mathematical Imaging and Vision, 50(1-2), 107-114 (cited 11/16/17)

DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44, 837-845 (cited 10/17/17)

Dobriban, E. (2015). Efficient computation of limit spectra of sample covariance matrices. Random Matrices: Theory and Applications, 4(04), 1550019 (cited 11/30/17)

Domingos, P. & Pazzani, M. (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29:103–­137 (cited 9/19/17)

Dryden, I.L., Mardia, K.V. (1998) Statistical Shape Analysis, Wiley, Chichester (cited 11/2/17)

Duda, R. O. and Hart P. E. (1973) Pattern Classification and Scene Analysis, Wiley, New York (cited 9/19/17)

Duda, R. O., Hart P. E. and Stork, D. G. (2001) Pattern Classification, Wiley, New York (cited 9/19/17)

El Karoui, N. (2010). The spectrum of kernel random matrices. The Annals of Statistics, 38(1), 1-50 (cited 10/5/17)

Eltzner, B., Jung, S., & Huckemann, S. (2015). Dimension reduction on polyspheres with application to skeletal representations. In International Conference on Networked Geometric Science of Information (pp. 22-29). Springer, Cham. (cited 11/14/17)

Fan, J., & Gijbels, I. (1996). Local Polynomial Modelling and Its Applications, Chapman and Hall, London (cited 10/17/17)

Fasy, B. T., Lecci, F., Rinaldo, A., Wasserman, L., Balakrishnan, S., & Singh, A. (2014). Confidence sets for persistence diagrams. The Annals of Statistics, 42(6), 2301-2339  (cited 12/05/17)

Feng, Q., Hannig, J., & Marron, J. S. (2016). A note on automatic data transformation. Stat, 5(1), 82-87  (cited 8/31/2017)

Fisher, R.A. (1936) The Use of Multiple Measurements in Taxonomic Problems, Annals of Eugenics, 7, 179-188  (cited 9/19/17)

Fisher, N. I. (1983) Graphical Methods in Nonparametric Statistics: A Review and Annotated Bibliography, International Statistical Review, 51, 25-58  (cited 10/31/17)

Fletcher, P. T. (2004) Statistical variability in nonlinear spaces: Application to shape analysis and DT-MRI, University of North Carolina at Chapel Hill  (cited 11/9/17)
Fréchet, M. (1948) Les éléments aléatoires de nature quelconque dans un espace distancié, Annales de l’institut Henri Poincaré, 10, 215-310 (cited 11/9/17)

Gabriel, K. R. (1971) The biplot display of matrices with application to principal component analysis, Biometrika, 58, 467  (cited 9/12/17)

Gersho, A. and Gray, R. M. (1991) Vector Quantization and Signal Compression, Springer, New York  (cited 10/26/17)

Godtliebsen, F., Marron, J. S., & Chaudhuri, P. (2002). Significance in scale space for bivariate density estimation. Journal of Computational and Graphical Statistics, 11(1), 1-21 (cited 10/26/17)

Godtliebsen, F., Marron, J. S., & Chaudhuri, P. (2004). Statistical significance of features in digital images. Image and Vision Computing, 22(13), 1093-1104 (cited 10/26/17)

Godtliebsen, F., Marron, J. S., & Pizer, S. M. (2002). Significance in scale-space for clustering. Spatial clustering modeling. Chapman and Hall/CRC, 24-36 (cited 10/26/17)

Good, I. J., & Gaskins, R. A. (1980). Density estimation and bump-hunting by the penalized likelihood method exemplified by scattering and meteorite data. Journal of the American Statistical Association, 75(369), 42-56 (cited 10/17/17)

Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics, Wiley (cited 10/17/17)

Haldane, J. B. S. (1948) Note on the median of a multivariate distribution, Biometrika, 35, 414-415 (cited 9/14/17)

Hall, P., Marron, J. S., & Neeman, A. (2005). Geometric representation of high dimension, low sample size data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(3), 427-444. (cited 10/3/17, 10/5/17)

Hampel, F. M., Ronchetti, E. R., Rouseeuw, P. J. and Stahel, W. A. (2011) Robust Statistics: the Approach Based on Influence Functions, Wiley, New York (cited 9/12/17)

Hannig, J., & Marron, J. S. (2006). Advanced distribution theory for SiZer. Journal of the American Statistical Association, 101(474), 484-499 (cited 10/26/17)

Hannig, J., Marron, J. S., & Riedi, R. (2001). Zooming statistics: Inference across scales. Journal of the Korean Statistical Society, 30(2), 327-345 (cited 10/26/17)

Hartigan, J. A. (1975) Clustering Algorithms, Wiley, New York  (cited 10/26/17)

Hastie, T., & Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 84(406), 502-516 (cited 11/14/17)

Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning New York. NY: Springer, 115-163 (cited 9/26/17)

Hotelling, H. (1933) Analysis of a Complex of Statistical Variables Into Principal Components. Journal of Educational Psychology, 24, 417-441 (cited 8/22/17)

Hsu, C.-W. and Lin, C.-J. (2002) A comparison of methods for multiclass support vector machines, IEEE Transactions on Neural Networks, 13, 415-425 (cited 9/26/17)

Huang, H., Liu, Y., Yuan, M. and Marron J.S. (2014) Statistical Significance of Clustering Using Soft Thresholding, Journal of Computational and Graphical Statistics, DOI:10.1080/10618600.2014.948179 (cited 10/31/2017)

Huber, P. (2011) Robust Statistics. Wiley, New York (cited 9/12/17)

Huckemann, S., Hotz, T., & Munk, A. (2010). Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric lie group actions. Statistica Sinica, 1-58 (cited 11/9/17)

Inselberg, A. (1985) The Plane with Parallel Coordinates, Visual Computer 1: 69–91 (cited 8/24/17, 9/14/17)

Inselberg, A. (2009) Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications. Springer, New York (cited 8/24/17, 9/14/17)

Izenman, A. J., & Sommer, C. J. (1988). Philatelic mixtures and multimodal densities. Journal of the American Statistical association, 83(404), 941-953 (cited 10/17/17)

Jeong, J.-Y. (2009) Estimation of Probability Distributions on Multiple Anatomical Objects and Evaluation of Statistical Shape Models, Ph.D. Thesis, Department of Computer Science, University of North Carolina (cited 11/7/17)

John, S. (1972) The distribution of a statistic used for testing sphericity of normal distributions. Biometrika, 59(1), 169-173 (cited 10/3/17)

Jolliffe, I. T. (2002) Principal Component Analysis, Springer, New York, 2nd Edition, ISBN 978-0-387-95442-4   (cited 9/05/17)

Johnstone, I. M. (2008). Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Annals of statistics, 36(6), 2638 (cited 11/30/17)

Johnstone, I. M., & Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104(486), 682-693 (cited 10/5/17)

Jones, M. C., Marron, J. S., & Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association, 91(433), 401-407 (cited 10/24/17)

Jung, S., & Marron, J. S. (2009). PCA consistency in high dimension, low sample size context. The Annals of Statistics, 37(6B), 4104-4130 (cited 10/5/17)

Jung, S., Liu, X., Marron, J. S., & Pizer, S. M. (2010). Generalized PCA via the backward stepwise approach in image analysis. In Brain, Body and Machine (pp. 111-123). Springer Berlin Heidelberg (cited 11/14/17)

Jung, S., Foskey, M., & Marron, J. S. (2011). Principal arc analysis on direct product manifolds. The Annals of Applied Statistics, 578-603 (cited 11/9/17)

Jung, S., Dryden I. L., & Marron, J. S., (2012) Analysis of Principal Nested Spheres, Biometrika, doi: 10.1093/biomet/ass022 (cited 11/9/17)

Jung, S., Sen, A. and Marron, J. S. (2012), Boundary behavior in high dimension, low sample size asymptotics of PCA, The Journal of Multivariate Analysis,109, 190–203  (cited 10/5/17)

Kaufman, L. and Rousseeuw, P. J. (2005) Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York  (cited 10/26/2017)

Keleman, A. Szèkely, G. and Gerig, G. (1997 & 1999) Three dimensional model-based segmentation, TR-178 Technical Report Image Scinec Lab, ETH Zurich & Elastic model-based segmentation of 3-D neuroradiological daat sets, IEEE Transactions on Medical Imaging, 18, 828-839 (cited 11/7/17)

Kendall, D.G., Barden, D., Carne, T.K. and Le, H. (1999) Shape and Shape Theory, Wiley, Chichester (cited 11/2/17)

Kimes, P. K., Cabanski, C. R., Wilkerson, M. D., Zhao, N., Johnson, A. R., Perou, C. M., Makowski, L., Marron, J. S., Hayes, D. N. (2014) SigFuge: single gene clustering of RNA-seq reveals differential isoform usage among cancer samples, Nucleic Acids Research (2014): gku521 (cited 8/24/17)

Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., … & Bracken, M. B. (2005). Complement factor H polymorphism in age-related macular degeneration. Science, 308, 385-389 (cited 10/5/17)

Koch, I., Hoffmann, P., & Marron, J. S. (2014). Proteomics profiles from mass spectrometry. Electronic Journal of Statistics, 8(2), 1703-1713 (11/28/17)

Kruskal, J. B. (1964). Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29(2), 115-129 (cited 9/7/17, 9/12/17)

LeBlanc, M., & Tibshirani, R. (1996). Combining estimates in regression and classification. Journal of the American Statistical Association, 91(436), 1641-1650 (cited 11/14/17)

Lee, D. D., & Seung, H. S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755), 788-791 (cited 11/14/17)

Lee, Y., Lin, Y. and Wahba, G. (2004) Multicategory Support Vector Machines, Theory, and Application to the Classification of Microarray Data and Satellite Radiance Data, Journal of the American Statistical Association, 99, 67-81 (cited 9/26/17)

Li, G. and Chen, Z. (1985) Projection pursuit approach to robust dispersion matrices and principal components: primary theory and Monte Carlo, Journal of the American Statistical Association, 80, 759-776 (cited 9/14/17)

Lindeberg, T. (1994) Scale Space Theory in Computer Vision, Kluwer (cited 10/26/17)

Liu, X. (2007). New statistical tools for microarray data and comparison with existing tools. The University of North Carolina at Chapel Hill (cited 10/5/17)
Liu, Y., Hayes, D. N., Nobel, A. and Marron, J. S. (2008) Statistical Significance of Clustering for High Dimension Low Sample Size Data, Journal of the American Statistical Association, 103, 1281-1293  (cited 10/31/17)

Liu, X., Parker, J., Fan, C., Perou, C. M., & Marron, J. S. (2009). Visualization of cross-platform microarray normalization. Batch Effects and Noise in Microarray Experiments: Sources and Solutions. Wiley, New York, 167-181 (cited 8/29/17, 10/3/17)

Locantore, N., Marron, J. S., Simpson, D. G., Tripoli, N., Zhang, J. T., Cohen, K. L., … & Fan, J. (1999). Robust principal component analysis for functional data. Test, 8(1), 1-73 (cited 9/12/17, 9/14/17)

Lu, X., & Marron, J. S. (2014). Analysis of juggling data: Object oriented data analysis of clustering in acceleration functions. Electronic Journal of Statistics, 8(2), 1842-1847 (cited 11/28/17)

MacQueen, J. B. (1967) Some Methods for Classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, 281-297, University of California Press, Berkeley  (cited 10/26/17)

Maggiora, G. M. (2006). On outliers and activity cliffs why QSAR often disappoints  (cited 8/19/17)

Marčenko, V. A., & Pastur, L. A. (1967). Distribution of eigenvalues for some sets of random matrices. Mathematics of the USSR-Sbornik, 1(4), 457 (cited 11/30/17)

Maronna, R. Martin, D., and Yohai, V. (2006) Robust Statistics: Theory and Methods, Wiley, New York (cited 2/4/16)

Marron, J. S. & Alonso, A. M. (2014) Overview of object oriented data analysis, Biometrical Journal, 56, 732-753 (cited 8/22/17, 8/24/17)

Marron, J. S., Ramsay, J. O., Sangalli, L. M., & Srivastava, A. (2014). Statistics of time warpings and phase variations. Electronic Journal of Statistics, 8(2), 1697-1702 (cited 11/28/17)

Marron, J. S., Ramsay, J. O., Sangalli, L. M., & Srivastava, A. (2015). Functional data analysis of amplitude and phase variation. Statistical Science, 30(4), 468-484 (cited 11/28/17)

Marron, J. S., Todd, M. J., & Ahn, J. (2007). Distance-weighted discrimination. Journal of the American Statistical Association, 102(480), 1267-1271 (cited 9/26/17)

Marron, J. S., & Wand, M. P. (1992). Exact mean integrated squared error. The Annals of Statistics, 712-736 (cited 10/26/17)

McLachlan, G. J. (2004) Discriminant Analysis and Statistical Pattern Recognition, Wiley-Interscience (cited 9/19/17)

Miao, D. (2015) Class-Sensitive Principal Components Analysis , UNC PhD Dissertation, https://cdr.lib.unc.edu/record/uuid:853d8c52-5b4a-4607-afff-9554b68bb6f5 (cited 9/21/17)

Miedema, J., Marron, J. S., Niethammer, M., Borland, D., Woosley, J., Coposky, J. & Thomas, N. E. (2012). Image and statistical analysis of melanocytic histology. Histopathology, 61(3), 436-444 (cited 9/5/17, 10/17/17)

Milasevic, P. and Ducharme, J. R. (1987) Uniqueness of the spatial median, Annals of Statistics, 15, 1332-1333 (cited 9/14/17)

Owen, S. J. (1998) A survey of Mesh Generation Technology, http://www.imr.sandia.gov/papers/imr7/owen_meshtech98.ps.gz (cited 11/7/17)

Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17(4), 1617 (cited 10/5/17)

Pearson, K. (1901) On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine, 2, 559-572 (cited 8/22/17)

Pennec, X. (2015). Barycentric subspaces and affine spans in manifolds. In International Conference on Networked Geometric Science of Information (pp. 12-21). Springer, Cham. (cited 11/16/17)

Pennec, X. (2016). Barycentric Subspace Analysis on Manifolds. arXiv preprint arXiv:1607.02833. (cited 11/16/17)

Perou, C. M., Sorlie, T., Eisen, M. B., & Van De Rijn, M. (2000). Molecular portraits of human breast tumours. nature, 406(6797), 747 (cited 9/28/17, 10/31/17)

Pizer, S. M., Jung, S., Goswami, D., Vicory, J., Zhao, X., Chaudhuri, R., … & Marron, J. S. (2013). Nested sphere statistics of skeletal models. In Innovations for Shape Analysis (pp. 93-115). Springer Berlin Heidelberg (11/14/17)

Qiao, X., Zhang, H. H., Liu, Y., Todd, M. J., & Marron, J. S. (2010). Weighted distance weighted discrimination and its asymptotic properties. Journal of the American Statistical Association, 105(489), 401-414 (cited 10/5/17)

Ramsay, J. O. & Silverman, B. W. (2002) Applied Functional Data Analysis, Springer, N.Y. ISBN 0-387-95414-7 (cited 8/22/17)

Ramsay, J. O. & Silverman, B. W. (2005) Functional Data Analysis, 2nd Edition, Springer, N.Y. ISBN 0-387-40080-X (cited 8/22/17)

Ramsay, J. O. (2005) Functional Data Analysis Web Site, http://psych.mcgill.ca/misc/fda/  (cited 8/22/17)

Ramsay, J. O., Gribble, P., & Kurtek, S. (2014). Description and processing of functional data arising from juggling trajectories. Electronic Journal of Statistics, 8(2), 1811-1816 (cited 11/28/17)

Rao, C. R. (1958). Some statistical methods for comparison of growth curves. Biometrics, 14(1), 1-17  (cited 9/05/17)

Rondonotti, V., Marron, J. S., & Park, C. (2007). SiZer for time series: a new approach to the analysis of trends. Electronic Journal of Statistics, 1, 268-289 (cited 10/26/17)

Rousseeuw, P. J., & Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John Wiley & Sons (cited 9/14/17)

Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323-2326 (cited 11/14/17)

Royer, J.-Y. and Chang, T. (1991) Evidence for relative motions between the Indian and Australian Plates during the last 20 m.y. from plate tectonic reconstructions: Implications for the deformation of the Indo-Australian Plate, Journal of Geophysical Research, 96(B7), 11,779–11,802, doi:10.1029/91JB00897 (cited 11/9/17)

Sarle, W. S., and Kuo, A. H. (1993), The MODECLUS Procedure, Technical Report P-256, SAS Institute Inc., Cary  (cited 10/31/2017)

Schmitz, H. P. and Marron, J. S. (1992) Simultaneous estimation of several size distributions of  income, Econometric Theory, 8, 476-488  (cited 10/26/17)

Schölkopf, B., & Smola, A. J. (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press (cited 9/21/17)

Schölkopf, B., & Smola, A. J. (2002. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press (cited 9/26/17)

Schwiegerling, J., Greivenkamp, J. E., & Miller, J. M. (1995) Representation of videokeratoscopic height data with Zernike polynomials. JOSA A, 12(10), 2105-2113 (cited 9/12/17)

Sen, S. K., Foskey, M., Marron, J. S., & Styner, M. A. (2008) Support vector machine for data on manifolds: An application to image analysis. In Biomedical Imaging: From Nano to Macro, 2008. ISBI 2008. 5th IEEE International Symposium on (pp. 1195-1198). IEEE (cited 11/9/17)

Shabalin, A. A., Tjelmeland, H., Fan, C., Perou, C. M., & Nobel, A. B. (2008) Merging two gene-expression studies via cross-platform normalization. Bioinformatics, 24(9), 1154-1160 (cited 9/28/17)

Shen, D., Shen, H., & Marron, J. S. (2013) Consistency of sparse PCA in high dimension, low sample size contexts. Journal of Multivariate Analysis, 115, 317-333 (cited 10/5/17)

Shen, D., Shen, H., Bhamidi, S., Muñoz Maldonado, Y., Kim, Y., & Marron, J. S. (2014). Functional data analysis of tree data objects. Journal of Computational and Graphical Statistics, 23(2), 418-438  (cited 12/05/17)

Shen, D., Shen, H., Zhu, H., & Marron, J. S. (2016) The statistics and mathematics of high dimension low sample size asymptotics. Statistica Sinica, 26(4), 1747 (cited 10/5/17)

Shen, H., & Huang, J. Z. (2008) Sparse principal component analysis via regularized low rank matrix approximation. Journal of multivariate analysis, 99(6), 1015-1034 (cited 10/5/17)

Siddiqi, K. and Pizer, S. M. (2007) Medial Representations Mathematics Algorithms and Applications, Springer, New York (cited 11/7/17)

Sommerfeld, M., Heo, G., Kim, P., Rush, S. T., & Marron, J. S. (2017). Bump hunting by topological data analysis. Stat  (cited 12/05/17)

Spellman, P. T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D. and Futcher, B. (1998) Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization, Molecular Biology of the Cell, 9, 3273-3297 (cited 9/05/17)

Srivastava, A., Wu, W., Kurtek, S., Klassen, E., & Marron, J. S. (2011). Registration of functional data using Fisher-Rao metric. arXiv preprint arXiv:1103.3817 (cited 11/21/17)

Staudte, R. G. and Sheather, S. J. (2011) Robust Estimation and Testing, Wiley, New York (cited 9/12/17)

Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American statistical Association, 69(347), 730-737  (cited 9/05/17)

Srivastava, M. S., Katayama, S., & Kano, Y. (2013). A two sample test in high dimensional data. Journal of Multivariate Analysis, 114, 349-358 (cited 8/31/17)

Tenenbaum, J. B., De Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319-2323 (cited 11/14/17)

Toh, K. C., Todd, M. J. & Tutuncu, R. H. (1999) www.math.nus.edu.sg/~mattohkc/sdpt3.html  (cited 9/26/17)

Tracy, C. A., & Widom, H. (1994). Level-spacing distributions and the Airy kernel. Communications in Mathematical Physics, 159(1), 151-174 (cited 11/30/17)

Tukey, J. W. (1977). Exploratory data analysis, Pearson, N.Y. ISBN 978-0201076165  (cited 8/24/2017)

Vapnik, V, N. (1982) Estimation of dependences based on empirical data, Springer (Russian version, 1979) (cited 9/26/17)

Vapnik, V. N. (1995) The nature of statistical learning theory, Springer (cited 9/26/17)

Vidal, R., Ma, Y., & Sastry, S. (2016). Generalized principal component analysis, Springer (cited 9/26/17)
Wand, M. P., & Jones, M. C. (1994). Kernel smoothing. Crc Press (cited 10/17/17)
Wang, H. and Marron, J. S. (2007) Object oriented data analysis: sets of trees, Annals of Statistics, 35, 1849-1873  (cited 8/22/16)
Wang, B., & Zou, H. (2016). Sparse Distance Weighted Discrimination. Journal of Computational and Graphical Statistics, 25(3), 826-838 (cited 9/26/17)

Wei, S., Lee, C., Wichers, L., & Marron, J. S. (2015). Direction-projection-permutation for high dimensional hypothesis tests. Journal of Computational and Graphical Statistics, (cited 8/31/17, 10/5/17)

Wilkinson, L. (2017). Visualizing Big Data Outliers through Distributed Aggregation. IEEE Transactions on Visualization and Computer Graphics (cited 9/12/17)

Wright, F. A., Strug, L. J., Doshi, V. K., Commander, C. W., Blackman, S. M., Sun, L., … & Corey, M. (2011). Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13. 2. Nature genetics, 43(6), 539-546 (cited 9/19/17)

Xiong, J., Dittmer, D. P., & Marron, J. S. (2015). “Virus hunting” using radial distance weighted discrimination. The Annals of Applied Statistics, 9(4), 2090-2109 (cited 10/17/17)

Yao, J., Zheng, S., & Bai, Z. D. (2015). Sample covariance matrices and high-dimensional data analysis. Cambridge University Press (cited 11/30/17)

Yata, K., & Aoshima, M. (2009) PCA consistency for non-Gaussian data in high dimension, low sample size context. Communications in Statistics—Theory and Methods, 38(16-17), 2634-2652 (cited 10/5/17)

Yata, K., & Aoshima, M. (2010a) Effective PCA for high-dimension, low-sample-size data with singular value decomposition of cross data matrix. Journal of multivariate analysis, 101(9), 2060-2077 (cited 10/5/17)

Yata, K., & Aoshima, M. (2010b) Intrinsic dimensionality estimation of high-dimension, low sample size data with d-asymptotics. Communications in Statistics—Theory and Methods, 39(8-9), 1511-1521 (cited 10/5/17)

Yata, K., & Aoshima, M. (2012) Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations. Journal of multivariate analysis, 105(1), 193-215 (cited 10/5/17)

Yata, K., & Aoshima, M. (2013) PCA consistency for the power spiked model in high-dimensional settings. Journal of multivariate analysis, 122, 334-354 (cited 10/5/17)

Yu, Q., Lu, X., & Marron, J. S. (2017). Principal Nested Spheres for Time-Warped Functional Data Analysis. Journal of Computational and Graphical Statistics, 26(1), 144-151 (cited 11/28/17)

Yushkevich, P., Pizer, S. M., Joshi, S., and Marron, J. S. (2001) Intiutive, localized analysis of shape variability, Information Processing in Medical Imaging (IPMI), eds. Insana, M. F. and Leahy, R. M. 402-408 (cited 11/7/17)

Zhang, L., Lu, S., & Marron, J. S. (2015). Nested nonnegative cone analysis. Computational Statistics & Data Analysis, 88, 100-110 (cited 11/14/17)

Zhao, X., Marron, J.S. and Wells, M.T. (2004) The Functional Data View of Longitudinal Data, Statistica Sinica, 14, 789-808 (cited 9/05/17, 10/26/17)

Zhou, Y. H., & Marron, J. S. (2015). High dimension low sample size asymptotics of robust PCA. Electronic Journal of Statistics, 9(1), 204-218 (cited 10/5/17)

Zhou, Y. H., & Marron, J. S. (2016). Visualization of robust L1PCA. Stat, 5(1), 173-184 (cited 9/19/17)