I am a Ph.D. student in Department of Statistics at Iowa State University and actively looking for full-time opportunities in data science, applied statistics and machine learning.
I love working in the cross-functional team, collaborating with people from multiple fields to solve interesting interdisplinary problems in the real world.
Imputation for satellite image data and change-point detection for urbanization process in the remote sensing/machine learning group.
Error analysis for the tree-based models, a monitoring, alerting & root cause analysis system in python for model observability in the marketplace forecasting team.
Develop client reports with R Shiny and other tools, create reporting templates, research new Shiny features and document best practices.
Lab TA for STAT 404 Regression Social & Behavioral Research, STAT 407 Methods of Multivariate Analysis and STAT 201 Principles of Statistics Honors.
Discussion TA for MATH 1160 Finite Mathematics and an Introduction to Calculus and MATH 3280 Differential Equations with Linear Algebra.
Motivated by Study of Women's Health Across the Nation dataset, we would like to analyze how hundreds of variables from FFQ and physical measurements influence people's cognitve function performance. However, the exposure variables, including nutrient intake, and the outcomes of cognitive tests, were measured asynchronously. We propose a sparse FPCA based calibration of asynchronous longitudinal data for regression and variable selection. Some theoretical work is under progress.
View ProjectWe improve STFIT imputation algorithm in several aspects for the purpose of providing more accurate and complete satellite image data in water classification. It enables random forest to classify water pixels and outperform JRC on capturing temporal water area change. An improvement on computation efficiency for imputing global data is under progress.
View ProjectDetecting an urbanization process and estimating when the urbanization was happening is critical in the study of urbanization and will facilitate the development of urban growth models and the investigation of further environmental impacts. We develop and implement a class of sparse functional change-point detection methods based on FPCA, CUSUM and ensemble to test and estimate the change-year in urbanization process by using Landsat dataset.
View ManuscriptThe NASA’s CO2 monitoring satellite, Orbiting Carbon Observatory-2 (OCO-2), aims to provide a comprehensive measurement network for the CO2 concentraion by retrieving from high-resolution spectra of reflected sunlight. However, due to large amount of missing radiance and unstable land fraction estimates, the spatial coverage of the retrieval algorithm is limited. We propose an approach to model spatially dependent hyperspectral data such that radiance imputation and land fraction estimation can be tackled well.
View ManuscriptWe developed a Bayesian logistic regression model for the binary classification using 300 continous features, with only 250 training samples aimed for 19750 cases in test set. It ended up with ranking top 13%, 300/2330.
View ReportWe reviewed three papers regarding the topic "space-time covariance function on sphere", and successfully implemented the global space-time models for climate ensembles. The work was awarded Meritorious Research in Advanced Spatial Statistics class.
View Project