« Back

St. Martin’s Senior Bo Fan Hua Finishes in Top 10 of the After-The-AP National Data Science Championship

August 21st, 2023


Among 2,500 students from over 35 states and a range of academic backgrounds — including course-completers in AP Statistics, AP Computer Science, and AP Calculus — St. Martin’s Episcopal School senior Bo Fan Hua finished 10th in the recently announced results for the National Data Science Championship. Hua was the only student in Louisiana to finish in the Top 10.


“All of us at St. Martin’s are exceedingly proud of Bo Fan Hua,” said Head of School Whitney Samuel Drennan. “It is an incredible achievement, and one that we believe is reflective of our curriculum. At St. Martin’s, our students are encouraged to think critically and collaborate with each other to solve real-world challenges.”

In collaboration with Data Science 4 Everyone at The University of Chicago, North Carolina State University Data Science Academy, and the CourseKata platform, Skew the Script — a nonprofit education initiative led by a collaborative of AP Statistics teachers — organized a national Data Science challenge for high school students, to follow their AP examinations.


This past school year, competitors had to predictively evaluate a critical life question for many graduates: Which colleges or universities will give the greatest return on financial investment, or “pay off” the most? The prompt challenged participants to build the best possible model for predicting student loan default rates at different colleges across the country.


Leveraging data from the U.S. Department of Education’s College Scorecard and IPEDS portals, students filtered, analyzed, modeled, visualized, and communicated complex data on U.S. colleges and universities. The mission was to conquer the college debt problem by maximizing financial return and minimizing the likelihood of defaulting on student loans.


The College Scorecard data set is large and complex: students were tasked with navigating a portion of the data, focusing on 26 different variables across 4,435 colleges; they were encouraged to carefully choose which information was valuable to include and exclude in their final submissions. Most students had minimal programming experience — having recently completed an AP Statistics course. And they had two weeks, at most, to find a good answer.


Despite these challenges, over 80% of participants completed all components of the project. The levels of commitment and engagement demonstrated that even highly technical data science skills are accessible to high school students. Furthermore, when those skills are used to analyze problems that are genuinely relevant, the students will engage deeply with the investigation process.
Using Jupyter Notebooks and R, an open-source statistical software, competitors engaged in complex data analysis techniques applying digital technology, advanced algebra, and complex statistical techniques. For students to succeed in the challenge, mastery of linear regression, polynomial regression, the basics of machine-learning (artificial intelligence), and the intuition of calculus were all essential.

Posted in the categories Academics, Innovation + Design, Upper School.