Data Science Essentials: Statistics, Tools, and Techniques, July 11 to August 11, 2025
The Vancouver Summer Program (VSP) in Statistics and Data Science is a four-week, intensive and hands-on learning experience for international students. Students will learn about the best practices in field of statistics and data science in Canada, while also immersing in the Canadian society and culture through interactive classes and social activities. Students will have a chance to explore what it's like to study data science in Canada, with new friends from around the world. The University of British Columbia data science program consists of two intensive courses, each covering 39 hours of class time.
In this course, students will develop their statistical literacy and reasoning. They will also build a deeper understanding of data to make data-driven decisions. Through carefully designed lectures, practical exercises, assignments and a group project, participants will acquire a fundamental understanding of study design and data collection. This will empower the students to systematically organize their own analysis projects and collect representative data. The course learning objectives are: i). build basic data literacy and reasoning, ii). improve students ability to formulate data-driven problems, iii). design and properly apply statistical methodologies such as data visualizations, t-test, ANOVA, correlation and linear regression, iv). understand data trends, anomalies, and patterns, v). interpret results, and vi). develop written and oral communication skills to present data-driven insights. This course is ideal for students seeking to build a robust methodological foundation in data analysis and statistical reasoning, with applications in diverse fields to support evidence-based decision-making and data-driven insights. By the end of the course students will have completed a full data science project from conception, to understanding the study design, to conducting analysis and finally presenting their findings via oral and written presentation.
In this course, students will gain practical skills to implement the best data science practices. Students will learn how to code effectively using the R programming language throughout this course (no previous R experience is expected from students). They will master data manipulation, figure creation, and the application of advanced statistical models to datasets. Students will also acquire strong cooperation and organization skills by immersing themselves in the world of reproducible research. By applying their command of statistical reasoning, visualization and analysis, students will be empowered to generate powerful data-driven storytelling in the course's final presentation. The main learning objectives of this course are: i). understand basic R essentials when coding, ii). master data manipulation and visualization, iii). develop reproducible data workflows, iv). acknowledge fundamental machine learning concepts and implement machine learning methods, and v). accurately assess the predictive performance of models. This course is ideal for students seeking to develop hands-on technical data science skills that will help them to implement appropriate data analyses and conduct collaborative research in the future.
While the program is intensive, students consistently leave with a sense of accomplishment and inspiration for the next step in their academic journeys.