STAT 440 Fall 2025 Syllabus

STAT 440 Statistical Data Management

Spring 2026

3/4 Credit Hours - Major Elective

Sections 1UG/1GR and 2UG/2GR


About the Instructor

Christopher Kinson is the Instructor. His email address is kinson2@illinois.edu. He is a Teaching Associate Professor in the Department of Statistics. His bio and more information about him may be found at chriskinson.com. His faculty profile image is below.

Christopher Kinson


Course Description

Statistical Data Management (STAT 440) is a focused data wrangling course that aims to cover various types of data storage, extractions, manipulation, cleaning, and visualization and to apply these methodologies in R in a reproducible fashion. By reproducible, we mean, verifiable by any computer running the same exact code and receiving the same exact result as the original source. This means that students must have a laptop that they bring to class each day. This course includes lectures, archived lecture videos, notes, readings, and assessments in both digital and paper formats. Notes are reproducible documents that include R code examples. Videos and notes should be viewed outside of class, ideally before each class meeting. The classroom space and time will be for practice, application, and assessment of course content.

The expectation is that students will gain competency in exploring, organizing, designing, creating, storing, cleaning, wrangling, sharing, visualizing, and using data, all of which are commonly done prior to data analysis. Creative thinking, critical thinking, and efficient coding will be encouraged. Concepts covered in this course will build upon each other. Thus, students can expect all assessments to be cumulative in nature. An auto-grader is used to grade most assignments. Students should be comfortable using R prior to beginning this course.

R will be the only programming language covered in this course. R will primarily be used inside of RStudio. RStudio offers reproducible documentation with Markdown syntax which will support long-term learning opportunities. With the advancement of large language models, this course will allow students to use Copilot within RStudio. Copilot is Microsoft's AI tool that suggests code and can be used while coding in RStudio assuming the student has the proper settings and access. 


Learning Objectives

These learning objectives are important because they connect the physical know-how with the technical knowledge of the course.

  • Students will assess effectiveness, organization, and intent from a published data set
  • Students will explore data sets of various types
  • Students must design well-organized, clean data sets for the purpose of data analysis
  • Students will present and submit data management work that is reproducible.
  • Students must demonstrate critical thinking and creativity through asking questions about a given data set
  • Students must be able to explain and summarize data wrangling code
  • Students will share and discuss data management ideas, coding snippets, and other thoughts to aid in meaningful dialogue
  • Students must recall important data management concepts
  • Students will reflect on their own learning of data management principles
  • Students will build data wrangling tools, apps, and dashboards.
  • Students will collaborate on lab assignments.
  • Students will reproduce and replicate data visualizations.

Prerequisites

The prerequisites for this course are the following:

  • STAT 400 or STAT 409

  • A laptop with most up-to-date versions of R and RStudio installed. If using a netbook or Chromebook, please setup an Posit Cloud account.

  • Operating knowledge of computers such as locating a file, creating a directory, saving a file, compressing a file, extracting a compressed file, keyboarding, and fundamental troubleshooting.

  • Operating knowledge of R such as understanding various object types, mathematical and logical operators, and value types and their coercion, as well as creating user-defined functions and fundamental R troubleshooting


Meeting Schedule

  • For section 1UG/GR, the class meets at 4:00 pm - 4:50 pm in Room 32 of the Psychology Building on Mondays, Wednesdays, and Fridays. 
  • For section 2UG/GR, the class meets at 3:00 pm - 3:50 pm in Room 32 of the Psychology Building on Mondays, Wednesdays, and Fridays.

The Psychology Building is located at 603 E Daniel St, Champaign, IL 61820.


Expectations

All students are expected to fully participate in class regularly and to do the following before coming to class each week:

  • read the course notes
  • practice coding in R
  • review lecture notes
  • review archived lecture videos
  • create and answer relevant questions with the assistance of Copilot

Students can expect that on Mondays, there will be an initial assessment in the form of a training lab, where students can collaborate to complete assigned problems. These initial assessments are to understand where students are in their learning and what adjustments or corrections can be made to improve their learning of the course material. On Wednesdays, the Instructor will give a lecture and review of the material for that week. On Fridays, there will be more intense assessment in the form of a test lab and occasionally, in the form of a paper exam.


Office Hours

Office hours will be remote in Zoom on Thursdays, 6:30 pm - 7:30 pm.

If a student has a specific question, but cannot attend the office hours, then that student should post their question in the discussion board. If a student wants one-on-one assistance from the Instructor at an alternative time, then that student should email the Instructor in order to schedule a Zoom meeting. 


Textbooks

There is no required textbook, but students may find the texts below to be helpful. These are all free and accessible to students for further reading. The Instructor may refer to certain sections of these texts in the course content. The asterisk * means these are accessible from the University Library as E-books. 


Software

The course requires students to already have a fundamental and operational understanding of R. It is recommended that students with no familiarity in R understand that this course will not discuss fundamental and operational usage of R. The following are download and install links to the software regularly used in this course. As with any software, it is imperative to maintain up-to-date software.


Calendar

Below is a calendar of topics, tentative assignments, and due dates. Any due dates apply to all sections of STAT 440 unless otherwise noted.

Weekly course topics and assignments with due dates
Week 2026 Date Topic Assignment (Due Date)
1 01/19 - 01/23 Introduce the course and software, Copilot tutorial, Auto-grader tutorial
2 01/26 - 01/30 Structures of data, Delimiters and file extensions, Using the pipe operator, Handling dates and times, Exporting data
3 02/02 - 02/06 Accessing and importing data training_lab01 (Monday 02/02), test_lab01 (Friday 02/06)
4 02/09 - 02/13 Shiny apps and dashboards training_lab02 (Monday 02/09), test_lab02 (Friday 02/13), homework01 (Friday 02/13)
5 02/16 - 02/20 Data visualization I training_lab03 (Monday 02/16), test_lab03 (Friday 02/20)
6 02/23 - 02/27 Data visualization II training_lab04 (Monday 02/23), test_lab04 (Friday 02/27), homework02 (Friday 02/27)
7 03/02 - 03/06 Arranging data, Data reduction methods training_lab05 (Monday 03/02), paper_exam01 (Friday 03/06)
8 03/09 - 03/13 Reshaping data, Data expansion methods training_lab06 (Monday 03/09), test_lab05 (Friday 03/13), homework03 (Friday 03/13)
9 03/16 - 03/20 Spring Break, No class, No office hour
10 03/23 - 03/27 String manipulation, Regular expression training_lab07 (Monday 03/23), test_lab06 (Friday 03/27)
11 03/30 - 04/03 Summarizing data training_lab08 (Monday 03/30), test_lab07 (Friday 04/03), homework04 (Friday 04/03)
12 04/06 - 04/10 Combining data training_lab09 (Monday 04/06), paper_exam02 (Friday 04/10)
13 04/13 - 04/17 SQL databases and queries training_lab10 (Monday 04/13), test_lab08 (Friday 04/17)
14 04/20 - 04/24 SQL queries and sub-queries training_lab11 (Monday 04/20), test_lab09 (Friday 04/24), homework05 (Friday 04/24)
15 04/27 - 05/01 Creative and critical thinking, final project preparation reflective_survey (Friday 05/01)
16 05/04 - 05/08 Data science careers paper_exam03 (Wednesday 05/06)
17 05/11 - 05/15 Finals week final_project (Wednesday 05/13)

Grading Breakdown

  • 1 Reflective Survey: 2 points total
  • 1 Final Project: 10 points total
  • 3 Paper Exams: 36 points total (12 points each)
  • 5 Homework Assignments: 20 points total (4 points each)
  • 9 Test Labs: 36 points total (4 points each)
  • 11 Training Labs: 22 points total (2 points each)

Course Total Points: 126 points 


Final Letter Grades

When computing final grades, students can add up their scores on the assignments. The resulting sum will determine which letter grade they earn when the course is completed. There is only one + letter grade in this course. All other letter grades are without +/-. Points are not rounded.

Total course points (to three decimal places) to determine final letter grade
Lower Bound Upper Bound Letter Grade
122.850 126.000 A+
110.250 122.849 A
97.650 110.249 B
85.050 97.649 C
72.450 85.049 D
0 72.449 F

Instructional Activities

Students should read the course notes and practice the code therein, annotate the lectures, watch the archived lecture videos as supplemental materials, and attempt the assignments. If or when students get stuck, then they should ask questions in the i) Discussions, ii) Office Hours, or iii) via email (preference in this order). The following activities and tools will be useful for students.

Course Note

The course notes are reproducible documents that contain text, images, and code chunks. The notes are written in RMarkdown syntax and saved as .Rmd files. The notes may be rendered as .html files for easy reading and navigation. Students should read the notes before coming to class each week. Students should also practice coding the examples (code chunks) within the course notes. The course notes perform the duty of a textbook for this course. Yes, there is a lot of information in the notes, but it is useful to read it for the important parts and return to it for details after attempting the assignments.

In-person Lecture

In-person lecture will be used to review and clarify concepts from the course notes and archived lecture videos. In-person lecture will also be used to introduce new concepts that are not always covered in the course notes or archived lecture videos, answer questions from students, and provide guidance on how to approach and complete certain assignments.

Archived Lecture Video

The archived lecture videos are supplemental materials that cover the similar content as the course notes, but may go into more detail with worked out examples. Students should watch the archived lecture videos before coming to class each week. Students should also practice coding the examples (code chunks) within the course notes while watching the videos.

Discussion

Canvas's Discussions is one of the best ways to communicate with classmates and the Instructor. Questions can be seen quickly and receive a rapid response. Students are encouraged to use this board, but there is no requirement to participate in the discussion board. Do use the board to openly discuss ideas about the course such as questions about content, deadlines, notes, data, etc. The things discussed here should be of a non-personal and non-private matter. If student has a personal or private matter to discuss, then email the Instructor. Additionally, the conversation in the discussion board should be respectful of people's differences. 


Assignments

This course is open to undergraduate and graduate students. Graduate students will be expected to complete additional work in the course to justify the 4 credits. For graduate students to earn 4 credits, they must complete additional problems within the homework and test labs. These additional problems will be written and marked as for graduate students only. 

Reflective Survey

The reflective survey is a short digital survey that students complete near the end of the semester. The survey will ask students to reflect on their learning in the course, what they liked, what they did not like, specific aspects of technology usage, and how the course might be improved. The survey is worth 2 points and is due on Friday of Week 15 (May 1, 2026). 

Final Project

The final project is a shiny app that students create in R and demonstrate in a video presentation. The project is worth 10 points total. The project is due on Wednesday of Week 17 (May 13, 2026). Students must submit the .R script file in their repo.

Paper Exam

The paper exam is an in-person exam that will be given on some Fridays. The exam will be closed resources i.e. no notes, no book, no assistance of any kind. These are graded for correctness and attendance is checked. See the calendar for due dates and the Attendance, Absence, and Missing Assignments Policy.

Homework

These are assignments that students complete individually. The submitted assignment must follow reproducibility guidelines and must not contain executable errors or warnings. For undergraduate students, each of assignments will have four problems summing to 4 points such that each problem is worth 1 point. For graduate students, these assignments will have 5 problems summing to 4 points such that each problem is worth 0.8 point. See the calendar for due dates.

Test Lab

These are in-person lab sessions that are due at the end of the class period on most Fridays. Test labs are intended to challenge students in a short period of time to apply covered concepts for the week cumulatively. These are graded for correctness and attendance is checked. For undergraduate students, each of assignments will have four problems summing to 4 points such that each problem is worth 1 point. For graduate students, these assignments will have 5 problems summing to 4 points such that each problem is worth 0.8 point. See the calendar for due dates.

Training Lab

These are in-person lab sessions that are due at the end of the class period on Mondays. Training labs are intended to give students an opportunity to fail productively such that accurate knowledge of content and skills may be corrected and reframed in later assignments. These are graded for completion and attendance is checked. See the calendar for due dates.


Auto-grader

The code we write in this course must be reproducible - verifiable by any computer running the same exact code and receiving the same exact result as the original source. It is important that code does not contain executable errors and warnings. Submitting code with executable errors and warnings shows that a learner is not following one of the course learning objectives. Submitting error-producing code also shows that there is no regard for what reproducibility means. There is an auto-grader used in this course to grade most assignments. The auto-grader is not forgiving. It scans the entire file and checks for base R executable errors and warnings as well as grades the assignment for correctness and completion. Objects created at the top of the file which are overwritten at the bottom of the file are considered incorrect by the auto-grader. When the auto-grader detects a base R executable error or warning, it stops grading the learner's submission and assign a grade of 0 for the assignment.

To follow reproducible coding guidelines and prevent executable errors and warnings, be sure to do the following (in no particular order):

  • After beginning a new clean session, execute and run all your code to ensure there are no executable errors or warnings.
    • Some warnings are specific to a package which may not cause R executable errors or warnings.
  • Comment out any erratic code using the hashtag symbol #.
    • Doing so prevents the auto-grader from executing it. This is useful if you don't know how to correct your errors or warnings before the deadline.
  • Comment out or remove any install.packages() in your code chunks..
  • Always use URLs for accessing and importing data.
    • Local file locations are not reproducible.
  • If timing permits, knit the file to html to see if any error occurs.
    • Remember that local data files will knit as if the work is reproducible, but local files are not URLs. Thus, another computer will not be able to access your computer's local files. Hence, that kind of work is actually irreproducible.

Grade Disputes

A grade dispute is when a grade has been incorrectly applied to an assignment and the learner has evidence supporting the fact that the grade is incorrectly applied. A grade dispute is not a plea or request to change a grade simply because a learner does not like their original grade. Please email the Instructor with your disputes within 7 days (i.e. 1 week) of your grade being returned.


Late, Improper, or Irreproducible Submissions Policy

An assignment is considered a late submission when it is submitted by a learner in the proper location after the assignment deadline. An assignment is considered an improper submission when it is submitted by a learner outside of the appropriate location or with the wrong file name. An assignment is considered an irreproducible submission when it is submitted by a learner and the code within the file produces an executable error. Thus, there is no way to reproduce the same coding result as the original submission presumes. It is possible for an assignment to be submitted given any combination of late, improper, or irreproducible. Any late, improper, or irreproducible submission will not be accepted nor graded. This policy applies to all assignments including paper exams and the final project with exceptions defined in the Attendance, Absence, and Missing Assignments Policy.


Attendance, Absence, and Missing Assignments Policy

Students are expected and required to attend class each day it meets. If a student misses an assignment for any reason beyond official University business with proper documentation, then that assignment is considered late and thus the Late, Improper, or Irreproducible Submissions Policy is applicable. The exceptions for this policy apply to documented illnesses and family emergencies when a learner misses any paper exam. A make-up paper exam will be offered if and only if there is documentation for the absence. A paper exam may be made up within 7 days of the original due date of that missed paper exam. Keep in mind that the final letter grades are computed to account for some assignments to be missed.


Disability Accommodations

To obtain disability-related academic adjustments and/or auxiliary aids, learners with disabilities must contact the course Instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, learner may visit 1207 S. Oak St., Champaign, call 333-4603, e-mail disability@illinois.edu or go to the DRES website.


Academic Integrity and Generative Artificial Intelligence Tools

It is expected that all learners abide by the campus regulations on academic integrity. Intentional violations of academic integrity include, but are not limited to, copying any part of another learner's assignment and allowing another learner to copy any part of learner's own assignment.

Generative artificial intelligence tools can be useful in learning and studying. If learners use generative AI tools in this course, we suggest doing so outside of class as a means of studying and learning accurate information relevant to this course's content. Learners are permitted to use generative artificial intelligence tools on graded assignments in this course. Beware that multiple learners with the same exact code solution may be in violation of academic integrity. 

It is important to understand the course content and code for yourself and adapt code to be in alignment with the course content and trajectory. Using complex coding, because it is suggested by generative AI, demonstrates a lack of understanding of the actual course material and calls into question one's own ability to be curious, critical, and skeptical. Furthermore, reliance on generative AI tools may lead to dependence on its use and a lack of individuality. 

This course is concerned with the way learners think and create and their ability to adapt that creativity in various conceptual settings and environments. This course aims to challenge all learners to retain and exercise their own individual knowledge and power.


Safety Protocol 

We have been asked by Public Safety to share the following information in case of weather or security emergencies. See the links:


Sexual Misconduct Policy and Reporting

The University of Illinois is committed to combating sexual misconduct. Faculty and staff members are required to report any instances of sexual misconduct to the University's Title IX and Disability Office. In turn, an individual with the Title IX and Disability Office provides information about rights and options, including accommodations, support services, the campus disciplinary process, and law enforcement options.

This is a list of the designated University employees who, as counselors, confidential advisors, and medical professionals, do not have this reporting responsibility and can maintain confidentiality. To report an incident or find support, check these resources.


The Last Word

The Instructor reserves the right to make any changes considered to be academically advisable. Any changes will be announced in class and on Canvas. It is the student's responsibility to attend the class and keep track of the changes.