STAT 385 Spring 2024 Unofficial Syllabus

STAT 385 Statistical Data Management

3 Credit Hours - Major Elective

Section S1

Spring 2024 - Unofficial Syllabus

Table of Contents


Course Description

Statistics Programming Methods (STAT 385) is a programming course designed to establish the foundations of statistical computing. Potential topics include the following: command line operations, version control, objects, control structures, user-defined functions, mathematics (arithmetic, algebra, calculus), statistics (data and randomization), arrays, linear algebra and matrix manipulation, algorithms, string manipulation, regular expression, and debugging code. Critical and creative thinking and efficient coding will be encouraged. Additionally, students will learn basic knowledge of computers, such as locating a file, creating a directory, saving a file, compressing a file, extracting a compressed file, keyboard shortcuts, and fundamental troubleshooting. This course aims to present these programming methods concepts and skills via R and version control via Git and GitHub. R with RMarkdown offers reproducible documentation with Markdown syntax, which will support long-term learning opportunities, while Git propels students’ capacity for collaboration. Using Git in this course encourages students collaborating alone as individuals (pulling and pushing their own work) and with classmates (pulling and pushing everyone’s joint work). This means that students must have a laptop that they bring to class each day.

This course is not a traditional daily lecture format. Mondays will serve as lecture days, while Wednesdays and Fridays will mostly serve as assessment days. Some videos may be shared as supplemental to the readings and lecture. Readings are the primary source of learning material and will contain examples and code. Readings should be read and understood before (and outside of) class. Code within the readings should be attempted prior to being in class. The classroom space and time will be for application, practice, and assessment of course content in the readings, lecture, and supplemental materials such as videos and notes. The expectation is that students will gain strong fundamental mastery of coding, troubleshooting, and building programs in R. Concepts covered in this course will build upon each other. Thus, students can expect all assessments and assignments to be cumulative. Assignments will be graded with an autograder.


Learning Objectives

These learning objectives are important because they connect the physical know-how with the technical knowledge of the course.

  • Students must recall important coding concepts and workflows.

  • Students must construct reproducible code in digital notebook files and script files.

  • Students will present programming in a reproducible document file using Markdown syntax and code chunks. No local data files will be utilized.

  • Students must demonstrate critical thinking and creativity through asking questions.

  • Students must be able to explain and summarize R code.

  • Students will share and discuss programming ideas, code chunks, and other thoughts to aid in meaningful dialogue.

  • Students will reflect on their own learning of programming principles.

  • Students will build programs, tools, and games and store all work using git, a version control software, and GitHub.


Course Staff

  • Instructor - Christopher Kinson (kinson2@illinois.edu)


Course Specifics

Course Website

The course website is https://github.com/illinois-stat385. This course is operating as an organization named illinois-stat385 within GitHub. Students should bookmark or save the link below in their browser for future use, because it contains access points to all repositories and course materials including readings, assignment solutions, notes, and videos.

Prerequisites

The prerequisites for this course are the following:

  • A laptop (not a netbook) with most up-to-date versions of R and RStudio installed. If using a netbook or Chromebook, please setup a Posit Cloud (formerly RStudio Cloud) account.

  • STAT 107 or STAT 200 or STAT 212

Meeting Schedule and Expectations

  • For section S1, the class meets at 11:00 am - 11:50 am in Room 4029 of the Campus Instructional Facility (CIF) on Mondays, Wednesdays, and Fridays.

  • There will be in-person lectures. There may be videos from external sources posted on the Course Website (in videos directory of the course_content repository). Any videos posted can be considered supplemental learning opportunities.

  • All students are expected to fully participate in class regularly.

  • All students are expected to do the following before coming to class each week: read the readings, attend and annotate lectures, review supplemental videos and notes, and complete assignments.

  • Course content - syllabus, readings, assignment solutions, notes, videos, discussion board (as Discussions tab) - will be found on the Course Website via the course_content repository. Do check the course_content repo often for updates and announcements. Students are encouraged to clone and pull the course_content repository daily if accessing it remotely via git.

Office Hours

Any and all times listed in this document are in current US Central Time. Take care to adjust clocks when daylight savings time occurs.

Office hours will be in-person. If a student has a specific question, but cannot attend the office hours, then that student should post their question in the Discussions board. If a student wants one-on-one assistance from the course staff (kinson2@illinois.edu) at an alternative time, then that student should email the course staff in order to schedule a Zoom meeting.

  • Instructor in-person office hours:

    • Thursdays 05:15 pm - 07:15 pm in Room 166 Computer Applications Building (CAB)

Textbooks

There is no required textbook, but readings are required and will come from more than one book. Links will be provided to all readings. Below one may find textbooks that are useful in learning R by oneself.

R

  • R Ice Breaker. Sanchez. https://www.gastonsanchez.com/R-ice-breaker/
  • R Coding Basics. Sanchez. https://www.gastonsanchez.com/R-coding-basics/
  • An Introduction to R. Venables, Smith and the R Core Team. https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
  • Deep R Programming. Gagolewski. https://deepr.gagolewski.com/
  • R for Data Science. Wickham and Grolemund. https://r4ds.had.co.nz/
  • Hands-On Programming with R. Grolemund. https://rstudio-education.github.io/hopr/
  • R Inferno. Burns. https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

Software

The course assumes students are new to R and the version control system, Git.

Calendar

Below is a calendar of topics and tentative assignment deadlines. Readings are written in bold and required. Ideally, the readings are read in the order that they are listed below.

Week 2024 Dates Topics & Readings Covered
1 01/15 - 01/19 Bryan’s Why Git? Why GitHub?, Sanchez’s First Contact with R via RStudio, Sanchez’s First Contact with RStudio
2 01/22 - 01/26 Bryan’s Git Commands, Bryan’s Git Branches, Gagolewski’s Preface
3 01/29 - 02/02 Sanchez’s First Contact with Vectors, Sanchez’s Properties of Vectors, Sanchez’s Creating Vectors, Sanchez’s More About Vectors
4 02/05 - 02/09 Venables et al’s Simple manipulations; numbers and vectors, Venables et al’s Objects, their modes and attributes, Sanchez’s Matrices and Arrays, Gagolewski’s Lists and attributes
5 02/12 - 02/16 Sanchez’s More About Matrices, Sanchez’s Factors, Sanchez’s Lists, Sanchez’s Data frames
6 02/19 - 02/23 Sanchez’s Conditionals: If-Else, Sanchez’s Iterations: For Loop, Sanchez’s Iterations: While Loop, Burns’s Growing Objects
7 02/26 - 03/01 Gagolewski’s Data frames, Burns’s Failing to Vectorize, Burns’s Over-Vectorizing
8 03/04 - 03/08 Midterm Exam 01 is Wednesday March 06, Review for Midterm Exam 01
9 03/11 - 03/15 SPRING BREAK (no classes and no office hours)
10 03/18 - 03/22 Gagolewski’s Character vectors, Sanchez’s Intro to Functions, Sanchez’s Expressions, Sanchez’s More About Functions
11 03/25 - 03/29 Grolemund’s Environments, Grolemund’s Programs, Grolemund’s S3
12 04/01 - 04/05 Gagolewski’s S3 Classes, Gagolewski’s Functions
13 04/08 - 04/12 Gagolewski’s Designing functions
14 04/15 - 04/19 Midterm Exam 02 is Wednesday April 17, Review for Midterm Exam 02
15 04/22 - 04/26 Kinson’s Slides on Weston’s Creativity for Critical Thinkers, Kanna’s ‘How to Solve Coding Problems with a Simple Four-Step Method’, Build your own games
16 04/29 - 05/03 Reading Day is May 02 (no more class and no more office hours), Build and play your own games, Final Exam Review
17 05/06 - 05/10 Final Exam at 8:00 am - 11:00 am on Thursday May 09

Grading Breakdown

14 Reading Review Assignments: 14 points total (1 point each)

  • rr01-rr16 are weekly comprehensive assignments due on Mondays (before the start of class). No student receives credit for rr01 and rr02. Credit is earned for rr03-rr16.

8 Solo Lab Assignments: 24 points total (3 points each)

  • Each student will work solo (individually) on this assignment each week beginning in week 04. Solo labs are completed in-class in-person. The shorthand name for the solo lab assignments is sl.

8 Pair Lab Assignments: 32 points total (4 points each)

  • Pairs of students will work on this assignment each week beginning in week 04. Pairs of students are randomly formed by the Instructor to complete in-class in-person lab assignments. The shorthand name for pair lab assignments is pl. The pairs are changed every two weeks to increase variation and student connections. For example, on pl01 and pl02, student X will work with student Y, while on pl03 and pl04, student X will work with student Z. On pl05 and pl06, student X will work with student A, while on pl07 and pl08, student X will work with student B.

2 Midterm Exams: 20 points total (10 points each)

  • Midterm Exam 01 is a 50-minute (class start and end time) in-person exam (March 06). It contains conceptual and applied problems.
  • Midterm Exam 02 is a 50-minute (class start and end time) in-person exam (April 17). It contains conceptual and applied problems.

1 Final Exam: 20 points total

  • Final Exam is a 3-hour in-person exam containing conceptual and applied problems. This exam schedule is in alignment with the University Final Exam Schedule (see https://registrar.illinois.edu/courses-grades/final-exam-schedule-public/ for more information). Adjust your schedules and travel plans accordingly. Any undergraduate requests for a conflict exam will be denied.

Course Total Points: 110 points

Final Letter Grades

When computing final grades, students can add up their scores on the assignments. The resulting sum will determine which letter grade they earn when the course is completed. There is only one $+$ letter grade in this course. All other letter grades are without $+/-$. Points are not rounded.

Lower bound Upper bound Letter grade
110.000 points 120.000 A+
99.000 points 109.999 points A
88.000 points 98.999 points B
77.000 points 87.999 points C
66.00 points 76.999 points D
0.000 points 65.999 points F

Instructional Activities

Students should read the readings, attend lectures, annotate any notes, watch any videos, and complete all assignments. If or when students get stuck, then they should ask questions in the i) Discussions Board, ii) Office Hours, or iii) via email (preference in this order). The following activities and tools will be useful for students.

Readings

The readings are the most important pathways to learning this course. The readings typically have code within the text. It’s a great idea to attempt the code on your own and alter it in some ways to see how those changes affect the result. Yes, there is a lot of information in the readings, but it is useful to read them for the important parts and return to it for details after attempting the Reading Review (rr) assignments.

Lectures

The lectures are going to follow and review the readings for aspects that the Instructor deems important to say out loud and demonstrate. The lectures cannot cover all important topics from the readings. It is still the student’s responsibility to complete the readings as assignments may ask questions beyond lecture coverage.

Supplemental Notes and Videos

At times, the Instructor may provide notes and links to videos to supplement student learning. These materials could be useful for learning and reinforcement of ideas. Any supplemental notes and videos will be posted on the course website in the course_content repository.

Discussions

This discussion board, which exists as a tab on the course_content repository, is one of the best ways to communicate with classmates and course staff. Questions can be seen quickly and receive a rapid response. Students are encouraged to use this board, but there is no requirement to participate in the discussion board.

Do use the board to openly discuss ideas about the course such as questions about content, deadlines, notes, data, etc. If a student specifically wants the course staff to respond, then student should use the mention @kinson2 when posting in the board. The things discussed here should be of a non-personal and non-private matter. If student has a personal or private matter to discuss with the Instructor, such as grades, please send an email to kinson2@illinois.edu. Additionally, the conversation in the discussion board should be respectful of people’s differences and cannot be used to speak negatively about anyone or harm anyone.

Grade Disputes

Please email the Instructor with your requests and disputes within 7 days (i.e. 1 week) of your grade being returned.

Assignments

Reading Review Assignments

These are guided comprehension assignments to be completed by each student as an individual as late as the start of class on every Monday. There are 14 Reading Review assignments for the semester. The shortname for these assignments is rr. They are numbered by their corresponding week number. For example, rr04 corresponds to week 04 Reading Review and is due on Monday of Week 04.

Each rr file must be submitted in the main branch of each student’s repo in the reading-reviews directory. See Course Website for web links. These rr’s are graded for completion, not correctness, however, file submissions must be reproducible. See Autograder section below.

Solo Lab Assignments

Solo Lab assignments are intended to push students to apply concepts covered in the course readings and lectures as well as supplemental materials in a limited amount of time. These are lab sessions that contain 3 problems and are due at the end of the class period on Fridays for each student. Each week (beginning in Week 04), students will complete the solo lab assignment. Talking and sharing ideas and advice is permitted. Sharing code and files is not permitted. Each sl file must be submitted in the main branch of each student’s repo in the solo-labs directory. All students must bring their laptops to class each day. Late submissions of lab assignments will not be accepted. There will be no make-ups for any missed labs. This policy applies to any students who add the course late to their registration.

Pair Lab Assignments

Pair Lab assignments are intended to push students to apply concepts covered in the course readings and lectures as well as supplemental materials and encourage students to work together as a team in a limited amount of time. These are lab sessions that contain 4 problems and are due at the end of the class period on Wednesdays by the “driver”. Each week (beginning in Week 04), one member of the pair of students will be deemed the “driver” and will complete and submit the lab assignment. Each pl file must be submitted in the main branch of the driver’s repo in the pair-labs directory. All students must bring their laptops to class each day. A schedule will be give to students displaying exactly when each student is to be driver and which pair (or group) they are in. Do not deviate from this schedule. Late submissions of Pair Lab assignments will not be accepted. There will be no make-ups for any missed labs. This policy applies to any students who add the course late to their registration.

Both students receive the same grade for the Pair Lab assignment, unless either member is absent from class. Absent members of a pair forfeit their score and thus receive a 0 for the Pair Lab assignment. If a member is absent due to sickness, then they must communicate with the Instructor and their other member prior to the assignment’s submission deadline. The Instructor will then decide to excuse the absence or not. An excused absence for a Pair Lab assignment means the absent member receives the same grade as the present member.

Midterm Exam

Each midterm exam contains 10 problems and is in-class and in-person. The exam will be closed notes, closed book, and closed resources. Each midterm file must be submitted in the main branch of each student’s repo in the exams directory. The midterm exam will be graded for correctness and completeness. Students can expect the midterm exam to include a structure similar to lab assignments with a possible mixture of completion, correct/incorrect, and open-ended questions. Late submissions on any Midterm Exam submission will not be accepted or graded. There will be no make-ups for any missed Midterm Exam submission.

There is one Reassessment Midterm Exam for each original Midterm Exam to give students who struggle with the original Midterm Exam submission a second opportunity to review course content and improve their score: Reassessment Midterm Exam 1 and Reassessment Midterm Exam 2. The Reassessment Midterm Exams take place on the Friday following the original Midterm Exam. The Reassessment Midterm Exam grade may be used to replace the original Midterm Exam grade, for the respective exam number. For example, a student’s Reassessment Midterm Exam 1 grade of 10 points may only be used to replace the grade of the original Midterm Exam 1 grade of 0 points. All exams (original or reassessment) are in-person exams. For students who do not wish to take the reassessment, they may use the reassessment exam date as a study at home day.

Final Exam

The final exam is one in-person exam with 20 problems. The exam will be open notes, open resources, and take up to 3 hours to complete. The final exam will be graded for both correctness and completeness. Students can expect the final exam to include a structure similar to labs and the midterm exams with a possible mixture of completion, correct/incorrect, and open-ended questions. Students should not expect solutions to be provided for the final exam. Late submissions on any Final Exam will not be accepted or graded. There will be no make-ups for any missed Final Exam.


Autograder

The code we write in this course must be reproducible. It is important that any submitted code does not contain executable errors and warnings. Submitting code with executable errors and warnings shows that a student is not following one of the course learning objectives. Submitting error-producing code also shows that there is no regard for what reproducibility means. Reproducible here means that the code in the file can be run by any computer and produces exactly the same results as the original user claims to have produced without any executable errors and warnings. There is an autograder used in this course to grade assignments. The autograder is not forgiving. It will scan the entire file and check for base R executable errors and warnings as well as grade the assignment for correctness and completion. Objects created at the top of the file which are overwritten at the bottom of the file will be considered incorrect by the autograder. When the autograder detects a base R executable error or warning, it will stop grading the student’s submission and assign a grade of 0 for the assignment.

To follow reproducible coding guidelines and prevent executable errors and warnings, be sure to do the following (in no particular order):

  1. Save the file with the correct name. Your netid should replace anything saying ‘netid’.

  2. Save the file in the correct location. rr** assignments belong directly in the reading-reviews directory of your repo on the main branch. All sl** assignments belong directly in the solo-labs directory of your repo on the main branch. All pl** assignments belong directly in the pair-labs directory of your repo on the main branch. All exam** belongs in the exams directory of your repo on the main branch. Any sub-directories within these directories is inappropriate.

  3. Within code chunks, explicitly write code that attaches or loads a package using either library() or environment call package_name:: if you use a package to produce your result.

  4. Change your RStudio Global Options’s General Tab such that:
    • Restore most recently opened project at startup is not checked.
    • Restore previously open source documents at startup is not checked.
    • Restore .RData into workspace at startup is not checked.
    • Save workspace to .RData on exit is Never.
    • Always save history (even when not saving .RData) is not checked.
  5. Change your RStudio Global Options’s Code Tab such that Under the Saving section:
    • Always save R scripts before sourcing is not checked.
    • Automatically save when editor loses focus is not checked.
    • When editor is idle is Do nothing.
  6. Restart your R session. This can be done in RStudio using the Session > Restart R. After clicking this, if your session still shows objects in the Environment, then click Session > Terminate R > Yes. Terminating the R session effectively does the same thing that restarting the R session should do: detach any packages and remove all objects in the global environment giving you a new session.

  7. After beginning a new session, execute and run all your code to ensure there are no executable errors or warnings. Some warnings are specific to a package which may not cause R executable errors or warnings.

  8. Comment out any erratic code using the hashtag symbol #. Doing so will prevent the autograder from executing it. This is useful if you don’t know how to correct your errors or warnings before the deadline.

  9. Comment out or remove any install.packages() in your code chunks.

  10. Do not work with local files or any files that exist only on your computer. Data objects and data files must be accessed with URLs (provided by the Instructor).

University Specifics

Disability Accommodations

To obtain disability-related academic adjustments and/or auxiliary aids, students with disabilities must contact the course instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, student may visit 1207 S. Oak St., Champaign, call 333-4603, e-mail disability@illinois.edu or go to the DRES website at http://disability.illinois.edu/

Academic Integrity and Generative Artificial Intelligence Tools

It is expected that all students abide by the campus regulations on academic integrity http://studentcode.illinois.edu/article1_part4_1-401.html. Intentional violations of academic integrity can be found at http://studentcode.illinois.edu/article1_part4_1-402.html and include, but are not limited to, copying any part of another student’s assignment and allowing another student to copy any part of student’s own assignment.

Generative artificial intelligence tools are relatively new to the general public and can be useful in learning and studying. If students use generative AI tools in this course, do so outside of class as a means of studying and learning accurate information relevant to this course’s content. Students should not use generative artificial intelligence tools as a means to perform (on graded assignments and submissions) in this course. Doing so is an intentional violation of academic integrity.

Safety Protocol

We have been asked by Public Safety https://police.illinois.edu/emergency-preparedness/run-hide-fight/ to share the following information in case of weather or security emergencies. See the links:

Sexual Misconduct Policy and Reporting

The University of Illinois is committed to combating sexual misconduct. Faculty and staff members are required to report any instances of sexual misconduct to the University’s Title IX and Disability Office. In turn, an individual with the Title IX and Disability Office will provide information about rights and options, including accommodations, support services, the campus disciplinary process, and law enforcement options.

A list of the designated University employees who, as counselors, confidential advisors, and medical professionals, do not have this reporting responsibility and can maintain confidentiality, can be found at https://wecare.illinois.edu/resources/students/#confidential. Other information about resources and reporting is available at https://wecare.illinois.edu.


The Last Word

The Instructor reserves the right to make any changes considered to be academically advisable. Any changes will be announced in class and on the Course Website. It is the student’s responsibility to attend the class and keep track of the changes.