Dataset: Predicting Computer Systems and Architecture Learning Outcomes

The UNZA CSA dataset is a carefully labeled dataset of 273 student assessment score results for a Computer Systems and Architecture course offered at The University of Zambia. This dataset was created by Lighton Phiri and is, in part, based on work done as part of an undergraduate capstone research project [1].
Dataset [CSV]
Jupyter Notebook [IPYNB]
—Ethical Consideration—
• The original UNZA-centric student identifiers were replaced by MD5 hashes.
• The student names were replaced with randomly assigned Zambian names, extracted from commenters on The ZambianWatchDog Facebook page. As such, the randomly assigned names might not correspond to original student genders.
—Dataset Fields—
The dataset comprises of 69 fields, associated with the following aspects: Demographics, Background Information, Course Work, Moodle Interaction Logs details.

The student demographics were extracted from the Student Information System.

• StudentID---MD5 has representing unique identifiers for observation
• StudentName---Masked student full names
• AcademicYear---The cohort specific to each observation. There are a total of four cohors, associated with the enrollment year 2018 (201701), 2019 (201801), 2020 (201901) and 2021 (202001)
• DateOfBirth---Student date of birth
• Gender---Student gender
• YearOfStudy---Year of study. Typically 1st year, however, there are instances where an observation might be associated with 2nd year
• School---The school/faculty within which the student is registered
• Program---The programme the student is pursuing
• MajorDescription---The minor programme the student is pursuing
• MinorDescription---The description of the minor programme that student is pursuing
• Status---The registration status of the student
• Sponsor---The entity funding the student's education
• Nationality---The nationality of the student
• Comment---The status of the student at the end of the year
• CampusAccommodation---Flag indicating if the student has campus accommodation
• Category---Category of the student
• Mode---Mode of study of the student

Background Information
The background information was elicited from a preliminary survey administered to students at the beginning of the course.

• SurveyHomeTownSuburb---Student hometown
• SurveyProgramMinor---Student minor programme
• SurveyMinorMotivation---Motivation for choosing the minor programme
• SurveyMajorMotivation---Motivation for choosing the major programme
• SurveyStudyComputersHighschool---Status indicating if the student formally took a computing subject in highschool
• SurveyPriorComputerTraining---Status indicating if the student has undergone any formal computing training
• SurveyPriorComputerTrainingDetails---The details of the formal computing training undertaken by the student
• SurveyExperienceUsingComputers---Period within which the student has had experience working/using computers
• SurveyOwnComputer--Status indicating if the student owns a computer
• SurveyAboutYou---Random comment made by student

Course Workload
The course workloads were extract from the Student Information information, with the MinorProgram, MinorClassification and CourseWorkload fields derived.

• Courses---Total number of courses the student is enrolled into
• MinorProgram---Student minor programme
• MinorClassification---Classification of the student minor programme
• CourseWorkload---Computed course workload score

Moodle Interaction Logs
The Moodle interaction logs were extracted from Moodle logs.

• MoodleHits---Total number of daily unique Moodle hits during the academic year
• MoodleHitsWeight---Computed daily unique Moodle hits weight
• MoodleLogComponentAssignment---Total number of hits to the Moodle Assignment component
• MoodleLogComponentChoice---Total number of hits to the Moodle Choice component
• MoodleLogComponentFile---Total number of hits to the Moodle File component
• MoodleLogComponentFolder---Total number of hits to the Moodle Folder component
• MoodleLogComponentForum---Total number of hits to the Moodle Forum component
• MoodleLogComponentOverviewReport---Total number of hits to the Moodle Overview Report component
• MoodleLogComponentSystem---Total number of hits to the Moodle System component
• MoodleLogComponentURL---Total number of hits to the Moodle URL component
• MoodleLogComponentUserReport---Total number of hits to the Moodle User Report component
• MoodleLogComponentUserTours---Total number of hits to the Moodle User Tours component

Assessment Scores
Assessment scores were extracted from spreadsheets compiled by the course instructors for the course.

ICT 1110 assessment are clustered into three main components: quizzes, tests and the final examination. The assessment scores have all been scaled such that scores are between 0 and 100.

• Quiz1---Quiz on History of Computing
• Quiz2---Quiz on Classification of Computers
• Quiz3---Quiz on Abstraction in Computing
• Quiz4---Quiz on History of Computing
• Quiz5---Quiz on Computer Software
• Quiz6---Quiz on Von Neumann Model
• Quiz7---Quiz on Central Processing Unit
• Quiz8---Quiz on Peripherals
• Quiz9---Quiz on I/O Subsystem
• Quiz10---Quiz on Computer Primary Memory
• Quiz11---Quiz on File Organisation and Filesystems
• Quiz12---Quiz on Computer Secondary Storage
• Quiz13---Quiz on Number Systems and Representation
• Quiz14---Quiz on Number Systems and Representation
• Quiz15---Quiz on MIPS Instruction Set Architecture
• Quiz16---Quiz on MIPS Instruction Set Architecture
• Quiz17---Quiz on MIPS Instruction Set Architecture
• Quiz18---Quiz on MIPS Datapath and Control
• Quiz19---Quiz on MIPS Datapath and Control
• Quiz20---Quiz on Digital Logic Structures
• Test1---First test in term 1
• Test2---Second test in term 1
• Test3---First test in term 2
• Test4---Second test in term 2
• MakeUpTest---Make up assessment administered for various reasons
• FinalExamination---Final examination
—Exploratory Data Analysis—

EDA | Assessment Scores

EDA | Assessment Scores
Picture 1 of 8

[1] Chaibela, M., Chisha, I., Pungwa, D., Siabbaba, D., & Simukoko, B. (2021). Performance Predictor: A Data Mining and Machine Learning Software for Student Performance Outcomes. The University of Zambia. URL: