Photo credit to Gerd Altmann
This project was completed in collaboration with Kenny Livingston for Angela Hairrell at the beginning 2019 with the goal of extracting useful information from the ExamSoft test-taker reports. ExamSoft is the exam administration software used by the Texas A&M College of Medicine and it is clear that most of their engineering effort was put into security, not user-interface design and analytics. Dr. Hairrell wanted to be able to clearly show students whether they were spending too much time on some questions, one question or all questions, whether they mostly missed questions the class found easy or difficult, how their exam performance changed as they progressed through the exam, and so on.
We began with two Excel files which are generated by ExamSoft for every student and are easy to download. File one is specific to each student with their answers and timestamped activity (logging of each click). File two is specific to each exam with the answer key and the class-wide statistics (percentage of correct responses, average time on each question, etc).
File_one.xls
Item # | Snapshot # | Item Type | Time Stamp | Trigger | Response |
1 | 1 | Choice | 18:25:09 | Answered | Choice(s): E |
2 | 1 | Choice | 18:40:48 | Answered | |
3 | 1 | Choice | 18:40:46 | Answered | |
4 | 1 | Choice | 18:23:09 | Answered | |
5 | 1 | Choice | 18:21:34 | Answered | |
6 | 1 | Choice | 18:03:57 | Answered | Choice(s): A |
7 | 1 | Choice | 18:13:11 | Answered | Choice(s): E |
8 | 1 | Choice | 18:03:59 | Answered | |
9 | 1 | Choice | 17:58:11 | Answered |
File 2 has much more information than is shown here but it looks something like this:
File_two.xls
Exam Name | # of Testers | KR20 | Stdev | Mean | % Mean | Median Pts | ... |
Exam 2 | 129 | 0.69 | 4.53 | 56.11 | 86.33 | 56 | ... |
Item # | Diff(p) | Upper | Lower | Disc. Index | Point Biserial | Correct Answer | ... |
1 | 0.93 | 94.87% | 91.89% | 0.03 | 0.1 | B | ... |
2 | 0.57 | 64.10% | 48.65% | 0.15 | 0.21 | C | ... |
3 | 0.94 | 100.00% | 89.19% | 0.11 | 0.2 | E | ... |
4 | 0.85 | 100.00% | 67.57% | 0.32 | 0.37 | C | ... |
5 | 0.91 | 97.44% | 89.19% | 0.08 | 0.12 | A | ... |
... | ... | ... | ... | ... | ... | ... | ... |
From this information I developed a script which efficiently analyzes these report documents and generates graphical representations of the student's performance in comparison to the class as well as a print-out with relevant statistics and an enumeration of missed questions for reference.
This program is currently in use on a daily basis when students meet with the academic support staff to discuss test performance. Of particular importance are the question change statistics which were previously computed manually. The process took 30-45 minutes and was therefore rarely done. Time usage statistics are similarly difficult to obtain by other means. The script has reduced the time required for exam analysis from over an hour to approximately 2 minutes (most of that time is downloading files from ExamSoft), allowing every exam to be completely analyzed during exam review meetings.
>>> exec(open('examsoft_report.py','r').read())
Test took 102.55 minutes
Average question time 94.7 seconds
Class average Q time 83.9 seconds
Final score 81.5% (53 of 65)
Question change stats
right to wrong 0
wrong to right 3
wrong to wrong 2
only one answer 60
no answer 0
Quarter Correct Attempted Accuracy
1st 19 22 0.864
2nd 11 14 0.786
3rd 10 13 0.769
4th 13 16 0.812
MISSED QUESTIONS
Q# Class resp (%) Time (s)
54 81.0 381
24 83.0 323
21 71.0 298
36 91.0 157
58 95.0 107
28 85.0 84
27 81.0 69
33 91.0 56
15 77.0 51
60 88.0 37
52 68.0 29
34 97.0 19
CHANGE STATS
Q# Class resp (%) Time (s)
Wrong to Right
5 91.0 61
8 95.0 59
51 53.0 78
Wrong to Wrong
21 71.0 298
28 85.0 84
[Caption] Comparison of student performance and timing to class performance. Question number does not reflect the order in which questions are displayed to students. Unanswered questions are shown in yellow but in this example the student answered everything. [Upper left] Comparison of questions the student missed (red vs blue bars) with the total class performance. Is the student missing things the rest of the class found difficult or easy? [Upper right] Comparison of the time the student spent on each question to the rest of the class. The black line represents per question time exactly matching the class average. Is the student spending more time on questions the class spent lots of time on or do they zone out randomly? [Lower left] Time per question vs question number. Is the student missing the questions they spent the most time on? [Lower right] Histogram of the lower left graph, same question and information, different representation.
[Caption] Student performance as a function of time elapsed through exam. This illuminates issues with exhaustion or lack of focus near the end of the exam. It can also indicate test anxiety (reduced percentage at beginning — trouble settling into exam mode, discrepant drop in the middle — panic over perceived performance). [Left] Each point on the graph contains data from the previous 15% of the exam (percentage as a function of time spent on the exam, not percentage of questions). Sudden dips in performance indicate a slowing of pace (black line) or decrease in accuracy (gold line) during the exam. Often we expect these lines to track with one another since difficult parts of the exam will probably require more time per question and have lower answer accuracy than easier parts of the exam. [Right] Performance by quarter of the exam. Same information as the graph on the left but shown at 4 discrete time points with a traveling window of 25% of the exam.