Post

Timetable Image AI Recognition and Prompt Improvement

Timetable Image AI Recognition and Prompt Improvement

Problem

For the Kakao Tech Campus final project, we built a student schedule management service. Manually entering timetables is tedious, so I wanted to implement automatic recognition when users upload timetable images.


Approach: GPT Vision + Structured Output

Chose GPT-4.1’s Vision feature over traditional OCR. Reasons:

  • Timetable image formats vary (Everytime app, school website, handwritten, etc.)
  • Need to convert to structured data, not just text extraction

Used GPT’s Structured Output feature to enforce output format with JSON Schema.


Prompt Design

Initially tried simple prompts like “Convert this timetable image to JSON” - results were inconsistent. Testing 10 times, only 2-3 were accurate.

After improving the prompt, achieved 8-9 accurate results out of 10.

Final Prompt Rules:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1. dayOfWeek starts from 1 for Monday. (Mon=1, Tue=2, Wed=3, Thu=4, Fri=5)

2. startTime, endTime use "HH:MM:SS" format, 24-hour clock.

3. Same subject + professor + room should be grouped as one subject,
   with multiple time entries in the times list.

4. If credit is not specified:
   - Regular lecture courses: 3
   - Lab/practice courses ('lab', 'practice', 'project', 'capstone', etc.): 2

5. If time is not directly shown in image, estimate based on grid spacing.

6. Infer class duration from the table. Don't assume 1 hour without evidence.

7. Start/end times may not be on the hour. Could be 5-minute intervals.

8. Enter all strings exactly as shown in the image.

Improvement Points

ProblemSolution
Always assumed 1-hour classesAdded “Don’t assume 1 hour without evidence”
Only guessed times on the hourAdded “Could be 5-minute intervals”
Same course split into multiple entriesAdded “Group same courses” rule
Returned 0 creditsAdded default credit rules by course type

Lessons Learned

  • AI features vary greatly based on prompt
  • Structured Output makes parsing stable
  • Need to anticipate edge cases and include in prompt

From Kakao Tech Campus 3rd cohort final project (student schedule management service).

This post is licensed under CC BY 4.0 by the author.