aboutsummaryrefslogtreecommitdiff
path: root/tests/test-strategy.org
blob: 0fb2a20c13601eb73384d24c33ab671e87349164 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
#+TITLE: org-drill Test Strategy
#+AUTHOR: Test Implementation Plan
#+DATE: 2025-11-13

* Overview

This document outlines the testing strategy for org-drill, an Emacs package implementing flashcard and spaced repetition functionality. The strategy follows best practices from quality-engineer.org, emphasizing:

- Test isolation and independence
- Clear naming conventions
- Comprehensive coverage (normal, boundary, error cases)
- Balance between unit and integration tests
- Maintainable, readable test code

* Current Status

** Test Infrastructure
- [X] Makefile with test targets configured
- [X] Cask dependency management working
- [X] Tests directory structure established (tests/)
- [X] Existing test file moved to tests/ directory
- [X] All compilation warnings fixed (0 warnings)

** Existing Tests
- tests/org-drill-test.el (3 tests, all passing)
  - test-org-drill-entry-p functionality
  - test-org-drill-map-entries with tags

** Test Coverage Status
- Unit tests: ~3 tests covering basic entry detection
- Integration tests: 0 tests
- Coverage: Minimal (~1% of codebase)

** Next Steps
1. [ ] Create test files for critical path functions (see Implementation Plan)
2. [ ] Write integration tests for drill session workflow
3. [ ] Add comprehensive card type tests
4. [ ] Implement spaced repetition algorithm tests
5. [ ] Add boundary and error case coverage

* Architecture Overview

** Core Components

org-drill has several interconnected systems:

*** Entry Management
- Entry identification (drill tags, inheritance)
- Entry filtering (due dates, overdue, new vs mature)
- Entry state tracking (last reviewed, repetitions, etc.)

*** Scheduling Algorithms
- SM2 (SuperMemo 2)
- SM5 (SuperMemo 5)
- Simple8 (modified SM8)
- Interval calculation based on quality ratings

*** Session Management
- Session state (entries pending, done, failed)
- Progress tracking (counts, percentages)
- Scope management (file, tree, directory, agenda)
- Cram mode vs normal mode

*** Card Type System
- Simple cards (question/answer)
- Two-sided and multi-sided cards
- Cloze deletion variants (hide1, show1, hide/show with weights)
- Language learning cards (conjugation, declension)
- Spanish verb conjugation

*** User Interface
- Card presentation (hiding/showing text)
- Response collection (quality ratings 0-5)
- Answer display
- Progress reporting

* Test Categories

** Unit Tests
Test individual functions in isolation with no external dependencies.

*** Naming Convention
Pattern: =test-org-drill-<function>-<category>-<scenario>-<expected>.el=

Examples:
- =test-org-drill-entry-p-normal-valid-tag-returns-true.el=
- =test-org-drill-entry-p-boundary-inherited-tag-returns-true.el=
- =test-org-drill-entry-overdue-p-error-nil-interval-returns-nil.el=

*** Test Structure
Each test file should contain:
- Setup/teardown using testutil functions
- Normal cases (expected usage)
- Boundary cases (edge values, empty/nil, single elements)
- Error cases (invalid inputs, missing data)

** Integration Tests
Test multiple components working together in realistic workflows.

*** Naming Convention
Pattern: =test-integration-<area>-<scenario>-<outcome>.el=

Examples:
- =test-integration-drill-session-complete-workflow-reschedules-entries.el=
- =test-integration-spaced-repetition-quality-ratings-affect-intervals.el=
- =test-integration-card-types-cloze-hides-and-reveals-text.el=

*** Integration Test Characteristics
- Test workflows spanning multiple functions
- Use real org-mode buffers and data structures
- May involve file I/O, property manipulation, state changes
- More setup required, slower than unit tests
- Higher value for catching real-world bugs

* Critical Functions by Priority

Functions prioritized by criticality to org-drill operation:

** Priority 1: Core Drill Loop (Cannot function without these)

*** org-drill-entry-p
*Criticality:* CRITICAL - Entry point for identifying drill items
- Tests if a heading is a drill entry
- Used by all drill operations
- *Test file:* =test-org-drill-entry-p.el=

*** org-drill-entries
*Criticality:* CRITICAL - Main drill session loop
- Orchestrates the entire drill session
- Manages entry queue and state transitions
- *Test file:* =test-org-drill-entries.el=
- *Integration test:* =test-integration-drill-session-complete-workflow.el=

*** org-drill-entry
*Criticality:* CRITICAL - Presents individual drill items
- Shows question, collects response, handles answer
- Core user interaction point
- *Test file:* =test-org-drill-entry.el=

** Priority 2: Scheduling & Intervals (Core algorithm correctness)

*** org-drill-determine-next-interval-sm2
*Criticality:* HIGH - Primary scheduling algorithm
- Calculates next review interval based on SM2
- Quality ratings → interval calculations
- *Test file:* =test-org-drill-determine-next-interval-sm2.el=
- Must cover all quality values (0-5), boundary intervals

*** org-drill-determine-next-interval-sm5
*Criticality:* HIGH - Advanced scheduling option
- More sophisticated than SM2
- Uses optimal factor matrix
- *Test file:* =test-org-drill-determine-next-interval-sm5.el=

*** org-drill-reschedule
*Criticality:* HIGH - Applies scheduling decisions
- Updates entry properties with new intervals
- Persists scheduling state
- *Test file:* =test-org-drill-reschedule.el=

*** org-drill-entry-days-overdue
*Criticality:* HIGH - Determines entry priority
- Calculates overdueness for scheduling
- Affects entry selection order
- *Test file:* =test-org-drill-entry-days-overdue.el=

** Priority 3: Entry Selection & Filtering (Correct entry set)

*** org-drill-entry-overdue-p
*Criticality:* MEDIUM - Filters entries for review
- Determines if entry is due for review
- *Test file:* =test-org-drill-entry-overdue-p.el=

*** org-drill-entry-due-p
*Criticality:* MEDIUM - Core filtering logic
- Checks if entry meets review criteria
- Different behavior for cram vs normal mode
- *Test file:* =test-org-drill-entry-due-p.el=

*** org-drill-entry-leech-p
*Criticality:* MEDIUM - Special case handling
- Identifies problematic items
- Affects leech handling behavior
- *Test file:* =test-org-drill-entry-leech-p.el=

*** org-drill-map-entries
*Criticality:* MEDIUM - Entry collection
- Finds and filters drill entries in scope
- Handles file/tree/agenda scopes
- *Test file:* =test-org-drill-map-entries.el=
- *Integration test:* =test-integration-entry-collection-scope-filters.el=

** Priority 4: Card Presentation (User experience)

*** Card Type Tests
Card type functions are HYBRID tests (both unit and integration aspects):
- *Unit aspect:* Individual presentation logic (hiding text, formatting)
- *Integration aspect:* Interaction with answer handling and user response

*Naming convention for card types:*
Pattern: =test-card-type-<card-type>-<category>-<scenario>.el=

Examples:
- =test-card-type-simple-normal-shows-question-hides-answer.el=
- =test-card-type-twosided-normal-alternates-sides.el=
- =test-card-type-hide1cloze-boundary-single-cloze-hides-correctly.el=
- =test-card-type-multicloze-error-no-cloze-markup-fails-gracefully.el=

*Card types to test:*
1. =org-drill-present-simple-card= - Basic Q&A (PRIORITY: HIGH)
2. =org-drill-present-two-sided-card= - Bidirectional cards (PRIORITY: MEDIUM)
3. =org-drill-present-multi-sided-card= - Multiple sides (PRIORITY: MEDIUM)
4. =org-drill-present-multicloze-hide1= - Hide one cloze (PRIORITY: HIGH)
5. =org-drill-present-multicloze-show1= - Show one cloze (PRIORITY: MEDIUM)
6. =org-drill-present-multicloze-hide1-firstmore= - Weighted hiding (PRIORITY: LOW)
7. =org-drill-present-verb-conjugation= - Language learning (PRIORITY: LOW)
8. =org-drill-present-noun-declension= - Language learning (PRIORITY: LOW)

** Priority 5: Session State & Reporting (Polish)

*** org-drill-session class methods
*Criticality:* MEDIUM - Session state management
- Track progress, counts, statistics
- *Integration test:* =test-integration-session-state-tracking.el=

*** org-drill-report
*Criticality:* LOW - User feedback
- Display session results
- Less critical to core functionality

* Integration Test Scenarios

** High Priority Integration Tests

*** Complete Drill Session Workflow
*File:* =test-integration-drill-session-complete-workflow.el=

*Scenario:* User runs org-drill, reviews items, session completes successfully

*Components integrated:*
- org-drill (entry point)
- org-drill-entries (session loop)
- org-drill-entry (individual drill)
- Card presentation functions
- org-drill-reschedule (update intervals)
- Property persistence (DRILL_LAST_REVIEWED, etc.)

*Validates:*
- Entries are selected correctly based on due dates
- Cards present appropriately for their type
- User responses trigger correct rescheduling
- Entry properties are updated and persisted
- Session statistics are accurate

*** Spaced Repetition Algorithm Integration
*File:* =test-integration-spaced-repetition-quality-affects-intervals.el=

*Scenario:* Different quality ratings produce expected interval changes

*Components integrated:*
- org-drill-entry (collects quality rating)
- org-drill-determine-next-interval-* (calculates interval)
- org-drill-reschedule (applies new interval)
- Property reading/writing

*Validates:*
- Quality 5 → longer intervals (easy items)
- Quality 0-2 → reset or short intervals (failed items)
- Intervals increase appropriately with successful repetitions
- Algorithm choice (SM2/SM5/Simple8) affects results correctly
- Lapsed items handled appropriately

*** Leech Detection and Handling
*File:* =test-integration-leech-detection-and-handling.el=

*Scenario:* Items that fail repeatedly are tagged and handled as leeches

*Components integrated:*
- org-drill-entry (tracks failures)
- Failure count increment
- Leech tagging (add "leech" tag)
- org-drill-entry-leech-p (detection)
- Leech method handling (skip/warn/nil)

*Validates:*
- Failure threshold triggers leech tagging
- Leech items are skipped when leech-method is 'skip
- Warning displayed when leech-method is 'warn
- Leech tag persists across sessions

** Medium Priority Integration Tests

*** Card Type Presentation Chain
*File:* =test-integration-card-types-presentation-and-answer.el=

*Scenario:* Different card types present correctly and collect answers

*Components integrated:*
- org-drill-entry-f (card orchestration)
- Card type presentation functions
- org-drill-present-default-answer (answer display)
- Overlay management (hiding/showing text)

*Validates:*
- Each card type hides appropriate content
- Answer reveal shows correct information
- User can navigate through answer display
- Overlays are cleaned up properly

*** Multi-File and Scope Handling
*File:* =test-integration-scope-handling-files-trees-agenda.el=

*Scenario:* Drill sessions work across different scopes

*Components integrated:*
- org-drill (scope parameter handling)
- org-drill-map-entries (scope-aware filtering)
- org-agenda integration (agenda scope)
- File finding and buffer management

*Validates:*
- File scope drills only current file
- Tree scope drills only current subtree
- Directory scope finds all .org files
- Agenda scope uses org-agenda-files

*** Cram Mode vs Normal Mode
*File:* =test-integration-cram-mode-behavior.el=

*Scenario:* Cram mode treats entries differently than normal mode

*Components integrated:*
- org-drill-cram (cram mode entry)
- Entry filtering (all items vs due items)
- Scheduling (cram doesn't update long-term schedule)
- org-drill-cram-hours (recent review filtering)

*Validates:*
- Cram mode includes all entries regardless of due date
- Recent items (within cram-hours) are excluded
- Cram mode doesn't update normal scheduling data
- Normal mode only includes due entries

** Lower Priority Integration Tests

*** Session Interruption and Resume
*File:* =test-integration-session-resume-after-interruption.el=

*Scenario:* User can pause and resume drill sessions

*Validates:*
- Session state is preserved
- Failed items are remembered
- Resume continues from correct point

*** Session Time and Count Limits
*File:* =test-integration-session-limits-time-and-count.el=

*Scenario:* Sessions respect maximum duration and item counts

*Validates:*
- Session stops at maximum items
- Session stops at maximum duration
- Limits are configurable

* Implementation Plan

** Phase 1: Foundation (Week 1)
Goal: Test critical path functions to ensure basic operation

*** Step 1.1: Entry Detection Tests
- [ ] Create =test-org-drill-entry-p.el=
  - Normal: Valid drill tag
  - Boundary: Inherited tag, nested entries
  - Error: No heading, no tags

- [ ] Create =test-org-drill-part-of-drill-entry-p.el=
  - Normal: Main heading and subheading
  - Boundary: Deeply nested
  - Error: Outside drill entry

*** Step 1.2: Basic Scheduling Tests
- [ ] Create =test-org-drill-determine-next-interval-sm2.el=
  - Normal: Quality 3-5 (successful recall)
  - Boundary: First repetition, very high repetition count
  - Error: Quality 0-2 (failed items), invalid quality

- [ ] Create =test-org-drill-reschedule.el=
  - Normal: Update with new interval
  - Boundary: Nil intervals, zero intervals
  - Error: Invalid entry, missing properties

*** Step 1.3: First Integration Test
- [ ] Create =test-integration-drill-session-simple-workflow.el=
  - Single entry, simple card type
  - User rates quality 4
  - Verify interval updated correctly

** Phase 2: Card Types (Week 2)
Goal: Ensure all card types work correctly

*** Step 2.1: Simple Card Types
- [ ] Create =test-card-type-simple-normal-presentation.el=
- [ ] Create =test-card-type-twosided-normal-alternates.el=

*** Step 2.2: Cloze Card Types
- [ ] Create =test-card-type-hide1cloze-normal-single-hidden.el=
- [ ] Create =test-card-type-show1cloze-normal-single-shown.el=
- [ ] Create =test-card-type-multicloze-boundary-multiple-clozes.el=

*** Step 2.3: Card Type Integration
- [ ] Create =test-integration-card-types-all-types-work.el=
  - Test each card type in actual drill session
  - Verify presentation and answer handling

** Phase 3: Advanced Scheduling (Week 3)
Goal: Test all scheduling algorithms thoroughly

*** Step 3.1: SM5 Algorithm
- [ ] Create =test-org-drill-determine-next-interval-sm5.el=
- [ ] Test optimal factor matrix behavior

*** Step 3.2: Simple8 Algorithm
- [ ] Create =test-org-drill-determine-next-interval-simple8.el=
- [ ] Test early/late review adjustments

*** Step 3.3: Overdue and Due Logic
- [ ] Create =test-org-drill-entry-days-overdue.el=
- [ ] Create =test-org-drill-entry-overdue-p.el=
- [ ] Create =test-org-drill-entry-due-p.el=

*** Step 3.4: Algorithm Integration
- [ ] Create =test-integration-spaced-repetition-algorithms.el=
  - Compare SM2, SM5, Simple8 behaviors
  - Verify algorithm selection works

** Phase 4: Session Management (Week 4)
Goal: Test session orchestration and state

*** Step 4.1: Session State Tests
- [ ] Create =test-org-drill-session-class.el=
  - Test session initialization
  - Test state tracking (done, failed, pending)

*** Step 4.2: Entry Collection
- [ ] Create =test-org-drill-map-entries.el=
  - Test file scope
  - Test tree scope
  - Test tag filtering

*** Step 4.3: Session Integration
- [ ] Create =test-integration-drill-session-complete-workflow.el=
- [ ] Create =test-integration-session-state-tracking.el=

** Phase 5: Special Cases (Week 5)
Goal: Test edge cases and special behaviors

*** Step 5.1: Leech Handling
- [ ] Create =test-org-drill-entry-leech-p.el=
- [ ] Create =test-integration-leech-detection-and-handling.el=

*** Step 5.2: Cram Mode
- [ ] Create =test-org-drill-cram.el=
- [ ] Create =test-integration-cram-mode-behavior.el=

*** Step 5.3: Session Limits
- [ ] Create =test-integration-session-limits-time-and-count.el=

** Phase 6: Polish (Week 6)
Goal: Add remaining coverage and documentation

*** Step 6.1: Boundary Cases
- [ ] Review all test files for boundary case coverage
- [ ] Add missing boundary tests

*** Step 6.2: Error Cases
- [ ] Review all test files for error case coverage
- [ ] Add missing error tests

*** Step 6.3: Documentation
- [ ] Update this document with final coverage statistics
- [ ] Document any untested areas and rationale
- [ ] Add test maintenance guide

* Test Naming Quick Reference

** Unit Test Naming
Pattern: =test-org-drill-<function>-<category>-<scenario>-<expected>.el=

Categories:
- =normal= - Expected usage patterns
- =boundary= - Edge values, empty/nil, limits
- =error= - Invalid inputs, failures

Example structure within file:
#+begin_src elisp
;;; Normal Cases
(ert-deftest test-org-drill-entry-p-normal-valid-tag-returns-true () ...)
(ert-deftest test-org-drill-entry-p-normal-no-tag-returns-nil () ...)

;;; Boundary Cases
(ert-deftest test-org-drill-entry-p-boundary-inherited-tag-returns-true () ...)
(ert-deftest test-org-drill-entry-p-boundary-deeply-nested-returns-true () ...)

;;; Error Cases
(ert-deftest test-org-drill-entry-p-error-not-at-heading-returns-nil () ...)
#+end_src

** Integration Test Naming
Pattern: =test-integration-<area>-<scenario>-<outcome>.el=

Areas:
- =drill-session= - Complete drill workflows
- =spaced-repetition= - Algorithm behavior
- =card-types= - Card presentation
- =leech= - Leech detection and handling
- =cram= - Cram mode behavior
- =scope= - File/tree/agenda scope
- =session-state= - State tracking

Example structure within file:
#+begin_src elisp
;;; Setup
(defun test-integration-setup-drill-buffer () ...)

;;; Normal Workflow Tests
(ert-deftest test-integration-drill-session-single-entry-completes () ...)
(ert-deftest test-integration-drill-session-multiple-entries-tracked () ...)

;;; Edge Case Tests
(ert-deftest test-integration-drill-session-all-failed-tracks-correctly () ...)
#+end_src

** Card Type Test Naming
Pattern: =test-card-type-<card-type>-<category>-<scenario>.el=

Card types:
- =simple= - Basic Q&A
- =twosided= - Bidirectional
- =multisided= - Multiple faces
- =hide1cloze= - Hide one cloze
- =show1cloze= - Show one cloze
- =multicloze= - Multiple cloze handling
- =conjugate= - Verb conjugation
- =declension= - Noun declension

* Coverage Goals

** Target Coverage by Component

*** Entry Management: 90%
- Entry detection functions are critical
- Must handle all tag inheritance cases
- Edge cases around heading detection

*** Scheduling Algorithms: 95%
- Mathematical correctness is essential
- All quality ratings must be tested
- Boundary intervals (0, 1, max) critical

*** Card Types: 80%
- Basic types (simple, cloze) need high coverage
- Specialized types (conjugation) less critical
- Focus on correct text hiding/showing

*** Session Management: 85%
- Core loop must be robust
- State tracking is important
- Scope handling needs coverage

*** UI/Presentation: 60%
- Interactive functions harder to test
- Focus on testable helper functions
- Integration tests for user workflows

** Overall Target: 80% Coverage
- Focus on critical path first
- Add coverage incrementally
- Balance effort vs value

* Maintenance Guidelines

** Updating This Document
- Update "Current Status" section as tests are implemented
- Check off items in Implementation Plan as completed
- Document any deviations from the plan with rationale
- Add new test ideas to the appropriate section

** Test Maintenance
- Run full test suite before committing: =make test=
- Update tests when functionality changes
- Remove obsolete tests
- Refactor tests alongside production code

** Adding New Tests
1. Determine if unit or integration test
2. Follow naming convention for category
3. Place in appropriate file (create if needed)
4. Use existing test utilities where possible
5. Add to this document's tracking sections

* References

- quality-engineer.org: Comprehensive testing guidelines
- Makefile: Test runner configuration
- tests/org-drill-test.el: Existing test examples
- testutil-*.el files: Test utility functions (if created)

* Notes

** Test Philosophy for org-drill
- Spaced repetition correctness is paramount (test algorithms thoroughly)
- User data integrity matters (test property updates carefully)
- Card presentation affects learning (test hiding/showing accurately)
- Session state must be reliable (test state transitions)

** Card Types: Unit or Integration?
Card type tests are HYBRID:
- *Unit aspects:* Text hiding, formatting, overlay management
- *Integration aspects:* Answer handling, user response, state transitions

*Recommendation:* Write as unit tests first (fast, focused), then add integration tests for workflows that span card presentation + answer + rescheduling.

** Testing Interactive Functions
Many org-drill functions are interactive (=defun ... (interactive)=):
- Extract testable logic into internal functions (=org-drill--internal=)
- Test internal functions with explicit parameters
- Keep interactive wrappers thin (just user input handling)
- Integration tests can exercise full interactive workflows

** Testing with Real Org Buffers
Some tests need real org-mode buffers:
- Use =with-temp-buffer= and =(org-mode)= for temporary buffers
- Create test data as org-mode text, not mocked functions
- Test with realistic org structure (headings, properties, tags)
- Clean up buffers in teardown