test(file-tool): cover parse_pages_arg edge cases

`parse_pages_arg` validates the user-supplied `pages` argument that
`ReadFileTool` forwards to `pdftotext -f START -l END`. The function
has zero tests today even though it's the only gatekeeper between
user input and a pdftotext spawn — silent acceptance of a malformed
range yields a confusing empty extraction with no actionable error
message.

Adds five tests:

* `parse_pages_arg_accepts_single_page` — `"3"` and `" 7 "` both
  return `Some((n, n))`.
* `parse_pages_arg_accepts_range` — `"1-5"`, `"10-20"`, and
  whitespace-tolerant `" 1 - 5 "` all parse correctly.
* `parse_pages_arg_rejects_invalid_ranges` — `5-1` (end < start),
  `0` and `0-3` (one-indexed contract), empty / whitespace-only
  inputs, `abc` (non-numeric), and `3.5` (floats) all return `None`.
* `parse_pages_arg_rejects_half_open_ranges` — `1-`, `-5`, and `-`
  reject rather than silently extending to `u32::MAX` or `0`.
* `parse_pages_arg_rejects_negative_numbers` — `-3-5` doesn't wrap
  into a giant positive number via u32 parsing.

Zero behaviour change; locks the contract so a future innocuous edit
can't silently shift validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
LinQ
2026-05-11 00:50:48 +01:00
committed by Hunter Bown
parent 68b25584cc
commit 6440052089
+52
View File
@@ -462,6 +462,58 @@ mod tests {
assert_eq!(result.content, "hello world");
}
#[test]
fn parse_pages_arg_accepts_single_page() {
assert_eq!(parse_pages_arg("3"), Some((3, 3)));
assert_eq!(parse_pages_arg(" 7 "), Some((7, 7)));
}
#[test]
fn parse_pages_arg_accepts_range() {
assert_eq!(parse_pages_arg("1-5"), Some((1, 5)));
assert_eq!(parse_pages_arg("10-20"), Some((10, 20)));
// Whitespace around either side of the dash is tolerated so
// hand-typed `pages: "1 - 5"` still works.
assert_eq!(parse_pages_arg(" 1 - 5 "), Some((1, 5)));
}
#[test]
fn parse_pages_arg_rejects_invalid_ranges() {
// Caller would otherwise feed `pdftotext -f 5 -l 1`, which
// prints nothing — fail loudly so the model can re-issue.
assert!(parse_pages_arg("5-1").is_none(), "end < start must reject");
// 0-indexed pages aren't a thing in pdftotext; reject so the
// caller doesn't get a confusing "no output" silent fail.
assert!(
parse_pages_arg("0").is_none(),
"zero single-page must reject"
);
assert!(parse_pages_arg("0-3").is_none(), "zero start must reject");
// Empty / whitespace-only / non-numeric inputs must reject.
assert!(parse_pages_arg("").is_none());
assert!(parse_pages_arg(" ").is_none());
assert!(parse_pages_arg("abc").is_none());
assert!(parse_pages_arg("3.5").is_none(), "floats must reject");
}
#[test]
fn parse_pages_arg_rejects_half_open_ranges() {
// Half-open ranges like `1-` or `-5` are almost certainly a
// typo for `1-N`/`N` rather than intentional input. Reject
// them rather than silently extending to u32::MAX or 0.
assert!(parse_pages_arg("1-").is_none());
assert!(parse_pages_arg("-5").is_none());
assert!(parse_pages_arg("-").is_none());
}
#[test]
fn parse_pages_arg_rejects_negative_numbers() {
// u32::parse on a negative literal returns Err, so the
// function reports `None` rather than wrapping into a giant
// positive number — defensive but worth pinning.
assert!(parse_pages_arg("-3-5").is_none());
}
#[tokio::test]
async fn test_read_file_not_found() {
let tmp = tempdir().expect("tempdir");