Keyword check: matching expected text inside the HTML response

5 мин чтения
Обновлено 12 мая 2026

Keyword check is an extra check on top of an HTTP monitor: in addition to the standard status code probe, Tracker.ru searches the response body for a given substring and considers the URL healthy only if the substring is found. If the page returns 200 OK but the expected word is gone, the URL transitions to down and notifications are dispatched.

This covers a whole class of incidents where the server is technically alive but the content is broken: an empty template, a blank page after a deploy, a stub page from a reverse proxy, a deface, or a malfunctioning feature-flag panel. Without keyword check, such cases pass HTTP monitoring silently.

When you need keyword check

  • Anti-deface. The substring is a fragment of unique text that must always be on the site (for example, a copyright line in the footer). If an attacker swaps the home page, the keyword is no longer found and an alert is fired.
  • Feature-flag SLO. The page contains a marker like Feature: enabled. When the feature flag is broken and the page returns 200 without the marker, monitoring catches the regression.
  • Template integrity on 200. The cache returned stale HTML, the frontend is broken, but the status is still 200 — a keyword from the expected block (a username, navigation menu, a specific heading) immediately tells you something is off.
  • Status page. A public status page usually contains a phrase like "All systems operational". A keyword check on this phrase turns the status page into an "is everything fine" monitor.
  • Deploy version marker. A footer or meta tag with the version number (v1.42.0). If the deploy went sideways and the version did not bump — alert.

How to configure

In the URL edit form fill in the "Expected keyword" field (expected_keyword). You can enter any substring up to 255 characters: a phrase, a fragment of HTML markup, a piece of footer text, a CSS class name — anything that is guaranteed to appear on the expected page.

Via API:

curl -X POST https://tracker.ru/api/v1/urls \
  -H "Authorization: Bearer <api_token>" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/status",
    "monitor_type": "http",
    "period": 60,
    "expected_keyword": "All systems operational"
  }'

The length limit is 255 characters (max:255 in StoreUrlRequest). That is enough for a unique fragment; do not try to fit the entire HTML into the field — keep the minimal sufficient substring that your frontend is guaranteed to render on a working version.

Trigger logic

Each HTTP probe with expected_keyword set runs through this pipeline:

  1. The checker performs an HTTP request.
  2. It receives the response and reads the body (subject to the limits below).
  3. It searches for the substring inside the body — case-insensitive substring match.
  4. If the substring is not found, it raises the keyword_not_found = true flag, sets Ok = false, error: Expected keyword not found: <keyword>. The URL transitions to down.
  5. When the substring reappears, keyword_not_found is cleared, the URL recovers, a recovery notification is sent.

The check_log table stores the keyword_not_found flag for every check — this lets the URL log distinguish "fell on status" from "fell on keyword".

Technical details

  • HEAD → GET auto-switch. By default Tracker.ru uses the HEAD HTTP method — it is faster and cheaper for the monitored site. But a HEAD response carries no body, and there is nothing to match a keyword against. Therefore, when expected_keyword is set, the checker automatically switches the method to GET (http_checker.go:108-110). No user-side configuration is needed. If HEAD is explicitly chosen in the form, it is overridden to GET for that URL.
  • Case-insensitive. The lookup ignores case. Both sides — the keyword and the response body — are lowercased via strings.ToLower (http_checker.go:168). So Operational, operational, OPERATIONAL are equivalent. Convenient: it does not break on a casual template edit that changes a header's case.
  • Plain substring, not regex. The field value is a plain string. No metacharacters, groups, or anchors. A literal substring is searched for. If you need regex, keyword check is not your tool — look at custom webhook checkers or a dedicated health endpoint.
  • 1 MB body limit. The checker reads the body via io.LimitReader(resp.Body, 1<<20) — only the first megabyte of the response (http_checker.go:164). The keyword search runs only against this megabyte. If your HTML is bigger than 1 MB, place the expected word closer to the beginning of the page (in <head> or right after <body>); otherwise it ends up beyond the limit and the URL constantly emits a false keyword_not_found alert. In practice 1 MB is a lot for a single page; the issue is rare.
  • UTF-8. Cyrillic, Chinese characters, and emoji are supported in the keyword — both compared byte streams are processed as UTF-8 strings. The Laravel validator max:255 uses mb_strlen — the limit is measured in characters (including multi-byte ones), not bytes. For most practical scenarios 255 characters is enough.
  • Compatibility with TCP monitoring. The expected_keyword field is only available for monitor_type=http. For TCP checks it is ignored — TCP has no body to search in. See /docs/features/tcp-monitoring?lang=en for details.

Notifications

When the keyword disappears, an alert with an explicit reason goes to the configured channels — it differs from a regular "status 5xx" alert:

Site is unavailable! https://example.com/status
Error: Expected keyword not found
Expected: "All systems operational"
Down since 2026-05-01 19:15:09

In the webhook payload the error_status field takes the value keyword_not_found — useful for splitting incident flow: keyword failures often need a different reaction than 5xx (for example, ping the frontend team, not SRE).

On recovery (the keyword is back in the response) a regular recovery message is sent with the downtime duration.

Related articles