Skip to content

Address prometheus scraping jobs that are failing #551

Open
@jfly

Description

I see that we have 3 failing scraping jobs (at time of writing). Query:

  • up{instance="r13y.com:443", job="r13y"}
  • up{instance="127.0.0.1:9190", job="rfc39"}
  • up{instance="hydra.nixos.org:9199", job="hydra_notify"}

r13y

Last successful scrape: 2024-09-21

Added in 3c4f476.

Looks like it's a reproducibility checker created by @grahamc: https://github.com/grahamc/r13y.com.

Perhaps this can just be removed?

rfc39

This is trickier. Seems to be periodically up (link):Query

Image

Apparently this is a known "issue", see comment here: https://github.com/nixos/infra/blob/af0ed6d10dbb3a3ec919321314506b180d1f5faf/build/pluto/prometheus/exporters/rfc39.nix#L12.

AFAICT, we don't have any alerting rules configured that react to this (just this systemd unit state). Perhaps we could just stop scraping this? Is there useful historical data in here?

hydra_notify

Last successful scrape: 2024-08-02

Added in bf95096, also see 88abf45.

Looks like @mweinelt disabled hydra-notify here: 66da5cf, which lines up with the last successful scrape.

Seems like we should just disable this scrape job as well.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions