Skip to content

feat: add curated telegraf 1.38.2 build for Azure Linux 4.0 (#20399)#17893

Draft
WithEnoughCoffee wants to merge 2 commits into
4.0from
autumnnash/telegraf-20399
Draft

feat: add curated telegraf 1.38.2 build for Azure Linux 4.0 (#20399)#17893
WithEnoughCoffee wants to merge 2 commits into
4.0from
autumnnash/telegraf-20399

Conversation

@WithEnoughCoffee

Copy link
Copy Markdown
Contributor

Supersedes #17798. Same change, re-opened from a branch on microsoft/azurelinux (instead of a fork) so the ADO source-upload/package-build pipelines can run with secrets. Branch content is identical (commit dccc59f).

Summary

Telegraf shipped in Azure Linux 3.0 but is missing from 4.0. This restores it as a general-purpose, plugin-driven agent for collecting, processing, aggregating, and writing metrics. Resolves #20399.

Why a curated build

Built upstream-default, telegraf links ~400 plugins and the full transitive dependency tree — a large CVE surface, vendor footprint, and binary for a distro we maintain. Instead we compile a curated ("Balanced", 108 build tags) set: 63 inputs, 15 outputs, 7 processors, 4 aggregators, 12 parsers, 7 serializers. The rest are absent from the binary at build time.

  • General-purpose — common system, network, database, container, and cloud inputs/outputs included, useful out of the box.
  • Full first-party Azure + GitHubazure_monitor (in/out), azure_storage_queue, eventhub_consumer, azure_data_explorer, and the github input are all included, since AzL is an Azure/Microsoft + GitHub product and these should work by default.
  • Smaller attack/maintenance surface — fewer compiled plugins → fewer linked deps, fewer CVEs, smaller binary. Concretely, go list -deps ./cmd/telegraf links 1,877 packages vs 3,386 for the full build (~45% dropped), across 344 distinct third-party modules vs 592 (~248 fewer). Note: this reduces the linked/runtime-reachable surface only — the full vendor tree is still shipped (Fedora requires it), so the source-level CVE-scan footprint is unchanged.
  • Reviewable & adjustable in one place — the entire plugin policy is a single spec macro (%global buildtags). Adding (or removing) a plugin is a one-line change — append its tag to the macro, e.g.:
         inputs.cpu inputs.disk inputs.diskio inputs.mem inputs.net inputs.netstat \
    +    inputs.redis \
    then re-render and rebuild. No %build/%install/%files changes are needed, so curation stays easy to audit and evolve as requirements change.

Note on upstream: telegraf is not packaged by Fedora; the reference is InfluxData's own RPM, which ships the full plugin set. Our curation is a deliberate deviation produced via upstream's supported custom build tag (the same mechanism InfluxData exposes to end users for slimmed builds). The full vendor tree is retained for reproducibility and easy plugin additions.

Packaging (Fedora Go guidelines)

Uses the go2rpm --profile vendor scaffold as the baseline (Go Vendor Tools, vendored deps, %gobuild with GO_BUILDTAGS/GO_LDFLAGS), so it can be upstreamed to Fedora and matches the vendored-Go pattern AzL already uses (rootlesskit, git-lfs). Divergences are marked # AzL:. The full vendor tree is retained (Fedora requires it); curation only affects what is compiled. The cumulative SPDX License tag is computed with go_vendor_license and enforced by %go_vendor_license_check; bundled(golang(...)) provides are auto-generated.

systemd unit

The upstream systemd unit is shipped unmodified (runs as User=telegraf). We intentionally add no sandboxing drop-in: telegraf is a whole-system monitoring agent, and the curated inputs include hardware collectors that shell out via sudo -n (smart, smartctl, ipmi_sensor) or need CAP_NET_RAW (ping) — NoNewPrivileges/Protect* would break them. This matches upstream InfluxData and AzL 3.0. (An earlier revision shipped an openSUSE-derived 50-hardening.conf; it was dropped after review because it diverged from upstream and conflicted with the curated hardware inputs. See the PR discussion for the full rationale.)

Contents

  • telegraf.spec%gometa, curated %global buildtags, Go Vendor Tools license macros, sysusers (no userdel on uninstall), upstream systemd unit (unmodified), logrotate, generated default config, state dir, %check (license check + binary smoke test).
  • go-vendor-tools.toml — askalono detector + manual license entries.
  • telegraf.comp.toml — upstream source plus the full vendor tarball.
  • telegraf.sysusers, telegraf.default, generate_source_tarball.sh, locks/telegraf.lock.

Verification

Full mock build passes every phase including %check. Confirmed in mock:

  • Binary reports Telegraf 1.38.2 (branch stamped azurelinux); functional collection works (cpu input loads and emits).
  • Curated plugins present (azure_monitor in/out, azure_data_explorer, github, eventhub_consumer, docker, prometheus, snmp, …); non-curated absent (cloudwatch, sqlserver, nats, clickhouse).
  • File modes: telegraf.conf 0644 root:root (world-readable, as on Fedora); state dir /var/lib/telegraf 0770 root:telegraf (matching upstream InfluxData post-install.sh).
  • Install/erase lifecycle: sysusers creates the telegraf user with home /etc/telegraf (matching upstream InfluxData useradd -r -M -d /etc/telegraf; config is read via the unit's explicit -config flag, independent of $HOME); the unit installs; on erase the user is intentionally retained.
  • systemd-analyze verify accepts the unit; debuginfo is split into its own subpackage.

Why 1.38.2 (not 1.39.0)

telegraf 1.39.0's go.mod requires Go 1.26; AzL 4.0 currently ships Go 1.25.8, and 1.38.2 is the latest release that builds on it. We can bump to 1.39.0 once AzL golang reaches ≥ 1.26 (which also drops the logzio azure-monitor dependency).

Known follow-up

  • The reproducible vendor tarball (generate_source_tarball.sh, SHA512 1108fe48086a7051c5cb89935c6de1c675c3ea8212a979d147ad0c03aef327c6234fa9eee292e4f9594ba9ec2cb757fc9eff46630aea43551bca3d948b30b27f) must be uploaded to the lookaside store before CI source checks and package builds can fetch it; the comp.toml source URI already points at its final published path.

WithEnoughCoffee and others added 2 commits July 1, 2026 21:06
Restore telegraf (absent from AzL 4.0) as a general-purpose metrics agent,
packaged per the Fedora Go guidelines for upstreaming.

- Curated ("Balanced") custom build via GO_BUILDTAGS: a general-purpose subset
  (~108 of ~415 plugins) is compiled in, including the full first-party Azure
  plugin set and the github input. This is a deliberate AzL deviation from the
  full upstream/AZL3 build, produced via upstream's supported `custom` build
  tag; the complete vendor tree is still shipped for reproducibility and easy
  plugin additions. Curation drops ~248 third-party modules from the linked
  binary (defense-in-depth: unlinked code is not runtime-reachable).
- Reproducible vendor tarball via generate_source_tarball.sh (deterministic tar
  flags, fixed SOURCE_DATE_EPOCH=0); pinned by SHA512. The script is an
  out-of-band maintainer tool and is never invoked during rpmbuild. Parser
  hardened with a catch-all case to avoid an infinite loop on unexpected args.
- rpmautospec %autorelease / %autochangelog (release calculation = autorelease),
  per the convention for newly authored specs.
- System user via sysusers.d with home /etc/telegraf, matching upstream
  InfluxData (useradd -M -d /etc/telegraf). The home field is unused for config
  loading: the unit reads config via explicit -config /etc/telegraf/telegraf.conf
  -config-directory /etc/telegraf/telegraf.d, and no curated plugin writes $HOME.
- Ships the upstream systemd unit as-is (no sandboxing drop-in), matching
  upstream InfluxData; curated hardware inputs (smart, ipmi_sensor, ping) rely on
  sudo/CAP_NET_RAW that aggressive Protect*/NoNewPrivileges settings would break.
- Config, sysusers and env-file drop-ins shipped as Source files;
  telegraf.conf installed 0644 root:root (world-readable, as on Fedora) and
  state dir /var/lib/telegraf installed 0770 root:telegraf, matching upstream
  InfluxData (scripts/rpm/post-install.sh).
Co-authored-by: Tobias Brick <39196763+tobiasb-ms@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 2, 2026 15:44
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

🔒❌ Lock files are out of date

FIX: — run this and commit the result:

azldev component update -p telegraf

Or download the fix patch and apply it:

gh run download 28602873058 -R microsoft/azurelinux -n locks-patch
git apply locks.patch

Changed components (1)

Component New upstream commit
telegraf -

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores telegraf (an InfluxData metrics/monitoring agent that shipped in Azure Linux 3.0) to Azure Linux 4.0 as a new local-spec component. Because telegraf is not packaged by Fedora, it is maintained here as an AzL-authored local spec built from a pinned upstream source tarball plus a reproducible vendored-Go dependency archive. The build is deliberately "curated": only ~108 of telegraf's ~415 plugins are compiled in via a single %global buildtags macro, shrinking the linked dependency/CVE surface while retaining the full vendor tree for reproducibility and easy plugin additions. It follows the established go-vendor-tools pattern already used by rootlesskit/git-lfs.

Changes:

  • Add the telegraf component: local spec (%gometa/%gobuild, curated build tags, Go Vendor Tools license macros), sysusers, env file, logrotate, systemd unit, generated default config, and a reproducible vendor-tarball generator.
  • Wire up sources (upstream tarball from GitHub + vendor archive from the AzL lookaside), lock file, and the rendered spec/manifest under specs/t/telegraf/.
  • Add telegraf to the base-packages publish group (→ rpm-base channel).

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
base/comps/telegraf/telegraf.comp.toml New component: local spec + two source-files (upstream tarball, lookaside vendor archive).
base/comps/telegraf/telegraf.spec Hand-authored curated spec; matches the go-vendor-tools pattern; rpmautospec release/changelog.
base/comps/telegraf/telegraf.sysusers Declarative telegraf system user (home /etc/telegraf, nologin).
base/comps/telegraf/telegraf.default EnvironmentFile for $TELEGRAF_OPTS.
base/comps/telegraf/go-vendor-tools.toml askalono detector + manual license overrides for the vendor tree.
base/comps/telegraf/generate_source_tarball.sh Out-of-band maintainer tool to produce the reproducible vendor archive.
specs/t/telegraf/* Rendered spec, source-referenced files, and sources SHA512 manifest (consistent with the base copies).
locks/telegraf.lock Component lock with input fingerprint.
base/comps/components-publish-channels.toml Adds telegraf to base-packages (publishes to rpm-base).

Note: the rendered spec, sources manifest, and lock file appear self-consistent with the component definition, and the vendor/source hashes match across comp.toml, sources, and the spec; final consistency (rendered-spec drift, lock fingerprint) is validated by the repo's Check Rendered Specs / Update Locks CI. The PR description also flags a required lookaside upload of the vendor tarball before source checks/builds can fetch it.

"tcpdump",
"tcsh",
"teckit",
"telegraf",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants