fix(client): update icmp/ping logic to determine pinger privileged mode (#1346)

* fix(pinger): update logic to determine pinger privileged mode * add some unit tests for pinger Signed-off-by: Zee Aslam <zeet6613@gmail.com> * undo accidental removal Signed-off-by: Zee Aslam <zeet6613@gmail.com> * check for cap_net_raw by trying to open a raw socket and checking for permission error Signed-off-by: Zee Aslam <zeet6613@gmail.com> * revert syscall after testing. It is unable to build a binary on windows Signed-off-by: Zee Aslam <zeet6613@gmail.com> * remove extra import * review icmp section of readme. No changes required Signed-off-by: Zee Aslam <zeet6613@gmail.com> * Update client/client.go Co-authored-by: TwiN <twin@linux.com> * Update client/client.go Match function name Co-authored-by: TwiN <twin@linux.com> * Update client/client.go Remove extra line Co-authored-by: TwiN <twin@linux.com> --------- Signed-off-by: Zee Aslam <zeet6613@gmail.com> Co-authored-by: TwiN <twin@linux.com>
2026-02-04 11:11:44 +00:00 · 2025-11-04 20:42:20 -05:00
parent 2ebb74ae1e
commit 5fdc489113
3 changed files with 68 additions and 24 deletions
--- a/README.md
+++ b/README.md
@@ -373,13 +373,13 @@ Where:
  - Using the example configuration above, the key would be `core_ext-ep-test`.
 - `{success}` is a boolean (`true` or `false`) value indicating whether the health check was successful or not.
 - `{error}` (optional): a string describing the reason for a failed health check. If {success} is false, this should contain the error message; if the check is successful.
- `{duration}` (optional): the time that the request took as a duration string (e.g. 10s). 
+- `{duration}` (optional): the time that the request took as a duration string (e.g. 10s).

 You must also pass the token as a `Bearer` token in the `Authorization` header.


 ### Suites (ALPHA)
-Suites are collections of endpoints that are executed sequentially with a shared context. 
+Suites are collections of endpoints that are executed sequentially with a shared context.
 This allows you to create complex monitoring scenarios where the result from one endpoint can be used in subsequent endpoints, enabling workflow-style monitoring.

 Here are a few cases in which suites could be useful:
@@ -421,7 +421,7 @@ suites:
    context:
      price: "19.99"  # Initial static value in context
    endpoints:
-      # Step 1: Create an item and store the item ID 
+      # Step 1: Create an item and store the item ID
      - name: create-item
        url: https://api.example.com/items
        method: POST
@@ -435,7 +435,7 @@ suites:
        alerts:
          - type: slack
            description: "Failed to create item"
-            
+
      # Step 2: Update the item using the stored item ID
      - name: update-item
        url: https://api.example.com/items/[CONTEXT].itemId
@@ -446,7 +446,7 @@ suites:
        alerts:
          - type: slack
            description: "Failed to update item"
-        
+
      # Step 3: Fetch the item and validate the price
      - name: get-item
        url: https://api.example.com/items/[CONTEXT].itemId
@@ -457,7 +457,7 @@ suites:
        alerts:
          - type: slack
            description: "Item price did not update correctly"
-            
+
      # Step 4: Delete the item (always-run: true to ensure cleanup even if step 2 or 3 fails)
      - name: delete-item
        url: https://api.example.com/items/[CONTEXT].itemId
@@ -536,7 +536,7 @@ System-wide announcements allow you to display important messages at the top of

 Types:
 - **outage**: Indicates service disruptions or critical issues (red theme)
- **warning**: Indicates potential issues or important notices (yellow theme)  
+- **warning**: Indicates potential issues or important notices (yellow theme)
 - **information**: General information or updates (blue theme)
 - **operational**: Indicates resolved issues or normal operations (green theme)
 - **none**: Neutral announcements with no specific severity (gray theme, default if none are specified)
@@ -548,7 +548,7 @@ announcements:
    type: outage
    message: "Scheduled maintenance on database servers from 14:00 to 16:00 UTC"
  - timestamp: 2025-08-15T16:15:00Z
-    type: operational  
+    type: operational
    message: "Database maintenance completed successfully. All systems operational."
  - timestamp: 2025-08-15T12:00:00Z
    type: information
@@ -709,7 +709,7 @@ endpoints:
 > 📝 Note that if running in a container, you must volume mount the certificate and key into the container.

 ### Tunneling
-Gatus supports SSH tunneling to monitor internal services through jump hosts or bastion servers. 
+Gatus supports SSH tunneling to monitor internal services through jump hosts or bastion servers.
 This is particularly useful for monitoring services that are not directly accessible from where Gatus is deployed.

 SSH tunnels are defined globally in the `tunneling` section and then referenced by name in endpoint client configurations.
@@ -746,7 +746,7 @@ endpoints:
      - "[STATUS] == 200"
 ```

-> ⚠️ **WARNING**:: Tunneling may introduce additional latency, especially if the connection to the tunnel is retried frequently. 
+> ⚠️ **WARNING**:: Tunneling may introduce additional latency, especially if the connection to the tunnel is retried frequently.
 > This may lead to inaccurate response time measurements.


@@ -2182,7 +2182,7 @@ Here's an example of what the notifications look like:
 | `alerting.telegram`                   | Configuration for alerts of type `telegram`                                                | `{}`                       |
 | `alerting.telegram.token`             | Telegram Bot Token                                                                         | Required `""`              |
 | `alerting.telegram.id`                | Telegram User ID                                                                           | Required `""`              |
-| `alerting.telegram.topic-id`          | Telegram Topic ID in a group corresponds to `message_thread_id` in the Telegram API        | `""`                       |    
+| `alerting.telegram.topic-id`          | Telegram Topic ID in a group corresponds to `message_thread_id` in the Telegram API        | `""`                       |
 | `alerting.telegram.api-url`           | Telegram API URL                                                                           | `https://api.telegram.org` |
 | `alerting.telegram.client`            | Client configuration. <br />See [Client configuration](#client-configuration).             | `{}`                       |
 | `alerting.telegram.default-alert`     | Default alert configuration. <br />See [Setting a default alert](#setting-a-default-alert) | N/A                        |
@@ -2844,7 +2844,7 @@ will send a `POST` request to `http://localhost:8080/playground` with the follow


 ### Recommended interval
-To ensure that Gatus provides reliable and accurate results (i.e. response time), Gatus limits the number of 
+To ensure that Gatus provides reliable and accurate results (i.e. response time), Gatus limits the number of
 endpoints/suites that can be evaluated at the same time.
 In other words, even if you have multiple endpoints with the same interval, they are not guaranteed to run at the same time.

@@ -2952,8 +2952,8 @@ endpoints:
 ```

 The `[BODY]` placeholder contains the output of the query, and `[CONNECTED]`
-shows whether the connection was successfully established. You can use Go template 
-syntax. 
+shows whether the connection was successfully established. You can use Go template
+syntax.


 ### Monitoring an endpoint using ICMP
@@ -2970,7 +2970,7 @@ endpoints:
 Only the placeholders `[CONNECTED]`, `[IP]` and `[RESPONSE_TIME]` are supported for endpoints of type ICMP.
 You can specify a domain prefixed by `icmp://`, or an IP address prefixed by `icmp://`.

-If you run Gatus on Linux, please read the Linux section on https://github.com/prometheus-community/pro-bing#linux
+If you run Gatus on Linux, please read the Linux section on [https://github.com/prometheus-community/pro-bing#linux]
 if you encounter any problems.


@@ -3088,7 +3088,7 @@ endpoints:
      - "[CERTIFICATE_EXPIRATION] > 240h"
 ```

-> ⚠ The usage of the `[DOMAIN_EXPIRATION]` placeholder requires Gatus to use RDAP, or as a fallback, send a request to the official IANA WHOIS service 
+> ⚠ The usage of the `[DOMAIN_EXPIRATION]` placeholder requires Gatus to use RDAP, or as a fallback, send a request to the official IANA WHOIS service
 > [through a library](https://github.com/TwiN/whois) and in some cases, a secondary request to a TLD-specific WHOIS server (e.g. `whois.nic.sh`).
 > To prevent the WHOIS service from throttling your IP address if you send too many requests, Gatus will prevent you from
 > using the `[DOMAIN_EXPIRATION]` placeholder on an endpoint with an interval of less than `5m`.
@@ -3117,7 +3117,7 @@ concurrency: 0

 **Use cases for higher concurrency:**
 - You have a large number of endpoints to monitor
- You want to monitor endpoints at very short intervals (< 5s)  
+- You want to monitor endpoints at very short intervals (< 5s)
 - You're using Gatus for load testing scenarios

 **Legacy configuration:**
@@ -3201,7 +3201,7 @@ ui:
  default-sort-by: group
 ```
 Note that if a user has already sorted the dashboard by a different field, the default sort will not be applied unless the user
-clears their browser's localstorage. 
+clears their browser's localstorage.


 ### Exposing Gatus on a custom path
--- a/client/client.go
+++ b/client/client.go
@@ -13,6 +13,7 @@ import (
 	"net"
 	"net/http"
 	"net/smtp"
+	"os"
 	"runtime"
 	"strings"
 	"time"
@@ -343,12 +344,7 @@ func Ping(address string, config *Config) (bool, time.Duration) {
 	pinger := ping.New(address)
 	pinger.Count = 1
 	pinger.Timeout = config.Timeout
-	// Set the pinger's privileged mode to true for every GOOS except darwin
-	// See https://github.com/TwiN/gatus/issues/132
-	//
-	// Note that for this to work on Linux, Gatus must run with sudo privileges.
-	// See https://github.com/prometheus-community/pro-bing#linux
-	pinger.SetPrivileged(runtime.GOOS != "darwin")
+	pinger.SetPrivileged(ShouldRunPingerAsPrivileged())
 	pinger.SetNetwork(config.Network)
 	err := pinger.Run()
 	if err != nil {
@@ -364,6 +360,25 @@ func Ping(address string, config *Config) (bool, time.Duration) {
 	return true, 0
 }

+// ShouldRunPingerAsPrivileged will determine whether or not to run pinger in privileged mode.
+// It should be set to privileged when running as root, and always on windows. See https://pkg.go.dev/github.com/macrat/go-parallel-pinger#Pinger.SetPrivileged
+func ShouldRunPingerAsPrivileged() bool {
+	// Set the pinger's privileged mode to false for darwin
+	// See https://github.com/TwiN/gatus/issues/132
+	// linux should also be set to false, but there are potential complications
+	// See https://github.com/TwiN/gatus/pull/748 and https://github.com/TwiN/gatus/issues/697#issuecomment-2081700989
+	//
+	// Note that for this to work on Linux, Gatus must run with sudo privileges. (in certain cases)
+	// See https://github.com/prometheus-community/pro-bing#linux
+	if runtime.GOOS == "windows" {
+		return true
+	}
+	// To actually check for cap_net_raw capabilities, we would need to add "kernel.org/pub/linux/libs/security/libcap/cap" to gatus.
+	// Or use a syscall and check for permission errors, but this requires platform specific compilation
+	// As a backstop we can simply check the effective user id and run as privileged when running as root
+	return os.Geteuid() == 0
+}
+
 // QueryWebSocket opens a websocket connection, write `body` and return a message from the server
 func QueryWebSocket(address, body string, headers map[string]string, config *Config) (bool, []byte, error) {
 	const (
--- a/client/client_test.go
+++ b/client/client_test.go
@@ -6,6 +6,8 @@ import (
 	"io"
 	"net/http"
 	"net/netip"
+	"os"
+	"runtime"
 	"testing"
 	"time"

@@ -129,6 +131,33 @@ func TestPing(t *testing.T) {
 	}
 }

+func TestShouldRunPingerAsPrivileged(t *testing.T) {
+	// Don't run in parallel since we're testing system-dependent behavior
+	if runtime.GOOS == "windows" {
+		result := ShouldRunPingerAsPrivileged()
+		if !result {
+			t.Error("On Windows, ShouldRunPingerAsPrivileged() should return true")
+		}
+		return
+	}
+
+	// Non-Windows tests
+	result := ShouldRunPingerAsPrivileged()
+	isRoot := os.Geteuid() == 0
+
+	// Test cases based on current environment
+	if isRoot {
+		if !result {
+			t.Error("When running as root, ShouldRunPingerAsPrivileged() should return true")
+		}
+	} else {
+		// When not root, the result depends on raw socket creation
+		// We can at least verify the function runs without panic
+		t.Logf("Non-root privileged result: %v", result)
+	}
+}
+
+
 func TestCanPerformStartTLS(t *testing.T) {
 	type args struct {
 		address     string