Skip to content

[mux] Failed to start mux.service - MUX Cable Container due to start-pre operation timed out #22017

Open
@stepanblyschak

Description

@stepanblyschak

MUX service failed to start due to ExecStartPre= scripts took longer than 90 sec to complete:

Observed behaviour

2025 Feb 15 16:40:25.527995 r-leopard-72 INFO systemd[1]: Starting mux.service - MUX Cable Container...
2025 Feb 15 16:41:55.639499 r-leopard-72 WARNING systemd[1]: mux.service: start-pre operation timed out. Terminating.
2025 Feb 15 16:41:55.646631 r-leopard-72 WARNING systemd[1]: mux.service: Control process exited, code=killed, status=15/TERM
2025 Feb 15 16:41:58.808679 r-leopard-72 WARNING write_standby: Applying state to interfaces {'Ethernet0': 'standby', 'Ethernet16': 'standby', 'Ethernet160': 'standby', 'Ethernet168': 'standby', 'Ethernet176': 'standby', 'Ethernet184': 'standby', 'Ethernet192': 'standby', 'Ethernet200': 'standby', 'Ethernet208': 'standby', 'Ethernet216': 'standby', 'Ethernet224': 'standby', 'Ethernet232': 'standby', 'Ethernet24': 'standby', 'Ethernet240': 'standby', 'Ethernet32': 'standby', 'Ethernet40': 'standby', 'Ethernet48': 'standby', 'Ethernet56': 'standby', 'Ethernet64': 'standby', 'Ethernet72': 'standby', 'Ethernet8': 'standby', 'Ethernet80': 'standby', 'Ethernet88': 'standby'}
2025 Feb 15 16:41:58.825119 r-leopard-72 WARNING systemd[1]: mux.service: Failed with result 'timeout'.
2025 Feb 15 16:41:58.825726 r-leopard-72 ERR systemd[1]: Failed to start mux.service - MUX Cable Container.

mux.service.j2:

User={{ sonicadmin_user }}
ExecStartPre=/usr/local/bin/write_standby.py
ExecStartPre=/usr/local/bin/mark_dhcp_packet.py
ExecStartPre=/usr/bin/{{docker_container_name}}.sh start
ExecStart=/usr/bin/{{docker_container_name}}.sh wait
ExecStop=/usr/bin/{{docker_container_name}}.sh stop
ExecStopPost=/usr/local/bin/write_standby.py --shutdown mux
Restart=always
RestartSec=30

The write_standby.py script waits for IP tunnel to be created in ASIC DB for 90 sec and as can be seen in the logs it was successfully created and standby config written to APPL_DB by the script:

2025 Feb 15 16:41:58.808679 r-leopard-72 WARNING write_standby: Applying state to interfaces {'Ethernet0': 'standby', 'Ethernet16': 'standby', 'Ethernet160': 'standby', 'Ethernet168': 'standby', 'Ethernet176': 'standby', 'Ethernet184': 'standby', 'Ethernet192': 'standby', 'Ethernet200': 'standby', 'Ethernet208': 'standby', 'Ethernet216': 'standby', 'Ethernet224': 'standby', 'Ethernet232': 'standby', 'Ethernet24': 'standby', 'Ethernet240': 'standby', 'Ethernet32': 'standby', 'Ethernet40': 'standby', 'Ethernet48': 'standby', 'Ethernet56': 'standby', 'Ethernet64': 'standby', 'Ethernet72': 'standby', 'Ethernet8': 'standby', 'Ethernet80': 'standby', 'Ethernet88': 'standby'}

However, the mux.service itself still failed due to 90 sec timeout.

Expected behaviour
No failure to start mux.service, mux.service should apply this config on event when tunnel is created instead of polling with a timeout.

Version

202411 hash 4e5026e48

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions