NAME
upsmon - UPS monitor and shutdown controller
SYNOPSIS
upsmon -h
upsmon -c command [-P pid]
upsmon [-D] [-F | -B] [-K] [-p] [-u user]
DESCRIPTION
upsmon is the client process that is responsible for the most important part of UPS monitoring—shutting down the system when the power goes out. It can call out to other helper programs for notification purposes during power events.
upsmon can monitor multiple systems using a single process. Every UPS that is defined in the upsmon.conf(5) configuration file is assigned a power value and a type (primary or secondary).
OPTIONS
- -h
-
Display the help message.
- -c command
-
Send the command command to the existing upsmon process. Valid commands are:
- fsd
-
shutdown all primary-mode UPSes (use with caution)
- stop
-
stop monitoring and exit
- reload
-
reread upsmon.conf(5) configuration file. See "reloading nuances" below if this doesn’t work.
- -P pid
-
Send the command signal above using specified PID number, rather than consulting the PID file. This can help define service units which start main
upsmon
as a foreground process so it does not have to rely on a PID file. - -D
-
Raise the debugging level. upsmon will run in the foreground by default, and will print information on stdout about the monitoring process. Use this option multiple times for more details.
- -F
-
upsmon will run in the foreground, regardless of debugging settings.
- -B
-
upsmon will run in the background, regardless of debugging settings.
- -K
-
Test for the shutdown flag. If it exists and contains the magic string from upsmon, then upsmon will exit with
EXIT_SUCCESS
. Any other condition will make upsmon exit withEXIT_FAILURE
.You can test for a successful exit from
upsmon -K
in your shutdown scripts to know when to call upsdrvctl(8) to shut down the UPS. - -p
-
Run privileged all the time. Normally upsmon will split into two processes. The majority of the code runs as an unprivileged user, and only a tiny stub runs as root. This switch will disable that mode, and run the old "all root all the time" system.
This is not the recommended mode, and you should not use this unless you have a very good reason.
- -u user
-
Set the user for the unprivileged monitoring process. This has no effect when using -p.
The default user is set at configure time with configure --with-user=…. Typically this is nobody, but other distributions will probably have a specific nut user for this task. If your notification scripts need to run as a specific user, set it here.
You can also set this in the upsmon.conf(5) file with the RUN_AS_USER directive.
UPS DEFINITIONS
In the upsmon.conf(5), you must specify at least one UPS that will be monitored. Use the MONITOR directive.
MONITOR 'system' 'powervalue' 'username' 'password' 'type'
The system refers to a upsd(8) server, in the form
upsname[@hostname[:port]]
. The default hostname is "localhost". Some
examples follow:
-
"su700@mybox" means a UPS called "su700" on a system called "mybox". This is the normal form.
-
"fenton@bigbox:5678" is a UPS called "fenton" on a system called "bigbox" which runs upsd(8) on port "5678".
The powervalue refers to how many power supplies on this system are being driven this UPS. This is typically set to 1, but see the section on power values below.
The username is a section in your upsd.users(5) file. Whatever password you set in that section must match the password set in this file.
The type set in that section must also match the type here — primary or secondary. In general, a primary process is one running on the system with the UPS actually plugged into a serial port, and a secondary is drawing power from the UPS but can’t talk to it directly. See the section on UPS types for more.
NOTIFY EVENTS
upsmon senses several events as it monitors each UPS. They are called notify events as they can be used to tell the users and admins about the change in status. See the additional NOTIFY-related sections below for information on customizing the delivery of these messages.
- ONLINE
-
The UPS is back on line.
- ONBATT
-
The UPS is on battery.
- LOWBATT
-
The UPS battery is low (as determined by the driver).
- FSD
-
The UPS has been commanded into the "forced shutdown" mode.
- COMMOK
-
Communication with the UPS has been established.
- COMMBAD
-
Communication with the UPS was just lost.
- SHUTDOWN
-
The local system is being shut down.
- REPLBATT
-
The UPS needs to have its battery replaced.
- NOCOMM
-
The UPS can’t be contacted for monitoring.
- NOPARENT
-
upsmon
parent process died - shutdown impossible. - CAL
-
UPS calibration in progress.
- NOTCAL
-
UPS calibration finished.
- OFF
-
UPS administratively OFF or asleep.
- NOTOFF
-
UPS no longer administratively OFF or asleep.
- BYPASS
-
UPS on bypass (powered, not protecting).
- NOTBYPASS
-
UPS no longer on bypass.
- ECO
-
UPS in ECO or similar mode (as defined and named by vendor); other names include High Efficiency mode and Energy Saver System.
For example, Eaton docs define High Efficiency mode as a sort of hardware-monitored bypass with switch-over under 10ms into Online mode in case of troubles (so what was known as Line-Interactive in the older days), which can be chosen instead of always running in a dual-conversion mode (safer, but more wasteful and with a toll on battery life). Older devices only implemented one or the other.
- NOTECO
-
UPS no longer in ECO mode (see above).
- ALARM
-
UPS has one or more active alarms (look at
ups.alarm
for details). - NOTALARM
-
UPS is no longer in an alarm state (no active alarms).
- OTHER
-
UPS has at least one unclassified status token.
- NOTOTHER
-
UPS has no unclassified status tokens anymore.
- SUSPEND_STARTING
-
OS is entering sleep/suspend/hibernate mode.
- SUSPEND_FINISHED
-
OS just finished sleep/suspend/hibernate mode, de-activating obsolete UPS readings to avoid an unfortunate shutdown.
NOTIFY COMMAND
In upsmon.conf(5), you can configure a program called the NOTIFYCMD that will handle events that occur.
NOTIFYCMD
"path to program"
NOTIFYCMD "/usr/local/bin/notifyme"
Remember to wrap the path in "quotes" if it contains any spaces.
The program you run as your NOTIFYCMD can use the environment variables NOTIFYTYPE and UPSNAME to know what has happened and on which UPS. It also receives the notification message (see below) as the first (and only) argument, so you can deliver a pre-formatted message too.
Note that the NOTIFYCMD will only be called for a given event when you set the EXEC flag by using the notify flags, as detailed below.
NOTIFY FLAGS
By default, all notify events (see above) generate a global message (wall) to all users, plus they are logged via the syslog. Except for Windows where upsmon only writes to the syslog by default. You can change this with the NOTIFYFLAG directive in the configuration file:
NOTIFYFLAG
notifytype flags
Examples:
-
NOTIFYFLAG ONLINE SYSLOG
-
NOTIFYFLAG ONBATT SYSLOG+WALL
-
NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
The flags that can be set on a given notify event are:
- SYSLOG
-
Write this message to the syslog.
- WALL
-
Send this message to all users on the system via wall(1).
- EXEC
-
Execute the NOTIFYCMD.
- IGNORE
-
Don’t do anything. If you use this, don’t use any of the other flags.
You can mix these flags. "SYSLOG+WALL+EXEC" does all three for a given event.
NOTIFY MESSAGES
upsmon comes with default messages for each of the NOTIFY events. These can be changed with the NOTIFYMSG directive.
NOTIFYMSG
type "message"
Examples:
-
NOTIFYMSG ONLINE "UPS %s is getting line power"
-
` NOTIFYMSG ONBATT "Someone pulled the plug on %s"`
The first instance of %s is replaced with the identifier of the UPS that generated the event. These messages are used when sending walls to the users directly from upsmon, and are also passed to the NOTIFYCMD.
POWER VALUES
The "current overall power value" is the sum of all UPSes that are currently able to supply power to the system hosting upsmon. Any UPS that is either on line or just on battery contributes to this number. If a UPS is critical (on battery and low battery) or has been put into "forced shutdown" mode, it no longer contributes.
A "power value" on a MONITOR line in the config file is the number of power supplies that the UPS runs on the current system.
MONITOR
upsname powervalue username password type
Normally, you only have one power supply, so it will be set to 1.
MONITOR myups@myhost 1 username mypassword primary
On a large server with redundant power supplies, the power value for a UPS may be greater than 1. You may also have more than one of them defined.
MONITOR ups-alpha@myhost 2 username mypassword primary
MONITOR ups-beta@myhost 2 username mypassword primary
You can also set the power value for a UPS to 0 if it does not supply any power to that system. This is generally used when you want to use the upsmon notification features for a UPS even though it’s not actually running the system that hosts upsmon. Don’t set this to "primary" unless you really want to power this UPS off when this instance of upsmon needs to shut down for its own reasons.
MONITOR faraway@anotherbox 0 username mypassword secondary
The "minimum power value" is the number of power supplies that must be receiving power in order to keep the computer running.
MINSUPPLIES
value
Typical PCs only have 1, so most users will leave this at the default.
MINSUPPLIES 1
If you have a server or similar system with redundant power, then this value will usually be set higher. One that requires three power supplies to be running at all times would simply set it to 3.
MINSUPPLIES 3
When the current overall power value drops below the minimum power value, upsmon starts the shutdown sequence. This design allows you to lose some of your power supplies in a redundant power environment without bringing down the entire system while still working properly for smaller systems.
UPS TYPES
upsmon and upsd(8) don’t always run on the same system. When they
do, any UPSes that are directly attached to the upsmon host should be
monitored in "primary" mode. This makes upsmon take charge of that equipment,
and it will wait for the "secondary" systems to disconnect before shutting
down the local system. This allows the distant systems (monitoring over
the network) to shut down cleanly before upsdrvctl shutdown
runs
and turns them all off.
When upsmon runs as a secondary, it is relying on the distant system to tell it about the state of the UPS. When that UPS goes critical (on battery and low battery), it immediately invokes the local shutdown command. This needs to happen quickly. Once all secondaries disconnect from the distant upsd(8) server, its primary-mode upsmon will start its own shutdown process. Your secondary systems must all quiesce and shut down before the primary turns off the shared power source, or filesystem damage may result.
upsmon deals with secondaries that get wedged, hang, or otherwise fail to disconnect from upsd(8) in a timely manner with the HOSTSYNC timer. During a shutdown situation, the primary upsmon will give up after this interval and it will shut down anyway. This keeps the primary from sitting there forever (which would endanger that host) if a secondary should break somehow. This defaults to 15 seconds.
If your primary system is shutting down too quickly, set the FINALDELAY interval to something greater than the default 15 seconds. Don’t set this too high, or your UPS battery may run out of power before the primary upsmon process shuts down that system.
TIMED SHUTDOWNS
For those rare situations where the shutdown process can’t be completed between the time that low battery is signalled and the UPS actually powers off the load, use the upssched(8) helper program. You can use it along with upsmon to schedule a shutdown based on the "on battery" event. upssched can then come back to upsmon to initiate the shutdown once it has run on battery too long.
This can be complicated and messy, so stick to the default critical UPS handling if you can.
REDUNDANT POWER SUPPLIES
If you have more than one power supply for redundant power, you may also have more than one UPS feeding your computer. upsmon can handle this. Be sure to set the UPS power values appropriately and the MINSUPPLIES value high enough so that it keeps running until it really does need to shut down.
For example, the HP NetServer LH4 by default has 3 power supplies installed, with one bay empty. It has two power cords, one per side of the box. This means that one power cord powers two power supply bays, and that you can only have two UPSes supplying power.
Connect UPS "alpha" to the cord feeding two power supplies, and UPS "beta" to the cord that feeds the third and the empty slot. Define alpha as a powervalue of 2, and beta as a powervalue of 1. Set the MINSUPPLIES to 2.
When alpha goes on battery, your current overall power value will stay at 3, as it’s still supplying power. However, once it goes critical (on battery and low battery), it will stop contributing to the current overall power value. That means the value will be 1 (beta alone), which is less than 2. That is insufficient to run the system, and upsmon will invoke the shutdown sequence.
However, if beta goes critical, subtracting its contribution will take the current overall value from 3 to 2. This is just high enough to satisfy the minimum, so the system will continue running as before. If beta returns later, it will be re-added and the current value will go back to 3. This allows you to swap out UPSes, change a power configuration, or whatever, as long as you maintain the minimum power value at all times.
MIXED OPERATIONS
Besides being able to monitor multiple UPSes, upsmon can also monitor them as different roles. If you have a system with multiple power supplies serviced by separate UPS batteries, it’s possible to be a primary on one UPS and a secondary on the other. This usually happens when you run out of serial or USB ports and need to do the monitoring through another system nearby.
This is also complicated, especially when it comes time to power down a UPS that has gone critical but doesn’t supply the local system. You can do this with some scripting magic in your notify command script, but it’s beyond the scope of this manual.
FORCED SHUTDOWNS
When upsmon is forced to bring down the local system, it sets the "FSD" (forced shutdown) flag on any UPSes that it is running in primary mode. This is used to synchronize secondary systems in the event that a primary which is otherwise OK needs to be brought down due to some pressing event on the UPS manager system.
You can manually invoke this mode on the system with primary-mode upsmon
by starting another copy of the program with -c fsd
command line argument.
This is useful when you want to initiate a shutdown before the critical
stage through some external means, such as upssched(8).
Warning
|
Please note that by design, since we require power-cycling the load and don’t want some systems to be powered off while others remain running if the "wall power" returns at the wrong moment as usual, the "FSD" flag can not be removed from the data server unless its daemon is restarted. If we do take the first step in critical mode, then we normally intend to go all the way — shut down all the servers gracefully, and power down the UPS. |
Keep in mind that some UPS devices and corresponding drivers would also latch or otherwise resurface the "FSD" state again even if "wall power" is available, but the remaining battery charge is below a threshold configured as "safe" in the device (usually if you manually power on the UPS after a long power outage). This is by design of respective UPS vendors, since in such situation they can not guarantee that if a new power outage happens, their UPS would safely shut down your systems again. So it is deemed better and safer to stay dark until batteries become sufficiently charged.
When it is time to shut down, upsmon creates POWERDOWNFLAG to communicate to the operating system that the UPS should be commanded off late in the shutdown sequence. This file is removed if present when upsmon starts, so that the next normal shutdown does not cause the UPS to be turned off. (The file can’t in general be removed during shutdown because the filesystem might be read only. If the file is in a RAM-backed filesystem, the it won’t be present and the check to remove it won’t fire.)
SIMULATING POWER FAILURES
To test a synchronized shutdown without pulling the plug on your UPS(es), you need only set the forced shutdown (FSD) flag on them. You can do this by calling upsmon again to set the flag, i.e.:
upsmon -c fsd
After that, the primary and the secondary will do their usual shutdown sequence as if the battery had gone critical, while you can time how long it takes for them. This is much easier on your UPS equipment, and it beats crawling under a desk to find the plug.
Note you can also use a dummy SHUTDOWNCMD setting to just report that the systems would shut down at this point, without actually disrupting their work.
Warning
|
after such "dummy" experiments you may have to restart the NUT data
server upsd to clear its "FSD" flag for the devices and clients involved,
and make sure no files named by POWERDOWNFLAG option (e.g. /etc/killpower )
remain on the upsmon primary systems under test. |
DEAD UPSES
In the event that upsmon can’t reach upsd(8), it declares that UPS "dead" after some interval controlled by DEADTIME in the upsmon.conf(5). If this happens while that UPS was last known to be on battery, it is assumed to have gone critical and no longer contributes to the overall power value.
upsmon will alert you to a UPS that can’t be contacted for monitoring with a "NOCOMM" notifier by default every 300 seconds. This can be changed with the NOCOMMWARNTIME setting.
Also upsmon normally reports polling failures for each device that are in place
for each POLLFREQ loop (e.g. "Data stale" or "Driver not connected") to
system log as configured. If your devices are expected to be AWOL for an
extended timeframe, you can use POLLFAIL_LOG_THROTTLE_MAX to reduce the
stress on syslog traffic and storage, by posting these messages only once
in every several loop cycles, and when the error condition has changed or
cleared. A negative value means standard behavior (log on every loop,
effectively same as when max=1
), and a zero value means to never repeat
the message (log only on start and end/change of the failure state).
Note that this throttle only applies to one latest-active error state per monitored device.
UPS ALARMS
UPS manufacturers and UPS drivers may implement device-specific alarms to notify the user about potentially severe conditions of the UPS. The status "ALARM" will be set on such UPS as a common denominator for the different implementations throughout the various UPS drivers and generally anytime that a "ups.alarm" variable is seen reported by the specific UPS driver.
Alarms can be acted upon by the user using the "ALARM" and "NOTALARM" notifiers,
which are reported by the upsmon
when the "ALARM" status occurs or disappears.
As alarms raised by the UPS are usually severe in nature, the upsmon
watches
a UPS in such a state more closely, bumping up the polling frequency as needed.
When connection loss occurs in such an alarm state, the upsmon
will by default
consider the volatile UPS as critical/dead and this may lead to false shutdowns
if the actual alarm was in fact mundane in nature (e.g. caused by a HE/ECO mode).
This can be changed by utilizing the ALARMCRITICAL
setting within upsmon.conf
.
RELOADING NUANCES
upsmon usually gives up root powers for the process that does most of the work, including handling signals like SIGHUP to reload the configuration file. This means your upsmon.conf(8) file must be readable by the non-root account that upsmon switches to.
If you want reloads to work, upsmon must run as some user that has permissions to read the configuration file. I recommend making a new user just for this purpose, as making the file readable by "nobody" (the default user) would be a bad idea.
See the RUN_AS_USER section in upsmon.conf(8) for more on this topic.
Additionally, you can’t change the SHUTDOWNCMD or POWERDOWNFLAG definitions with a reload due to the split-process model. If you change those values, you must stop upsmon and start it back up. upsmon will warn you in the syslog if you make changes to either of those values during a reload.
ENVIRONMENT VARIABLES
NUT_DEBUG_LEVEL sets default debug verbosity if no -D arguments were provided on command line, but does not request that the daemon runs in foreground mode.
NUT_CONFPATH is the path name of the directory that contains
upsmon.conf
and other configuration files. If this variable is not set,
upsmon uses a built-in default, which is often /usr/local/ups/etc
.
FILES
SEE ALSO
Server:
CGI programs:
Internet resources:
The NUT (Network UPS Tools) home page: https://www.networkupstools.org/