A
watchdog is a special chip, circuit or part of a microcomputer chip. Its' purpose is to reset
the product if the product ever becomes 'lost', stuck or erratic. Generally this
circuit is used in embedded computer systems like that in a microwave oven or piece
of medical equipment such as a blood pressure monitor (rather than in a
consumer PC or laptop). Part of a watchdog system is computer software to monitor correct
operation of the product. The computer watches the system and the watchdog watches
the computer. Sometimes the watchdog is also called a watchdog timer because
of the electrical circuit used to implement it.
Why do we need a watchdog timer to reset a product? If a piece of medical equipment
or other embedded equipment were to hang or get stuck it could present a threat
in an emergency. PC
application freezing up, such as a word processor, are
generally more benign. Watchdogs, or watchdog timers are frequently used in critical equipment to help monitor the correct
operation of the microcomputer chip. Watchdog timers are used in non-critical consumer
equipment also. Here they are designed in to provide the customer with a better product
experience. A PC could benefit from a watchdog timer in many dedicated applications,
but they are not generally used because the office applications that generally run
on the PC's are not designed to implement them and most office applications do not
require them.
Why would equipment operation freeze, hang or become erratic? Embedded equipment
can freeze or get stuck for a number of reasons. An electrostatic discharge may
cause this in equipment not specifically designed to resist such events. An electrostatic
discharge can occur when your rub your feet across a rug and touch a product in
dry weather. Moreover, it can occur around high voltage equipment. In general there has
been more and more pressure to put more circuitry in smaller packages. With the
smaller packages comes the difficulty of designed
products that are ESD resistant
thus the impetus to let a watchdog timer recover from the event rather than circumvent
it with an ESD resistant design.
Another event that may be of concern is inappropriate connections to equipment,
or data input that the equipment was not designed to handle. These can cause
the computer programs or other hardware in equipment with embedded processor to hang or freeze. Hardware
failures in the computer chip or system can also cause the microcontrollers in equipment
to malfunction or hang. A Watchdog timer may be able to reset the system out
of this problem or provide diagnostics.
Finally, programming errors or bugs can be 'caught'. The most difficult types of
bugs to catch are those that occur very infrequently and cause some catastrophic
problem like the product operation to freeze. A functioning watchdog timer may be able to reset out of such an event
without the customer even knowing that there was a problem.
This brings us to another issue with watchdog timers.
What happens after the watchdog timer detects that the microchip or other parts
of the product are not functioning
properly? These circuits are hardwired to first cause a system level reset. This
is to bring the system back to a known state. It is important for the whole system
to reset together so that all parts of the equipment are
always in 'sync'. After this reset the design of the equipment can proceed in two directions. Between these
two directions there are an infinite number
of possibilities. The two directions are:
- Stop the equipment. Something
bad happened. Possibly inform customer
equipment malfunctioning.
- Try to proceed as if nothing happened from where the equipment was before the watchdog
event. If designed for upfront, the customer may never know something different
even happened.
The exact system response to a watchdog, and choices, are dependant on the product
and applications. For instance in some products it may be critical that certain
tasks continue uninterrupted in the case of a watchdog event. Be aware that
it may not be
possible for a watchdog to resume operation in 100% of the products operating modes,
or it may not be economically feasible to even try. On the other hand, a properly
implemented watchdog function can be a strong selling point for a product. Although the consumer may not specifically care about a watchdog
circuit, they may understand the ensuing security and confidence that are
the product benefit if properly explained.
In many systems diagnostics are collected and stored by the computer to help
diagnose watchdog events. This is of benefit to the manufacturer in improving the product.
Important information to collect may be the subroutine or sub-program
the computer chip was running when the problem occurred, external levels, pressures,
and other collected data, internal voltages and the states of internal switches
and nodes. The time and date of the event may be collected and a history can even
be built. The data can be valuable in a redesign or for equipment forensics and liabilities.
Perhaps in your next design you will consider including a watchdog timer and properly
applying it to the chip and system. |