MTBF stands for Mean Time Between Failures.

MTBF numbers represent a statistical approximation of how long a set of devices should last before failure. MTBF numbers are not valuable at determining when a specific device will fail. MTBF numbers are usually stated in terms of hours.

MTBF is often erroneously described as Mean Time Before Failure.

Environmental Factors and MTBF

MTBF numbers can be radically altered by environmental factors such as improper power and cooling. MTBF numbers can also be lowered by excessive vibration or by many forms of misuse.

For example, hard drives installed in notebook PC’s have lower actual MTBF’s than hard drives in desktop PC’s due to environmental factors.

Defining Time in MTBF

The Time utilized in MTBF may not be actual clock time, it may be the time during which the system is actually in use.

A machine which is operated 8 hours a day may last three times as long as an identical machine which is operated 24 hours a day. The MTBF for these machines would be the same, because they both experienced the same number of in-serice hours.

The MTBF Curve

In the real world, MTBF numbers are not steady over the lifetime of a set of devices.

Systems tend to experience a high failure rate during an initial burn-in period. During this period, manufacturing defects lead to a large number of early system failures.

During the operational lifespan of a set of devices, the MTBF numbers improve. Failures become much more rare, as the systems experiencing early failure have been repaired or replaced.

Eventually, the devices will begin to wear out. MTBF numbers will steadily grow worse until the devices are replaced with new equipment.

Calculating MTBF

When a product has an estimated MTBF of one million hours, that does not mean that the product was tested for one million hours. MTBF numbers are generally created by estimating the MTBF of individual components and by past experience with similar products.

MTBF numbers can be generated for a datacenter or IT shop by looking at a historical record of service outages.