Systems are designed not to fail, but they inevitably do. Even if your own contribution to a project is solid, complex systems will have aspects outside of your control. Perhaps a memory leak in a third party component. Or the ionizing radiation of a cosmic ray from a galaxy far far away flips a bit.
If we cannot prevent a failure, we can try to detect it and take corrective action. Microcontrollers commonly provide a Watchdog Timer which does just that. We can use similar concepts in higher level software.