Being aware of the myriad ways in which network I/O can fail goes a long way toward creating a robust system and diagnosing problems. Continue reading Perils of Network I/O
Category: Reliability
Perils of Disk I/O
Being aware of the myriad ways in which disk I/O can fail goes a long way toward creating a robust system and diagnosing problems. Continue reading Perils of Disk I/O
Memory Errors
When you write to a memory cell, will you always read back the same value? The vast majority of the time the answer is yes. For an individual using a personal computer the odds and consequences of failure are small enough to be irrelevant. But in some systems it can be a regular concern.
Watchdog
Systems are designed not to fail, but they inevitably do. Even if your own contribution to a project is solid, complex systems will have aspects outside of your control. Perhaps a memory leak in a third party component. Or the ionizing radiation of a cosmic ray from a galaxy far far away flips a bit.
If we cannot prevent a failure, we can try to detect it and take corrective action. Microcontrollers commonly provide a Watchdog Timer which does just that. We can use similar concepts in higher level software.