I regard debugging and troubleshooting as among the most difficult engineering skill set, and one that is mostly learned through experience, mistakes, and practice. At the same time, it's easy to have so much experience that you get a little too clever for yourself and miss the obvious.
Like most of you, I have the usual collection of IR-based remote controls by my TV area — excuse me, “media center” — for the display, the DVD, and for other control boxes. When the remote that controls my TV started to act erratically, and then died, I assumed it was the batteries; in this case, a pair of standard AAA cells.
No problem. None of the other remotes also used AAA cells, and were working fine. I did the quick swap (each remote had a different brand of battery, so I didn't have to worry about mixing up the good and bad ones — a common occurrence when doing a good/bad interchange).
After the swap, the remote still didn't work. OK, then, I thought knew the problem: the remote control had gone bad. It was either an outright failure or it was poor contacts on the pushbuttons, which is a very common problem due to degeneration and oozing of the elastomeric sheet with conductive dots used for the keyboard-backing assembly and pushbuttons.
I was about to crack open the remote's case to look for any obvious problem (unlikely it would actually be visible, but what the heck, why not?) and clean the keyboard assembly with alcohol — something I have done on many remotes, cordless phones, and similar units with these elastomeric keyboards. So I put the batteries from the working remote back in that unit, and surprise: It no longer worked. Now I was really puzzled and ready to take everything apart, figuring that maybe a short circuit in the bad remote had killed those batteries as well.
Luckily, I was interrupted and delayed from my plan. When I got back to the project, I remembered the first rule of debugging: when you are not sure what is going on, do nothing. Stop, think, collect the facts you have already observed, review their timeline, and ask questions. And that's what I did.
Long story short: In one of those oddball coincidences, the batteries in the supposedly good remote were also marginal, but only a little less so than the ones in the remote which originally stopped working. As soon as I put fresh and tested batteries in both units, they worked fine. My plan to open and clean the first unit, or look for an internal short, would have been useless.
Admittedly, in the scheme of things, this debugging problem was pretty trivial, both literally and figuratively. No timing violations, no noise or EMI considerations, no grounding issues, no driver sourcing/sinking problems — just a set of marginal batteries tested against another set of slightly less marginal ones.
But I assumed there was a more serious problem, because I overthought the problem, and linked the observed symptoms to a complicated conclusion based on my previous experience with similar units. The simpler question to ask first would have been, “How do I really know the second set of batteries is good enough?”
Have you ever jumped to conclusions about a simple problem, primarily because your troubleshooting experience caused you to move too far, too fast?
This article was originally published on EBN's sister publication EDN .