Save Those Failed Components

An often overlooked discipline for component engineers is the retrieval of failed components from customer service technicians. Many times when products are sent back for repair, the technicians fail to keep running records of the failed components. Or, worse yet, they fail to push the trouble-causing parts back to the engineers for evaluation.

Valuable information is lost when the same components keep failing and no one initiates further investigation. The technicians unsolder the bad parts and, more often than not, toss them into the waste to keep their work surfaces clean.

In the past, I have alerted customer service to keep any failed components in a separate box and to keep a record of how the component failed and under what conditions. When I examined the box and saw more than one component of the same part number, I would ask the repair technicians if the failure symptoms were the same and how often they had returns for the same problem. If the parts had the identical date code and there were no failures from other date codes, I would call the manufacturer to see if other companies had reported similar issues involving parts with similar date codes. Many times, the manufacturer would ask for the parts back, so it could evaluate them itself. This would not be possible if we had not kept the parts.

This best-practice continues with the completion of a failure mode analysis request form. When the manufacturer receives the form and the failed parts and acknowledges the receipt of both, the official failure mode effects analysis (FMEA) begins. The component engineer is responsible for tracking the progress of the off-site analysis and informing the key players as information becomes available. Meanwhile, the design engineers are told of the suspect part, so they can decide whether to use components with that date code in prototypes. The materials department is also advised and given the option of purging its work in process inventories.

Meanwhile, the assembled boards where the parts are used are considered suspect. If it is possible and practical to do so, different date codes are substituted for the suspect parts. If it seems like a lot of trouble, consider the potential trouble and costs that are being avoided. At the very least, if no action is taken other than beginning the FMEA effort, at least there is a record that an attempt was made to flag what could be a very dangerous situation.

If a component's failure has nothing to do with the lot number or date code, look for modifications made after engineering release that may have changed the stress factors. A slight increase in voltage or current due to a substitute part used elsewhere on the circuit may be pushing your component beyond its specified operating limits.

In reality, there could be a dozen causes for the failure, but when a device with a single component part number seems to be failing over and over again, there is a common cause that should be identified as soon as possible. Having customer service keep the errant parts is the only way to make the failure analysis study credible.

If your company is not doing this, consider implementing this simple direction. This will allow you not only to catch the failure trends now, but also to avoid repeat performances in the future. Customer service will appreciate that its reports are valuable resources that will reduce failure rates, returns, and unhappy customers.

9 comments on “Save Those Failed Components

  1. Barbara Jorgensen
    November 13, 2012

    Douglas: I would think the component makers would embrace and even encourage failure feedback. Think of how much they could learn from that kind of input. For example, when one of our cell phones stops working, we are required to send it back before our insurance/replacement policy kicks in. I hope that's being used for analysis somewhere to build a better phone. They even provide the box and pre-paid postage. Would a similar model work at the component level?

  2. SP
    November 13, 2012

    Yes FMEA is quite good to do for these failed components. But does companies have that much time in reality. They just want their orders to get completed.

  3. dalexander
    November 13, 2012

    @Barbara, Failure Analysis is a very well developed area of all manufacturing disciplines. Aircraft malfunctions, automobile accidents, and general component and product failures are key sources for learning how to ruggedize or develope new products. In most cases the OEM needs the failed parts to perform the root cause analysis, RCA in order to get to the real cause of the failure. For semiconductors, decapsulation and etching single layers of depositied materials on the surface of the wafer help to localize the failure area. The failed component's physical exmination is absolutely necessary to determine root cause. It is a very engaging area of research as the operator is unravelling mystery after mystery through the processes of empiracle scientific investigation methodologies. I have a very high regard for FA people. The skill set an knowledge is extensive.

  4. dalexander
    November 13, 2012

    @SP The time it takes to determine the root cause of a failure is time well spent as it will actually promote the higher yield at the production level. If a component that is prone to fail under normal operating conditions is not caught and examined, then the component will be used in volume production and likely will produce many field failures as well. The company has to take the time to examine every repeat failure. You should be looking for multiple failures by the same component. The rework techs can really help idnetify these and as a result lower the retrun rate as well.

  5. SP
    November 14, 2012

    @Douglas, I am in complete agreement with you. Actually when we buy a component say crystal or any IC or any other component, we expect it to have passed quality check from the manufacturer. BUt I agree some time or many times although the components are fuly certified by the manufacturer, it still fails when its put on the board. Sometime its a component issue and many times its a design fault on associated circuitry. But yes I completely agree that it saves time if we test it out on prototype or few boards before we get into production.

  6. SemiMike
    November 14, 2012

    Could this issue be resolved by better use of tracking code captures to a cloud-based fuzzy data shared system?   Not knowing which components in a sub-system were defective, could it be that sub-system makers have or could have a component ID tracking system that relates to their sub-system tracing ID?  Could that information not be uploaded to a shared DB (shared with that sub-system vendor's component vendors only…to present an opportunity to cross check lot codes etc)?  If many sub-system vendors had such systems, then “crowd-sourcing” could highlight chronic issues, and in end user interest, a team of component makers and their board customers COULD escalate non-random issues without major FA effort at that point.  

  7. SemiMike
    November 14, 2012

    Another approach, supportive of the “save the failures” idea, is for salvage companies to perhaps that component ID DB input, regardless of any FA effort, in exchange for perhaps a “membership” fee to view that data?

  8. dalexander
    November 15, 2012

    @MClayton, that is the kind of thinking that requires a value-effort trade-off evaluation. If there was a central depository with restricted access for “suspect” components, then there would have to be a real-time feedback loop that also removes parts that failed due to workmanship or operator errors. I think if you just had a “Yelp” type site open to the general public, then it could be pretty effective as the front end could be managed like a Wikipedia where you take all information with a grain of salt, but if you had an integrated search engine that could identify a specific part number entry, find the referenced part and date code in the community DB, and highlight the trouble symptom or failure mode, then you would go a long way to legitimizing the entry as not being bogus or malicious in order to defame.or malign a manufacturer or a product in general. The responsibility for accuracy and timeliness would be shared by the community of participants. The contributors would be registered by name and data they entered could be sorted by their name. If they were malicious contributors then they could be identified and banned from the site. I like your idea. Do you have the resources to initiate such a website? Let me know if you want to pursue this. It is a great idea. Congratulations on having exactly the kind of mind that starts a tech ball rolling.

  9. t.alex
    November 18, 2012

    This is in fact very good and valuable practice to write down which components failed. Sometimes he may not know exactly which part but taking notes of the phenomenon really helps a lot in troubleshooting. In general, this also applies to the whole development process, for example when a particular firmware version fails and another version works, it can give a lot of insights for the development people.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.