Electronics Manufacturability and Reliability with QA Guru Cheryl Tulkoff

Zachariah Peterson
|  Created: October 24, 2022  |  Updated: March 5, 2024
Electronics Manufacturability and Reliability with QA Guru Cheryl Tulkoff

Let’s talk about Electronics reliability with the QA guru Cheryl Tulkoff.

In this episode Cheryl and I will talk about risk assessment, planning for not only success but also a failure, and understanding the difference between quality and reliability. This discussion is going to be very informative for every PCB designer who wants to get ahead of their game. Watch through the end, and make sure to check the additional resources below.

Listen to Podcast:

Download this episode (right-click and save)

Watch the Video:

Show Highlights:

  • Cheryl shares her rewarding career experience in the electronics industry
    • She worked at IBM where she was immersed in electronic manufacturing from beginning to end
    • She also worked at DfR Solutions and National Instruments where she learned all the skills and knowledge in electronics manufacturability, quality & reliability consulting
  • To produce a successful electronic product it is important to have the awareness to resolve every problem, from the chip level, board level, system level, and the environment level
  • Cheryl explains why unique or non-aligned standards exist in the industry – no one size fits all
  • A great piece of advice for all PCB designers is to know what you are designing and who you are designing it for, look at the risks, and then manage them appropriately
  • Planning for success may also include celebrating failures. Budget for failure analysis is often disregarded due to the “success-driven roadmap” mentality
  • Failure should be part of design management
  • Cheryl and Zach talk about the “Startup Culture”
  • Software reliability and hardware reliability go hand in hand
  • What rate of failure is tolerable? Defining what is quality and reliability separately for the product you are designing
  • Manufacturers can not ensure reliability for you
  • Cheryl shares her experience being involved in litigation as an expert witness
  • Redundancy practices in the industry, is it typical?
  • What can designers do to mitigate failures?
    • Understanding what you are designing and who you are designing it for
    • Collect as much feedback as possible – from users, industry experts, and professional organizations

Links and Resources:

Claim the special offer for Podcast listeners only

Transcript:

Cheryl Tulkoff:
It was just that case. It was built fine, failed badly in application, and the designer was blaming the assembler for a field reliability issue when they had fundamentally picked wrong materials for a very hostile environment. And so arguing about who's really at fault there and who should have known.

Zach Peterson:
Hello everyone, and welcome to the Altium OnTrack podcast. I am here talking today with Cheryl Tulkoff, an electronics reliability consultant and overall QA guru, and formerly working at a number of well-known companies. I think this is going to be a really great conversation looking at QA, QC and PCB manufacturing, and some of the things that designers should know as they make that transition to higher volumes. Cheryl, thank you so much for joining us today.

Cheryl Tulkoff:
Yeah, thank you for having me.

Zach Peterson:
We were talking beforehand, but you used to work at National Instruments. I first recall seeing your name from your work at DfR Solutions. And I remember seeing your name on a number of technical papers, looking at reliability. Maybe if you have a moment you could tell the audience about your background and how you became to be a reliability consultant.

Cheryl Tulkoff:
Well, it's definitely been a long, interesting and rewarding path. So by background, I'm a mechanical engineer from Georgia Tech, but I really started my career in electronics from the beginning. So right out of college, I relocated to Austin going to work for IBM at the time when IBM was first making laptops, PS2s, still making mainframes. And it was a really neat time. IBM was really vertically integrated. So Austin had everything. We had bare board manufacturing. We had circuit board assembly and test. We had reliability. We had box build as well as chip fabrication and design.

So it was a great way to really get immersed in electronics manufacturing from beginning to end. So that's how I got my start. And from there, I also had the opportunity to work inside a fab, doing failure analysis at the die level. And then of course working for National Instruments. And then with DfR, what that really brought was the ability to bring everything together and really focus on reliability and quality, which I found really fascinating and still a pretty small group of people that kind of specialize in this area. And through DfR, I got a chance to take all these skills and then apply them across different industries.

So there, we were doing consulting really to everyone, defense, medical, aerospace, consumer electronics. And so you get to see all the different ways in which people use the same types of things and the different ways in which they fail based on their environment.

Zach Peterson:
That's interesting. So starting at IBM, you said you got to see really everything from beginning to end, but it's not just the PCB assembly. I mean, you mentioned they had chip fabrication. So I mean really right down to the bare bones of what we use every day, which is of course integrated circuit.

Cheryl Tulkoff:
Absolutely.

Zach Peterson:
And so looking at the entire, I guess, the entire process for getting to an electronic product, you must see a lot of potential failures that we don't really see as designers, unless we're literally testing this stuff all the time.

Cheryl Tulkoff:
Yeah, that's true. And that's, I think, really one of the neat things about being able to work in all these different areas of printed circuit board assembly processes and seeing sometimes the misalignment. So if you think back to chip manufacturing or those processes versus the assembly processes, even today, the standards and the way they're qualified or different at the chip level than at the board level, and then of course at the system and environment level.

So there's some inherent gaps in the system, even from the get go that even if you were doing everything according to the playbook, if you will, there's still things to learn there. So it's a pretty fascinating look. That's one of the things that I find really inspiring is the ability to look at those gaps and help people at least avoid the pitfalls we've solved already, especially across industries. There tend to be lots of silos in this business. And so it's giving people that perspective or awareness that you don't have to reinvent or resolve every problem. Let's look for the new things or the new stressors.

Zach Peterson:
So you said that the standards by which, I guess, reliability is quantified, if you will, or assessed in these different levels of an electronics assembly, whether it's the chip and there's probably something totally different for components that are not integrated circuits and then the assembly itself, all of those things are siloed. And I would agree with you. I only agree because I don't know all of those standards for integrated circuits. I certainly don't know all the standards for all the different components.

In working with a client, I might hear a few alpha numeric strings that I then realize, "Oh, that's a technical standard from somewhere." But why is that all siloed? I mean, is it really just those reliability problems are so siloed themselves, that it makes sense to have these individual kind of very siloed standards? Or is there really a need to standardize across these different domains?

Cheryl Tulkoff:
Well, it's a little bit of both. So there are definitely areas where it makes complete sense to have unique or non-aligned standards and ways of testing and evaluating things and that comes down to how things are going to be used. One of the things I think is really fascinating is convergence, how you see electronics in everything now.

It's not just the high tech things. It's day to day life. Even refrigerators now like my refrigerator is on internet of things. I get updates from it and it tells me when the filter needs to be changed. Electronics have become mundane and that they're literally everything we touch and they don't all need to be built to the same standards. What's important for reliability of your watch versus your refrigerator, versus your heart implant, versus a satellite are very different. You don't want necessarily a one size fits all, but what you also see is that all these industries are trying to use the same components off the shelf for cuts to the same thing. They want to try to minimize cost. And that's fine, but as designers and as test engineers, reliability experts, we need to help those people understand some of the risks inherent in that, how you can design around that, how you can test or how you can mitigate some of the risk if you choose to go down that path.

So that's where some of that misalignment naturally happens. So even at the component level, something could be designed for consumer electronics, but end up in a vehicle, but there are things that those people do to ensure that that component will survive that way versus buying something that might be inherently more expensive, but built to a higher standard.

Zach Peterson:
That's really interesting. I mean, you're saying something like... I'm just going to throw a name out here. You're saying something like an Arduino that maybe isn't meant for high reliability or I guess production grade equipment ending up in a system that someone's life could depend on. Is that the level of modularity that people are trying to achieve? And then creating reliability problems because of it.

Cheryl Tulkoff:
Well, it's not always problems. You can safely do those things if you understand the risks inherent in it and design mitigate appropriately for it. And that's really one of the, I think, really neat spaces that's a little under resourced in the industry is the whole risk assessment, risk mitigation, risk-based thinking. The standards like ISO standards and the quality standards have really used that as the core tenet. Think about what you're designing and look at the risks and then manage them appropriately. Because you can use those, but they require different test methods, different evaluation methods, different things to ensure that it's going to have the lifetime and reliability you need. If you don't do nothing, then of course you're setting yourself up for potential failure.

Zach Peterson:
So I have to wonder here, to what extent you start to see companies go down this path that you're talking about and not be able to anticipate the potential problems that they'll run into and they just jump into these design decisions whether it's cost driven, let's say. They want to minimize cost and this is the cheapest path forward. Or maybe it's market pressure. Our competitors are already on the market-

Cheryl Tulkoff:
Supply chain.

Zach Peterson:
Yeah, perfect point. Yeah, supply chain. We can't build this ourselves, but we can buy this off the shelf, so let's just buy this off the shelf. How often do you see companies run into reliability problems because they fail to anticipate those potential failures that could arise from their design choices?

Cheryl Tulkoff:
Absolutely. It does happen. I mentioned supply chain is just one of the things that's been impacting the industry dramatically, the availability of the components that you do want. Do you choose not to proceed if you can't get a specific component because it's just simply not available or you can't get it in time? But that's where the risk assessment and the test evaluation things come in and where sadly one of the things that is... And I guess it's kind of a cultural and industry thing as well too is that we're hopelessly optimistic. If you see a typical design plan or test plan, they always plan for success.

If you look at it, if you look at a typical company, they have all these phases and gates through the design process and sign offs and tests that are run and you're looking for a specific outcome and it's always like, "Yep, we're going to do this test. It's going to pass." Next gate. There's typically not enough time or budget left to do failure analysis or if something goes wrong, how to recover.

So that's where I typically see the mistakes happen. They get to one of these, they have a failure and now they're up against a commitment, a budget, a customer need. And now they're like, "Well, where can I cut?" And so typically the first thing that gets cut is some part of a test or some part of a quality plan and that's where I typically see the most common forms because company has gotten really good at doing design FMEAs or failure modes effects and analysis.

So those have become part of the general practice, especially in high reliability, the challenges to do those with a fully functional group cross-functional people who can help identify risks you might not think of.

Zach Peterson:
That's really interesting because what you just said about planning for success. Now that you've actually said that and it immediately got me thinking about engagements that I've had with other companies. And you're absolutely right. Their roadmaps are always success driven. There is never a contingency or a, "What if this thing that we want to build doesn't work?" Even write down with front end engineering and having to do simulations. We're doing something totally new. We need to go through a round of simulations. It's going to take X number of days. But what if that thing that you think you're simulating and is going to work perfectly doesn't actually work? I'm hearing the parallel here now with reliability on the back end. Once you actually go to build something and then try and take it to market, what if that doesn't work?

How do companies manage the potential for failure? Because if everybody's so optimistic in thinking, "This is our roadmap and we're sticking to it," of course everybody is surprised and has their hair on fire as soon as something doesn't go to plan. How do companies manage that NPI process when failure is probably inevitable somewhere?

Cheryl Tulkoff:
And the industry has gotten so much better at this in the past couple years is embracing and actually celebrating failure. And there still is obviously a stigma against failing. But I know at the companies I've worked at, there was an active movement to say, "Hey, you're pushing boundaries here. Expect failure. No one gets things right the first time. If you are, that probably means you're not pushing yourself hard enough." If everything you try works great immediately, you're probably doing things that are a little too easy or that have been done before.

So part of that is actually saying, "Yes. Okay, it's good. We've learned something. We were expecting that. We've planned for it, now how do we resolve it and move on?" And so that's been, I think, a really neat shift to see, and there's been a lot of media articles and writing about trying to accept that failure should be embraced and celebrated for their learning opportunity that it truly is.

Zach Peterson:
What's the expression, fail hard and learn fast, I think is the expression? I've heard that before.

Cheryl Tulkoff:
Yeah. And there's a couple around it. The first is fail fast and the other one is, "Don't make the same mistake twice." Right?

Zach Peterson:
Oh yeah.

Cheryl Tulkoff:
Which is what I see and feel bad about the reliability world is that you get it the first time when you're trying, but once it's been solved, share that information and don't make that mistake again. Don't have it happen on another product line or another area because you weren't paying attention.

Zach Peterson:
Yeah, absolutely. I think going back to product roadmaps or setting milestones for projects for just a moment in order to account for what is possibly going to be a failure, I mean, what does that look like in a milestone? Is it just like we secretly extend out those anticipated completion dates in order to account for the fact that we might have to do it twice because we failed at something somewhere along the way?

Cheryl Tulkoff:
Well, some of that. There are a couple of practices that can really help with that. The first is obviously understanding that failure should be part of your design planning process. You need to have it in there, you need to budget for it, and allocate for it. But some of that also comes down to basic project management, best practices. In a lot of places, it's X number of days and a date. And the reality is it's a range. Any given phase gate could have a date plus or minus X number of days, a budget plus or minus. You can actually model that based on your experience of the companies.

What is your typical schedule look like? How often do you go over and under those kind of postmortems in your project where you can actually set those up? Or you can use some of the tools that are built in to... Most of the really good project planners now will help you do that kind of simulation that gives you boundaries and helps kind of set the stage a little bit better, so you don't get into this absolute crunch and then you have no way out or you didn't plan for it. Now, you've got to figure out how to afford it and what the implications would be.

And that's really where the risk analysis part comes in as well too. Is that beforehand, okay, if something fails, do I have the labs lined up? Do I have the extra parts needed? Do I have the people... Just putting those things in a plan so that if that does happen, you've got to ready to go action that you can jump on rather than having to stop and not even have a clue as to how to proceed.

Zach Peterson:
This is one of those things that I have seen startups run into where they're overly optimistic and they probably were always on time at their previous corporate job doing engineering or NPI or whatever it is. So I think they're used to succeeding, but then they get into something where they're having to build an advanced piece of hardware. And then all of a sudden supply chain bites them or all of a sudden the thing they thought they were going to build, the simulation says otherwise, or their prototype says otherwise. Now they've got to go back to the drawing board. They're totally off their project roadmap, or they're going to miss that meeting with that VC who's going to fund them for the next 10 years or whatever the case may be.

It's actually more common than I think startup founders would like to admit, especially if this is their first jump into hardware. I'm actually seeing that a lot more often. The startup founders that I've met most recently, they actually come from software and they're jumping into hardware for the first time. And this whole idea of, "It's a physical thing, it can fail" seems foreign at first until they actually produce their first prototype and something doesn't work.

Cheryl Tulkoff:
Oh my gosh, you raised six fascinating points in there. It's like where to begin unpacking that one. So first the startup culture. The pressure is there obviously real when you're literally looking at survival mode, what you're trying to do. And so.

Zach Peterson:
On top of that, there's another cost pressure there because if it's just you or you and a partner, it's probably your money. And so forget about the VCs for a second. This is my life savings I'm putting into this. This has to work.

Cheryl Tulkoff:
And so the pressure is phenomenal. Actually, I found that space so interesting. I went back to grad school in 2015 at UT Austin in a program specifically around technology commercialization that was attempting to address those types of things. How do you launch successfully? How do you make these partnerships? It's a fascinating space for startups because you want to nurture because you don't know where those next great, fantastic things come from, but how do you do that smartly and successfully. So that's where I really think some of the future in modeling really helps. When you think about some of the tools that we have out there now that can do a lot of things before you start to invest a lot of money in physical hardware. Is it even capable of doing what I want to do and what are some of the risks?

And then you also mentioned software. I find that fascinating because the electronics world typically has been hardware focused and hardware failure focused in particular too, is that if something goes wrong, it's got to be the hardware. So we've got lots of practices and standards around hardware and not so much around software. And if you look today, there's every electrical or electronic device or system is really an integration. Nothing runs without software anymore.

So software reliability, software quality is a big thing and I'd say, even smaller group who specialize in focus in that area. And it's very underrepresented as a reason for failure of systems. So it's interesting to see as more of these things come together, how we get better at looking for software faults rather than focusing as that as a hardware thing and what those folks with that background can do to successfully integrate systems together and also learn about the challenges of being in a hardware space.

Zach Peterson:
Yeah, definitely. I think there's almost this assumption that if you're the hardware person, it's like the software person gave it to me, so it must work. They must have done whatever magic they do in code to ensure that all the logic is correct. And I get the binary and flash it on.

Cheryl Tulkoff:
That's not the case.

Zach Peterson:
Yeah, that's true. It could be. I mean, I was dealing something with a project a few months ago where we didn't actually know if the hardware was defective, if we had made a design mistake somewhere, if we just had defective components because we had to get them overseas. There's that whole risk associated with it and non-authorized distributors and all that. So there's always that question mark of, "Did they give us something bogus?" Or did the vendor give us an old version of the firmware that is out of date that was wrong to begin with and there's all these question marks. At some point we had no way of verifying. We just kind of had to take the vendor at their word that, "No, this software is correct. This is correct. You got it from us. It's correct. Don't worry about it. It's your hardware, it's your problem."

Cheryl Tulkoff:
Yeah. There's a lot of finger pointing that happens when that system integration comes together. And yeah, that's really, I think where there's a lot of opportunity in the design space to improve some of the techniques and another great place for modeling. I think earlier you had mentioned, how do you assess risk when you're trying something that's never been done before? How do you think of the scenarios and know which things could go wrong and how you might respond to them? And that's very real and that's where sometimes the modeling may create ridiculous situations, but can also help identify those things that people really didn't conceive of or couldn't think like, "Oh, that would never happen. Or I didn't think of that."

Inject those faults into the testing sequence and try it out because you're never going to try every possible thing that could happen in a system. You'd never get a product out the door looking at every potential way it couldn't be used or misused as well. And then add in counterfeiting and supply chain and other things that are going on. And you've got this just incredibly complex suite of parameters that you're trying to control or think about.

Zach Peterson:
Yeah. Okay. So you brought up a couple of things that are interesting here for the designer engineer who's trying to maybe do something totally new and innovative and they want to assess that risk. So I think simulations are an important part of it, especially if it's electromagnetics or even mechanical, right? You brought up FMEA. Those types of tools to qualify the physical design on the front end before you produce a prototype. So is that something that you think design teams or engineers, or even just startups will have to rely on more in order to properly assess that risk? Or is it still like, "Hey, we're going to produce this prototype. We're going to stress it until it breaks. Find out why it broke and update it and we're going to get to a viable product that way."

Is that still the path forward? I'd like to hear your thoughts here, because at some point they're going to try and scale and they may need to come talk to you. I think you might want to know what they did on the front end to try and assess the reliability.

Cheryl Tulkoff:
Absolutely. There's always going to be a need for both. And again, this comes back to what your risk tolerance is and what you're designing for as to what percentage you can rely on for modeling and then how much you really need to spend on the physical side as well too, because you're never going to be able to model with modeling only for a medical device or space device. I mean, simply you've got to do physical testing with prototypes and real things, but there is a lot that you can do that in the modeling side that can help reduce that amount how many samples you might need, how much you have to spend, or how far along the design process you have to be before you uncover anything. And that's really where the modeling really helps tailor and reduce cost and speed up your efforts. It's not a replacement for hardware testing, but it certainly can help things and help drive some understanding that helps you be more mindful of where you spend your money and how long it's going to take you to do things.

Zach Peterson:
Well, I almost think of the front end simulation as your baseline expectation. They set the bar for what you should expect to see in terms of operation, but possibly also help you identify something that you might want to test later when you actually do produce something physical, right?

Cheryl Tulkoff:
Absolutely. Maybe your intuition or industry standard says you need to test A, B, and C. But then simulation comes along and says you need to actually also test D, E, and F.

Or further out in this direction or another combination of factors. One of the biggest challenges that I kind of see routinely in working with clients and companies is that quality and reliability can be very nebulous to people. Everybody says, "Oh, I need high reliability, or I want high quality." But you need to put some definition behind that is what are you designing for? How long does it have to last? What failure rate is tolerable? Because you see so many people jump right in without a clear understanding of that. And those drive very different behaviors in design, in component selection, in the assembly processes.

So if you don't go in with a well defined target to achieve, how do you know if you've been successful and how do you really get there? And that's where I see things go wrong or things go under budget or not work as intended as they weren't clear on what they really needed and you were just trying to hit a target or a budget thing without understanding some of the trade offs there.

Zach Peterson:
So I think reliability is probably a little bit easier to quantify or define than quality because with reliability, I can say, I know that this board could possibly receive a mechanical shock of this force, let's say, or I know it could potentially be exposed to this much of a thermal shock, let's say, right? Those are things that I can put a number to it. And then if I had the resources, I could throw it in a simulator or I could take it to a test facility with an actual prototype and then test it. So I think those things are easier to define, but how do you define quality?

Cheryl Tulkoff:
It's such a fascinating topic all by itself is just distinguishing the difference between quality and reliability. And there is no one answer because it depends on the industry. What is good quality and high reliability for a toy would not be acceptable for an automobile. And so first there's a lot of it depends. And there's also a lot of confusion between the terms as well too, because something can be built with high quality, meaning defect free that's completely reliable and it's into application.

So they can be just completely divorced as well too. You picked the cheapest part and everything and it was built and it tests perfectly, but it was intended to operate in a really stressful environment and falls apart after the first summer that it's have hot temperature exposure or something. So reliability is that concept of the end use environment and time being part of it. And quality is really the point in time versus that duration out there. So first is trying to come to grips with how they're different and then how they're different for industries in the end use environment as well.

Zach Peterson:
Let me ask you this. So when I was... Little background first. So several years ago, looking for a job before I started doing design professionally, I was interviewing as a quality engineer for an aerospace company. And it felt like the conversation centered a lot around yield, however you're going to define that. They wanted to teach me statistical process control. In my head at the time, that's what it felt like quality was. You're doing statistical process control to try and make sure that yield gets to as close to a hundred percent as possible. Is that the right way to think about quality in terms of electronics manufacturing, or is there something deeper?

Cheryl Tulkoff:
Yeah. There's a lot more to it than that. So obviously those are key elements. When you talk about statistical process control, first pass yield, or end yield, those are important elements in reducing defects in

Zach Peterson:
So there's different types of field.

Cheryl Tulkoff:
Oh yes. There are definitely different types of yield. First pass yield is the first kind of the yield, everything tested the first time, but something, if you find an error and it gets repaired, then there are other throughput yields. There are different ways of defining yield as well too in the industry. But when you think about SPC and the types of things you're describing, that's defect reduction. There's a whole nother element of quality that's kind of the forward looking or the prevention to begin with too that are important elements as well, so it's not just how you find and reduce variation, but how you take that and then prevent it from happening to begin with. That's another part. And that's where some of the design plays a key role in that because as designers, the types of components you select, the way you lay them out on the board can make it very difficult to achieve a high first pass yield, no matter how good the factory is in terms of their process abilities and their process control.

So there's a lot of interaction in there between the design process and the actual yield process. And so quality is really a combination of a lot of factors. The design, the partnership that you have, the tools and processes you have, how heavily it can be automated. And that's one of the exciting things about all these new sensors now. So we've got all this data to reduce variation and control, but still if someone has done a poor design job, there's always so much you can do to overcome that.

Zach Peterson:
I think if you're a designer, some of this, you're separated from some of this. There's a big buffer in between you and then the kind of things you're talking about with quality sometimes because not all designers worked at an OEM and not all designers work it for a manufacturer let's say either as handling designs for clients or maybe in the cam department or wherever else. They're external vendors. They're design firms. That's all they do. They create the design data, they hand it off, and they don't even cross their fingers and hope it works. It's like, "Okay, I gave you what you asked for, that's it." And they're out of there.

So if someone's on the front end in that situation and they're working to create something that is going to eventually scale and go through this kind of battery of reliability assessments that we've been referencing, what's their role? What can they do? Or does their role just end at some point and it's all on the manufacturer beyond that to ensure reliability?

Cheryl Tulkoff:
Well, first, I'll say the manufacturer cannot ensure reliability for you at all, so you can-

Zach Peterson:
Love it.

Cheryl Tulkoff:
There's no amount of money that you're going to be able to pay that will make a reliable product out of a poor design about how good the OEM is. They really have very little control over that. They control the quality piece and ideally you have a relationship with them. So the best thing to do, and you mentioned it earlier is most designers anymore don't really have a lot of experience in the factory or the production processes. We've separated that out. Designers, even in one part of the world and they're manufacturers and other, and they may never seen it or understand it. They're optimizing their design for the design purposes, not necessarily for the full life or the buildability.

But the best thing to do is first to educate themselves on what it is, and really partner. If you're going to be building something physical and you're not doing it yourself, your own company, you've got the partner out there, work with them early in the design process.

Don't wait until the end. When you think you've got your finished product and you're like fling it over the wall and fingers cross, hope for the best. It's too late. You want to move on to other things and changing the design or components can be very challenging. So this really needs to be an early involvement and a partnership if you really want to make an affordable and reliable design because you just see that so frequently. It's like, "Well, the contract manufacturer take care of that."

They will and it'll cost you more and they can do the quality piece. But if you've inherently not chosen an appropriate type of package or PCB laminate for your in-use environment, they can build it great. Again, it'll be perfect quality and completely unreliable for your in-use application.

Zach Peterson:
So in terms of what a designer can do, I guess not everybody works directly with the manufacturer. So I think at some point they end up relying on standards. So like class three. IPC class two and class three. Those are some definite things that are at least intended to help ensure certain levels of reliability. How well do you think designers can rely on standards? And then just say, "Well, we built it to this as specified in the standard. It should be just fine." What's wrong with that thinking? And where does the spec start to fail? I see a big smile on your face.

Cheryl Tulkoff:
Oh yeah, because that question comes up so many times. And I do think standards play a great role in helping prevent the known problems. But when you look at standards and some of the challenges with relying on them is that first they lag in time. So if you are a designer and you're trying something new or pushing boundaries, the standards are not going to help you. The problems haven't been discovered yet. They haven't been documented. And so they're always going to be behind the curve.

And then second, if you look at how industry standards are kind of put together and compiled, it's volunteer based and it's representative of an entire industry. So again, what is great and agreeable in the consumer space is much different and a higher reliability like space or transport or medical devices.

So they also become minimal kind of ground level. So just because you pass the bare minimum does not mean that you're going to have a greater reliable product, which is why you see some of the differentiation in IPC standards, as well as in ISO standards where they have extra things that add on if you're in aerospace or medical. But they're always going to be behind the times and not necessarily representative of your particular needs or applications.

Zach Peterson:
So in terms of the role of someone like yourself, reliability consultant, when does that come into play? Do people contact or maybe not individual designers, but do manufacturers or OEMs contact someone like yourself wants they have a problem with an assembly and everyone is pulling out their hair trying to figure out why this thing won't work? Or do you see companies actually being proactive and thinking about this ahead of time? Or is it a case kind of like we've just said where it's like, "Well, we did the bare minimum." Meaning we designed it to the spec and what's wrong? We don't understand why this is going on.

Cheryl Tulkoff:
So I see companies all across the spectrum. It's more heavily weighted toward the failure side, which is a little bit sad as a professional. Is that it's always one of those things we mentioned earlier during the design process how little time or acknowledgement there is that things might fail. And so, oh, we'll bypass some of these things or move on. But we always find the money and the resources when something is broken catastrophically to go do something about it. So we try to shift that upstream.

So I do see a lot of places where it's like, "We thought we did everything and it broke. Why? Can you help us figure it out? Where do we go wrong? But I also have the chance to work with a lot of companies at the design stage, doing design reviews very early on, even concept stage first helping frame, "What am I designing for? How long? What is the risk? What do I need as a reliability target? And then here's my concept design. Are there things that we could do to improve manufacturability? Can I move components, create some spacing, plug some vias? Whatever.

There's a whole litany of things that you can do that can reduce the opportunity for errors in manufacturing, if you're involved early. So it can be anywhere. And lots of companies recognize that and they hire those things out because they don't necessarily need that skill full time, depending on the amount of designs they do. It's also a pretty specialized field as well too. Again, I see them at all kinds of things, unfortunately, or at the backend and sometimes even litigating when things have gone badly awry.

Zach Peterson:
Wow. So you mentioned litigating. Have you ever actually had to be an expert witness?

Cheryl Tulkoff:
I have several times. I actually had to testify in court for one that actually didn't get settled, which was a pretty fascinating experience to be involved in it. And it's also really sad because what I think one of the things that Haven in common is that failure to set expectations. And you mentioned earlier thinking that the manufacturer is going to be able to guarantee your reliability. And that was one of them, which it was just that case. It was built fine, failed badly in application, and the designer was blaming the assembler for a field reliability issue when they had fundamentally picked wrong materials for a very hostile environment. So you're arguing about who's really at fault there and who should have known

Zach Peterson:
Really? Okay. So the person who was responsible for picking materials in terms of assembly. So it sounds like solder in this case.

Cheryl Tulkoff:
Well, in this case it was a laminate-based issue and the designer had said, "This is what I want." And that's what the assembler gave them. And it just wasn't appropriate for their end use application. So they built it fine. It just didn't hold up well.

Zach Peterson:
I see. So again, failure at the designer level to anticipate what the demands are in the end application.

Cheryl Tulkoff:
Or there was an element of that, but there's also the element too, of sometimes, again, that idea of the partnership as if you're going to pick an OEM or contract manufacturer to make your device consulting with them on material selection right to you because sometimes they're like, "Oh, well, I read about this great thing. I want X on my product when X wasn't really intended for that use environment."

So sometimes you'll see in their desire to be very precise in what they want on their design. They design it so that it is really not the right thing. If they had spoken to someone about their end use or availability, they might have gotten some recommendations on something that be more appropriate, but they're like, "Oh, well, I'm going to name it to get exactly what I want. I got that. It just didn't work out great."

Zach Peterson:
Is that the type of thing that could have been identified early on with environmental functional, some kind of testing with a batch of prototypes or was this a sample size issue? A prototype batch would've just been too few to even have noticed it and it just so happened that the failure ended up causing some incident that then needed to be litigated later.

Cheryl Tulkoff:
Yeah. The challenge with this one is I don't think it would've shown up in typical prototype testing because of both sample size and both on the difficulty of finding this particular failure mode that was occurring. So this particular one, it was a failure mode called CAF conducted upon a filament formation within the PCB. It's very difficult to find and debug to begin with and certainly in small sample sizes. So I think it would be a challenge, but that's where some of the modeling software and just knowing about the rules of practice was spacing of conductors and relative voltage potential differences between them and the type of laminate you choose. Knowing those things, that risk was there for this environment and the choices they made. They just weren't aware of it.

Zach Peterson:
I think this is one of those issues where there is a standard for it, but there is a big asterisk with a bunch of context next to the standard that you have to understand.

Cheryl Tulkoff:
It depends. Yes.

Zach Peterson:
Right. So I think designers educating themselves a little bit more almost on the failure modes themselves is really more valuable than just knowing the standard ahead of time because I think I'm agreeing with you that standards are helpful, but they're not the thing that's going to make your product immune to reliability problems. It's almost as if understanding the failure modes a little bit better can maybe help you make some smarter design decisions like what we're talking about picking the right laminate to suppress CAF.

Cheryl Tulkoff:
Yeah. And then it's always about knowing that in-use environment, what types of things you need to be worried about and how you might find them. Right? Back to that idea of the risk assessment, knowing is it going to have corrosive agents? Or is it going to be condensing? Or will there be water temperature? Knowing those things makes you think about how you might test differently or where you're vulnerable and what you can tolerate. And if you need redundancy or something like that, again, back to that concept, if you got to know what you're designing for to be successful at it.

Zach Peterson:
Yeah. And then with redundancy, I swear, the only time I hear anyone talk about redundancy is in space, which, I mean, I just wrapped up an engagement with a client recently, but they work in commercial space. And every time we would talk, go over and do design reviews and talk about system architecture and stuff like that, it was like, "Oh yeah, that's going to be redundant. There's going to be another board on the actual system." I'm not going into what the system is, but there's going to be another board. In case this one fails, we have a second one.

Cheryl Tulkoff:
Yeah. Redundancy can come in a number of different forms too, because some people think of it as just having entirely duplicate system or backup. But there are other industries in which redundancy has become quite common in different ways and they do it. So think about data centers. The internet, these days and how much uptime and reliability they want that your access to have to websites and things. People like Amazon and Google who are in that space among many smaller players.

So that's power supplies and fans and the typical things that they're looking at or parallel banks and then location. So you think where data centers are located, so they're not vulnerable to weather and kind of environmental constraints where you have backups in different parts of the world. So there's different ways to think about redundancy too that there's some surprising places where you'll find redundant practices done.

Zach Peterson:
Even right down to what happens on the board and not just duplicating systems?

Cheryl Tulkoff:
Absolutely. So that's the concept. Even some components are designed with redundancy built into them, or at least fail safe modes. So if something fails, it doesn't fail catastrophically, which is again, back to your risk tolerance.

Zach Peterson:
Sure. Well, I think the takeaway here is understand your risk tolerance because this is certainly the case in finance, but I guess we don't really think of it as being the case in electronics. But I think people massively underestimate their own risk tolerance. And then when it happens to them, whatever it is, whenever it happens to them, suddenly they start to overestimate how conservative they should be.

Cheryl Tulkoff:
Yeah. We're really bad, just in general, at risk mitigation assessment. I mean, you see it all over the news. One incident will happen on the news and then all of a sudden everybody's terrified it's going to happen to them. But yet we get in our vehicles and drive every day when that's one of the most hazardous things you can do as a person. We think nothing of that is a risk. So part of it is it's just not part of our everyday thinking or practice that way.

Zach Peterson:
Well, we're getting to the edge of our time allocation at the moment, but I think I'd like to wrap up with some words of wisdom from you as far as what can maybe newer designers as the industry undergoes this kind of generational shift, what can newer designers do to educate themselves on reliability challenges, modes of failure, what they can do in their designs to ensure that they don't run into the types of problems where the manufacturer now has to call you in order to fix it after it's already failing?

Cheryl Tulkoff:
Well, point number one, we've talked about several times so far is understand what you're really trying to design for. What does that really look like from your customer standpoint? Not yours, but your end use customer. What do they think is tolerable and acceptable and what are they going to need? And then don't go in alone. There's so many resources out there. That's been one of the nice things about the explosion of webinars and podcasts. There are a lot of resources to go out and do some general education. There's the industry standards themselves you need to become familiar with. There's lots of professional societies: SMTA, IPC, IMNA that are both some that are targeted to specific industries as well as some that are broad based. And then partner. If you're designing for something, the more you can get early involvement, early feedback from other people, the more successful you'll be.

Zach Peterson:
I think that's all great advice. I like what you said about joining certain professional societies. I definitely agree. I think designers, anyone who's going to do this professionally, whether they're at the design side or the manufacturing side should be members of some kind of professional society, because it's excellent for continuing education. Cheryl, thank you so much for joining us. This has been a real treat and it's certainly good, selfishly for myself, because the manufacturing process and things like quality control and reliability are all areas where I've tried to educate myself at a deeper level. And I think a lot of other designers need to do that as well. So thank you very much.

Cheryl Tulkoff:
Thank you.

Zach Peterson:
And to everyone out there listening, make sure that you subscribe to our YouTube channel. Make sure you hit that like button. And of course you will be able to keep up with all of our upcoming episodes. We're going to have some great links in the show notes, so you can go learn more about what Cheryl does and you can go learn more about reliability in electronics manufacturing. That's all we have for today. Thanks for joining us, folks. Don't stop learning, stay on track, and we'll see you next time.

About Author

About Author

Zachariah Peterson has an extensive technical background in academia and industry. He currently provides research, design, and marketing services to companies in the electronics industry. Prior to working in the PCB industry, he taught at Portland State University and conducted research on random laser theory, materials, and stability. His background in scientific research spans topics in nanoparticle lasers, electronic and optoelectronic semiconductor devices, environmental sensors, and stochastics. His work has been published in over a dozen peer-reviewed journals and conference proceedings, and he has written 2500+ technical articles on PCB design for a number of companies. He is a member of IEEE Photonics Society, IEEE Electronics Packaging Society, American Physical Society, and the Printed Circuit Engineering Association (PCEA). He previously served as a voting member on the INCITS Quantum Computing Technical Advisory Committee working on technical standards for quantum electronics, and he currently serves on the IEEE P3186 Working Group focused on Port Interface Representing Photonic Signals Using SPICE-class Circuit Simulators.

Related Resources

Back to Home
Thank you, you are now subscribed to updates.