Power Electronics Reliability: State of the Art and Outlook

Huai Wang, Senior Member, IEEE, Frede Blaabjerg, Fellow, IEEE

Abstract—This paper aims to provide an update of the reliability aspects of research on power electronic components and hardware systems. It introduces the latest advances in the understanding of failure mechanisms, testing methods, accumulated damage modeling, and mission-profile-based reliability prediction. Component-level examples (e.g. Si IGBT modules, SiC MOSFETs, GaN devices, capacitors, and magnetic components) are used for illustration purposes, in addition to system-level studies. The limitations and associated open questions are discussed to identify future research opportunities in power electronics reliability.

*Index Terms*—Power electronics, failure mechanism, reliability prediction, physics-of-degradation, accelerated degradation testing, control, condition monitoring.

#### I. INTRODUCTION

**FFICIENCY** and power density have been at the heart of E the power electronics community over the course of the last five decades. The improvements in efficiency and power density are heavily driven by power semiconductor technologies, circuit topologies, and control methods. Fig. 1 shows the historical development of power semiconductors, power electronics, and reliability engineering. With the increasing demands of application-driven research, reliability has become a considerable practical challenge in power electronics; among others listed in Fig. 1. Reliability is an important performance factor that is considered during the design, manufacturing, and field operation of power electronic converters. Over the past decade, the business model in industry has been transitioning from product suppliers to holistic service providers. This demands life-cycle-cost reduction and operation optimization of power electronic converters through innovative design, in-depth understanding of failure mechanisms and mission profiles, and predictive maintenance.

In 2013-2014, the authors reviewed the three aspects of research in power electronics reliability in [1,2]. In May 2015, IEEE Transactions on Power Electronics published a special issue on Robust Design and Reliability of Power Electronics. In the same year, an edited book on the Reliability of Power Electronic Converter Systems was published by IET [3]. Research activities and outcomes worldwide have increased significantly in the last decade. Many of the results are reviewed in [4–13]. In [4], the primary research activities in the power cycling tests of power modules between 1994 and 2015 are summarized, which shows the varieties in packaging technologies, control strategies, failure mechanisms

of interest, and failure analysis methods. In [5–7], the methods for estimating the junction temperature and health condition of power modules are reviewed. Following the adoption of wide-bandgap devices in commercial applications, new packaging technologies are discussed in [8, 9]. The chip-level and packaging-level failure mechanisms, testing, and condition monitoring health precursors of SiC devices are summarized in [10]. The failure mechanisms of GaN devices are discussed in [11]. In addition to active components, capacitor reliability and condition monitoring methods are discussed in [12, 13], respectively. These overview papers provide a foundation to understand the state-of-the-art research on power electronics reliability.

1

This paper focuses on updating the power electronics reliability aspects of research, which are less addressed in [1–13]. Most of the references that will be discussed in this paper are drawn from the last five years. For a more comprehensive list of references and discussions, please refer to [1–13]. It should be noted that this paper has the following limited scopes:

- The component failure mechanisms are mainly on the short-circuit of SiC MOSFETs and GaN devices, and the wear-out of magnetic components. The failure mechanisms of Si IGBT modules and capacitors, and the degradation of SiC MOSFETs and GaN devices are discussed in detail in [10–12].
- 2) The discussed stressors are limited to the electro-thermal stresses and humidity. Other stressors (e.g. vibration and environmental contaminants) and related failure mechanisms are not included.
- 3) A comprehensive reliability prediction procedure for power electronic converters is not included. Instead, the references which give detailed discussions of this topic are introduced. In addition, the reliability prediction methods are only briefly presented. For more details, please refer to [2].

This paper starts with a description of the relevance of component-level failure mechanisms in power electronic applications. Next, component degradation curves, lifetime model limitations, relevance, and time constraints of accelerated testing are discussed. Mission-profile-based reliability prediction and its challenges in thermal model simplification, and accumulated damage modeling are also presented. Finally, a specific outlook of power electronics reliability research is provided.

Huai Wang and Frede Blaabjerg are with the Department of Energy Technology, Aalborg University, 9220 Aalborg, Denmark. (Email: hwa@et.aau.dk; fbl@et.aau.dk)

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JESTPE.2020.3037161, IEEE Journal of Emerging and Selected Topics in Power Electronics



Fig. 1. The historical development of power semiconductors, power electronics, and reliability engineering.

# II. FAILURE IN POWER ELECTRONIC COMPONENTS AND SYSTEMS

In reliability engineering, failure is typically classified into sudden failure and degraded failure. Degradation in time may ultimately cause a sudden failure if no corrective action is taken. Sudden failure is caused by design defects, manufacturing issues, single-event effects, overstress, or misuse. In contrast, degraded failure is caused by long-term degradation. For a power electronic system, failure can come from both hardware and software, due to intrinsic and extrinsic factors. This paper focuses on hardware failures, consequently software reliability [14] and human reliability analysis [15] are not included. Nevertheless, they are essential to a complete system-level reliability design and analysis.

Theoretically, multiple failure mechanisms exist for a single power electronic component [10–12], [16]. At the application level, it might not be realistic to consider all of the reported failure mechanisms. A more effective method is to address a few dominant failure mechanisms that are relevant in field operation. However, the problem here is that the dominant failure mechanisms may vary with the design criteria, and environmental and operational conditions of different applications.

#### A. Examples of field experiences

An example of field experiences in Photovoltaic (PV) applications is presented in [17], which is based on three data reports (primarily for central inverters). Report 1 includes more than 3,500 service tickets (i.e. abnormal states of PV systems) that were requested during 2010-2012 with a total operation of 2,800 inverter-years [18]. Report 2 is from an older set of PV systems monitored during 2008-2010 with a total of 1,650 inverter-years. Report 3 is based on about 400 service tickets. According to [17, Fig. 1], inverters account for 43% to 70% of PV plant service requests. Inverter component failure breakdown based on the three data sources is analyzed in [17, Fig. 3]. Control software failure stands out in all three reports. This specific failure area may be the consequence of hardware malfunction, besides software or firmware issues. The inverters may resume normal operation upon a manual restart [18]. Although the importance of the widely-studied hardware components (e.g. IGBTs and capacitors) remains in terms of their failure percentage as a single type of component, other failure areas including control card PCB boards, fans, and fuses also rank high. However, in-depth failure analyses are unavailable, and the causes of failure and the dominant stressors are still open to question. In particular, it would be interesting to study humidity-related failure mechanisms

2

due to the condensation effect of frequent cold starts in the intermittent operation of PV inverters.

In [19], an example of wind power converter failure analysis is given. The phase modules are one area of failure, including IGBT modules, gate driver boards, DC-link capacitors, and busbars. The database used for this analysis includes 2,734 wind turbine that were commissioned between 1997 and 2015, operating in 23 countries from 11 suppliers. As shown in [19, Fig. 3], most wind turbine are either in the early or middle operation phase when considering a typical service life of 25 years. According to [19, Fig. 5], except for other unknown failure sources, phase modules are responsible for the largest fraction of power converter failures (i.e. 22%), which is followed by the cooling system, control board, and the main circuit breaker. On average, 0.16 phase-module failure per year per turbine is observed for these turbine. The detailed results question the widespread assumption that thermal-related wearout is the dominant failure mechanism of these phase modules. Instead, it indicates that there is a correlation between the failure rate and the absolute humidity level with seasonal change, which suggests that humidity and condensation may be the main stressors causing failure. These observations are consistent with the findings in [20].

These examples represent two large databases for converterspecific failure analyses, which are rarely made publicly available in other power electronic applications. Both findings suggest the need for further investigation of humidity-related failure mechanisms for power electronic components [21, 22]. Nevertheless, they do not necessarily imply that the thermalrelated wear-out failure mechanisms are irrelevant. Most of the analyzed wind turbine in [19] are still in the earlyto-middle phase of their service life. A proper design and manufacturing process aims to prevent or to a large extent reduce wear-out failure. Thermal-related wear-out prediction can inform decisions in the design and operation phases to prevent the associated failure from becoming dominant during the expected service life.

# B. Selected failure mechanisms

1) Short-circuit and overstress failure of SiC MOSFETs: The reliability performance of SiC MOSFETs is of significant consideration in power electronic applications. Extrinsic failures due to defectivity or high process variability, and intrinsic failures due to material degradation are both relevant [23]. On the one hand, field experiences are limited by the early stage of technology adoption. On the other hand, unique reliability issues compared to Si devices give a lower cycling capability with conventional packaging and reduced shortcircuit withstand time [24]. The power cycling capability of SiC MOSFETs has been reported to be lower than IGBTs with the same current rating and package technology in [25] by testing and in [26] by simulation. New packaging technologies [9] are likely to overcome the challenges of power cycling capability. Nevertheless, the compromised short-circuit withstand time with the reduced SiC die size is still relevant in power electronic applications.

Much effort has been made in the short-circuit testing [27-33], [34] of SiC MOSFETs and its protection in power

converter design [34-38]. In [27], comparative testing of two types of 1.2 kV/24 A SiC MOSFETs with Si MOSFET is given. Only a gradual reduction of gate-source voltage is observed for the testing samples, which is caused by the gate leakage current triggered by excessive power dissipation during the short-circuit operation, resulting in lower gateoxide reliability due to a thinner gate thickness. In [28], 1.2 kV/36 A SiC devices are tested with short-pulse high-voltage and long-pulse low-voltage. Two different failure mechanisms are observed: first, thermal runaway due to high off-state drain leakage current; and second, gate-source shorting due to break up of top layers. The observed failure mechanisms are consistent with [29] in their case studies of 1.2 kV/180 A and 1.2 kV/300 A high-current SiC MOSFET modules. Under 600 V testing, the thermal runaway becomes dominant according to the switching waveforms shown in Fig. 2(a). The saturation current reaches about 18 times of the nominal value. It then drops quickly before the switch is turned off, which indicates a significant increase in the chip temperature. A delayed failure occurs at 2  $\mu$ s after the turn-off due to excessive drain leakage current at high temperature. When the testing voltage increases to 800 V, gate-source shorting is observed. This is due to the increased gate leakage current triggered by high local temperatures close to the gate oxide. The corresponding switching waveforms are shown in Fig. 2(b). A safe operation area is proposed in terms of gate-source voltage and short-circuit current [29]. A case study of a 10 kV/10 A SiC MOSFET tested under 6 kV is presented in [30]. The testing results do not show signs of measurable thermally generated currents or degradation in the gate structure. The impact of thermal stress on short-circuit capability is studied in [31, 32] based on the testing of 1.2 kV SiC MOSFETs under different case temperatures. This shows that the shortcircuit critical energy and withstand time slightly decrease with the case temperature. Moreover, the relationship between the number of repetitive short-circuits to failure and the case temperature is presented [32]. The gate-source voltage drop from the initial value of 15 V at the time of 2  $\mu$ s in short-circuit operation is used as a failure criterion. This shows that with the increase of case temperature, the number of repetitive shortcircuit to failure increases because the energy dissipated during the short-circuit operation decreases with the increase of case temperature due to the negative temperature coefficient of the MOSFETs. The impact of degradation on the short-circuit withstanding capability of a 1.2 kV SiC MOSFET module has been investigated [33] based on comparisons between new samples and aged samples from power cycling test. This study shows that the gate-oxide degradation has a more significant impact on the short-circuit withstand time than the packaginglevel degradation.

These studies give us a better understanding of the shortcircuit failure mechanisms of SiC MOSFETs, and mutual effects with chip-level and packaging-level degradation. It can be noted that varieties still exist in testing conditions in terms of the DC supply voltage, gate drive voltage, and case temperature due to lack of standards. Moreover, the study of high-voltage or high-current devices is still limited due to the scarcity of the samples and higher demands on the testing



(a) Short-circuit thermal runaway failure occurs at 600 V testing voltage.



(b) Short-circuit gate breakdown failure occurs at 800 V testing voltage.

Fig. 2. Short-circuit testing of a 1.2-kV/300-A SiC MOSFET module under different DC voltages and case temperature of 25°C.  $I_D$  – drain current,  $V_{GS}$  – gate-source voltage,  $V_{DS}$  – drain-source voltage. Gate driver supply +20 V/-6V with an external gate resistance of 5  $\Omega$  [29].

facilities. With the increasing commercial applications of SiC devices it is expected that more studies on short-circuits will be needed. At the application level, fast and robust short-circuit protection [34–38] is essential to utilize the promising properties of SiC power devices without compromising the device-level design.

Overvoltage and overcurrent caused by parasitics and sustained oscillations are also important issues related to the reliability and EMI [39] in SiC applications. An excessive EMI may lead to a severe consequence of failing to fulfill the relevant standards and regulations. Advanced packaging technologies [9] and converter-level reduction of parasitics and couplings between the power loop and gate driver loop can address this issue [39].

2) Wear-out and short-circuit failure mechanisms of GaN devices: Like SiC MOSFETs, both the extrinsic and intrinsic failures of GaN devices need to be considered for power electronic applications. The level of extrinsic failures of GaN devices due to material defectivity still have not reached the level of the Si counterparts [40]. The internal GaN-specific degradation mechanisms during off-state, switching-state, and on-state are presented in [11], [41]. In [41], a power cycling

test is designed to exclude thermo-mechanical effects by setting temperature swings less than 25°C. This work studies the threshold voltage shift and dynamic on-state resistance change due to the trapping effect. Thermo-mechanical-related power cycling test is performed in [42–45] with temperature swings of 100°C or above. Drain-to-source off-state leakage current (IDSS) failure is reported in [42, 43]. The failure mechanism is identified in [44] by post-failure analysis tools and detailed FEM simulations. This reveals that the thermo-mechanical stresses induced by an acceptable temperature range can cause multilayer cracks in the die of the GaN device.

4

Short-circuit failure mechanisms are studied in [46-49]. The drain-to-source voltage distinguishes two failure modes of GaN devices. A case study on a 650 V p-GaN is given in [46, 47]. The short-circuit withstand time is in the range of few  $\mu$ s or more when the testing voltage is below 350 V. This reduces to a few hundreds of ns when the voltage is above 350 V. The first failure mode is due to the temperature increase of the entire device (i.e. more than 1,000 K), causing damage to the surface metal [48, 49]. The second failure mode is related to local thermal destruction in a relatively small region of the device [46], [48, 49]. Specifically, it is observed that the righthand side of the gate field plate is the most sensitive part of the testing samples [49]. From an application perspective, it can be noted that destructive short-circuits could happen within sub- $\mu$ s under moderate voltage stresses. Therefore, the design of an ultrafast overcurrent protection solution [50] is even more challenging than that for SiC MOSFETs. The drainto-source voltage stress is a vital factor determining shortcircuit capability and dominant failure mechanisms of both SiC MOSFETs and GaN devices.

3) Reliability of magnetic components in power electronic applications: The transformers and inductors that are used in power electronic converters are generally recognized as having fewer failure issues than capacitors and power semiconductor switches. Nevertheless, design-to-limit could become a standard practice to meet the demand for cost reduction and power density increase. In these design scenarios, thermal-related failure and insulation-related wear-out become relevant within the service life of power electronic converters. Moreover, insulation degradation or sudden breakdown is further accelerated by thermal stress, moisture, and dust levels.

The reliability aspect studies of medium-frequency to highfrequency magnetic components for power electronic applications are rarely available. The first studies are presented in [51,52] on the degradation testing and lifetime modeling of planar transformers. Two groups of 12 samples are tested under 200°C and 180°C, respectively, for 3,500 hours and 2,500 hours. The primary-side inductance reduces by 15-40%. Two of the possible failure mechanisms are core material degradation and expansion of the glue between two cores. Further investigations are needed to confirm the relevant failure mechanisms.

Understanding the relevant failure modes and failure mechanisms is the first step to prioritizing resources for reliability aspect testing, lifetime and reliability prediction, as well as condition monitoring.

## III. DEGRADATION AND LIFETIME OF POWER ELECTRONIC COMPONENTS

# A. Component degradation curve and End-of-Life (EOL) criteria

The degradation of power electronic components typically occurs at the material or interconnection level, which needs advanced physics analysis tools [44] to identify the location and structure change. Nevertheless, because the changes of materials and interconnections affect the electro-thermal aspect parameters, they can be used as health precursors to indirectly estimate the degradation level. For example, on-state saturation voltage and thermal resistance between junction-to-case are widely used health precursors for IGBT modules, and they are applied for both accelerated degradation testing [53] and condition monitoring [6]. Capacitance, Equivalent Series Resistance (ESR), dissipation factor, and insulation resistance are usually used for capacitors [12, 13].

Fig. 3 shows a generic degradation curve of a precursor corresponding to a specific failure mechanism, where y is the value of the health precursor and  $\Delta y$  is its parameter shift. The y-axis shows the absolute percentage change of y to its initial value  $y_0$ . The degradation curve has three possible distinctive stages: I, II, and III. These stages correspond to the time intervals in which the health precursor keeps constant, increases or decreases linearly, and increases or decreases with an accelerated pace.

It should be noted that Fig. 3 is given for illustration purposes only. Not all of the three stages necessarily appear for a specific precursor. A typical degradation curve is a combination of one or more of the three stages. The slope and percentage of each period vary for different failure precursors and stress conditions.

The time-to-failure of an individual component due to wearout is determined by one or more defined EOL criteria, which are chosen by considering the component-level destruct limit, system-level specification, and a certain margin; as shown in Fig. 3. In practice, this is achieved in terms of the percentage of change of y to its initial value  $y_0$ . It is reported in [4] that a 5% increase of on-state saturation voltage or a 20% increase of junction-to-case thermal resistance is used as EOL in most of the analyzed 70 publications on power cycling. This is consistent with the recommendations by [53]. Nevertheless, 13% of the publications use 20% increase of the on-state saturation voltage as EOL. For electrolytic capacitors, 20% capacitance drop and 2-3 times increase of ESR or dissipation factor are commonly used [12]. The EOL in terms of capacitance drop for film capacitors is usually within the range of 2-10%.

The varieties in EOL in terms of the percentage of change depend mainly on when the degradation accelerates and enters Stage III; as shown in Fig. 3. This does not usually correspond to the time when the component destruct limit reaches or a system-level failure occurs. Therefore, the change rate of yplays an essential role in defining the EOL, but not necessarily the absolute value of y. This implies that component-level and system-level failure would occur soon if no action is taken at the EOL. Meanwhile, the EOL can also be decided according



Fig. 3. Three possible stages of a component degradation curve.  $\Delta y$  is the parameter shift of the health precursor y,  $y_0$  is the initial value of the parameter x before testing or in use.

to the component parameter constraints considering systemlevel requirements.

#### B. (Percentile) lifetime definition

Lifetime is a widely used reliability metric for a population of items defined at a specific reliability level. For an individual item, time-to-failure is used corresponding to when the EOL is reached. Fig. 4 shows a Weibull plot [54] of an unreliability curve. The curve is plotted based on the time-to-failure data of a limited number of samples. The confidence intervals illustrate the uncertainties. The solid line and dashed line represent 50% and 95% confidence level, respectively.  $B_X$  is used to define the percentile lifetime corresponding to the time when there is X% of accumulated failure. It can be noted that the obtained lifetime varies with X% and confidence level. B0.1\_95% implies 95% probability that 0.1% of the population of items will fail until the time. In other words, statistically, there is 5% risk that more than 0.1% of the items fail until the time. B10\_50% implies 50% probability that 10% of the items fail until the time. Moreover, the plots shown in Fig. 4 are for specific stress conditions. Therefore, a comprehensive lifetime definition should include at least the following four aspects:

- 1) The environmental and operational conditions,
- 2) The EOL used for determining the time-to-failure,
- The corresponding percentage of accumulated failure (X%),
- 4) The confidence level.

Nevertheless, such comprehensive information is rarely provided in the literature when lifetime is stated, which means that we do not know the implication for failure and the associated risk. If any aspect of the information is missed, then the lifetime comparison would be in question. Moreover, the  $B_X$ lifetime with a certain confidence level is a single point in the unreliability curve. The same  $B_X$  lifetime for two different items does not necessarily imply equal reliability during the service life because the slope of their degradation curves could be different.



Fig. 4. Illustration of percentile lifetime definition under different confidence levels for a specific wear-out failure mechanism based on time-to-failure data in a Weibull plot.

## C. Limitations of existing lifetime models

The most widely-considered stressors in empirical lifetime models for power electronic components are temperature, temperature swing, and voltage. The capacitor lifetime model is given by (1), and a simplified version by (2) [11]. The lifetime model for the packaging related wear-out of power modules is given by (3). Relative Humidity (RH) is an additional stressor, as described by [55]. The acceleration factor  $AF_{\rm RH}$  is presented by (4).

$$L = L_{\rm ref} \times \left(\frac{V}{V_{\rm ref}}\right)^{-n_1}$$
(1)  
  $\times \exp\left[\left(\frac{E_{\rm a}}{K_{\rm B}}\right) \left(\frac{1}{T} - \frac{1}{T_{\rm ref}}\right)\right] \times \text{other factors}$ 

$$L = L_{\rm ref} \times \left(\frac{V}{V_{\rm ref}}\right)^{-n_1} \times 2^{\frac{T_{\rm ref} - T}{n_2}} \times \text{other factors} \qquad (2)$$

$$N = A \times (\Delta T)^{-\beta_1} \times \exp\left(\frac{E_{\rm a}}{K_{\rm B}T}\right) \times t_{\rm on}^{\beta_2} \times \text{other factors}$$
(3)

$$AF_{\rm RH} = \left(\frac{RH}{RH_{\rm ref}}\right)^{-n_3} \tag{4}$$

where L is the predicted lifetime at the stress level of interest. N is the predicted cycle-to-failure at the stress level of interest. V is voltage stress, T is temperature,  $\Delta T$  is temperature variation,  $t_{on}$  is the heating time of the thermal cycle,  $E_a$  is the activation energy,  $K_B$  is Boltzmann constant ( $8.62 \times 10^{-5}$ eV/K), and RH is relative humidity.  $L_{ref}$  is the lifetime at the referenced operation conditions of  $V_{ref}$  and  $T_{ref}$ .  $RH_{ref}$ is the referenced RH for the acceleration factor  $AF_{RH}$ . The parameters  $L_{ref}$ , A,  $E_a$ ,  $n_1$ ,  $n_2$ ,  $n_3$ ,  $\beta_1$ , and  $\beta_2$  are constants to be obtained based on time-to-failure data. Other factors are not widely considered or unknown. For example, a humidity derating factor, as shown in (4) is added to (2) in [55]. In [56], three other factors related to the power module design parameters are included. Physics-based lifetime models are also available for power modules, as given in [3, Ch.5]. These models provide insights into improving component design through physics analyses and they can verify the assumed failure mechanisms used for the empirical models. At the application level, empirical lifetime models, as given by (1) - (4), are commonly used thanks to their simple form and direct correlation with the electrothermal stresses and other stressors. They also give insights into lifetime extension by reducing the stresses through power converter design and control.

The following limitations of existing lifetime models are identified according to the literature and to experience:

- 1) Limited information is available for EOL,  $B_X$  lifetime definition, and confidence level when a lifetime model is presented. In [55], [57], comprehensive degradation data analyses are presented for capacitor testing and power module testing, respectively. This reveals that the obtained  $B_X$  values vary significantly with different definitions and confidence levels.
- 2) Limited information is available for the associated failure mechanism when a lifetime model is presented. The power cycling tests are designed to trigger different failure mechanisms in bond wires and solder joints of power modules in [58, 59]. For capacitors, lifetime models for different failure mechanisms [11] are unavailable, to the best of our knowledge.
- 3) The applicable stress ranges of a lifetime model are rarely investigated in reliability prediction. Fig. 5 illustrates the boundaries for the application of a specific lifetime model.  $S_{\rm L}$  and  $S_{\rm H}$  represent the lower and higher boundaries of the considered stressor. The underlying assumption of the models given in (1) – (3) is that the dominant failure mechanism remains the same as that described by the lifetime model within the predicted lifetime.  $L_1$  and  $L_2$  in Fig. 5 show the interval when the lifetime model is applicable. Consequently, if the predicted lifetime is longer than  $L_2$ , then the results should be interpreted cautiously. For example, in Aluminum electrolytic capacitors, the lifetime model in (2) is applied. In [60], it is stated that the lifetime model is applicable for up to 15 years (i.e.  $L_2 = 15$  years). One reason for this is that a new failure mechanism (e.g. sealing rubber degradation) may become dominant after 15 years of operation, instead of the considered electrolyte evaporation and electro-chemical reaction. It also indicates that the applicable range of the voltage stress in (2) is at the rated voltage or above for smallsized capacitors. The voltage stress under the rated one has a negligible impact on their degradation. In [61], the applicable ranges of different factors are given in the IGBT power module lifetime model based on the testing conditions. Nevertheless, for many other studies in the literature, such information is unavailable. Therefore, further investigations of the specific boundaries shown in Fig. 5 are suggested.
- 4) Limited numbers of failure mechanisms are considered. The associated failure mechanisms of the models (1)



Fig. 5. Illustration of the lifetime model application boundaries in terms of stress level and service life.  $S_L$  and  $S_H$  are the lower and higher stress boundaries in which the lifetime model is valid;  $L_1$  and  $L_2$  are the lower and higher service life boundaries in which the specific failure mechanism described by the lifetime is dominant.

to (3) represent a fraction. Failure during the early-tomiddle service life could be responsible for a considerable percentage, as revealed by the field experiences in [19] and the discussions on power cycling test in [62]. Other factors may also play an important role, such as humidity or corrosive atmosphere. As discussed in Section II, these models are useful for designing power electronic systems to avoid the wear-out failure mechanisms becoming life-limiting factors. However, they are only part of the picture for the reliability prediction and reliability design. Other reliability tools are also widely applied, such as Failure Mode and Effect Analysis (FMEA), robustness design, protection, faulttolerance, and burn-in testing.

#### D. Implications on component-level accelerated testing

Time and relevance are the two key considerations in component-level accelerated testing. For a given testing method and stressor S, the testing time can be reduced by increasing the stress. However, this may risk triggering new failure mechanisms that do not appear in normal operation when the stress level is above the  $S_{\rm H}$  shown in Fig. 5. Therefore, testing under accelerated conditions close to normal operation is very time-consuming and not a preferable choice.

1) Testing for relevant failure mechanism and stressor S: Much effort is made to power cycling for triggering specific failure mechanism individually, such as bond wire wear-out, chip solder fatigue, and baseplate solder fatigue. This is implemented by setting proper testing conditions in terms of junction temperature, temperature swing, and cycling period [4]. Besides component degradation curve analysis, the Weibull plot [54, Ch.5] can be used to verify if the time-to-failure data from accelerated testing are for a single failure mechanism, the slope of the unreliability curve (i.e. shaping factor) shown in Fig. 4 tends to be consistent. In [63], the testing results under different temperature swings have a shaping factor of 37.5 and 14.8, respectively. This corresponds

to the dominant failure mechanism of chip solder fatigue and bond wire wear-out, respectively. Humidity-related failure mechanisms are investigated in [55], [64, 65] for SiC power modules, Si IGBT modules, and film capacitors, respectively. As discussed in Section II on the field experiences, humidityrelated failures are likely to be dominant over those caused by thermal stresses for the surveyed systems in the earlyto-middle stage of service life. More studies are needed to confirm the associated failure mechanisms and design the relevant accelerated tests. Moreover, research to separate different failure mechanisms for capacitors and magnetic components in accelerated testing is still at an early stage.

2) Testing with conditions more relevant to  $S_H$  and  $S_L$ : As discussed in [4], since 1994, most studies on power cycling have focused on the stressors of junction temperature swing  $\Delta T_j$  larger than 40°C. In comparison, the impact of the heating time on the power cycling capability is less investigated.  $\Delta T_j$  is usually less than 40°C in normal operation of a power converter. There are line frequency power cycles (i.e. 20 ms or 16.7 ms) and lower frequency cycles due to changes in load and ambient condition.

Several studies have aimed to address these limitations [57], [66–71]. The power cycling is conducted with the heating time  $t_{\rm on}$  as low as 70 ms,  $\Delta T_{\rm i}$  = 70 °C, and the mean junction temperature  $T_{jm}$  =122°C for two types of IGBT modules in [68, 69]. The impact of  $t_{\rm on}$  on the cycling capability of Aluminum (Al) wire bonds and chip solder joints within the range of 70 ms and 60 s is modeled separately. In [57], a type of transfer molded Intelligent Power Modules (IPMs) are tested with  $t_{\rm on}$  from 147.5 ms to 2.5 s,  $\Delta T_{\rm i} = 80.8^{\circ}{\rm C}$ and  $T_{\rm jm}$  = 122°C. In [67], the tests are with  $t_{\rm on}$  down to 10 ms,  $\Delta T_j$  down to 23°C, and  $T_{jm}$  =100°C. Fig. 6 shows that it lasts for more than 500 million cycles for the test with  $t_{\rm on}$  = 10 ms. This study also reveals that the impact of  $t_{\rm on}$ becomes insignificant when it is less than 40 ms. The results from other testing groups are shown in Fig. 6 with different  $T_{\rm im}$  and current per bond wire foot  $I_{\rm bf}$  normalized to  $T_{\rm im}$ =100°C and  $I_{\rm bf}$  = 5.87A [67]. When  $\Delta T_{\rm j}$  is between 23°C and 27°C, 10°C reduction of  $\Delta T_i$  would increase the cycling capability more than 10 times. The high number of cycleto-failure and distinct slope changes imply that it is near the elastic deformation range, where the resultant degradation due to  $\Delta T_i$  is negligible. Instead, the absolute junction temperature plays a significant role in such a range. Power cycling tests with a combination of high  $\Delta T_i$  and low  $\Delta T_i$  are proposed in [70, 71] to investigate the impact of low  $\Delta T_{j}$  within a realistic testing period.  $\Delta T_i$  is down to 32.8°C with  $T_{im}$  of 56°C in three different testing stages. This reveals that the low  $\Delta T_{\rm i}$ has a minor impact on Stage I and II of the degradation curve shown in Fig. 3. This effect is manifested when the testing samples reach Stage III.

For capacitors, the time for degradation testing is usually on the scale of thousands of hours [55], [60]. The acceleration factor is relatively lower compared to many tests designed for power semiconductor switches. In [60], the technical note finds that the lifetime model of Aluminum electrolytic capacitors depends on the ranges of both temperature and voltage. However, testing to verify the dependence is still not

8



Fig. 6. The normalized lifetime of 1,200 A IGBT modules with IHM-B package (140 mm × 190 mm) from manufacturer A and B with  $t_{\rm on} \leq 40$  ms (current per bond foot  $I_{\rm bf}$  = 5.87 A and mean junction temperature at the beginning of the test  $T_{\rm jm}$  = 100°C, Hartmann refers to the model in [66]) [67].  $N_{\rm f}$ - number of cycles to failure,  $\Delta T_{\rm j}$ - amplitude of junction temperature swing,  $T_{\rm jm}$  – mean junction temperature.

# found in the literature.

3) Testing with relevant control strategy for stressor S: Empirical lifetime models (1)-(4) are usually derived from tests under two or more stress levels. The stress control strategies have an implication on how to apply the derived models. For power cycling of IGBT modules, the control strategy with a constant switching-on duration of DC pulse current and constant switching-off duration is adopted by IEC standard IEC 60749-34 (2011) [72] and ECPE AQG324 guide [53].  $T_{\rm im}$  and  $\Delta T_{\rm i}$  used for lifetime derivation are the values at the beginning of the testing. During the testing, they increase with possible parameter shifts due to degradation (e.g. thermal impedances of thermal interface materials, on-state voltage, and thermal resistance of baseplate solder and chip solder). In addition to this standard control strategy, another three methods (i.e. constant minimum and maximum case temperature, constant power loss, and constant  $\Delta T_i$ ) are also discussed in [73]. The standard control strategy is most relevant to power electronic applications because parameter shifts can occur and result in higher thermal stresses during the service life of a power converter. Therefore, the impact of these degradationdependent parameters does not need to be considered for the lifetime prediction in applications. If the lifetime model is obtained from power cycling test based on the other three control strategies, then either part or all the parameter shifts need to be included in the thermal stress modeling for lifetime prediction. This would increase the application-level modeling complexity. In this regard, passive thermal cycling of power modules without current stress ignores this degradation effect.

Capacitor testing can be implemented with or without ripple current. For the first category, a DC bias voltage, AC ripple current, and ambient temperature are kept constant. Similarly, the hot-spot temperature and peak voltage stress both increase during the test due to the changes of ESR and capacitance. The lifetime models (2) and (3) include the degradation effect if this control strategy is applied. For the second category, the effect of degradation-dependent ESR and capacitance is not considered in the lifetime model.

The control strategy for accelerated stressors affects the obtained lifetime models and converter-level modeling methods. As compared in [73], the cycle-to-failure based on constant  $\Delta T_{\rm j}$  could be three times that of the standard power cycling control strategy.

4) Testing with appropriate sample size: The sample size that is used for testing is closely related to the confidence interval. With the increase of sample size, the interval shown in Fig. 4 between the 95% confidence level and 50% confidence level becomes narrower. This implies a higher certainty level. Typically, 6 to 10 samples for each testing run are recommended for a meaningful statistical analysis compromised with the required resources. For instance, 10 and 12 samples are used for capacitor and planar transformer testing in [51], [55], respectively. Nevertheless, it is reported that 86% of the power cycling tests have only one sample per test run based on the 70 publications discussed in [4]. This implies that meaningful lifetime information is unavailable in some cases even though a significant amount of resources are used. Consequently, more attention to the design of accelerated testing in terms of sample size is necessary.

5) Testing with superimposed stress conditions: At the application level, another aspect is the correlation between the testing under static stress level and field operation with dynamic environmental and operational conditions. Two questions need to be raised here: how can the effect of two or more different stress levels be combined? And, does the dominant failure mechanism remain the same under dynamic stress profiles?

The degradation curve shown in Fig. 3 represents a single stress level. The accumulated damage model is used to analyze the joint damage of two or multiple stress levels. The time-tofailure corresponds to when the accumulated damage reaches one. The Palmgren-Miner rule that was presented in 1945

9

[74] is a linear accumulated damage model that is widely used for lifetime prediction. This rule assumes no interaction between stress levels in terms of amplitude, duration, and time sequence. Most of the wear-out failure predictions of active devices and passive components in power electronics under dynamic stress conditions are based on the Palmgren-Miner rule. The nonlinear accumulated damage models that are used in other areas in reliability engineering are summarized in [75]. The validity of the linear accumulated damage model is questioned in [76-78]. Simulations in [76] show that the numbers of cycle-to-failure of bond wires are similar between the linear and nonlinear models. In [77], power cycling tests under three sets of  $(\Delta T_i, \text{ minimum junction temperature } T_{\text{imin}},$ and  $t_{\rm on}$ ) are performed first as the reference numbers of cycleto-failure. The conditions at the beginning of the tests are Test 1 - (100°C, 50°C, 2s), Test 2 - (50°C, 100°C, 1s), and Test 3 - (100°C, 50°C, 15s). Each test has six samples. Test 4 runs for 50% of the referenced lifetime of Test 1 under the same condition, then for the rest of the time under the condition of Test 2 until the EOL reaches. Test 5 runs for 50% of the referenced lifetime of Test 3 under the same condition, then until EOL under the condition of Test 2. Test 6 runs periodically with 20% of the reference lifetime of Test 1 and Test 2, respectively, under the same conditions until failure. The results show an average of 9.9%, 18.6%, and 0.6% higher number of cycle-to-failure from Test 4, Test 5, and Test 6, respectively, to the values calculated based on the linear accumulated damage model. In [78], power cycling tests with superimposed stress levels are conducted with more frequent changes of the sequences than Test 6 in [77]. The conclusions of both tests are consistent: the bond wire wearout is assumed as the dominant failure mechanism [77, 78]. A different conclusion may be drawn if the failure mechanism changes under different stress levels, such as the case studies in [63]. It is worthwhile further investigating the validity of the linear accumulated damage model given that the reported number of case studies and testing conditions are still limited. Furthermore, no conclusions are able to explain the observed results and the varieties in comparisons. In [79], a nonlinear damage model for Aluminum electrolytic capacitors is applied for the lifetime prediction in a motor drive application.

Application-oriented testing under a representative mission profile can be used to check if the dominant failure mechanism remains the same with dynamic stress levels. In [80–82], power modules and capacitors are tested with a sequence of stress levels emulating PV applications. The daily profile of solar irradiance and ambient temperature is used [82] with a scaling factor to accelerate the test. The purpose of these tests is not to derive a lifetime model because the results are the joint effect of multiple stress levels in a single run. Instead, it is possible to verify the assumed failure mechanisms and the lifetime prediction results. To support such a superimposed stress test, advanced mission profile emulators are necessary. For example, in [83–85], the emulators for motor drives and Modular Multi-level Converters (MMCs) are proposed, respectively.

## IV. MISSION-PROFILE-BASED RELIABILITY PREDICTION

In [1, 2], the methods for reliability prediction are reviewed. System-level reliability modeling methods, such as Reliability Block Diagram (RBD), Fault Tree Analysis (FTA), and Markov Analysis (MA) are compared in [2]. Much progress has been made towards reliability prediction based on longterm environmental and operational conditions (i.e. mission profile). The methods and modeling procedures demonstrated in [86, 87] represent the state of the art for case studies on PV inverter and MMC, respectively.

A mission-profile-based reliability prediction tool is developed in [88]. This tool is based on six modeling steps starting from system-level mission profile to component-level reliability modeling and system-level reliability modeling. This tool features a modular design with flexible input levels of mission profiles. As shown in Fig. 7, mission profiles along the modeling process are divided into four levels. Level 1 is system level (e.g. drive cycles of a car, ambient temperature, humidity). Level 2 is the power electronic converter level, which can be directly measured or obtained from the systemlevel mission profile modeling shown as Step 2 in Fig. 7 (e.g. derive the output power, frequency, voltage of a drive from the drive cycles and motor models). Level 3 is down to the components of interest (e.g. the local ambient temperature, voltage and current of a capacitor). Level 4 is for stresses that are the inputs of the lifetime models or failure rate models used for component-level reliability modeling. For dynamic thermal profiles, Rainflow counting [89] is widely used to extract the periodic temperature swings for the packaging-related wearout of power semiconductor switches. The use of Rainflow counting is discussed in detail in [86, 87]. Depending on the available mission profile information, the tool runs part or all of the six steps. For instance, if all of the required information at Level 3 mission profile is available, then the first three steps can be excluded and the software tool starts directly from Step 4 component mission profile modeling. Fig. 8 shows a capture of the Graphic User Interface (GUI) of the software tool customized for motor drive applications. The user can select the available mission profile levels, configure the motor and drive, and select the component, assembly, or system of interest for reliability analysis.

The details of mission-profile-based reliability prediction methods and the corresponding software tools can be found in [86–88]. A few additional aspects are discussed below:

## A. Mission profile data

The mission profile represents all of the relevant stresses that the item of interest withstands during the whole service life, including stresses during manufacturing, burn-intest, transportation, installation, and field operation. Lifetime and reliability prediction often uses limited information that is compromised by the available data and complexity. The limitations come from both the considered stressors and the time span of the analyzed data. Predictions based on data for the whole service life and models for all kinds of failure mechanisms are rarely feasible. In the PV inverter case studies, annual solar irradiance and ambient temperature profiles are This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JESTPE.2020.3037161, IEEE Journal of Emerging and Selected Topics in Power Electronics

10



Fig. 8. The Graphic User Interface (GUI) of the developed software tool customized for motor drive application with the six-step modeling method [88].

used [86], [91]. In addition to the time span, the sampling frequency is also critical and this is relevant to the failure mechanisms with different timescales. As studied in [92], the sampling frequency reveals significant variances in the predicted lifetime based on wind speed data with different sampling frequencies for an offshore wind application. Nevertheless, guidelines on the appropriate mission profile resolution are unavailable.

A mission profile logger is developed in [90] to collect operational and environmental data from field operations. The prototype is shown in Fig. 9(a). This measures the raw data of voltages and currents of a phase-leg and climatic conditions. The data can be pre-processed, stored locally, and collected by wireless networks. Fig. 9(b) shows an example of the measured climatic data inside and outside of the enclosure of a 60 kW PV inverter.

While the mission profile enables more realistic reliability prediction, it increases the complexity and computational burden, especially considering the dynamic thermal profile modeling [86, 87]. Consequently, it could be more feasible to simplify mission profiles by identifying the representative profiles. Moreover, considering the varieties in thermal time constants and failure mechanisms of different components (e.g. IGBT modules and capacitors), the use of different data resolutions can reduce the computational burden. A simplified mission profile is usually sufficient because products are commonly designed for applications with a wide range of operating conditions. However, for benchmarking or warranty analysis of a specific application, a more detailed mission profile is preferred.

#### B. Simplified electro-thermal modeling

Thermal modeling is a key step for thermal-related wearout failure mechanisms. There are two significant challenges here:

 The local ambient temperature in Level 3 mission profile shown in Fig. 7 is dependent on the cooling, layout, power loss, and thermal couplings of the converter of interest. Therefore, system-level electrothermal modeling is needed to obtain the componentlevel thermal stress. In the case study of [86], the thermal stresses of capacitors and MOSFETs mainly This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JESTPE.2020.3037161, IEEE Journal of Emerging and Selected Topics in Power Electronics



(a) Photo of the mission profile logger prototype.



(b) Temperature and humidity data logged from a 60 kW PV inverter located in the solar park in Nordborg, Denmark. The legend indicates the locations of the sensors Inside (In)/Outside (Out) Top (Tp) side and Down (Dn) side of the inverter cabinet.

Fig. 9. An example of the mission profile logger and the obtained field measurement data in [90].

depend on their local ambient temperatures (i.e. the enclosure temperature and the air temperature around the components). Fig. 10 shows that the enclosure temperature is more than 30°C higher than the outside ambient temperature at heavy loads. The maximum rises of component internal temperatures due to self-heating are 5°C and 15°C, respectively, for the capacitor and MOSFETs. This implies that the accuracy of the enclosure temperature modeling is even more critical than that of the temperature rise due to self-heating in this example. In another case study of MMC in [87], there is a maximum of 17°C difference in local ambient temperatures among the 24 sub-modules. The setup and measurements are shown in Fig. 11. This implies that there are likely to be life-limiting sub-modules in the MMC setup. In both [86, 87], converter-level Finite Element Method (FEM) simulations are performed to obtain the system-level thermal models, including mutual coupling among the key heating sources.



Fig. 10. A two-day thermal stress profile of a PV micro-inverter with natural cooling [86].  $T_{\rm a}$  – ambient temperature outside the PV inverter enclosure;  $T_{\rm e}$  – enclosure temperature;  $T_{\rm jS1}$  to  $T_{\rm jS5}$  are the estimated junction temperature of the MOSFETs of the inverter;  $T_{\rm Cdc}$  – hot-spot temperature of the DC-link capacitor of the inverter.

Moreover, system-level thermal modeling can identify hot-spots in the system. This provides insights into improving reliability by heat re-distribution.

2) Thermal modeling with long-term dynamic profiles is time-consuming. Therefore, in applications where the ambient climatic conditions or loading conditions are highly intermittent and dynamic (e.g. wind power, PV, and e-mobility), it is necessary to consider the impact of thermal capacitance. A frequency-domain thermal modeling method is proposed in [93]. This simplifies the modeling at different timescales by coordinating different thermal time constants and the frequencies of power loss profiles. In [94], a simplification method is proposed for thermal modeling with periodic power loss profiles. An analytical error model is derived for different degrees of simplifications. Because the ambient temperature change is relatively slow, IGBT module junction temperatures are assumed to change simultaneously with the ambient temperature in [93,94]. Nevertheless, the thermal time constants of capacitors or large heatsinks range from a few minutes to hours, which are comparable with ambient temperature dynamics. A modified thermal model is proposed to incorporate the effect in thermal modeling in [95]. The snap-in Aluminum electrolytic capacitor that is used for the case study is shown in Fig. 12(a). The comparative results with the experimental characterizations are shown in Fig. 12(b). With the application of SiC and GaN devices, new challenges come from the nonlinearity of thermal impedance in a wide range of temperature stresses [96, 97], as well as PCB board-level thermal modeling and optimization [98].



(a) Prototype of the down-scaled MMC setup and its 24 sub-modules.



(b) The measured ambient temperature profiles of the sub-modules with forced air cooling.  $T_{\rm la1}$  to  $T_{\rm la24}$  are the ambient temperature of the 24 sub-modules, respectively.  $\Delta T_{\rm la}$  is the maximum ambient temperature difference.

Fig. 11. An example of system-level thermal modeling of a Multi-level Modular Converter (MMC) [87].

# C. Limitations of existing mission-profile-based reliability prediction methods

The limitations of the existing mission-profile-based reliability prediction methods are discussed below:

- Limited failure mechanisms and failure sources are considered. The discussed hardware failure is mainly for the thermal-related wear-out. Other failures due to humidity, vibration, and contamination are not usually quantitatively analyzed. Sudden hardware failures and software failures are yet to be included. For example, single-event-burnout due to cosmic rays is relevant for electronic device applications at high elevations [99].
- 2) There is a lack of humidity aspect modeling. As discussed previously, humidity-related failure mechanisms [21] may become dominant, especially for applications with intermittent loading profiles and frequent shutdowns and start-ups. A transient hygrothermal modeling method is proposed in analogy with electrical domain models in [100]. It is therefore worthwhile investigating the application of such models in power electronic converters.
- The linear accumulated damage model is dominantly used in wear-out failure prediction. To what extent it is a reasonable approximation is still open to question. Vari-



(a) Capacitor testing sample and its thermal model parameters.  $R_1$ ,  $R_2$ ,  $C_1$ , and  $C_2$  are the thermal resistances and thermal capacitances, respectively.



(b) Experimental verification of the model considering the impact of ambient temperature dynamics. Fig. 12(a) is the capacitor hot-spot temperature,  $T_{\rm a}$  is the capacitor ambient temperature,  $P_{\rm loss}$  is the power loss in the capacitor.

Fig. 12. The proposed thermal modeling method considering the thermal dynamics of ambient temperature proposed in [95].

ous nonlinear damage models are available, as discussed in [75]. However, their relevance and the necessity to use them are not yet known.

- 4) Failure interactions among components and external factors are not considered in the system-level reliability analysis. As shown in [86, 87], the RBD method is applied to calculate the system-level reliability based on the component-level reliability and system redundancy level. In practice, failure causes could be complicated and mutually affected by multiple factors. The assumption of the isolated failure of each component needs to be rigorously justified [101]. In [102], a Markov reliability modeling method is applied for multi-phase DC-DC converters to analyze different failure scenarios.
- 5) Activities in system-level reliability demonstrations are expanding but still limited in the literature. In [103, 104], converter-level reliability testing is performed for motor drives and PV inverters, respectively. The system-level testing outcomes could identify the weakest links and demonstrate that the reliability is no less than the designed target.

# V. OUTLOOK FOR POWER ELECTRONICS RELIABILITY RESEARCH

Power electronics reliability research has been booming in recent years, both at universities and in companies. Among others, design-margin reduction, predictive maintenance, cost of unreliability, and the increasing use of power electronic converters in critical applications are the main drivers. Lifecycle performance optimization and cost reduction are of interest, as are the time-zero performance and design cost. The recognition of the importance of data also plays a significant role in reliability studies. This section will describe several future research opportunities.

## A. Physics-of-degradation and condition monitoring

Failure-free operation may become a requirement for power electronic converters as they are increasingly used in reliability-, availability-, or safety-critical applications. If failure is not an option, then the understanding and modeling of the degradation process become important. A degradation curve retains all the information from accelerated degradation testing, as shown in Fig. 3, while a lifetime model is based on a single data point (i.e. EOL) of the degradation curve of each testing sample. Understanding the physics-of-degradation enables identifying key features of the degradation process.

Condition monitoring is likely to become an even more important tool in reliability engineering if failure is not allowed within the service life. Degradation modeling is essential to predictive maintenance. The following two key challenges remain to be addressed:

- Most condition monitoring methods are limited to a single type of component or an individual component [6,7,13], [105]. Nevertheless, at the application level, degradation could occur on multiple components concurrently. Different sources of parameter shifts happen simultaneously. However, the mutual effects cannot be verified based on the assumption that a single component degrades at a time. Moreover, a full converter or an assembly is usually replaced in the presence of component failure. It is therefore essential to monitor the health status of the entire converter or the assembly parts of interest. Converter-level health precursors and converter-level signal measurements are alternative solutions.
- 2) Robust and cost-effective condition monitoring methods are required. Even though a wide range of health precursors and implementations are proposed in the literature, they are rarely adopted in field operation. Complexity, cost, design constraint, accessibility, and effectiveness under field operation environment are of great concern. In [106, 107], challenges in circuit parasitics and data analytics of a condition monitoring unit for IGBT power modules are addressed. A converter-level power device on-state voltage measurement method is developed in [108], which reduces the design cost and provides a converter-level plug-and-play solution without the need to connect to the monitored power devices. Condition monitoring provides an ancillary service, which is critical to minimize the new risk brought by the added sen-

sors, measurement circuits, and/or software algorithms. In addition, this service may only require intermittent operation. It is needed most during Stage III of the degradation curve shown in Fig. 3. Anomaly detection from similar systems in operation, leverage special operation state (e.g. start-up transient [109]), systemlevel health precursors (e.g. efficiency), are among the most promising methods. In [110], a digital-twin based condition monitoring concept is proposed, representing a future direction of combining physical power electronic circuit models and data-driven approaches. A composite health precursor is proposed in [111] by an optimal weighting of multiple relevant parameters. This method expects to improve the sensitivity and reduce noise in the remaining useful lifetime prediction based on multiple sources of degradation data. Artificial Intelligence (AI) based condition monitoring methods [112] hold great potential with the continued digitization and increasing amounts of available data in power electronics applications.

# B. Accumulated damage modeling

Fig. 13 illustrates the validity of the linear accumulated damage model under different types of degradation curves. For the sake of simplicity, Figs. 13(a) and 13(c) show the degradation curves for two stress levels A and B, having only Stage II and Stage III illustrated in Fig. 3, respectively. More general cases could be a combination of the curves shown in Figs. 13(a) and 13(c). Even though the degradation curve at the beginning phase of testing may diverse, as shown in [78], it is negligible because it is usually a short period of the whole service life. Stress Levels A and B correspond to the referenced time-to-failure  $L_A$  and  $L_B$ , respectively. Three different scenarios of the nonlinear degradation curve under Stress Level B are considered in Fig. 13(c), namely: Scenario 1, Scenario 2, and Scenario 3. Figs. 13(b) and 13(d) show the degradation curves with normalized hour/cycle/mileage to  $L_A$  and  $L_B$ . As shown in Fig. 13(b), the degradation curves overlap under the two stress levels due to the normalization of the x-axis. Scenario 1 in Fig. 13(c) corresponds to when the two degradation curves of Level A and Level B are the same. Scenario 2 and Scenario 3 represent the cases for two different degradation curves shown in Fig. 13(d). All of the curves reach EOL criteria at 100% with testing under the individual level of stress conditions. At x% of the time-to-failure, the stress level changes from A to B.

With the linear degradation curves, the superimposed stress of Level A and Level B results in the same time-to-failure as predicted by the linear accumulated damage model, as shown in Fig. 13(b). With the nonlinear degradation curves, the health precursor's change at different intervals in the x-axis is different due to the nonlinearity. For instance, the testing under Level A stress condition at the very beginning causes a negligible change of  $\Delta y$ , compared to a significant change in approaching the EOL  $L_A$ ; as shown in Fig. 13(c). Nevertheless, this nonlinearity in the degradation curve is already considered in lifetime models, which does not necessarily



Fig. 13. Illustration for accumulated damage based on degradation curves under a sequence of two stress levels for a given failure mechanism. The normalization of the x-axis in (b) and (d) is with respect to the time-to-failure  $L_A$  and  $L_B$ , respectively, for the curves under stress Level A and Level B.

imply that a nonlinear accumulated damage model is needed. In Scenario 1 for Stress Level B, its degradation curve is the same as that of Stress A in Fig. 13(d). The resultant changes of  $\Delta y$  under Stress Level A and B are the same at x% of operation. The degradation curve ends at 100% under Stress Level B of operation after x%. In this scenario, the linear accumulated damage model is valid. In Scenario 2, the change of  $\Delta y$  at x% of operation under Stress Level A is equal to that until  $x_{B2}\%$  under Stress Level B. The degradation curve from x% of operation under Stress Level B is equivalent to shift the degradation curve from  $x_{B2}\%$ , resulting in a time-tofailure more than 100%, which is higher than that predicted by the linear accumulated damage model. Similarly, in Scenario 3, it can be derived that the superimposed stress results in a time-to-failure less than 100%.

According to Fig. 13, it is possible to check if the linear accumulated damage model is valid by analysing the degradation curves from the testing under individual stress conditions. For instance, the degradation curves presented in [77] feature a large portion of linear degradation shown in Fig. 13(a) and a sharp and short period of nonlinear degradation shown in Fig. 13(c). Unlike the case shown in Fig. 13(a), the boundary between the linear degradation and nonlinear degradation could correspond to the different percentage changes of  $\Delta y$ . This implies a slightly different slope of the degradation curves for Stress Level A and B in Fig. 13(b). According to Fig. 3, the degradation curve consisting of Stage III can be assumed linear within a short time interval. Therefore, as indicated by the tests in [78] and the  $6^{th}$  test in [77], the linear accumulated damage model is valid for the cases under more frequent changes of stress levels, which differs from the  $4^{th}$  and  $5^{th}$  tests in [77]. The perspectives provided in Fig. 13 are based on theoretical

analysis and are consistent with superimposed stress testing observations, focusing on the Palmgren-Miner rule's validity. However, more tests to verify the assumptions are necessary in the future.

## C. Humidity-induced failure mechanisms, and failure mechanisms of SiC and GaN switches, and magnetic components

The level of attention to the humidity-induced failure mechanisms of power electronic systems is relatively lower than that for electro-thermal and thermal-mechanical induced failures. Nevertheless, they are closely relevant to field applications, as discussed in Section II. Consequently, more research on modeling of humidity impact, and micro-climatic design and control for power electronic systems is suggested.

Even with increasing studies on the failure mechanisms and accelerated testing of SiC and GaN switches [10, 11], the relevance between these component-level investigations and converter-level applications is still largely unknown. SiC and GaN-based industrial systems are limited at the early stage of its operation. It is therefore expected that the commercial use of SiC devices in automotive applications will provide more field experiences. The decreased short-circuit withstand time of SiC and GaN devices demands protection schemes with sub- $\mu$ s or 1-2  $\mu$ s decision-making and action.

The understanding of failure mechanisms of magnetic components in power electronic applications is still limited. The early studies presented in [51,52] show the need to further investigate the failure mechanisms and degradation models, especially for planar magnetic components in high-density converters and medium-voltage medium-frequency isolation transformers.





(a) Schematic of the thermoreflectance measurement configuration, including the Charge Coupled Device (CCD) detector, illumination source, microscope objective, and the temperature-controlled stage.



(b) Measured drain current, voltage and calculated power dissipation, temperature based on measurement and simulation of the CGH40025F Cree GaN HEMT operating in a class-A amplifier, temperature measured with a temporal resolution of 50 ns.

Fig. 14. Thermoreflectance measurement method and an example of the measured temperature profile proposed in [113].

### D. High resolution and fast temperature measurement

Thermal stress is related to various failure mechanisms. Temperature-Sensitive-Electrical-Parameters (TSEP) are widely studied for temperature estimation, limited to 100  $\mu$ s temporal resolution [114]. In [113], a non-invasive thermal measurement technique called thermoreflectance thermography is proposed with nanosecond time and sub-micron spatial resolutions. This technique is promising for temperature measurement during short circuits and MHz-level high-frequency converters. Fig. 14(a) shows a schematic diagram of the measurement setup. Fig. 14(b) shows the measured temperature profiles with a temporal resolution of 50 ns. However, more studies are necessary to lift the maturity level and extend it to other applications.

#### E. Reduction of reliability testing time

It is necessary to reduce the required time of accelerated degradation testing and reliability demonstration testing. Two promising directions are briefly discussed below:

- Virtual-physical hybrid testing method. A digital platform can be built to simulate the degradation of power electronic components under accelerated or normal stress levels. Physical testing is used to calibrate the models used in the digital platform. Since a significant part is to be done by simulations, it is possible to reduce the required time without compromising the relevance of the results.
- 2) Accelerated testing combined with early wear-out prediction. The prediction for degradation curves based on early testing results is applied for film capacitors [115] and batteries [116, 117]. In a recent case study [117], it has been demonstrated that the required time for identifying high-cycle-life charging protocols among 224 battery candidates is reduced to 3.2% of the expected time without early-prediction. It is therefore worth exploring the concept for the degradation testing of power electronic components and converters.

# F. Reliability modeling of power-electronics-based power systems

The impact of power electronic converters on the reliability of power systems is still an open question with their increasing penetration level. Hundreds and thousands of power electronic converters may be used in a local electrical network. Consequently, reliability performance plays a role in the availability and security of the electric network. The interconnects, interactions, and control dynamics among different converters all add extra layers of complexity. This brings new challenges in scaling up the mission-profile-based reliability prediction methods discussed in Section IV. FMEA is a useful tool to prioritise the available resources to deal with the top-ranked failure causes in a system. Consequently, new multi-time-scale modeling approaches are needed to simplify the modeling procedures, meanwhile, maintaining reasonable prediction accuracy. Real-time simulation of large-scale systems could also become feasible for reliability study, extending its scope from electromagnetic domain.

#### VI. CONCLUSIONS

This paper reviews the latest research in power electronics reliability in field experiences, failure mechanisms, component-level end-of-life criteria, testing methods, and system-level mission-profile-based reliability prediction methods. The discussions on failure mechanisms have focused on the short circuits of SiC MOSFETs and GaN devices, and the magnetic components. Perspectives on the degradation curve, lifetime definition, lifetime model applicability boundary, and linear accumulated damage model validity are provided. In addition, the limitations in existing lifetime models and mission profile-based reliability prediction methods are discussed. Finally, future research demands and opportunities are identified, which are: a) physics-of-degradation and condition monitoring; b) accumulated damage modeling; c) humidity-induced failure mechanisms, and failure mechanisms of SiC and GaN switches, and magnetic components in power electronic applications; d) high resolution and fast temperature measurement; e) reduction of reliability testing time; and f) reliability modeling of power-electronics-based power systems. For further background information on these aspects and other relevant research topics, we refer the reader to [1-13].

### REFERENCES

- H. Wang, M. Liserre, and F. Blaabjerg, "Toward reliable power electronics: Challenges, design tools, and opportunities," *IEEE Ind. Electron. Mag.*, vol. 7, no. 2, pp. 17–26, Jun. 2013.
- [2] H. Wang, M. Liserre, F. Blaabjerg, P. de Place Rimmen, J. B. Jacobsen, T. Kvisgaard, and J. Landkildehus, "Transitioning to physics-of-failure as a reliability driver in power electronics," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 2, no. 1, pp. 97–114, Mar. 2014.
- [3] H. S.-H. Chung, H. Wang, F. Blaabjerg, and M. Pecht (editors), *Reliability of power electronic converter systems*. Institution of Engineering and Technology, 2015.
- [4] C. Durand, M. Klingler, D. Coutellier, and H. Naceur, "Power cycling reliability of power module: A survey," *IEEE Trans. Device Mater. Rel.*, vol. 16, no. 1, pp. 80–97, Jan. 2016.
- [5] N. Baker, M. Liserre, L. Dupont, and Y. Avenas, "Improved reliability of power modules: A review of online junction temperature measurement methods," *IEEE Ind. Electron. Mag.*, vol. 8, no. 3, pp. 17–27, Sep. 2014.
- [6] Y. Avenas, L. Dupont, N. Baker, H. Zara, and F. Barruel, "Condition monitoring: A decade of proposed techniques," *IEEE Ind. Electron. Mag.*, vol. 9, no. 4, pp. 22–36, Dec. 2015.
- [7] H. Oh, B. Han, P. McCluskey, C. Han, and B. D. Youn, "Physics-of-failure, condition monitoring, and prognostics of insulated gate bipolar transistor modules: A review," *IEEE Trans. Power Electron.*, vol. 30, no. 5, pp. 2413–2426, May 2015.
- [8] R. Khazaka, L. Mendizabal, D. Henry, and R. Hanna, "Survey of hightemperature reliability of power electronics packaging components," *IEEE Trans. Power Electron.*, vol. 30, no. 5, pp. 2456–2464, May 2015.
- [9] H. Lee, V. Smet, and R. Tummala, "A review of SiC power module packaging technologies: Challenges, advances, and emerging issues," *IEEE J. Emerg. Sel. Topics Power Electron.*, Mar. 2020.
- [10] Z. Ni, X. Lyu, O. P. Yadav, B. N. Singh, S. Zheng, and D. Cao, "Overview of real-time lifetime prediction and extension for SiC power converters," *IEEE Trans. Power Electron.*, vol. 35, no. 8, pp. 7765– 7794, Aug. 2020.
- [11] M. Meneghini, I. Rossetto, C. De Santi, F. Rampazzo, A. Tajalli, A. Barbato, M. Ruzzarin, M. Borga, E. Canato, E. Zanoni *et al.*, "Reliability and failure analysis in power GaN-HEMTs: An overview," in *Proc. IEEE Int. Rel. Physics Symp. (IRPS)*, 2017, pp. 3B–2.1–3B– 2.8.
- [12] H. Wang and F. Blaabjerg, "Reliability of capacitors for DC-link applications in power electronic converters-an overview," *IEEE Trans. Ind. Appl.*, vol. 50, no. 5, pp. 3569–3578, Sep./Oct. 2014.
- [13] Z. Zhao, Davari, W. Lu, H. Wang, and F. Blaabjerg, "An overview of condition monitoring techniques for capacitors in dclink applications," *IEEE Trans. Power Electron., early access*, doi: 10.1109/TPEL.2020.3023469.
- [14] M. R. Lyu et al., Handbook of software reliability engineering, 1996.
- [15] L. Podofillini, "Human reliability analysis," *Handbook of Safety Principles*, 2017.
- [16] M. Ciappa, "Selected failure mechanisms of modern power modules," *Microelectron. Rel.*, vol. 42, no. 4-5, pp. 653–667, Apr. 2002.
- [17] P. Hacke, S. Lokanath, P. Williams, A. Vasan, P. Sochor, G. Tamizh-Mani, H. Shinohara, and S. Kurtz, "A status review of photovoltaic power conversion equipment reliability, safety, and quality assurance protocols," *Renewable Sustainable Energy Rev.*, vol. 82, pp. 1097– 1112, Feb. 2018.
- [18] A. Golnas, "PV system reliability: An operator's perspective," *IEEE J. Photovolt.*, vol. 3, no. 1, pp. 416–421, Jan. 2013.
- [19] K. Fischer, K. Pelka, A. Bartschat, B. Tegtmeier, D. Coronado, C. Broer, and J. Wenske, "Reliability of power converters in wind turbines: Exploratory analysis of failure and operating data from a worldwide turbine fleet," *IEEE Trans. Power Electron.*, vol. 34, no. 7, pp. 6332–6344, Jul. 2019.
- [20] K. Fischer, T. Stalin, H. Ramberg, J. Wenske, G. Wetter, R. Karlsson, and T. Thiringer, "Field-experience based root-cause analysis of powerconverter failure in wind turbines," *IEEE Trans. Power Electron.*, vol. 30, no. 5, pp. 2481–2492, May 2015.
- [21] P. Drexhage, "Effect of humidity and condensation on power electronics systems," Semikron Application Note AN 16-001, 2016.

- [22] H. Wang, D. A. Nielsen, and F. Blaabjerg, "Degradation testing and failure analysis of DC film capacitors under high humidity conditions," *Microelectron. Rel.*, vol. 55, no. 9-10, pp. 2007–2011, Aug.-Sep. 2015.
- [23] T. Aichinger and M. Schmidt, "Gate-oxide reliability and failure-rate reduction of industrial SiC MOSFETs," in *Proc. IEEE International Reliability Physics Symposium*, 2020, pp. 1–6.
- [24] R. Singh, "Reliability and performance limitations in SiC power devices," *Microelectron. Rel.*, vol. 46, no. 5-6, pp. 713–730, May 2006.
- [25] C. Herold, J. Sun, P. Seidel, L. Tinschert, and J. Lutz, "Power cycling methods for SiC MOSFETs," in *Proc. Int. Symp. Power Semicond. Devices IC's*, 2017, pp. 367–370.
- [26] B. Hu, J. O. Gonzalez, L. Ran, H. Ren, Z. Zeng, W. Lai, B. Gao, O. Alatise, H. Lu, C. Bailey *et al.*, "Failure and reliability analysis of a SiC power module based on stress comparison to a Si device," *IEEE Trans. Device Mater. Rel.*, vol. 17, no. 4, pp. 727–737, Oct. 2017.
- [27] T.-T. Nguyen, A. Ahmed, T. Thang, and J.-H. Park, "Gate oxide reliability issues of SiC MOSFETs under short-circuit operation," *IEEE Trans. Power Electron.*, vol. 30, no. 5, pp. 2445–2455, May 2015.
- [28] G. Romano, A. Fayyaz, M. Riccio, L. Maresca, G. Breglio, A. Castellazzi, and A. Irace, "A comprehensive study of short-circuit ruggedness of silicon carbide power mosfets," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 4, no. 3, pp. 978–987, May 2016.
- [29] P. D. Reigosa, F. Iannuzzo, H. Luo, and F. Blaabjerg, "A short-circuit safe operation area identification criterion for SiC MOSFET power modules," *IEEE Trans. Ind. Appl.*, vol. 53, no. 3, pp. 2880–2887, May/Jun. 2016.
- [30] E.-P. Eni, S. Bęczkowski, S. Munk-Nielsen, T. Kerekes, R. Teodorescu, R. R. Juluri, B. Julsgaard, E. VanBrunt, B. Hull, S. Sabri *et al.*, "Shortcircuit degradation of 10-kV 10-A SiC MOSFET," *IEEE Trans. Power Electron.*, vol. 32, no. 12, pp. 9342–9354, Dec. 2017.
- [31] Z. Wang, X. Shi, L. M. Tolbert, F. Wang, Z. Liang, D. Costinett, and B. J. Blalock, "Temperature-dependent short-circuit capability of silicon carbide power MOSFETs," *IEEE Trans. Power Electron.*, vol. 31, no. 2, pp. 1555–1566, Feb. 2016.
- [32] H. Du, P. D. Reigosa, L. Ceccarelli, and F. Iannuzzo, "Impact of repetitive short-circuit tests on the normal operation of SiC MOSFETs considering case temperature influence," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 8, no. 1, pp. 195–205, Sep. 2020.
- [33] P. D. Reigosa, H. Luo, and F. Iannuzzo, "Implications of ageing through power cycling on the short-circuit robustness of 1.2-kV SiC MOSFETS," *IEEE Trans. Power Electron.*, vol. 34, no. 11, pp. 11182– 11190, Sep. 2019.
- [34] S. Ji, M. Laitinen, X. Huang, J. Sun, W. Giewont, F. Wang, and L. M. Tolbert, "Short-circuit characterization and protection of 10-kV SiC MOSFET," *IEEE Trans. Power Electron.*, vol. 34, no. 2, pp. 1755– 1764, Feb. 2019.
- [35] D.-P. Sadik, J. Colmenares, G. Tolstoy, D. Peftitsis, M. Bakowski, J. Rabkowski, and H.-P. Nee, "Short-circuit protection circuits for silicon-carbide power transistors," *IEEE Trans. Ind. Electron.*, vol. 63, no. 4, pp. 1995–2004, Apr. 2016.
- [36] D. Rothmund, D. Bortis, and J. W. Kolar, "Highly compact isolated gate driver with ultrafast overcurrent protection for 10 kV SiC MOSFETs," *CPSS Trans. Power Electron. Appl.*, vol. 3, no. 4, pp. 278–291, Dec. 2018.
- [37] K. Sun, J. Wang, R. Burgos, and D. Boroyevich, "Design, analysis, and discussion of short circuit and overload gate-driver dual-protection scheme for 1.2-kV, 400-A SiC MOSFET modules," *IEEE Trans. Power Electron.*, vol. 35, no. 3, pp. 3054–3068, Mar. 2020.
- [38] P. Hofstetter and M.-M. Bakran, "The two-dimensional short-circuit detection protection for SiC MOSFETs in urban rail transit application," *ITPE*, vol. 35, no. 6, pp. 5692–5701, Jun. 2020.
- [39] B. Zhang and S. Wang, "A survey of EMI research in power electronics systems with wide bandgap semiconductor devices," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 8, no. 1, pp. 626–643, Mar. 2020.
- [40] M. Ťapajna and C. Koller, Reliability Issues in GaN Electronic Devices, Chaper 6 of Nitride Semiconductor Technology (eds F. Roccaforte and M. Leszczynski). Wiley, 2020.
- [41] M. González-Sentís, P. Tounsi, A. Bensoussan, and A. Dufour, "Degradation indicators of power-GaN-HEMT under switching powercycling," *Microelectron. Rel.*, vol. 100, p. 113412, Sep. 2019.
- [42] S. Song, S. Munk-Nielsen, C. Uhrenfeldt, and I. Trintis, "Failure mechanism analysis of a discrete 650v enhancement mode GaN-on-Si power device with reverse conduction accelerated power cycling test," in *Proc. IEEE Appl. Power Electron. Conf. Expo.*, 2017, pp. 756–760.
- [43] S. Song, S. Munk-Nielsen, and C. Uhrenfeldt, "Failure mechanism analysis of off-state drain-to-source leakage current failure of a commercial 650 v discrete GaN-on-Si HEMT power device by accelerated

power cycling test," *Microelectron. Rel.*, vol. 76, pp. 539–543, Sep. 2017.

- [44] S. Song, S. Munk-Nielsen, and C. Uhrenfeldt, "How can a cutting-edge gallium nitride high-electron-mobility transistor encounter catastrophic failure within the acceptable temperature range?" *IEEE Trans. Power Electron.*, vol. 35, no. 7, pp. 6711–6718, Jun. 2020.
- [45] J. Franke, G. Zeng, T. Winkler, and J. Lutz, "Power cycling reliability results of GaN HEMT devices," in *Proc. IEEE Int. Symp. Power Semicond. Devices IC's*, 2018, pp. 467–470.
- [46] C. Abbate, G. Busatto, A. Sanseverino, D. Tedesco, and F. Velardi, "Experimental study of the instabilities observed in 650 v enhancement mode GaN HEMT during short circuit," *Microelectron. Rel.*, vol. 76, pp. 314–320, Sep. 2017.
- [47] M. Fernández, X. Perpina, J. Roig-Guitart, M. Vellvehi, F. Bauwens, M. Tack, and X. Jorda, "Short-circuit study in medium-voltage GaN cascodes, p-GaN HEMTs, and GaN MISHEMTs," *IEEE Trans. Ind. Electron.*, vol. 64, no. 11, pp. 9012–9022, Nov. 2017.
- [48] C. Abbate, G. Busatto, A. Sanseverino, D. Tedesco, and F. Velardi, "Failure analysis of 650 v enhancement mode GaN HEMT after short circuit tests," *Microelectron. Rel.*, vol. 88, pp. 677–683, Sep. 2018.
- [49] C. Abbate, G. Busatto, A. Sanseverino, D. Tedesco, and F. Velardi, "Failure mechanisms of enhancement mode GaN power HEMTs operated in short circuit," *Microelectron. Rel.*, vol. 100, p. 113454, Sep. 2019.
- [50] X. Lyu, H. Li, Y. Abdullah, K. Wang, B. Hu, Z. Yang, J. Liu, J. Wang, L. Liu, and S. Bala, "A reliable ultra-fast short circuit protection method for E-mode GaN HEMT," *IEEE Trans. Power Electron.*, vol. 35, no. 9, pp. 8926–8933, Sep. 2020.
- [51] Z. Shen, Q. Wang, Y. Shen, and H. Wang, "First observations in degradation testing of planar magnetics," in *Proc. IEEE Appl. Power Electron. Conf. Exposit.*, 2019, pp. 1436–1443.
- [52] Z. Shen, Q. Wang, and H. Wang, "Degradation analysis of planar magnetics," in *Proc. IEEE Appl. Power Electron. Conf. Exposit.*, 2020, pp. 2687–2693.
- [53] ECPE Guideline AQG 324, Qualification of power modules for use in power electronics converter units in motor vehicles, 2019.
- [54] P. O'Connor and A. Kleyner, *Practical reliability engineering*. John Wiley & Sons, ISBN: 978-0-470-97982-2, 2012.
- [55] H. Wang, P. D. Reigosa, and F. Blaabjerg, "A humidity-dependent lifetime derating factor for DC film capacitors," in *Proc. IEEE Energy Conversion Congr. Exposit.*, 2015, pp. 3064–3068.
- [56] R. Bayerer, T. Herrmann, T. Licht, J. Lutz, and M. Feller, "Model for power cycling lifetime of IGBT modules-various factors influencing lifetime," in *Proc. CIPS*, 2008, pp. 1–6.
- [57] U.-M. Choi, F. Blaabjerg, and S. Jørgensen, "Study on effect of junction temperature swing duration on lifetime of transfer molded power IGBT modules," *IEEE Trans. Power Electron.*, vol. 32, no. 8, pp. 6434–6443, Aug. 2017.
- [58] R. Schmidt and U. Scheuermann, "Separating failure modes in power cycling tests," in *Proc. IEEE Int. CIPS*, 2012, pp. 1–6.
- [59] M. Junghaenel, R. Schmidt, J. Strobel, and U. Scheuermann, "Investigation on isolated failure mechanisms in active power cycle testing," in *Proc. PCIM Eur.*, 2015, pp. 1–8.
- [60] N. Chemi-Con, "Judicious use of Aluminum electrolytic capacitor," technical note, CAT. No. E1001-L, 2011.
- [61] U. Scheuermann and R. Schmidt, "A new lifetime model for advanced power modules with sintered chips and optimized al wire bonds," in *Proc. PCIM*, 2013, pp. 810–813.
- [62] U. Scheuermann and M. Junghaenel, "Limitation of power module lifetime derived from active power cycling tests," in *Proc. Int. Conf. Integr. Power Electron. Syst.*, 2018, pp. 1–10.
- [63] U. Scheuermann and U. Hecht, "Power cycling lifetime of advanced power modules for different temperature swings," *Proc. PCIM Europe*, vol. 5964, May 2002.
- [64] D.-P. Sadik, J.-K. Lim, F. Giezendanner, P. Ranstad, and H.-P. Nee, "Humidity testing of SiC power mosfets-an update," in *Proc. EPE*, 2017.
- [65] S. Kremp and O. Schilling, "Humidity robustness for high voltage power modules: Limiting mechanisms and improvement of lifetime," *Microelectron. Rel.*, vol. 88, pp. 447–452, Sep. 2018.
- [66] S. Hartmann and E. Özkol, "Bond wire life time model based on temperature dependent yield strength," in *Proc. PCIM Eur.*, 2012.
- [67] G. Zeng, R. Alvarez, C. Kunzel, and J. Lutz, "Power cycling results of high power IGBT modules close to 50 Hz heating process," in *Proc. EPE*, 2019, pp. 1–10.

- [68] U. Scheuermann and R. Schmidt, "Impact of load pulse duration on power cycling lifetime of Al wire bonds," *Microelectron. Rel.*, vol. 53, no. 9-11, pp. 1687–1691, Sep. 2013.
- [69] M. Junghaenel and U. Scheuermann, "Impact of load pulse duration on power cycling lifetime of chip interconnection solder joints," *Microelectron. Rel.*, vol. 76, pp. 480–484, Sep. 2017.
- [70] W. Lai, M. Chen, L. Ran, O. Alatise, S. Xu, and P. Mawby, "Low ΔT<sub>j</sub> stress cycle effect in IGBT power module die-attach lifetime modeling," *IEEE Trans. Power Electron.*, vol. 31, no. 9, pp. 6575–6585, Sep. 2016.
- [71] W. Lai, M. Chen, L. Ran, S. Xu, N. Jiang, X. Wang, O. Alatise, and P. Mawby, "Experimental investigation on the effects of narrow junction temperature cycles on die-attach solder layer in an IGBT module," *IEEE Trans. Power Electron.*, vol. 32, no. 2, pp. 1431–1441, Feb. 2017.
- [72] "IEC 60749-34: 2011 semiconductor devices-mechanical and climatic test methods-part 34 - power cycling," 2011.
- [73] U. Scheuermann and S. Schuler, "Power cycling results for different control strategies," *Microelectron. Rel.*, vol. 50, no. 9-11, pp. 1203– 1209, Sep. 2010.
- [74] M. A. Miner, "Cumulative damage in fatigue," J. Appl. Mech., vol. 12, pp. 159–164, 1945.
- [75] A. Fatemi and L. Yang, "Cumulative fatigue damage and life prediction theories: a survey of the state of the art for homogeneous materials," *Int. J. fatigue*, vol. 20, no. 1, pp. 9–34, Jan. 1998.
- [76] P. Rajaguru, H. Lu, and C. Bailey, "Application of nonlinear fatigue damage models in power electronic module wirebond structure under various amplitude loadings," *Advances in Manufacturing*, vol. 2, no. 3, pp. 239–250, Mar. 2014.
- [77] G. Zeng, C. Herold, T. Methfessel, M. Schäfer, O. Schilling, and J. Lutz, "Experimental investigation of linear cumulative damage theory with power cycling test," *IEEE Trans. Power Electron.*, vol. 34, no. 5, pp. 4722–4728, May 2019.
- [78] U.-M. Choi, K. Ma, and F. Blaabjerg, "Validation of lifetime prediction of IGBT modules based on linear damage accumulation by means of superimposed power cycling tests," *IEEE Trans. Ind. Electron.*, vol. 65, no. 4, pp. 3520–3529, Apr. 2018.
- [79] H. Wang, P. Davari, H. Wang, D. Kumar, F. Zare, and F. Blaabjerg, "Lifetime estimation of DC-link capacitors in adjustable speed drives under grid voltage unbalances," *IEEE Trans. Power Electron.*, vol. 34, no. 5, pp. 4064–4078, May 2019.
- [80] M. Dbeiss and Y. Avenas, "Power semiconductor ageing test bench dedicated to photovoltaic applications," *IEEE Trans. Ind. Appl.*, vol. 55, no. 3, pp. 3003–3010, May/Jun. 2019.
- [81] M. Dbeiss, Y. Avenas, H. Zara, L. Dupont, and F. Al Shakarchi, "A method for accelerated aging tests of power modules for photovoltaic inverters considering the inverter mission profiles," *IEEE Trans. Power Electron.*, vol. 34, no. 12, pp. 12226–12234, Dec. 2019.
- [82] A. Sangwongwanich, Y. Shen, A. Chub, E. Liivik, D. Vinnikov, H. Wang, and F. Blaabjerg, "Mission profile-based accelerated testing of DC-link capacitors in photovoltaic inverters," in *Proc. IEEE Appl. Power Electron. Conf. Exposit.*, 2019, pp. 2833–2840.
- [83] I. Vernica, F. Blaabjerg, and K. Ma, "Mission profile emulator for the power electronics systems of motor drive applications," in *Proc. ECCE Europe*, 2017, pp. 1–10.
- [84] K. Ma and Y. Song, "Power-electronic-based electric machine emulator using direct impedance regulation," *IEEE Trans. Power Electron.*, vol. 35, no. 10, pp. 10673–10680, Oct. 2020.
- [85] Z. Wang, H. Wang, Y. Zhang, and F. Blaabjerg, "A viable mission profile emulator for power modules in modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 34, no. 12, pp. 11580–11593, Dec. 2019.
- [86] Y. Shen, A. Chub, H. Wang, D. Vinnikov, E. Liivik, and F. Blaabjerg, "Wear-out failure analysis of an impedance-source pv microinverter based on system-level electrothermal modeling," *IEEE Trans. Ind. Electron.*, vol. 66, no. 5, pp. 3914–3927, May 2019.
- [87] Y. Zhang, H. Wang, Z. Wang, F. Blaabjerg, and M. Saeedifard, "Mission profile-based system-level reliability prediction method for modular multilevel converters," *IEEE Trans. Power Electron.*, Jul. 2020.
- [88] I. Vernica, H. Wang, and F. Blaabjerg, "Design for reliability and robustness tool platform for power electronic systems-study case on motor drive applications," in *Proc. IEEE Appl. Power Electron. Conf. Exposit.*, 2018, pp. 1799–1806.
- [89] S. E1049-85, "Standard practices for cycle counting in fatigue analysis," in ASTM International, West Conshohocken, PA, 2017.
- [90] S. K. Chaudhary, P. Ghimire, F. Blaabjerg, P. B. Thøgersen, and P. de Place Rimmen, "Development of field data logger for recording mission profile of power converters," in *Proc. Eur. Conf. Power Electron. Appl.*, 2015, pp. 1–10.

18

- [91] Y. Yang, H. Wang, F. Blaabjerg, and K. Ma, "Mission profile based multi-disciplinary analysis of power modules in single-phase transformerless photovoltaic inverters," in *Proc. ECCE Europe*, 2013, pp. 1–10.
- [92] Y. Zhang, H. Wang, Z. Wang, Y. Yang, and F. Blaabjerg, "The impact of mission profile models on the predicted lifetime of IGBT modules in the modular multilevel converter," in *Proc. Annu. Conf. IEEE Ind. Electron. Soc.*, 2017, pp. 7980–7985.
- [93] K. Ma, N. He, M. Liserre, and F. Blaabjerg, "Frequency-domain thermal modeling and characterization of power semiconductor devices," *IEEE Trans. Power Electron.*, vol. 31, no. 10, pp. 7183–7193, Oct. 2016.
- [94] Y. Zhang, H. Wang, Z. Wang, Y. Yang, and F. Blaabjerg, "Simplified thermal modeling for IGBT modules with periodic power loss profiles in modular multilevel converters," *IEEE Trans. Ind. Electron.*, vol. 66, no. 3, pp. 2323–2332, Mar. 2019.
- [95] H. Wang, R. Zhu, H. Wang, M. Liserre, and F. Blaabjerg, "A thermal modeling method considering ambient temperature dynamics," *IEEE Trans. Power Electron.*, vol. 35, no. 1, pp. 6–9, Jan. 2020.
- [96] G. Mandrusiak, X. She, A. M. Waddell, and S. Acharya, "On the transient thermal characteristics of silicon carbide power electronics modules," *IEEE Trans. Power Electron.*, vol. 33, no. 11, pp. 9783– 9789, Nov. 2018.
- [97] A. Tsibizov, I. Kovačević-Badstübner, B. Kakarla, and U. Grossner, "Accurate temperature estimation of SiC power mosfet s under extreme operating conditions," *IEEE Trans. Power Electron.*, vol. 35, no. 2, pp. 1855–1865, Feb. 2020.
- [98] Y. Shen, H. Wang, F. Blaabjerg, H. Zhao, and T. Long, "Thermal modeling and design optimization of PCB vias and pads," *IEEE Trans. Power Electron.*, vol. 35, no. 1, pp. 882–900, Jan. 2020.
- [99] U. Schilling, "Cosmic ray failures in power electronics," in *Semikron Application Note AN 17-003*, 2017.
- [100] R. Bayerer, M. Lassmann, and S. Kremp, "Transient hygrothermalresponse of power modules in inverters-the basis for mission profiling under climate and power loading," *IEEE Trans. Power Electron.*, vol. 31, no. 1, pp. 613–620, Jan. 2016.
- [101] S. Peyghami, H. Wang, P. Davari, and F. Blaabjerg, "Mission-profilebased system-level reliability analysis in DC microgrids," *IEEE Trans. Ind. Appl.*, vol. 55, no. 5, pp. 5055–5067, Sep./Oct. 2019.
- [102] S. V. Dhople, A. Davoudi, A. D. Dominguez-Garcia, and P. L. Chapman, "A unified approach to reliability assessment of multiphase DC-DC converters in photovoltaic energy conversion systems," *IEEE Trans. Power Electron.*, vol. 27, no. 2, pp. 739–751, Feb. 2012.
- [103] J. Pippola, I. Vaalasranta, T. Marttila, J. Kiilunen, and L. Frisk, "Product level accelerated reliability testing of motor drives with input power interruptions," *IEEE Trans. Power Electron.*, vol. 30, no. 5, pp. 2614–2622, May 2015.
- [104] J. Flicker, G. Tamizhmani, M. K. Moorthy, R. Thiagarajan, and R. Ayyanar, "Accelerated testing of module-level power electronics for long-term reliability," *IEEE J. Photovolt.*, vol. 7, no. 1, pp. 259–267, Jan. 2017.
- [105] T. Krone, L. Dang Hung, M. Jung, and A. Mertens, "Advanced condition monitoring system based on on-line semiconductor loss measurements," in *Proc. IEEE Energy Convers. Congr. Expo.*, 2016, pp. 1–8.
- [106] B. Rannestad, A. E. Maarbjerg, K. Frederiksen, S. Munk-Nielsen, and K. Gadgaard, "Converter monitoring unit for retrofit of wind power converters," *IEEE Trans. Power Electron.*, vol. 33, no. 5, pp. 4342– 4351, May 2018.
- [107] B. Rannestad, K. Fischer, P. Nielsen, K. Gadgaard, and S. Munk-Nielsen, "Virtual temperature detection of semiconductors in a megawatt field converter," *IEEE Trans. Ind. Electron.*, vol. 67, no. 2, pp. 1305–1315, Feb. 2020.
- [108] H. Wang and Y. Peng, "Non-invasive front-end for power electronic monitoring," *filed patent*, PA 2020 70235, Apr. 2020.
- [109] Z. Wang, Y. Zhang, H. Wang, and F. Blaabjerg, "Capacitor condition monitoring based on the DC-side start-up of modular multilevel converters," *IEEE Trans. Power Electron.*, vol. 35, no. 6, pp. 5589–5593, Jun. 2020.
- [110] Y. Peng, S. Zhao, and H. Wang, "A digital twin based estimation method for health indicators of DC-DC converters," *IEEE Trans. Power Electron.*, vol. 36, no. 2, pp. 2105–2118, Feb. 2021.
- [111] S. Zhao, S. Chen, F. Yang, E. Ugur, B. Akin, and H. Wang, "A composite failure precursor for condition monitoring and remaining useful life prediction of discrete power devices," *IEEE Trans. Ind. Informatics.*

- [112] S. Zhao, F. Blaabjerg, and H. Wang, "An overview of artificial intelligence applications for power electronics," *IEEE Trans. Power Electron., early access*, doi: 10.1109/TPEL.2020.3024914.
- [113] C. Matei, J. Urbonas, H. Votsi, D. Kendig, and P. Aaen, "Dynamic temperature measurements of a GaN DC/DC boost converter at MHz frequencies," *IEEE Trans. Power Electron.*, vol. 35, no. 8, pp. 8303– 8310, Aug. 2020.
- [114] Y. Avenas, L. Dupont, and Z. Khatir, "Temperature measurement of power semiconductor devices by thermo-sensitive electrical parameters-a review," *IEEE Trans. Power Electron.*, vol. 27, no. 6, pp. 3081–3092, Jun. 2012.
- [115] S. Zhao, S. Chen, and H. Wang, "Degradation modeling for reliability estimation of DC film capacitors subject to humidity acceleration," *Microelectron. Rel.*, vol. 100, p. 113401, Sep. 2019.
- [116] K. A. Severson, P. M. Attia, N. Jin, N. Perkins, B. Jiang, Z. Yang, M. H. Chen, M. Aykol, P. K. Herring, D. Fraggedakis *et al.*, "Datadriven prediction of battery cycle life before capacity degradation," *Nature Energy*, vol. 4, no. 5, pp. 383–391, Mar. 2019.
- [117] P. M. Attia, A. Grover, N. Jin, K. A. Severson, T. M. Markov, Y.-H. Liao, M. H. Chen, B. Cheong, N. Perkins, Z. Yang *et al.*, "Closed-loop optimization of fast-charging protocols for batteries with machine learning," *Nature*, vol. 578, no. 7795, pp. 397–402, Feb. 2020.



Huai Wang (M'12-SM'17) received a BE degree in electrical engineering from Huazhong University of Science and Technology, Wuhan, China in 2007 and a PhD degree in power electronics from the City University of Hong Kong in 2012. He is currently Professor with the Center of Reliable Power Electronics (CORPE), Department of Energy Technology at Aalborg University, Denmark. He was a Visiting Scientist with the ETH Zurich, Switzerland, from Aug. to Sep. 2014, and with the Massachusetts Institute of Technology (MIT), USA, from Sep. to

Nov. 2013. He was with the ABB Corporate Research Center, Switzerland in 2009. His research addresses the fundamental challenges in modelling and validation of power electronic component failure mechanisms and application issues in system-level predictability, condition monitoring, circuit architecture, and robustness design.

Dr. Wang received the Richard M. Bass Outstanding Young Power Electronics Engineer Award from the IEEE Power Electronics Society in 2016, and the Green Talents Award from the German Federal Ministry of Education and Research in 2014. He is currently the Chair of IEEE PELS/IAS/IES Chapter in Denmark. He serves as an Associate Editor of IET Electronics Letters, IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN POWER ELECTRONICS, and IEEE TRANSACTIONS ON POWER ELECTRONICS.



Frede Blaabjerg (S'86-M'88-SM'97-F'03) was with ABB-Scandia, Randers, Denmark, from 1987 to 1988. From 1988 to 1992, he received his PhD degree in Electrical Engineering at Aalborg University in 1995. He became an Assistant Professor in 1992, an Associate Professor in 1996, and a Full Professor of power electronics and drives in 1998. Since 2017, he has been a Villum Investigator. He is honoris causa at University Politehnica Timisoara (UPT), Romania and Tallinn Technical University (TTU) in Estonia.

His current research interests include power electronics and its applications such as in wind turbines, PV systems, reliability, harmonics, and adjustable speed drives. He has published more than 600 journal papers in the fields of power electronics and its applications. He is the co-author of four monographs and editor of 10 books in power electronics and its applications.

He has received 32 IEEE Prize Paper Awards, the IEEE PELS Distinguished Service Award in 2009, the EPE-PEMC Council Award in 2010, the IEEE William E. Newell Power Electronics Award 2014, the Villum Kann Rasmussen Research Award 2014, the Global Energy Prize in 2019, and the 2020 IEEE Edison Medal. He was the Editor-in-Chief of the IEEE Transactions on Power Electronics from 2006 to 2012. He has been Distinguished Lecturer for the IEEE Power Electronics Society from 2005 to 2007 and for the IEEE Industry Applications Society from 2010 to 2011 as well as 2017 to 2018. In 2019-2020 he serves a President of IEEE Power Electronics Society. He is Vice-President of the Danish Academy of Technical Sciences. He was nominated in 2014-2019 by Thomson Reuters to be among the 250 mostcited researchers in engineering in the world.