Wednesday, June 07, 2017 / Perth Australia / By Niekie Jooste
In this edition of "The WelderDestiny Compass":
Don't worry, we are not going to delve into the deep dark recesses of the statistician's soul today. Luckily, we are only going to superficially think about what statistics is, and what it is not.
Unfortunately, there are many moral hazards when trying to apply statistics to people, so we will briefly give that thought some air.
The other issue is that in a statistical process, like welding, making absolute claims about anything is fraught with danger. Suddenly the issues become a lot more complex than just grabbing onto a number and holding it aloft as if it is definitive of the particular weld.
If you would like to add your ideas to this week’s discussion, then please send me an e-mail with your ideas, (Send your e-mails to: firstname.lastname@example.org) or complete the comment form on the page below.
Now let's get stuck into this week’s topics...
It is not uncommon to see a newspaper headline, or hear a politician, claiming that half the hospitals are below average, so the medical system needs to be reformed. That could be half the schools, or whatever would best score points for the politician making the statement.
Either the politician does not have a basic understanding of statistics, or s/he is just trying to play politics rather than actually contribute to society.
In any group of "objects" that we are measuring, the characteristic we are measuring will have a range of values. On average, half the values will fall above the average value and half will fall below the average value of that characteristic. In other words, it does not matter how "good" or "bad" the object is that we are measuring, we can always find an average with approximately half the "objects" above average and half below average.
An average value is a simple statistical calculation, it is not a moral judgement.
This group of objects we are measuring would typically be called the "population" that we are trying to measure. There will almost always be a range of values measured, but most of the time this variation is just "part of the system" and does not actually tell you much about the individual objects. It tells you more about the system. When there is a wide variance in the values measured, then it tells us that "the system" results in wide variation. It does not necessarily tell us why, or on which objects we should focus to improve the system. Variation in such a system is called "common cause" variation.
If one or more of the objects displayed a measured value that was significantly different to the average value, then we may want to decide if there is something special about this object that results in the "wayward" reading. We want to know if the variation of this object is due to a "special cause" rather than it just being part of the usual common cause variation.
There are many ways of trying to make this decision, but the most common that works in most instances is to calculate a "standard deviation" for the system, and then seeing where this special object's value lies in terms of this standard deviation value.
In most cases, (a normal distribution) 68% of the group of measured values will fall within plus and minus 1 standard deviation. In most cases 95% of the measured values will fall within plus and minus 2 standard deviations. The value is 99.6% for 3 standard deviations.
Depending on circumstances, we could decide that anything that falls outside of 2 or 3 standard deviations is due to "special cause" variation. It starts making sense to start isolating this object for "special attention", to try to understand what is its "problem".
More often than not, it is just not a special cause problem. It is a systematic problem. To solve the problems associated with a wide range of measured values, we therefore need to apply our efforts to the system, not the individual objects.
Don't you just love performance review systems in large corporates? Everyone gets certain "personal targets" that they need to achieve for the current financial year. If you meet your objectives, then you get an average rating. If you significantly over-achieve, you get a high rating. If you under-achieve, then you get a below average rating. Your salary increase for the next year is then usually tied to the "performance rating" that you were given.
This system might sound like a great way to "motivate" people, but my experience (both as an employee and manager) is that these systems result in a lot more de-motivation than motivation. The reason is that this way of applying statistics is not only wrong from a statistical perspective, but it also does not take account of human nature.
If we wanted to do this properly, we would have to have a "performance distribution" for all the people, and then we would have to decide which people are below or above the average based on "special cause" variation rather than common cause variation. Statistically it will only make sense to then treat these "special cause" people in any way differently. Given that almost all the employees will fall within the common cause range, this will just not be worth the effort from a cost versus reward perspective.
From a human nature perspective, this exercise is not only inefficient, but downright disastrous. If we assume that the whole system was objective, and actually gave a "true value" of the person's performance, (which these systems never are) then we could decide that everyone within the plus and minus one standard deviation was "average" and that anybody above that was a high performer, and anybody below that range was a low performer.
This means that 84% of you employees received an average or below rating. Very few people actually want to hear that they are "average". From a human nature perspective, we all want to hear that we are in some way "above average". From a statistical perspective we know this is not possible, but that does not change our emotional reaction to the great news that we are indeed average! - Congratulations!
So, these systems are designed to de-motivate 84% of the workforce. How much did they pay a consultant to come up with that plan?
When capturing welding parameter values to place on the procedure qualification record (PQR) there are a myriad of different practices. As an example, let us look only at the voltage value that is recorded.
As the voltage drops significantly across the welding cables, you will record lower values the closer you get to the welding arc, and higher values the further away in the circuit you measure. Then you will notice that there are normal fluctuations of the voltage reading during the welding operation. Every now and again, there are spikes and dips in the voltage. Do you record the maximum "spikes" and minimum "dips" as a range? Do you disregard the spikes and try to record what you consider the "normal range" of maximum and minimum values? Do you try to average the value in your mind and record only this average value, as a single value for each run, or each electrode?
When performing calculations to apply allowable parameter ranges on the welding procedure, (WPS) we also see all kinds of practices. Some people will take the maximum "spike" measured and take that as a maximum value from which to perform calculations. If this spike was only experienced for a fraction of a second throughout the entire weld of possibly 20 weld runs, how meaningful is that value?
Some people will take the maximum and minimum values and calculate an average value for the run based on that. If you have 20 runs, you end up with 20 averages. Do you now use the maximum and minimum values of those averages to calculate tour ranges? What if you have single runs with widely different maximum and minimum values from all the others. How representative is your calculations then?
Some people will calculate an average value for all of the different runs and base their WPS ranges on that value. We could argue that this is the most representative value for the weld as a whole, but that leaves the welder with a rather narrow range to hold to while welding in production.
Now we need to consider how the ranges are applied in production. Does the inspector monitor the voltage for a full run and apply the maximum and minimum spikes to the range on the WPS? Does the inspector average the value in his head for a single run, and apply that value to the range on the WPS?
I am sure that you get the picture here. A single parameter can have widely differing ways of measuring and controlling and calculating, so the idea that somehow a minor deviation from the "qualified range" will make the weld unsuitable, is not necessarily true. In the same token, a weld could be unsuitable, even when it is measured to be within the WPS range.
We've said it before, but we say it again. Welding is a process with high variability. This makes it an ideal candidate for monitoring by advanced sensors and technology. There is much value that can be added by such technologies.
It also means that many welding related decisions are not nearly as black and white as it appears from the codes. Thoughtful decisions are more risk based than absolute.
Yours in welding
Do You Have Thoughts About This Week's E-Zine?
Now is your opportunity to contribute to the topics in this week's The WelderDestiny Compass. If you have thoughts or examples that you would like to share with other readers of the e-zine, then please contribute by entering the title of your contribution in the box below. Feel free to make a brief or more expansive contribution to our discussion...
What are your experiences with measuring and monitoring of welds? What are your experiences with performance review systems? Please share your stories, opinions, insights and even fears or wishes regarding today's topics.
Click below to see contributions from other readers of this edition of The WelderDestiny Compass...