I know... this sounds like a post that only a compensation geek could love. Nevertheless, here I go.
During a recent presentation to a group of HR professionals on building a compensation program, one attendee asked about reviewing and reconciling the market data collected for different jobs. Is it OK, as she had been told, to follow the practice of removing the highest and lowest data points in the set, as a form of quality control, and simply use the remaining "middle set" to determine market value?
Not the first time I've heard the question. This quick and dirty approach to "cleaning" a market data set appears to be pretty widely used and accepted. Perhaps it is even taught somewhere as a best practice. And I realize that sometimes, when you're dealing with a ton of jobs and a TON of market data, that short cuts are attractive and perhaps even necessary. But my experience would suggest that a "one-size-fits-all" approach like this is way less than ideal for a number of reasons and in many situations (not the least of which is in those where you only have two or three data sources to begin with).
But more than that, what if it turns out that one of those highest (or lowest) data points...
- Reflects the best fit between survey job descriptions and the scope and responsibilities of your job?
- Represents the most robust set of data (in terms of the largest "n") among all the data lines collected?
- Is the closest fit to your geographical location and the true boundaries of your labor market for the job?
- Comes from the survey source whose participants best reflect your direct labor market competitors?
Might be smart to include it? Even if it yanks the job's overall market value in one direction or the other?
A data point that appears, at first glance, to be an outlier isn't necessarily wrong or inappropriate. It might, in fact, be the best in your set. So indiscriminately lopping off the high and low points without closer review may result in your elimination of the best and most relevant pieces of data for the job.
Many (most?) organizations use market data as the foundation of their salary structure and the mechanism by which relative job value is determined. Which makes those market value estimates pretty important. That being the case, it behooves us to do as careful and thoughtful a job as can be managed in reviewing and fine-tuning the data.
Don't short shrift the quality control process!
Image: Creative Commons Photo "Money Whirlpool" by Patrick Hoesly
Nicely put, Ann. Many tend to rush past the quality control element of survey analysis and blindly apply "techniques" that some bright spark told them about eons ago. Trouble is, they were also told to "think", to look and listen to the data - because there is always a story to be told.
Thanks for reminding us all.
Posted by: Chuck Csizmar | March 24, 2010 at 09:57 PM
Thanks, Chuck! And I love and often use the story analogy as well - every data set does indeed have a story to tell. The question is, do you listen ... or simply hack away the inconvenient outliers!
Posted by: Ann Bares | March 25, 2010 at 09:07 AM
Great post! I would never think of doing such a thing, but I guess another HR practitioner who is not solely dedicated to compensation wouldn't understand the impact of applying this practice.
Posted by: Windsor Lewis - CCP, GRP, PHR | March 25, 2010 at 05:53 PM
Thanks, Windsor. Based on the number of times I have encountered people using this technique (some of them compensation specialists even), I'm guessing it is pretty common.
Posted by: Ann Bares | March 25, 2010 at 05:59 PM
Sorry, Windsor. Being a sucker for a good pun, I just can't resist posting a link to Winsorized Means (http://en.wikipedia.org/wiki/Winsorized_mean), highly relevant to this thread.
Posted by: E. James (Jim) Brennan | March 26, 2010 at 12:13 PM
for some reason, the link above was defective. this should work better:
http://en.wikipedia.org/wiki/Winsorized_mean
Posted by: E. James (Jim) Brennan | March 26, 2010 at 12:19 PM
Funny that you mentioned this. I actually do loop off data that skew the particular job family. Most of the time, when such thing is done, it's because there are a few very veteran (employees in the same job for the last 30 years) data points which is skewing the market. But then again, I do have my hands on the data dump/raw data so I could do further analysis. I agree you wouldn't remove outliers willy nilly but if you have a strong case, it can be done.
Posted by: Juliana | April 01, 2010 at 07:54 PM