Removing Noise from Analysis: Comparing Two Values | Ranjan Kumar

Hey there! Have you ever tried to compare two fields, like usage over the last 30 days versus the last 30 to 60 days, only to get a bunch of false flags due to small changes in low numbers? For instance, a change from 10 to 5 might show up as a significant percentage change, even though it’s not as meaningful as a change from 100 to 50. That’s exactly what I’m trying to tackle.

To get around this issue, I’ve been thinking about using a weighted average. I’m working in SQL, and I’ve tried a few different methods, but I’m not sure if there’s an industry standard for this type of problem.

Here are some of the approaches I’ve taken so far:

(sum_last_30_day_usage – sum_30_to_60_day_usage) / ((sum_last_30_day_usage + sum_30_to_60_day_usage) / 2.0)

((sum_last_30_day_usage – sum_30_to_60_day_usage) / NULLIF(sum_30_to_60_day_usage, 0)) * LN((sum_last_30_day_usage + sum_30_to_60_day_usage) + 1)

I’m curious to know if anyone has a better solution or if there’s a standard way to handle this type of analysis. Do you have any experience with this?

It’s interesting because I don’t want to disregard smaller values, but at the same time, I need to remove the noise from my analysis.

What do you think?

Leave a Comment Cancel Reply