How to Ensure Your Data is Clean

by | Jul 9, 2021

Google Analytics is a great tool for analyzing data, but the data is rarely 100% clean. Oftentimes it’s filled with spam, inaccurate links, and misrepresented numbers that do not provide less than a clear picture of your performance online. 

Having clean data is important if you want to accurately read the success of your campaign and will provide you with a squeaky clean picture to help plan your future online campaigns. 

So here are some of the basics.  

What is clean data? 

When analyzing data it is important for your data to be as accurate as possible. Clean data is when you filter out data and traffic sources in Google Analytics that don’t accurately reflect your customer behavior. This includes spam, inaccurate numbers and false links. 

Why does it matter? 

Analyzing data that is accurate will give you a clearer picture of customer behavior so you can get an honest, and true story of your traffic and sales, and sort through the noise that is not applicable to your business or campaigns. 

So how can you give your data a clean sweep? 

Add UTM Parameters to Campaign URLs

When you run a PPC campaign, newsletter, run banner ads on other websites, etc. you can create special URLs that will help you filter out traffic by campaign type and gain insight into how your campaign performed. This is also helpful if you run your campaigns on several different channels. You will want to know which channels are performing the best and which can use more love. Read more about the types of parameters you can add to your site here

Secure Your Site 

If your site is still not SSL secured, there is a high chance that your referral traffic will be masked as direct traffic. A quick solution is to purchase an SSL certificate to secure your site. This will help ensure that Google Analytics is treating the traffic appropriately. 

Exclude Bots and Spam 

Google Analytics is an intuitive tool that does a good job in filtering out spam on its own but setting up manual filters helps target areas GA misses. You will also want to exclude information that isn’t relevant or helpful to your overall picture. 

To start you will want to select the box under “bot filtering” in the Admin area of Google Analytics. 

Bots and spiders have a sneaky way of scraping your content and making their way into your analytics as weird-looking traffic sources in your data. 

Here you will identify which websites are bogus or created by bots. 

Filter out Ghost Spam.

This spam is from sources with fake hostnames. In order to clean these out, you can add a quick filter that will cleanse out these intruders. Add a filter that only includes your hostname to stop all ghost spam at the gate. 

Clean Out Your Language

Another way to filter out spam is to clean out your language. Oftentimes, pesky source traffic is set to “C” language, which doesn’t actually correlate to any language.

Eliminate Fake Referrals


Lastly, fake referrals will show up in your analytics. 

To view a list of your referral website and potential false links go to Acquisition > All Traffic > Referrals. 

Often these register as Source in Google Analytics with hopes that you may visit the website. 

A simple way to determine these sites is to look for a 100% bounce rate, a session duration of 0 minutes, 0 transactions, and $0 revenue. 

Filter Out Internal Traffic 

There will be in-house employees scrolling and working on your site, whether that is your marketing manager, a content creator, or data analyzers. 

It’s important to remove these people from your overall analysis so you get a better understanding of the customers that are visiting your site, rather than your own team. 

Your team’s traffic will inflate your numbers and provide an unrealistic look of visitors to your site. 

This can be easily resolved simply by excluding the IP addresses of your team. 

Organize Channel Groupings

Google Analytics will give you a list of all websites that direct traffic back to your site. 

Google organizes these channels into the following groups: Organic Search, Paid Search, Direct, Referral, and Social. 

There are times, however, when Google does not correctly apply the right Channel. 

This is why you will want to go into your Channel list and manually adjust any websites that are categorized incorrectly. This will ultimately give you a more accurate picture of your data.

Create Content Groups 

If you have an eCommerce website that covers a lot of different categories it is wise to section off those categories to obtain reporting for each channel. This will help you greatly improve your eCommerce SEO and sales.

For example, if you have a clothing e-commerce site, with a lot of different category pages, creating categories like “mens, womens, and kids” then sub categories like “women’s dresses, women’s shirts, women’s pants”, will help you track your sales more accurately. Learn how to do that here

Although these adjustments may seem big, they are actually small steps you can take that take little time. This will ensure that the data you are analyzing is clean and accurate. Oftentimes bots, spam, and inaccurate forecasts bog up our data, leaving it unreliable and deceptive.  

Having squeaky clean data gives you a solid canvas to be able to track, plan and execute a marketing campaign that works for your business. 

As Per That Last Article…