Debunking GIGO - Does garbage product data in = garbage product data out?

Posted by Maria West on Jul 1, 2021 2:18:41 PM

Garbage In, Garbage out…” or so the saying goes ...

The concept asserts that the quality of the information coming out cannot be better than the quality of the information going in. This saying is used in data science and manufacturing and often when talking about product data, like in this article, or this one

Today, we are here to bust that myth as it pertains to product data and eCommerce. With the right tools in place, you can put garbage product data in and output high-quality product data, ready for eCommerce. 


How does product data quality erode over time?

Nobody intends for a company's product data to fall into chaos. Nobody sets out with that intention or knowingly facilitates it. Still, it is easy for it to happen, one small iteration at a time, by different staff completing manual actions in different ways day after day. This slow creep leads to a database building up many errors and inconsistencies.

When assessing a product catalog, you can often see different individual’s ‘’unique handwriting’’ in the data. For example, using different spellings, organizing data differently, like putting dimensions at the front of a title, or putting dimensions after. Spaces or no spaces included, capitals or no capitals. Without agreed conventions or naming principles, these differences will continue to erode the data consistency. 

Issues in the data often stem from having a lack of structure in place at the data creation point.

In a core system, such as an ERP, for example, you can set it up to be restrictive with limitations at the data entry point, which can help hold the integrity of your data. Or, the data entry point can be loose and fluid, making it easier for staff to learn and use day to day - but with an increased chance of more discrepancies, and a reduction in the data integrity over time. 


Identifying the problem

Errors and inconsistencies in your product data may not significantly impact your business until you want to start selling online or scale your current eCommerce by adding more products and brands to your site or listing them on more channels. At this point, it can become apparent, very fast if you have data integrity issues you weren’t aware of until you started assessing your data through the lens of ‘’ready for eCommerce.” 

It’s possible you even thought your product data was clean, but when you tried to use it for eCommerce, the cracks start to show, and the errors become apparent.

If you put your raw product data as it is straight into your eCommerce platform (such as Big Commerce or Shopify), you will indeed find the concept of Garbage In = Garbage out to be true. With Garbage product data on an eCommerce site, you will find the filters will be long and illogical, the navigation will be difficult, and the site will be inconsistent and hard for visitors to shop.


Unearthing ‘gold’ from your product data with Vesta eCommerce

If you’ve found yourself in this position, Vesta can reverse the damage. With Vesta, you can put ‘’garbage’’ or inconsistent, messy data in and get consistent, clean data out. No matter how bad the data may look, 9 times out of 10, the gold is still within it, that will deliver value for your business, that we can help you to uncover.


The first step in the road to data recovery is ...

Start with a HERO product that represents your eCommerce vision. Depending on your business needs, there may be a different hero product for each of your categories. The hero product defines the attributes needed and how you will show them on your product pages, including details like the use of 

  • Swatches
  • Text fields
  • Dropdowns 
  • The position of the attributes on the page



Think about the priority order of the attributes, which are the most important, and have the most influence on purchase intent. Define each of those attributes, the naming conventions, and the structure for each of those attributes labels and values. 

Different categories may need different attributes and ordering. For example, people looking to buy power tools will be interested in information such as engine power and tank volume, whereas those purchasing a sofa will be looking at fabrics, colors, and dimensions.



You may build up your hero product from your existing eCommerce presence, take some inspiration from competitors’ websites or draw it up from scratch. You may want to do some testing, checking in with colleagues that you have included all critical information, or with customers for their feedback on what they are most interested in seeing when purchasing your products. 

After this clear endpoint is defined, Vesta will assess your data and determine the data transformations and extractions needed to get it from its current, raw format, to the final format that matches your Hero product.  We call this set of data manipulations a Cleaning Profile. Outputting the data in a consistent, structured, categorized optimized form, ready for your eCommerce store.


The next step is to... 

Put in place the structure and restrictions for each of the attributes you defined at the Hero Product stage, training staff, or including field restrictions within the container of your product data, to ensure you maintain data integrity over time. 

Any vendor data collected by Vesta will be cleaned by the same cleaning profile, giving a uniform result. For in-house changes and enrichment, you need these rules and defined ways of working to maintain a consistent result. You don’t want it so restrictive that it is hard to use and it hampers efficiency significantly. Still, through some iteration mixed with staff training, you can reach the right balance for system input limitations.

If you have found yourself with a catalog of substandard product data which is costing you a large amount of resource to manage or is holding back your progress in eCommerce, book a call to discuss how Vesta can automate the cleaning process to get your catalog optimized for eCommerce. 


Topics: Data Cleansing, product data, product attributes, product information, Product detail page

Recent Posts