SigmaWay Blog

SigmaWay Blog tries to aggregate original and third party content for the site users. It caters to articles on Process Improvement, Lean Six Sigma, Analytics, Market Intelligence, Training ,IT Services and industries which SigmaWay caters to

Garbage In is Garbage Out in Data Sciences!

Whether you are a data analyst in a firm or a developer training its machine learning model, you deal with data. Rather you need data! Data is one of the essential things which is needed to create a foundation. The decisions and results are relied on the output you get from the data. Thus, data is important and like every other thing, it also works on the principle of Garbage In, Garbage Out.

Many people make mistake while feeding data to their data set with a hope to get better results.

However, they end up having an ugly dataset with a greater risk of damaging their product.

The 6 most common mistakes are: Not Enough Data, Low Quality Classes, Low Quality Data, Unbalanced Classes, Unbalanced Data, No Validation or Testing.

These mistakes can be fixed which could further help in fetching good results.

One just need to remember that their dataset is equally important to the model they are working on. Without a balanced dataset, getting a fine finish product is next to impossible.

To know how to fix those mistakes visit: https://hackernoon.com/stop-feeding-garbage-to-your-model-the-6-biggest-mistakes-with-datasets-and-how-to-avoid-them-3cb7532ad3b7

Rate this blog entry:
540 Hits
0 Comments
Sign up for our newsletter

Follow us