/home/leansigm/public_html/components/com_easyblog/services

SigmaWay Blog

SigmaWay Blog tries to aggregate original and third party content for the site users. It caters to articles on Process Improvement, Lean Six Sigma, Analytics, Market Intelligence, Training ,IT Services and industries which SigmaWay caters to

The Art of Predictive Modelling 

Your perspective on data depends on the type of task you want to accomplish. They could be broadly specified as: Analytics : Helps you explore what happened and why.

Monitoring : Looking at things as they occur to find abnormalities.

Prediction : To predict what might happen in future.

Some of the most popular algorithms that can be applied to a predict future trends are :

The Ensemble Model : It uses multiple model output to arrive at a decision , however, one has to understand how to pick correct models and what problem does one want to solve.  

Unsupervised Clustering Algorithms : These algorithms help to group similar people and objects together.

Regression Algorithms:  These are used to predict future values of a product/service

There is no ideal formula to find the best suitable method for predictive analytics. A strong level of business expertise is required to master ‘art’ of predictive modelling. Read more at: http://www.analyticbridge.com/profiles/blogs/the-ultimate-guide-for-choosing-algorithms-for-predictive

 

Rate this blog entry:
2630 Hits
0 Comments

Data Value Chain for GeoSpatial Data

The value of data has changed over time. Companies have realized that collecting, analyzing, sharing, selling data and extracting actionable insights is critical to the development of their organization. Geospatial data is captured and analyzed by engineers and product managers to develop creative solutions and thus increasing productivity. People can view the flow of geospatial data from the instant it is collected throughout its lifecycle using a framework known as 'Data Value Chain'. Data intersects with analytics and can turn this information into decisions. A technological ecosystem built around a geospatial system provides new ways to work and reduce costs, accelerate schedules and supply high-value deliverables along the value chain. Read more at : http://dataconomy.com/2017/02/power-of-data-value-chain/

Rate this blog entry:
3647 Hits
0 Comments

Data Science Challenges in Production Environment 

A very little time is spent on thinking about how to deploy a data science model into production. As a result, many companies fail to earn the value that comes from their efforts and investments. In production environment data continuously comes, result are computed and models are frequently trained. The challenges faced by companies fall into four categories:  Small Data Teams: They mostly use small data, often don’t retrain models and business team is involved in a development project. 

Packagers: Often build their framework from scratch and practice informal A/B testing , generally not involved with the business team

Industrialization Maniacs: These teams are IT led and automated process for deployment and maintenance , business team are not involved in monitoring and development

The Big Data Lab : Uses more complex technologies , business teams are involved before and after deployment of data product

Companies should understand that working in production is different than working with SQL databases in development , moreover real time learning and multi-language environments will make your process complex. Also a strong collaboration between business and IT teams will increase your efficiency. Read more at : http://dataconomy.com/2017/02/value-from-data-science-production/

 

Rate this blog entry:
3982 Hits
0 Comments

Rise of Data Science Platforms

Data science platform has become a buzzword of the decade. So, what is it? The sole purpose of a data science platform is to encapsulate all off-data science work by incorporating tools required to visualize, deploy, collect, analyze data, build models, generate reports. This toolkit makes it convenient to maintain, reproduce and scale up the project and produce results dynamically. Adoption of data science platforms is expected to grow almost double by 2018 as more companies realize its potential benefits. Many data driven business faces the challenge of effectively utilizing data science tools and lack integrated approach to their data science technology stack to find value in the data. While on the other hand, companies who have already established data science platforms are excelling in the field.

Read more at : http://dataconomy.com/2017/02/tech-wave-data-science-platforms/

Rate this blog entry:
2556 Hits
0 Comments

Deploying Machine Learning On Real Time Systems

The three critical steps involved in deployment of machine learning algorithm and exposing it to real world are :

Define a goal based on a metric : Decide if you want human level intelligence or an acceptable one as this decision will affect time and engineering cost of your system. Also define a metric to measure performance of your model.

Build the system : Build a minimum viable system without worrying much about accuracy. Then build an incremental strategy to improve your system by solving problems you face in each iteration.

Refine the system with more data : Initial metric values are not the indicators of real life, your data and users might change , so regularly monitor the system performance. Update it with new data and fine tune the model accordingly.

Read more at : http://www.erogol.com/short-guide-deploy-machine-learning/

Rate this blog entry:
2765 Hits
0 Comments

Enhancing Artificial Intelligence using Ensemble Training

Sometimes even the Machine learning algorithms behave so dumb that an image recognition model can be confused by generating an adversarial instance, i.e. by changing few pixels by either taking derivative of model output or exploiting genetic algorithms. Adversarial instances lie in low probability regions which is in contrast with limited instances of high probability regions from which the model was trained. A possible approach to solve this problem is ensemble training - To let multiple models back each other. As we look forward to developing more artificial intelligent systems it would become common to encounter such problems.

You can read more at: http://www.erogol.com/ensembling-against-adversarial-instances/

Rate this blog entry:
2577 Hits
0 Comments

Effective Quality Management using Hypothesis Test

A business hypothesis is a foundational theoretical concept whose good understanding helps you to achieve business goals. For instance, it provides a mathematical way to answer questions like whether you should spend on advertising or whether increasing a price of a product will affect your customers. Data collection is one part of the game, but correct data processing and interpretation is the final stage of your decision-making process. Hypothesis testing is used to infer whether there is enough data to support evidence . There are various test methods : Parametric Tests - z-test, t-test, f-test. Non Parametric Tests - Wilcoxon Rank-sum test, Kruskal-Wallis test and permutation test.

Read more at : http://www.datasciencecentral.com/profiles/blogs/importance-of-hypothesis-testing-in-quality-management

Rate this blog entry:
3055 Hits
0 Comments

Hadoop Architecture for Big Data Analytics

 

The emergence of massive unstructured data sources like Facebook and Twitter has created a need to develop distributed processing systems for Big Data Analytics. Hadoop (A Java based programming framework) has become the first choice of developers and industry experts mainly because its: Highly scalable, flexible, and cheap. An application is broken down into various small parts which runs on thousands of nodes to achieve fast computing speed and reduce overall operation time. Hadoop architecture continues to operate even if a node fails. Its incredible design allows you to process large volumes of data and extract computationally difficult features of users/customers.

Read more at : http://www.datasciencecentral.com/forum/topics/how-to-use-hadoop-for-data-science

Rate this blog entry:
3241 Hits
0 Comments

Scaling Data Models in Production Environment

Often the outputs of data models developed by data, scientists end up in a report which summarizes the state of business and used by stakeholders to make decisions. But it is necessary to achieve a system that can predict the future outcomes in real time. This can be done by integrating the model in a production environment, however, it requires advance engineering skills and data scientists cannot do it alone. The process of deployment follows broadly 7 steps :  1.Refactor the model code

2. Walk through the code and determine how it slots into the engineering cycle

3.Re-write into a production stack language or PMML

4.Implement it into the tech stack

5. Test performance

6. Tweak the model based on test results

7.Slowly roll out the model.

Today many companies are adopting tools to make this process faster to reap the benefit of data driven decision making.

Read more at : https://www.datascience.com/blog/navigating-the-pitfalls-of-model-deployment

 

Rate this blog entry:
2533 Hits
0 Comments

Winning Data Strategy using Industrialized Machine Learning

 The first block to build a winning business strategy is to create a map based on business value of the question and approximating how much time would it take to get high quality answers to that question. The idea is to break the business questions into groups that corresponds to real time data systems. It allows you to focus on a specific system at once to build a strong strategy and optimize the sequence in which each sub question needs to be answered depending upon its current business value. A pattern of actions for data strategy begins with a hypothesis and collection of relevant data followed by building models to explain the data and evaluating its credibility for future predictions. The entire process is achieved on an enterprise scale digital infrastructure using Industrialized Machine Learning (IML). This approach can have a huge impact on natural resources and healthcare industries as well.

Read more at : https://blogs.csc.com/2016/07/05/how-to-build-and-execute-a-real-data-strategy/

 

Rate this blog entry:
3868 Hits
0 Comments

A Neural Network Approach To Raise Your E-Book Business 

E-Book business communities generate a lot of revenue everyday but sometimes it is difficult for author(s) to earn decent amount because of lack of preparation and research. No matter how unique and interesting your content is, if it doesn't appear on the first or second page of search results, it's highly unlikely that a visitor would ever read it. The story doesn't end here, one must cleverly select the title and cover which attract the reader as it changes the way we think. A neural network approach for the determination of most titles using Doc2Vec can be adopted to increase revenue. It involves training a thin two-layer neural network, which operates in unsupervised mode and form clusters of most similar words (using cosine similarity metric) based on context.

To read more about the technical implications here: http://www.datasciencecentral.com/profiles/blogs/use-neural-networks-to-find-the-best-words-to-title-your-ebook

Rate this blog entry:
2592 Hits
0 Comments

Automatic Debt Management System 

Big Data Analytics and Business Intelligence is changing the way business interacts with customers. Modern big data solutions have enabled automated decision making in debt management systems for client handling processes. Correct implementation of these tools provides a more personalized experience to each customer and avoid infringements. Debt management automation has been proven a successful solution to maintain balance between meticulous efficiency and customer satisfaction. Such a CRM automates a lot of process and thus it requires a small team days to complete debt collection process. Analytics have not just accelerated debt collection, but also enhanced customer relations.

You can read more at: http://www.dataminingblog.com/what-could-big-data-mean-for-debt-management/

 

 

Rate this blog entry:
3030 Hits
0 Comments

Essence of Qualitative Research

Global markets are becoming more complex each day, and therefore, it has become essential for business intelligence teams to apply advanced methods for data interpretation. They believe that only the decisions based on quantitative data can be justified. Although there are some ways quantitative research may go wrong, the truth comes out only when you meet people, talk to them, involve them in creative exercises.

Read more at: http://www.dataversity.net/science-big-data-art-interpretation/

Rate this blog entry:
3912 Hits
0 Comments

Big Data Integration for Advanced Analytics 

Modern needs of Big data consumption require data integration before data actually hit the business intelligence tools. This includes leveraging complex and unstructured data and enables raw data to flow securely through business. Today, even the smallest companies produce huge amount of data across systems which need to communicate with each other and therefore requires a platform to pipe all these data sources into Data Lakes.

Read more at: http://www.dataversity.net/dont-put-cart-horse-comes-big-data/

Rate this blog entry:
3385 Hits
0 Comments

Building Consumer Intelligence System

It has been evident that a great customer experience is one of the signs of a healthy business model. Machine Learning and Data Analytics are playing a fundamental role in building consumer intelligence systems. It is important to capture data and there is no single magic source to collect data. Telecoms are making billions by selling data. You need to ensure that the data is relevant to business. Once you have the right data, you are ready to model, design and engineer and deploy your 360-degree customer view platform and achieve the enhance customer experience for your organization.

You can read more at: http://www.datasciencecentral.com/m/blogpost?id=6448529%3ABlogPost%3A508502

 

Rate this blog entry:
4242 Hits
0 Comments

Real Time Analytics on Streaming Data

Today world has become smaller and faster, with increasing computation speed decisions are done in seconds instead of days. Product information and comparison is available on any device any time. Real Time analytics involve solving problems quickly as they happen or even before they happen. Companies now have more insights into their assets. Several industries are using streaming data and putting real time analytics. The big data revolution has further accelerated the demand of real time analytics to analyze customer behavior. Gone are the days when decisions were based on data stored on a disk , actions are taken on streaming data. Read more at: http://www.datasciencecentral.com/profiles/blogs/do-you-know-what-is-powerful-real-time-analytics

 

Rate this blog entry:
3364 Hits
0 Comments

Building 21st Century Data Science Teams

A traditional data science department is comprised of Data Scientists, Data Engineers and Infrastructure Engineers. This model has a drawback that one role is always dependent on other and likely to criticize them for task failures because they didn't do their job well. These conflicts may reflect in the quality of final data product. So, what went wrong? You probably don't have big data. Jeff Magnusson (Director of Algorithms Platform at Stitch Fix) suggested a clever approach of forming a "High Functioning Data Science Department" which involves building an environment which allows autonomy, ownership, and focus for everyone involved yet at the same time clearly distinguishing the roles of Data Scientists and Data Engineers. Data scientist can't suddenly become talented engineers nor is that engineers will be ignorant of all business logic, the partnership is inherent to the success of this model. You can read more at: http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/

 

Rate this blog entry:
3530 Hits
0 Comments
Sign up for our newsletter

Follow us