What is Data Science?
Data Science as the name suggests is a field of science that deals with data. It combines the power of computers and mathematics for analyzing data, extracting important information from it and process this information for getting a useful output.
Learn about the providers of online masters in data science by clicking here
How can we use Data Science?
There are two ways in which we can use data science:
- Finding a solution to a problem by analyzing the data.
- Analyzing the data and come up with new ideas that can be implemented or come up with new problems that can be solved with it.
Classifications of Data Science
Data Science can be classified into the following:
- Data Collection
- Data Analysis
- Data Visualization
We will take a brief look at each of these three…
Data Collection
In philosophy, we call the things that are known or are assumed as facts which makes the basics of reasoning and calculation as DATA. Collecting data has been one of the most common things that humans have been doing for ages.
Our ancestors used to collect data in rocks and stones for remembering the number of their cattle or to create memories about their life or the knowledge they have gained which they wanted to pass on to the next generation.
In the modern world, the basic purpose of collecting data is for using it to find solutions to existing problems.
We collect data mainly in these different forms like:
- Sound data
- Visual data
- Text data
Types of data
The two main types of data are:
Structured data
Structured data is information that is organized. For example, a data set which contains names and roll numbers in two different column.
Unstructured data
These are a collection of information that is not processed. Examples are IoT sensor data, emails, chats, etc.
Data Analysis
Now that we have collected the data, for finding the solution to the problem that we have, we need to analyse the data.
The process of analyzing data using different tools like R, Python, MATLAB, etc. (We can use the libraries available in these programming languages for analyzing data by plotting graphs or charts) is called data analysis.
For example, consider the problem of housing price prediction. Imagine we have a dataset containing the prices of houses over the past 10 years. We would like to predict the price of the house in the coming year using this data.
One way we could do this is by plotting a graph where on the x-axis we give the years and, on the y-axis, we give the price of houses. When we plot the data like that, we would be able to see a pattern in which the prices of the house are increasing or decreasing over time.
And now by using this trend we would be able to predict the possible increase in price for a house in the coming years.
Data Visualization
Data visualization is a tool that is used to explain the data using graphical representations of the data. It helps the data analyst to understand different patterns in data and outliers and trends in data.
Also, the data analyst can use the visualization techniques to present his findings to the customer in the form of graphs, charts and maps.
Some of the different libraries in python for data visualization are:
- Plotly
- Seaborn
- Ggplot
- Altair
- Matplotlib
- Bokeh
- Folium
If we are not using a programming language for visualization, we can use below tools:
- Google charts
- Tableau
- Xplenty
- Hubspot
- Whatagraph
Data visualization example
We shall see an example of data visualization of data about three machines A, B, C, D and E for the period 01-10-2020 to 07-10-2020, done in python programming language using Plotly library.
Subsets of Data Science
Artificial Intelligence
AI – Artificial Intelligence is the intelligence that enables machines to think like a human and find solutions to problems with little or no human intervention. There are mainly 3 types of AI:
Artificial Narrow Intelligence (ANI)
Narrow AI is the most common form of AI that machines have these days. ANI allows machines to be automated and do a particular task or a small set of tasks all on its own, with very little or no human intervention.
It doesn’t have emotions or feelings of consciousness. It cannot do a wide variety of tasks if it isn’t programmed for it.
Examples:
- Self-driving cars
- Auto – Pilot
- Spam Filters
- Chatbots
Artificial General Intelligence (AGI)
This type of AI can only be seen in sci-fi movies and can exhibit human-level intelligence. This type of AI would be hard to distinguish from normal humans and would be able to show emotional intelligence.
They can think like human beings and would be able to solve problems based on situations rather than just system needs. In other words, if there is a situation where a particular solution to a problem might be harmful to someone else, at this situation the machine might choose another solution.
Artificial Super Intelligence (ASI)
This type of AI will have an intelligence level that would be far superior to humans and would be able to think much faster than us. They would have greater problem-solving skills and updates themselves which would be more brilliant than the one before, all in just a matter of days.
They would have the ability to evolve quickly and become the better versions of themselves. This type of intelligence can even be a threat to our existence.
Machine Learning
Machine learning is the process of teaching a machine to accept inputs and do calculations based on algorithms build upon statistics and probability, to come up with an output, that is closer or equal to the expected output.
We can see the use of machine learning in our day to day life, for example, the recommendation system in YouTube or Instagram ads is all based on machine learning where the data of what the user clicks the most and likes the most is fed into a system and the system learns about the user’s interests and it suggests the contents that the user is most interested in.
Machine learning is classified mainly into 3 types of learning:
1.Supervised Learning
Let’s say we want our machine to classify images of apple from a set of other images. In supervised learning, we will initially provide the ML-model input images and labels according to the name of the fruit in the image.
An ML-Model is a set of algorithms that learn different features from input data and gives an output.
The model would compare the image and label and learn the features that map a particular image to a particular label.
And now when we give the model a new image it would be able to identify the same features that it had seen in the data, that we had used to train it and would map the image to the particular label.
Common supervised learning problems:
- Classification: Classification group’s the output to categories that are previously given to it as labels. For example 0,1, cat, mouse, apple, mango, etc.
- Regression: Regression is used to predict a continuous quantity. For example predicting live temperature in room. Stock market price prediction is also an example.
2.Unsupervised Learning
In unsupervised learning, the model is provided with input data without any labels. The model would categorize the data into different groups based on similar features. Unsupervised learning is mainly used for two types of problems:
- Clustering: Clustering identifies features that are similar between data and classifies according to these similarities. The model itself classifies input data according to similar features in data. For example, clustering peoples to different groups based on the spread of COVID-19 in their area.
- Association: For example, associating a particular product to a buyer based on another product he brought recently (mapping).
3.Reinforcement Learning
Its like teaching a baby what’s right and wrong. If he does right we will appreciate him, by giving him some chocolates, gifts, etc.. and we will give him a feedback if he does something wrong. So next time if he does something he would know that it is good or bad based on feedbacks or rewards he got before while doing the same.
So, reinforcement learning is a reward-based system in which an agent interacts with an environment by performing some actions and learn from rewards (either negative or positive) obtained from interpreter . There is no predefined data and no supervision. Follows a trial-and-error method for learning. It should identify an output by itself and we would just say if its right or wrong.
Examples:
Self driving cars where the environment is road and the interpreter (error signal generator) is a human in the driving seat. The human sends a signal based on the direction the car automatically takes or maybe the lane changes the car makes or maybe while parking if it follows the rules.
An automated machine that is used for categorizing products into different groups based on its weight. The person who monitors the task would generate an error signal which is negative if the machine classify the product wrongly and it would give positive response if the machine does it correctly
In addition to these, there is another type of learning called semi-supervised in which some data is labelled and others are unlabelled.
4.Deep Learning
Deep learning is a subset of machine learning where we use artificial neural networks for doing the supervised, unsupervised, and reinforcement learning tasks.
Artificial Neural Networks (ANNs) are inspired by the neurons in the human brain. In deep learning, we use multiple layers of neurons connected in which one layer of the neuron will learn a particular feature from the input and the output is passed through a function, which mostly uses some probabilistic equations to identify the useful features and pass it as an input to the next layer and so on, until it reaches the final layer where we get an output.
Benefits of using neural networks
Neural network can have lots of layer’s, each having number of neurons. So even if one neuron isn’t performing well the model would identify it and won’t affect the performance. Also the data (input data and the features identified from data) is stored in the neurons itself in form of numbers. So we don’t have to use a separate database for storing these data.
Also neural networks, can be manipulated to whichever way we want for different tasks. it can be used for solving multiple problems, basically like our brain can do lots of things by firing different sets of neurons.
The two main areas in which Deep Learning is used the most are:
Computer vision
Computer vision is a field of artificial intelligence that uses deep learning to learn about the visual world. We know that an image is a collection of pixel values. In the computer, we represent these values as numbers in the matrix.
These numbers are fed into the neural networks which would then learn the features of the image and would be able to either classify an image or to detect an object in the image.
A type of neural network called Convolutional Neural Networks (CNNs) is used commonly for this. Some of the most common applications of computer vision are:
- Defect detection in manufacturing
- Self-driving cars
- Intruder detection
Natural Language Processing
NLP is a field of artificial intelligence that uses the power of neural networks to understand human language in a useful way. NLP can be used to read, understand, and create natural language. Some of the applications of NLP are:
- Google Translate
- MS Word, Grammarly – for grammar check or spellcheck
- Siri, Alexa – Personal Voice Assistant
If you have decided to go ahead with data science, you can refer out next article on data science.
Thank you
If you know any subject that can be related to manufacturing industry or industrial engineering, you can earn some income by becoming article contributor of this website. For knowing more about it, please visit Join us page.
You don’t need to have any experience in article writing, just knowledge on the subject is needed.
Also you can know more about our team of article contributors by visiting the about us page.
About the Author
Deepak Jose is a B-Tech CS student with a passion for Data Science. Loves learning about Data Science, coding, and science in general. Does data analysis and visualization as a hobby. Even though I’m in the Computer Science path I always find time to learn about space, automobiles, geography, energy, architecture, arts, etc. Loves solving problems and learning about new inventions.
LATEST ARTICLES FROM KNOW INDUSTRIAL ENGINEERING
- How and Why to eliminate the concept of “Working hours”The debate around working hours and work-life balance is louder than ever. But let’s pause and ask—do hours really matter if you love what you do? When employees truly enjoy their work or feel like the company is their own, work stops feeling like a burden. They don’t count hours; they focus on results. The… Read more: How and Why to eliminate the concept of “Working hours”
- Proposal Writing and Bid writingProposal writing and Bid writing is a very important and a very critical skill that encompasses business, government, and different sectors including the IT and also the non-IT sectors. It involves writing well-structured documents that provide solutions to the requirements and also address all the problems that the client requires. In this article we will… Read more: Proposal Writing and Bid writing
- How to prioritize machines and activities for implementing SMEDThis article is written by Bharathkumar Radha Krishna. He is an Industrial engineer with expertise in lean methodologies and value stream mapping In this article, we will dive deeper into SMED. We will discuss how to prioritize machines and activities for implementing SMED, define economic lot size quantity, and will provide tips and techniques. In… Read more: How to prioritize machines and activities for implementing SMED
- Single-Minute Exchange of Die (SMED)This article is written by Bharathkumar Radha Krishna. He is an Industrial engineer with expertise in lean methodologies and value stream mapping In today’s manufacturing world, efficiency and productivity are the important keys to success. Every minute counts, which is why every organization is constantly seeking ways to improve or optimize their operations. Imagine a… Read more: Single-Minute Exchange of Die (SMED)
- How to add dimensions in AutoCADWe all know that AutoCAD is a powerful tool for drafting and designing and it is widely used in various fields such as engineering, construction, architecture etc. If you want to learn more about AutoCAD you can check the articles here. In today’s article, we are going to talk about a very important tool that… Read more: How to add dimensions in AutoCAD
- What will happen if you don’t hire an Industrial EngineerIf you don’t hire an industrial engineer, you might be missing out on a crucial opportunity to optimize your business processes, improve efficiency, and ultimately save both time and money. Industrial engineers are professionals who specialize in finding ways to make systems and processes work better. Here are some compelling reasons why hiring an industrial… Read more: What will happen if you don’t hire an Industrial Engineer
- Why a candidate with Industrial Engineering background is most suitable to lead a factoryI prefer candidates with an Industrial Engineering background to lead factory Operations or similar higher roles for several below compelling reasons. In addition to their traditional responsibilities, an individual with this background brings unique skills and perspectives to the role, fostering enhanced efficiency and innovation across the entire organization. Industrial Engineers are adept at optimizing… Read more: Why a candidate with Industrial Engineering background is most suitable to lead a factory
- Unit Per People Hour (UPPH)Unit per people hour is a measure of manhour used for manufacturing a product. It is abbreviated as UPPH. In this article lets discuss some formulas related to this and the uses of UPPH. At the end of this article you will be able access an online tool related UPPH, where you can enter the… Read more: Unit Per People Hour (UPPH)
- Unit Per Hour (UPH)Unit per hour is a measure of capacity of manufacturing or assembly line in a factory. Unit per hour is abbreviated a UPH. UPH means, how many units a manufacturing line or assembly can be produced in an hour. Let’s discuss this in detail. We will include following in this article. You may refer a… Read more: Unit Per Hour (UPH)
- How to grade operators in a factory and WhyGrading of the operators is categorizing or grouping the operators according to the various factors which is essential to do the job. In this article we will be discussing on how to grade operators along with the factors to be considered. So, stay tuned… We are explaining everything with practical examples. Before moving to our… Read more: How to grade operators in a factory and Why
- 50 Problems which 5S SolveIn this article we will be discussing about 50 problems in your factory, which can be solved by implementing 5S. Below, we have listed 50 problems and explained how these problems are solved using 5S. Before moving in to list, if you not yet attended our 5S training we recommend to do it. So, that… Read more: 50 Problems which 5S Solve
- Andon in ManufacturingVisual controls are the major part in any stream of business. May be Production, Road transport, Railway stations and in home appliances. Andon is a visual management tool used in lean manufacturing to communicate the status of production to the operators or the production team. The status can be Break down, Material Shortage, Quality issue,… Read more: Andon in Manufacturing
- A guide on Key Performance IndicatorA Key Performance Indicator (KPI) is a measurable metric that evaluates success and aligns actions with strategic goals in business. In the dynamic landscape of modern business, staying ahead requires more than just intuition and experience. Data-driven decision-making has become a cornerstone for success, and Key Performance Indicators (KPIs) play a pivotal role in this… Read more: A guide on Key Performance Indicator
- CNC machine problems and solutionsIn the intricate world of CNC machining, breakdowns can be a significant hurdle to seamless production. This article dives deep into the most prevalent issues faced by CNC machines, from improper voltage supply to auto tool changer glitches, providing valuable insights into their causes and offering expert solutions. Discover how to troubleshoot and address these… Read more: CNC machine problems and solutions
- Manpower productivity formulaManpower productivity, also known as labor productivity, is a measure of the efficiency of labor in generating economic output. It is typically calculated by the formula, ratio of output (goods or services produced) to the input of labor. The goal of measuring manpower productivity is to assess how effectively human resources are utilized in the… Read more: Manpower productivity formula
- What is Microsoft ExcelMicrosoft Excel, a stalwart in the world of spreadsheet software, has become synonymous with data management, analysis, and decision-making. Since its inception in 1985, Excel has undergone significant transformations, evolving into a multifaceted tool that caters to the needs of individuals, businesses, and researchers alike. In this comprehensive guide, we’ll unravel the essence of Microsoft… Read more: What is Microsoft Excel
- OEE Software for Maximizing Operational EfficiencyIn today’s fast-paced industrial landscape, the pursuit of operational excellence is a critical factor for success. One key metric that plays a pivotal role in achieving this goal is Overall Equipment Efficiency (OEE). OEE software has emerged as a powerful tool for businesses seeking to optimize their operations and enhance productivity. In this article, we… Read more: OEE Software for Maximizing Operational Efficiency
- Takt time ClockIn the realm of manufacturing precision, a pivotal player emerges the Takt Time Clock. This sophisticated timekeeping instrument is more than a mere apparatus; it serves as a cornerstone for aligning production with customer demand. In this exploration, we’ll navigate the professional significance of the Takt Time Clock and its role in elevating manufacturing efficiency.… Read more: Takt time Clock
- Takt time and Cycle timeIn the ever-evolving landscape of manufacturing, efficiency is a paramount goal. Two critical concepts that play a pivotal role in achieving this efficiency are Takt Time and Cycle Time. In this article, we will unravel the essence of Takt Time and Cycle Time, exploring their definitions, significance, and how they collaborate to streamline manufacturing processes.… Read more: Takt time and Cycle time
- Takt time exampleTakt time is a fundamental concept in lean manufacturing that plays a crucial role in optimizing production processes. In this article, we’ll explore a practical example of takt time in the context of an automotive assembly line. Understanding how takt time operates in real-world scenarios is essential for manufacturers aiming to enhance efficiency and meet… Read more: Takt time example
- Companies hire Industrial engineers in IndiaAre you an industrial engineer looking for exciting career prospects in India? You’re in the right place! We’ve compiled a comprehensive list of companies that frequently hire industrial engineers. It’s important to note that this isn’t an exhaustive list, and there may be other companies actively seeking industrial engineering talent. We have listed 261 companies… Read more: Companies hire Industrial engineers in India
- SOP vs Work instructionIn the dynamic landscape of various industries, ensuring consistency, compliance, and efficiency in daily operations is paramount. Two key documents that play a crucial role in achieving these goals are Standard Operating Procedures (SOP) and Work Instructions (WI). While often used interchangeably, it’s important to recognize that SOPs and WIs serve distinct purposes in the… Read more: SOP vs Work instruction
- Definition of waste in lean manufacturing perspectiveThe definition of waste in Lean Manufacturing perspective is “any activity that does not add value to the product or service” Something deep inside of almost every person tells us that it is good to improve. It is better to move forward than it is to move backward. It is better to move faster than… Read more: Definition of waste in lean manufacturing perspective
- AutoCAD Recovery ManagerIn today’s article we are going to talk about AutoCAD recovery manager. We all know about AutoCAD and how this software has changed our life drastically. If you want to learn more about AutoCAD you can check here. Autodesk’s AutoCAD is a household name nowadays, it is a tool which architects engineers, designer etc. use.… Read more: AutoCAD Recovery Manager
- Operational excellence strategyOperational excellence strategy is a comprehensive approach that organizations employ to optimize their processes, enhance productivity, and consistently deliver high-quality products or services. Operational excellence is a multifaceted approach to improving business performance and productivity. It encompasses a wide range of principles, strategies, and practices aimed at optimizing operations, enhancing efficiency, and delivering exceptional value… Read more: Operational excellence strategy
- 3D Printing in manufacturingHey there! Have you heard about 3D printing? It’s this incredible technology that’s shaking up the world of manufacturing. Let me break it down for you. Imagine a world where you can create anything you want, from the most intricate designs to personalized medical implants, at the touch of a button. That’s the power of… Read more: 3D Printing in manufacturing
- Industrial wastewater treatmentIndustrial wastewater treatment is the process of purifying and removing contaminants from water that has been used in industrial processes. It aims to eliminate harmful substances, such as chemicals, heavy metals, and pollutants, from the wastewater before it is discharged into the environment or returned for reuse. This treatment is essential to protect ecosystems, public… Read more: Industrial wastewater treatment
- Takt time excel templateTakt time excel template is a format in Microsoft excel, which can be used to calculate takt time from the inputs entered in to it. Following are the three inputs needed for calculating takt time in the excel template; Here you will be able to download the takt time excel template, by clicking the download… Read more: Takt time excel template
- Eliminate these 6 Big Losses to improve OEEOEE is one of the key factors in calculating efficiency of an Automation or machines. If any factory OEE is more than 85% than the utilization of Automation or machines is very good. In this article we will learn how to increase OEE by eliminating or reducing 6 Big Losses. Following are the 6 Big… Read more: Eliminate these 6 Big Losses to improve OEE
- How I got admission for Masters in United StatesIn this article I will be sharing my complete experience and will share the steps I have taken to get admission for Masters in United States.
well done deepak