HTML tutorial
CSS3 tutorial
Bootstrap tutorial
JavaScript tutorial
JQuery tutorial
AngularJS tutorial
React tutorial
NodeJS tutorial
PHP tutorial
Python tutorial
Python3 tutorial
Django tutorial
Linux tutorial
Docker tutorial
Ruby tutorial
Java tutorial
C tutorial
C ++ tutorial
Perl tutorial
JSP tutorial
Lua tutorial
Scala tutorial
Go tutorial
ASP.NET tutorial
C # tutorial
Up to 80% of a Machine Learning project is about Collecting Data
Data can be many things.
With Machine Learning, data is collections of facts:
Type | Examples |
---|---|
Numbers | Prices. Dates. |
Measurements | Size. Height. Weight. |
Words | Names and Places. |
Observations | Counting Cars. |
Descriptions | It is cold. |
Human intelligence needs data:
A real estate broker needs data about sold houses to estimate prices.
Artificial Intelligence also needs data:
A Machine Learning program needs data to estimate prices.
Data can help us to see and understand.
Data can help us to find new opportunities.
Data can help us to resolve misunderstandings.
Healthcare and life sciences collect public health data and patient data to learn how to improve patient care and save lives.
The most successful companies in many sectors are data driven. They use sophisticated data analytics to learn how the company can perform better.
Banks and insurance companies collect and evaluate data about customers, loans and deposits to support strategic decision-making.
The most common data to collect are Numbers and Measurements.
Often data are stored in arrays representing the relationship between values.
This table contains house prices versus size:
Price | 7 | 8 | 8 | 9 | 9 | 9 | 10 | 11 | 14 | 14 | 15 |
Size | 50 | 60 | 70 | 80 | 90 | 100 | 110 | 120 | 130 | 140 | 150 |
Quantitative data are numerical:
Qualitative data are descriptive:
var xValues = ["Sample", "Population"]; var yValues = [8, 92]; var barColors = [ "#b91d47", "#00aba9" ]; new Chart("myChart1", { type: "pie", data: { labels: xValues, datasets: [{ backgroundColor: barColors, data: yValues }] } });
A Census is when we collect data for every member of a group.
A Sample is when we collect data for some members of a group.
If we wanted to know how many Americans smoke cigarettes, we could ask every person in the US (a census), or we could ask 10 000 people (a sample).
A census is Accurate, but hard to do. A sample is Inaccurate, but is easier to do.
A Population is group of individuals (objects) we want to collect information from.
A Census is information about every individual in a population.
A Sample is information about a part of the population (In order to represent all).
In order for a sample to represent a population, it must be collected randomly.
A Random Sample, is a sample where every member of the population has an equal chance to appear in the sample.
A Sampling Bias (Error) occurs when samples are collected in such a way that some individuals are less (or more) likely to be included in the sample.