An Overview of Machine Learning
|
You've probably heard a lot about machine learning over the past few years. It's used everywhere these days — from recommending music to automatically trading stocks. It sounds like science fiction — a computer autonomously making predictions.
The way most machine learning works is that an algorithm learns the mapping between some input numbers and a target we're trying to predict. This is called supervised machine learning — we're supervising the algorithm to predict something for us.
For example, let's say we're trying to predict the stock market. We first set a target (what we're trying to predict). This target might be tomorrow's closing stock price.
We then use the other data we have (in this case, today's opening and closing prices) to predict the target.
You can see the opening and closing prices (called the predictors) and the target in the table below:
|
| Date | Open | Close | Target | | 2022-09-07 | 3909.42 | 3979.87 | 4006.17 | | 2022-09-08 | 3959.93 | 4006.17 | 4067.36 | | 2022-09-09 | 4022.93 | 4067.36 | 4107.27 | | 2022-09-12 | 4083.66 | 4107.27 | 3932.69 | |
If we're trying to determine if the stock price will go up or down tomorrow based on today's price, we might make up some rules:
- If today's price is higher than the average over the last month, then the stock will go back down.
- If today's closing price is a lot lower than today's opening price, then the price will go back up tomorrow.
With supervised machine learning, we use an algorithm to learn these rules automatically from historical data. This will create a trained model that we can then test on historical data to analyze its performance. Once we test our model out and see that it works, we can use it to predict the future (like tomorrow's stock price!)
Machine learning algorithms split into two broad categories:
Here are some of the most popular machine learning algorithms:
- Linear regression assumes that there is a linear relationship between your target and your predictors. It's the most commonly used machine learning algorithm.
- Decision trees work by repeatedly splitting your data up into two groups (a lot like decision trees you might use in real life!).
- Neural networks use internal weights to transform your data from the input into the target. Neural networks are used in deep learning, which has led to many recent AI breakthroughs, like GPT-3.
This may sound complicated, and machine learning algorithms do involve a lot of math internally, ranging from linear algebra to calculus.
Luckily, you don't need to know all of the math to actually use these algorithms. In most programming languages and data tools, libraries that implement common machine learning algorithms have already been developed:
-
Scikit-learn in python implements popular algorithms.
- R has packages that implement most machine learning algorithms.
- Javascript now has machine learning libraries, including tensorflow.
- Excel enables us to do linear regression.
If you want to learn more about machine learning, check out our Dataquest courses and projects:
You can also find tutorials on our blog and Youtube:
—Vik
Founder, Dataquest
P.S.: This section is new, and we're working to improve it! Please reply to this email, and let us know what you think.
|
|
|
How to Start Your Career in Data: Find Your Role, Learn Skills, Build a Portfolio |
In this video, we'll cover how to find the right data role, how to learn the skills you need to get hired, and how to show your skills to employers.
- You'll learn about the five main data roles: data analyst, data scientist, data engineer, business analyst, and machine learning engineer.
- We'll cover the exact skills you need to get a job in each role. We'll talk about a four-step learning method that will help you learn the skills.
- Then, we'll cover how to build a project portfolio that will advance your career.
|
|
|
Python Online Practice: 67 Free Ways to Improve Your Skills |
Whether you’re just starting your learning journey or looking to brush up before a job interview, getting the right Python practice can make a big difference. Studies on learning have repeatedly shown that people learn best by doing. So here are 67 ways to practice Python by writing actual code. |
|
|
Implementing a B-Tree Data Structure |
Rudolf Bayer and Edward M. McCreight coined the term B-tree data structure at the Boeing Research Labs in 1971. They published a scientific paper titled "Organization and Maintenance of Large Ordered Indices" and introduced a new data structure for fast data retrieval from disks. Although the B-tree data structure has evolved over the decades, understanding its concepts is still valuable. Here’s what we’ll cover in this tutorial:
- B-tree data structures
- B-tree properties
- Traversing in a B-tree
- Searching in a B-tree
- Inserting a key in a B-tree
- Deleting a key in a B-tree
|
|
|
@giovanni.srg shared a high-quality machine learning project on Support Vector Classifier with Python where he used the SVC algorithm to classify the level of customer satisfaction of an airline. The project is noticeable for its excellent explanations, insightful plots, and the high accuracy achieved by the algorithm. |
|
|
Go Premium and Accomplish Your Goals. |
Your goals are within reach. Subscribe and follow Dataquest's proven paths to grow your career. |
|
|
Dataquest • 548 Market St #73537 San Francisco, CA, 94104 |
|
|
|