So you need some data eh? Often times it’s extremely difficult to find open source data sets with exactly what you’re looking for or public APIs. In these situations, the go-to is we scraping. In this blog, let’s dive into what web scraping is and the step by steps!
Web scraping allows you to gather data from the website of your choice! However each website has different HTML structures so often times web scrapers are built to explore one specific website. It’s important to learn the following things about the website of your choice:
The fastest and most simple way to evaluate a model is to perform train-test-split. This procedure, as its name suggests, splits the data into a training and testing set, trains the model using the training set data and checks the accuracy. However, can you rely on this alone when finalizing your model?
The simplest answer is no because of something called the accuracy paradox.
Let’s dive into the process for evaluating your machine learning model and using the best, most effective metrics to do so.
Right off the bat, we take our data and begin the exciting (read laborious) process…
Normal distribution is obviously very close to all of our hearts, it’s widely used and has unprecedented power.
However, lurking in the shadows of normal distribution is the under-utilized, potential-filled binomial distribution. The most common example of this distribution is the classic coin example:
“If we flip a fair coin with a fixed probability, and we flip the same coin n times, what is the probability of getting a certain number of heads?”
In other words, if we flip a coin n number of times, the probability (p) of it landing on head or tails (x) follows a binomial distribution…
Data Enthusiast with a background in Engineering.