top of page
Search

Machine Learnt: Said No One

  • Writer: Eashwar Sathyamurthy
    Eashwar Sathyamurthy
  • Dec 2, 2023
  • 4 min read

Updated: Mar 23, 2024


One of the major problems with developing neural net models is that it will always be learning.


Before freaking out on the above statement, give me an opportunity to explain what I was thinking. Imagine you are in second grade and learning “Addition.” First, you start with numbers between 0 and 10. Then, you start adding bigger numbers. So, by doing addition again and again on different numbers (integers and fractions), you are training yourself to do addition quickly, and efficiently improving yourself (hopefully) correcting every time you make an error. Then, you are given homework to solve 10 textbook exercises in addition and you see you solved 9/10 problems accurately. You again analyze the error, and now you are tested on an exam, and you solve all the problems correctly. At this point, you say to yourself, “I can do addition.” Just for the sake of making my point, let’s assume a second grader does all these things.


In this above example, at what point do you think you mastered addition? You finished learning addition and convinced yourself that give me any two or more numbers I can add and give the correct answer. In fact, you can never say when exactly you mastered anything. You are just confident that you mastered addition based on all the experience of doing addition and learning from your mistakes. People still make errors on addition after second grade, which may be because of human error. That does not mean they need to do addition exercises to train and refine their knowledge of addition. Why is that? It is because we figured out the logic behind addition and humans are prone to human errors which is also the reason why it is unacceptable for a calculator to do addition wrong. This only means it’s a faulty calculator (cruel world) as the logic coded inside the calculator is wrong.


There are a lot of similarities between this example and how we train and develop neural net models. The steps are the same: develop, train, and test the model. I do realize I skipped an entire course of Andrew Ng on Machine Learning with that statement. This blog is not about how we can make neural nets learn better but rather think about when they are finished learning. Even Andrew Ng, in this course at the end of every lecture, says the iconic quote, “Don’t worry about it if you don’t understand.” Suppose we develop a neural net model to add two numbers (What’s the point of calculators), giving an accuracy of 98% in both training and test datasets (in an ideal world), does that mean voila, we are done? We are missing one crucial point. Did the neural network learn the logic behind addition? Well, essentially what we have done is train a bunch of weights to accurately (98%) map inputs to output, where input is two numbers and output is the sum of numbers. While some might argue that it is an absolute stroke of genius to develop a neural net model for something as groundbreaking as addition, you are absolutely correct. Let’s move on!


One key ambiguity that often sneaks under the radar is the concept of 98% accuracy. It's like claiming you're a math whiz because you aced 98% of your addition problems, but here's where it gets interesting. If your training set is just a cozy 100 sets of addition problems, that's impressive – you only stumbled on 2! Yet, toss in a whopping 10,000 sets into the mix, and suddenly that 98% translates to a not-so-cool 200 problems where your neural network went, 'Oops!' But wait, there's more to the story. Achieving 98% accuracy on small training and test sets feels like acing a math quiz created from an infinite pool of possible questions. Of course, we can't train and test our neural net with infinite possibilities, just as we can't spend our entire lives doing addition. It's the classic case of the infinite test cases that never made it to our neural network's training room – if only we had the time!


But how do we resolve this ambiguity? How can we confirm or validate that a neural network has genuinely learned the logic behind addition? The answer lies in recognizing the difficulty of understanding the reasoning behind neural networks, often labeled as 'black boxes.' We struggle to understand how neural networks work, making it like solving a mystery with hidden clues due to the black box phenomenon. In simple words, the black box phenomenon refers to a situation where something works, but we don't exactly know how. It's like using a TV remote without understanding the electronics inside – you press a button, and the TV changes, but you're not sure about the magic happening inside the remote. Similarly, neural networks can make predictions or decisions, but it can be challenging to explain step by step how they arrive at those results.


Seems convenient to label something as a black box for which we don’t know the intricate workings. One of the most badass cinematic introduction sections I have read is in this paper. It literally says, I quote, “Since the dawn of time, human beings have asked some fundamental questions: who are we? Why are we here? is there life after death? Unable to answer any of these, in this paper, we will consider cohomology classes …” In a world full of mysteries, we often embrace the term 'black box' to neatly package the unknown. It's a label that says, 'Hey, I'm complex, don't bother figuring me out!' However, just because something is labeled as a black box doesn't mean we shouldn't strive to peek inside. After all, it's the pursuit of understanding that transforms the mysterious into the understood.


This blog did not provide any answer to the question of whether we can confirm the neural net of learning logic. But if it did make the audience think for a moment, it is impossible to prove theoretically that the neural nets will give correct answers to additions in the future or any other use cases it is trained for, then I am satisfied. It is important to remember that if someone claims that they built a neural net to predict the future with 100% certainty, how can one believe in it unless we know the actual future?


 
 
 

Recent Posts

See All

Opmerkingen


Eashwar Sathyamurthy

bottom of page