Conditional probability is better than probability ; IF you have the relevant information
Introduction
Anyone who has ever studied probability has heard of the long-established definition of probability of “Probability can be defined as the number of favorable outcomes divided by the total number outcomes.” I can still hear my 4th grade teacher reiterating this!
While this definition is correct, it often makes me wonder, how accurate is this definition for the real-world? How accurate is it when we have some more information about the favorable outcomes? To put it more clearly, when we have some more “conditions” to put to our favorable outcomes.
Putting “conditions” is like slicing your original pie of number of favorable outcomes in a number of ways, using multiple conditions, to give you the slice that represents the number of favorable outcomes for you more truly. The image below attempts to depict this concept in a very brief manner.
What better way than to represent what is in the mind’s of a lot of international data science students studying in the US and looking for jobs! The original number of available jobs is depicted on the extreme left.
1st Condition : Introducing the 1st condition for “Work Experience Duration” refines the slice for the number of available jobs for a new joiner.
2nd Condition : Furthermore, introducing the 2nd condition for “Nationality/Citizenship” refines the slice even more.
3rd Condition : The small dark blue slice in the extreme right chart represents the most accurate representation for the number of available jobs (number of favorable outcomes).
Before proceeding into why conditional probability may be better than probability, let’s do a quick recap of the definitions.
2. Definition of probability and conditional probability
Probability :
P(A) = Number of favorable outcomes for A / Total number of outcomes
Conditional probability :
Now consider two events A and B. The foundation of conditional probability is when there is an “event given another event”. In this case, when one says A given B, what that means is the event A occurring given event B has occurred. So that is attaching the “condition of B, to A”.
P(A|B) = P(A intersection B) / P(B) where
P(A intersection B)* is given as the probability of both event A and event B occurring.
*given (A intersection B) and (A’ intersection B) are mutually exclusive. Hence (A intersection B) union (A’ intersection B) = B.
After defining these slightly confusing definitions, I will move to why I think conditional probability is actually better.
3. Example — Inspiration for this article
To start off, the reason I got an idea for this article was when I was watching a Bollywood movie the other day and when there was a scene about two old friends discussing the probability of them bumping into each other!
Let me introduce some more information about this scene:
- 1st friend : Police officer, originally from Mumbai city, who was traveling to Kalimpong; a small town, for a case.
- 2nd friend : Math professor, who was a resident of the town — Kalimpong.
These friends know each other since they both studied in the same university.
- Currently, the friends met each other at a cafe where the professor would go everyday.
After introducing this information, let’s go back to what was the probability of both of them bumping into each other in Kalimpong.
Police Officer : “Bro, what are the chances!”
Math Professor : “One out of 95675”
Police Officer : “Wrong! You did not count me”
Math Professor “I did. The current population is 95674”
Hmm… so let’s break this logic :
Initial Probability Calculation:
- The math professor calculated the probability of meeting his friend, the police officer as 1/95,675.
- This assumes that all the 95,674 residents of Kalimpong have the same probability of meeting the professor as the police officer.
Why is this calculation inaccurate:
- This calculation assumes that meeting the police officer is the same as meeting ANY OTHER resident of Kalimpong!
Introducing conditional probability:
Let’s consider some specific scenarios
I. Relevant Information:
- The police officer is a resident of Mumbai who traveled to Kalimpong.
- The math professor goes to this cafe every day.
- The police officer happened to go to the same cafe this one time.
II. Conditional Events:
Event A: The professor and police officer meet in Kalimpong.
Event B: The police officer travels from Mumbai to Kalimpong .
Probability of the two friends meeting :
1. The probability of police officer traveling from Mumbai to Kalimpong, depends on factors such as:
- How often does he travel for work?
- How often does he get assigned to cases from small towns?
- Let’s assume this probability is 0.1%.
2. The probability of the two friends meeting, depends on factors such as:
- How often do they both go to the cafe?
- How popular is the cafe?
- The professor regularly goes to the cafe.
- Let’s assume this probability is 1%.
Final calculations :
- The probability of the two friends meeting in Kalimpong, given that the police officer is there, is 0.001%.
- This is a simplistic representation of the concept, but what I am trying to say is to always look for more relevant information to refine your probability.
Conclusion
Probability is simple and complicated at the same time! However there is always refining we can do with any additional information that we are provided. In real world situations, always try to look for how additional information can help you add conditions to make your probabilities more accurate.
Thank you for reading and I hope this article was useful to you!
From Assumptions to Accuracy: The Role of Conditional Probability in Real-World Predictions was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
From Assumptions to Accuracy: The Role of Conditional Probability in Real-World Predictions