Consensys recently received a Wells notice from the SEC for its MetaMask product. Consensys aims to defend Ethereum’s status as a commodity. The company also challenges SEC’s authority over cryptocurrencies. Consensys, a prominent Ethereum developer, has filed a lawsuit against the U.S. Securities and Exchange Commission (SEC) over what it deems as an “unlawful seizure […]
MetaMask developer Consensys has asked a Texas Federal Court to declare that Ethereum is not a security in a pushback against sweeping SEC enforcement action.
Several top Ethereum Layer-2 (L2) networks, such as Arbitrum, Optimism, Polygon, zkSync, and Starkware, are preparing to integrate Avail’s Data Availability (DA) solution, according to an April 25 statement shared with CryptoSlate. This integration will grant rollup builders access to Avail DA and its ecosystem, facilitating the development of scalable and efficient rollup architectures across […]
Consensys has sued the SEC for wanting to take action against the firm for its MetaMask products
The blockchain firm has also asked the court to declare Ethereum as not a security
This accumulation signals a resurgence in confidence towards XRP, hinting at a potentially bullish sentiment among these influential market participants.
Meme coin rivals Shiba Inu ($12M raised) and Slothana ($15M raised) grab attention with promises of utility and Solana-based innovation. The meme coin sector of the cryptocurrency market continues to witness significant activity. Two meme coin projects, Shiba Inu (SHIB) and Slothana ($SLOTH), have recently garnered considerable attention with their ongoing funding initiatives. Shiba Inu, […]
Dogeverse, a new multichain meme coin, surpasses $10 million presale target with its unique concept and multi-blockchain technology. The recent presale of Dogeverse ($DOGEVERSE), a new meme-inspired cryptocurrency with multichain capabilities, has attracted significant attention within the crypto community. The project has successfully exceeded its soft cap target of $10 million, marking a noteworthy milestone […]
Choosing between frequentist and Bayesian approaches is the great debate of the last century, with a recent surge in Bayesian adoption in the sciences.
Number of articles referring Bayesian statistics in sciencedirect.com (April 2024) — Graph by the author
What’s the difference?
The philosophical difference is actually quite subtle, where some propose that the great bayesian critic, Fisher, was himself a bayesian in some regard. While there are countless articles that delve into formulaic differences, what are the practical benefits? What does Bayesian analysis offer to the lay data scientist that the vast plethora of highly-adopted frequentist methods do not already? This article aims to give a practical introduction to the motivation, formulation, and application of Bayesian methods. Let’s dive in.
Prior Beliefs
While frequentists deal with describing the exact distributions of any data, the bayesian viewpoint is more subjective. Subjectivity and statistics?! Yes, it’s actually compatible.
Let’s start with something simple, like a coin flip. Suppose you flip a coin 10 times, and get heads 7 times. What is the probability of heads?
P(heads) = 7/10 (0.7)?
Obviously, here we are riddled with low sample size. In a Bayesian POV however, we are allowed to encode our beliefs directly, asserting that if the coin is fair, the chance of heads or tails must be equal i.e. 1/2. While in this example the choice seems pretty obvious, the debate is more nuanced when we get to more complex, less obvious phenomenon.
Yet, this simple example is a powerful starting point, highlighting both the greatest benefit and shortcoming of Bayesian analysis:
Benefit: Dealing with a lack of data. Suppose you are modeling spread of an infection in a country where data collection is scarce. Will you use the low amount of data to derive all your insights? Or would you want to factor-in commonly seen patterns from similar countries into your model i.e. informed prior beliefs. Although the choice is clear, it leads directly to the shortcoming.
Shortcoming: the prior belief is hard to formulate. For example, if the coin is not actually fair, it would be wrong to assume that P (heads) = 0.5, and there is almost no way to find true P (heads) without a long run experiment. In this case, assuming P (heads) = 0.5 would actually be detrimental to finding the truth. Yet every statistical model (frequentist or Bayesian) must make assumptions at some level, and the ‘statistical inferences’ in the human mind are actually a lot like bayesian inference i.e. constructing prior belief systems that factor into our decisions in every new situation. Additionally, formulating wrong prior beliefs is often not a death sentence from a modeling perspective either, if we can learn from enough data (more on this in later articles).
Bayes’ Rule
So what does all this look like mathematically? Bayes’ rule lays the groundwork. Let’s suppose we have a parameter θ that defines some model which could describe our data (eg. θ could represent the mean, variance, slope w.r.t covariate, etc.). Bayes’ rule states that
P (θ = t|data) represents the conditional probability that θ is equal to t, given our data (a.k.a the posterior).
Conversely, P (data|θ) represents the probability of observing our data, if θ = t (a.k.a the ‘likelihood’).
Finally, P (θ=t) is simply the probability that θ takes the value t (the infamous ‘prior’).
So what’s this mysterious t? It can take many possible values, depending on what θ means. In fact, you want to try a lot of values, and check the likelihood of your data for each. This is a key step, and you really really hope that you checked the best possible values for θ i.e. those which cover the maximum likelihood area of seeing your data (global minima, for those who care).
And that’s the crux of everything Bayesian inference does!
Form a prior belief for possible values of θ,
Scale it with the likelihood at each θ value, given the observed data, and
Return the computed result i.e. the posterior, which tells you the probability of each tested θ value.
Graphically, this looks something like:
Prior (left) scaled with the likelihood (middle) forms the posterior (right) (figures adapted from Andrew Gelmans Book). Here, θ encodes the east-west location coordinate of a plane. The prior belief is that the plane is more towards the east than west. The data challenges the prior and the posterior thus lies somehwere in the middle. [image using data generated by author]
Which highlights the next big advantages of Bayesian stats-
We have an idea of the entire shape of θ’s distribution (eg, how wide is the peak, how heavy are the tails, etc.) which can enable more robust inferences. Why? Simply because we can not only better understand but also quantify the uncertainty (as compared to a traditional point estimate with standard deviation).
Since the process is iterative, we can constantly update our beliefs (estimates) as more data flows into our model, making it much easier to build fully online models.
Easy enough! But not quite…
This process involves a lot of computations, where you have to calculate the likelihood for each possible value of θ. Okay, maybe this is easy if suppose θ lies in a small range like [0,1]. We can just use the brute-force grid method, testing values at discrete intervals (10, 0.1 intervals or 100, 0.01 intervals, or more… you get the idea) to map the entire space with the desired resolution.
But what if the space is huge, and god forbid additional parameters are involved, like in any real-life modeling scenario?
Now we have to test not only the possible parameter values but also all their possible combinations i.e. the solution space expands exponentially, rendering a grid search computationally infeasible. Luckily, physicists have worked on the problem of efficient sampling, and advanced algorithms exist today (eg. Metropolis-Hastings MCMC, Variational Inference) that are able to quickly explore high dimensional spaces of parameters and find convex points. You don’t have to code these complex algorithms yourself either, probabilistic computing languages like PyMC or STAN make the process highly streamlined and intuitive.
STAN
STAN is my favorite as it allows interfacing with more common data science languages like Python, R, Julia, MATLAB etc. aiding adoption. STAN relies on state-of-the-art Hamiltonian Monte Carlo sampling techniques that virtually guarantee reasonably-timed convergence for well specified models. In my next article, I will cover how to get started with STAN for simple as well as not-no-simple regression models, with a full python code walkthrough. I will also cover the full Bayesian modeling workflow, which involves model specification, fitting, visualization, comparison, and interpretation.
Amidst generative AI copyright chaos, Apple looks like it’s leading with ethical training methods, navigating legal hurdles to forge a path of compliance and innovation.
Apple is forging a path towards more ethical generative AI – something sorely needed in today’s AI-powered world
Originally appeared here:
Apple is forging a path towards more ethical generative AI – something sorely needed in today’s AI-powered world
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.