Blog

  • Alien Base crypto soars nearly 60% in 24 hours – Analyzing the sudden jump

    Benjamin Njiri

    Alien Base crypto surged nearly 60% in 24 hours. 
    The altcoin is less than a month old but has over 80K users.

    Alien Base [ALB] was one of the new and best-performing assets on 28 October.

    The post Alien Base crypto soars nearly 60% in 24 hours – Analyzing the sudden jump appeared first on AMBCrypto.

    Go here to Read this Fast! Alien Base crypto soars nearly 60% in 24 hours – Analyzing the sudden jump

    Originally appeared here:
    Alien Base crypto soars nearly 60% in 24 hours – Analyzing the sudden jump

  • Pundit Believes XRP Could Eclipse Bitcoin Soon for These Reasons

    Pundit Believes XRP Could Eclipse Bitcoin Soon for These Reasons

    Newton Gitonga

    XRP Explosion In The Cards As Ripple Renews Optimism in the U.S. Market With Latest Acquisition

    Former Ripple executive and crypto enthusiast Matt Hamilton has suggested that XRP may soon outpace Bitcoin as the leading cryptocurrency.

    Go here to Read this Fast! Pundit Believes XRP Could Eclipse Bitcoin Soon for These Reasons

    Originally appeared here:
    Pundit Believes XRP Could Eclipse Bitcoin Soon for These Reasons

  • Metaplanet Becomes The Second-Largest Bitcoin Corporate Holder In Asia After Buying Another $10 Million In BTC

    Metaplanet Becomes The Second-Largest Bitcoin Corporate Holder In Asia After Buying Another $10 Million In BTC

    Brenda Ngari

    Crypto Community Fears Centralization As MicroStrategy’s Bitcoin Holdings Hit 132,500 BTC

    Japanese firm Metaplanet has acquired an additional $10 million in Bitcoin (BTC), bringing its total holdings of the flagship crypto to over $1,000 as it continues with its October BTC shopping spree. The Tokyo-based company has purchased 156.78 BTC for approximately $10.4 million, bringing its total holdings to 1,108 ($68.8 million), it announced on Monday. […]

    Go here to Read this Fast! Metaplanet Becomes The Second-Largest Bitcoin Corporate Holder In Asia After Buying Another $10 Million In BTC

    Originally appeared here:
    Metaplanet Becomes The Second-Largest Bitcoin Corporate Holder In Asia After Buying Another $10 Million In BTC

  • AI Strategy Makes Millions Buying POPCAT and MAGA, Now Buys PCHAIN for Another 12,000x Gains by January 2025

    AI Strategy Makes Millions Buying POPCAT and MAGA, Now Buys PCHAIN for Another 12,000x Gains by January 2025

    Guest Author

    AI Strategy Makes Millions Buying POPCAT and MAGA, Now Buys PCHAIN for Another 12,000x Gains by January 2025

    AI-driven strategies are redefining crypto now. AI-based crypto projects are beginning to gain a better reputation than other types of cryptocurrencies. POPCAT and MAGA are examples of these projects. The same AI system that discovered POPCAT and MAGA has identified a new opportunity, PropiChain (PCHAIN). Positioned to generate 12,000x returns by January 2025, PCHAIN’s revolutionary […]

    Go here to Read this Fast! AI Strategy Makes Millions Buying POPCAT and MAGA, Now Buys PCHAIN for Another 12,000x Gains by January 2025

    Originally appeared here:
    AI Strategy Makes Millions Buying POPCAT and MAGA, Now Buys PCHAIN for Another 12,000x Gains by January 2025

  • PropiChain (PCHAIN) Amongst Best 3 RWA and Crypto AI Altcoins Poised to Mint Millionaires by 2025

    Guest Author

    PropiChain (PCHAIN) Amongst Best 3 RWA and Crypto AI Altcoins Poised to Mint Millionaires by 2025

    2025 is just a stone’s throw away, and as we keep getting closer, we see that some of the latest trends in RWA (Real-World Assets) and Crypto AI altcoins are making waves. Among these tokens is PropiChain, which is praised alongside other crypto AI projects for bringing good returns. This article focuses on the best […]

    Go here to Read this Fast! PropiChain (PCHAIN) Amongst Best 3 RWA and Crypto AI Altcoins Poised to Mint Millionaires by 2025

    Originally appeared here:
    PropiChain (PCHAIN) Amongst Best 3 RWA and Crypto AI Altcoins Poised to Mint Millionaires by 2025

  • Shiba At $5 By 2030? Analyst Calls It A Pipe Dream As He Grabs $0.0171 WallitIQ (WLTQ) Coin Set To Reach $3

    Shiba At $5 By 2030? Analyst Calls It A Pipe Dream As He Grabs $0.0171 WallitIQ (WLTQ) Coin Set To Reach $3

    Guest Author

    Shiba At $5 By 2030? Analyst Calls It A Pipe Dream As He Grabs $0.0171 WallitIQ (WLTQ) Coin Set To Reach $3

    With the Shiba Inu price growth slowing down significantly since its peak, it is facing increased scrutiny as many begin to question the long-term viability of Shiba Inu (SHIB). Although the coin has seen a slight uptick recently, analysts suggest that the dream of the Shiba Inu price reaching $5 by 2030 may be overly […]

    Go here to Read this Fast! Shiba At $5 By 2030? Analyst Calls It A Pipe Dream As He Grabs $0.0171 WallitIQ (WLTQ) Coin Set To Reach $3

    Originally appeared here:
    Shiba At $5 By 2030? Analyst Calls It A Pipe Dream As He Grabs $0.0171 WallitIQ (WLTQ) Coin Set To Reach $3

  • GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions

    GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions

    Alexander Watson

    What The Paper on LLM Reasoning Got Right — And What It Missed.

    Co-authors: Alex Watson, Yev Meyer, Dane Corneil, Maarten Van Segbroeck (Gretel.ai)

    Source: Gretel.ai

    Introduction

    Large language models (LLMs) have recently made significant strides in AI reasoning, including mathematical problem-solving. However, a recent paper titled “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models” by Mirzadeh et al. raises questions about the true capabilities of these models when it comes to mathematical reasoning. We have reviewed the paper and found it to be a valuable contribution to the ongoing discussion about AI capabilities and limitations, however, our analysis suggests that its conclusions may not fully capture the complexity of the issue.

    The GSM-Symbolic Benchmark

    The authors introduce GSM-Symbolic, an enhanced benchmark derived from the popular GSM8K dataset. This new benchmark allows for the generation of diverse question variants, enabling a more nuanced evaluation of LLMs’ performance across various setups. The study’s large-scale analysis of 25 state-of-the-art open and closed models provides significant insights into how these models behave when faced with mathematical reasoning tasks.

    Figure 1: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (Source: Mirzadeh et al., GSM-Symbolic Paper)

    Performance Variability and Model Comparisons

    One of the most surprising findings is the high variability in model performance across different instantiations of the same question. All models exhibit “significant variability in accuracy” when tested on GSM-Symbolic. This variability raises concerns about the reliability of currently reported metrics on the GSM8K benchmark, which relies on single point-accuracy responses.

    Figure 3: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (Source: Mirzadeh et al., GSM-Symbolic Paper)

    Not all models are created equal. Llama-3–8b and GPT-4o are clear outliers in that they don’t exhibit as significant of a drop on the new benchmark as other models like gemma-2–9b, phi-3, phi-3.5 and mathstral-7b. This observations suggests two important points:

    1. Llama-3–8b and GPT-4o generally demonstrate a more robust understanding of mathematical concepts, although they are still not immune to performance variations.
    2. The training data for Llama-3–8b and GPT-4o likely has not been contaminated (or at least not to the same extent) with GSM8K data. In this context, data contamination refers to the unintentional inclusion of test or benchmark data in a model’s training set, leading to artificially inflated model performance during evaluation. If contamination had occurred, as the authors hypothesize for some models, we would expect to see very high performance on GSM8K but significantly lower performance on even slight variations of these problems.

    These findings highlight a opportunity for improvement through the use of synthetic data, where properly designed synthetic datasets can address both of these points for anyone training models:

    1. To mitigate potential data contamination issues, there’s no need to use the original GSM8K data in training when high-quality synthetic versions can be generated (blog link). These synthetic datasets retain the mathematical reasoning challenges of GSM8K without reusing the exact problems or solutions, thus preserving the integrity of the model’s evaluation.
    2. Even more importantly, it’s possible to generate synthetic data that surpass the quality of both the OpenAI GSM8K and Apple GSM-Symbolic datasets. This approach can lead to a more robust understanding of mathematical concepts, addressing the performance variability observed in current models.

    Sensitivity to Changes and Complexity

    The authors show that LLMs are more sensitive to changes in numerical values than to changes in proper names within problems, suggesting that the models’ understanding of the underlying mathematical concepts may not be as robust as previously thought. As the complexity of questions increases (measured by the number of clauses), the performance of all models degrades, and the variance in their performance increases. This highlights the importance of using diverse data in training, and this is something that synthetics can help with. As the authors demonstrate, there is logically no reason why a AI model should perform worse on a given set of problems, with just a simple change in numbers or a slight variation in the number of clauses.

    Figure 4: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (Source: Mirzadeh et al., GSM-Symbolic Paper)

    The GSM-NoOp Challenge

    Perhaps the most concerning finding is the introduction of GSM-NoOp, a dataset designed to challenge the reasoning capabilities of LLMs. By adding seemingly relevant but ultimately inconsequential information to problems, the authors observed substantial performance drops across all models — up to 65% for some. The authors propose that this points to current LLMs relying more on a type of pattern matching than true logical reasoning

    Figure 6: GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (Source: Mirzadeh et al., GSM-Symbolic Paper)

    A Critical Perspective on the Paper’s Conclusions

    While the GSM-Symbolic study provides valuable insights into the performance of LLMs on mathematical reasoning tasks, it’s important to critically examine the paper’s conclusions. The authors argue that the observed limitations suggest LLMs are not capable of true logical reasoning. However, this interpretation may be oversimplifying a complex issue.

    The paper’s argument for LLMs relying on pattern matching rather than reasoning seems less definitive when examined closely. It’s clear that these models are not perfect reasoners — if they were, they would achieve 100% accuracy on GSM8K. But the leap from imperfect performance to a lack of reasoning capability is not necessarily justified.

    There are at least two potential explanations for why LLMs, like humans, sometimes get questions wrong:

    1. The model tries to strictly pattern match a problem to something it has seen before, and fails if it can’t.
    2. The model tries to follow a logical program but has a certain (compounding) probability of making an error at each step, as expected based on the fact that it literally samples tokens.

    The paper seems to lean towards explanation (1), but doesn’t make a convincing case for why this should be preferred over explanation (2). In fact, (2) is more akin to human-like reasoning and potentially more interesting from a research perspective.

    Let’s examine each main finding of the paper through this critical lens:

    GSM-Symbolic Performance

    The GSM-Symbolic approach is a valuable method for dataset expansion, validating the potential of synthetic data generation techniques like those used by Gretel. However, it’s worth noting that model performance doesn’t completely fall apart on these new variants — it just gets somewhat worse. If the models were strictly pattern matching, we might expect performance to drop to near zero on these new variants. The observed behavior seems more consistent with a model that can generalize to some degree but makes more errors on unfamiliar problem structures.

    Even human experts are not infallible. On the MATH benchmark, for instance, former math olympians typically scored 18/20 or 19/20, making small arithmetic errors. This suggests that error-prone reasoning, rather than a lack of reasoning capability, might be a more accurate description of both human and LLM performance.

    Varying Difficulty

    The paper’s findings on performance degradation with increasing question complexity are consistent with the idea of compounding errors in a multi-step reasoning process. As the number of steps increases, so does the probability of making an error at some point in the chain. This behavior is observed in human problem-solving as well and doesn’t necessarily indicate a lack of reasoning ability.

    GSM-NoOp Challenge

    The GSM-NoOp results, may not be as directly related to reasoning capability as the paper suggests. In real-world scenarios, we typically assume that all information provided in a problem statement is relevant. For instance, in the example question in Figure 7, a reasonable human might infer (like the LLMs did) that the size of the kiwis was only mentioned because they were discarded.

    Figure 7: GSM-Symbolic: Example GSM No-Op question. (Source: Mirzadeh et al., GSM-Symbolic Paper)

    The ability to discern relevant information from irrelevant information, especially when the irrelevant information is inserted with the intent to be misleading (i.e. seemingly relevant), is a separate skill from pure mathematical reasoning.

    The authors include a follow-up experiment (NoOp-NoOp) in which the models are implicitly “warned” of the misleading intent: they use few-shot examples that also contain irrelevant information. The subset of models illustrated with this experiment still show a drop in performance. Several follow-up experiments could serve to better understand the phenomenon:

    1. Expand the NoOp-NoOp experiment to more models;
    2. Measure how well models perform when explicitly warned that some information may be irrelevant in the prompt;
    3. Fine-tune models on synthetic training examples that include irrelevant information in addition to examples that contain entirely relevant information.

    Opportunities for Improvement: The Promise of Synthetic Data

    While the paper by Mirzadeh et al. highlights important limitations in current LLMs, at Gretel we have developed datasets that address many of the challenges identified in the paper:

    1. Synthetic GSM8K Dataset: Available on HuggingFace at gretelai/synthetic-gsm8k-reflection-405b, this dataset focuses on generating more complex, multi-step reasoning versions of problems than what existed in the original human generated dataset from OpenAI. It incorporates advanced prompting techniques, including Reflection and other cognitive models, to capture detailed reasoning processes. This approach has shown significant improvements, particularly for very hard problems, demonstrating its potential to enhance AI’s ability to handle complex, multi-step reasoning tasks. As covered in our blog, Gretel’s synthetic data created using these techniques achieved a 92.3% win-rate on problem complexity and an 82.7% win-rate for educational value over the standard Llama 3.1 405B parameter model outputs, using these advanced techniques as judged by GPT-4o- demonstrating that LLM reasoning can further be unlocked with more sophisticated training data examples and prompting techniques than the basic Chain-of-Thought used in the paper.
    Source: https://gretel.ai/blog/teaching-ai-to-think-a-new-approach-with-synthetic-data-and-reflection

    2. Synthetic Text-to-SQL Dataset: Generated by Gretel to help improve LLMs ability to interact with SQL-based databases/warehouses & lakes, available at gretelai/synthetic_text_to_sql, has proven highly effective in improving model performance on Text-to-SQL tasks. When used to fine-tune CodeLlama models, it led to 36%+ improvements on the BIRD benchmark, a challenging cross-domain Text-to-SQL evaluation platform. Further supporting the theory about today’s LLMs being trained on data that is too simple and leading to memorization, a single epoch of fine-tuning the Phi-3 and Llama 3.1 models on this dataset yielded a 300%+ improvement on BIRD benchmark problems labeled as “very hard”.

    These results demonstratethat high-quality synthetic data can be a powerful tool in addressing the limitations of current LLMs in complex reasoning tasks.

    Future Directions

    In conclusion, the GSM-Symbolic paper provides valuable insights into the current limitations of LLMs in mathematical reasoning tasks. However, its conclusions should be approached critically. The observed behavior of LLMs could be interpreted in multiple ways, and it’s possible that the paper’s emphasis on pattern matching over reasoning may be oversimplifying a more complex issue.

    The limitations identified by the study are real and significant. The variability in performance, sensitivity to numerical changes, and struggles with irrelevant information all point to areas where current LLMs can be improved.

    However, as demonstrated by more advanced models such as GPT-4o and Llama 3.1 above- by synthesizing diverse, challenging problem sets that push the boundaries of what AI models can tackle, we can develop LLMs that exhibit more robust, human-like reasoning capabilities.

    References

    1. I. Mirzadeh, K. Alizadeh, H. Shahrokhi, O. Tuzel, S. Bengio, and M. Farajtabar. GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models. 2024.


    GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions

    Go Here to Read this Fast! GSM-Symbolic: Analyzing LLM Limitations in Mathematical Reasoning and Potential Solutions

  • Data Minimization Does Not Guarantee Privacy

    Data Minimization Does Not Guarantee Privacy

    Prakhar Ganesh

    Just because it was minimized, doesn’t mean it’s secure!

    Based on our paper The Data Minimization Principle in Machine Learning by Prakhar Ganesh, Cuong Tran, Reza Shokri, and Ferdinando Fioretto

    The Data Minimization Principle

    The proliferation of data-driven systems and ML applications escalates a number of privacy risks, including those related to unauthorized access to sensitive information. In response, international data protection frameworks like the European General Data Protection Regulation (GDPR), the California Privacy Rights Act (CPRA), the Brazilian General Data Protection Law (LGPD), etc. have adopted data minimization as a key principle to mitigate these risks.

    Excerpts of the data minimization principle from six different data protection regulations across the world. Image by Author.

    At its core, the data minimization principle requires organizations to collect, process, and retain only personal data that is adequate, relevant, and limited to what is necessary for specified objectives. It’s grounded in the expectation that not all data is essential and, instead, contributes to a heightened risk of information leakage. The data minimization principle builds on two core pillars, purpose limitation and data relevance.

    Purpose Limitation

    Data protection regulations mandate that data be collected for a legitimate, specific and explicit purpose (LGPD, Brazil) and prohibit using the collected data for any other incompatible purpose from the one disclosed (CPRA, USA). Thus, data collectors must define a clear, legal objective before data collection and use the data solely for that objective. In an ML setting, this purpose can be seen as collecting data for training models to achieve optimal performance on a given task.

    Data Relevance

    Regulations like the GDPR require that all collected data be adequate, relevant, and limited to what is necessary for the purposes it was collected for. In other words, data minimization aims to remove data that does not serve the purpose defined above. In ML contexts, this translates to retaining only data that contributes to the performance of the model.

    Data minimization. Image by Author

    Privacy expectations through minimization

    As you might have already noticed, there is an implicit expectation of privacy through minimization in data protection regulations. The data minimization principle has even been hailed by many in the public discourse (EDPS, Kiteworks, The Record, Skadden, k2view) as a principle to protect privacy.

    The EU AI Act states in Recital 69, “The right to privacy and to protection of personal data must be guaranteed throughout the entire lifecycle of the AI system. In this regard, the principles of data minimisation and data protection by design and by default, as set out in Union data protection law, are applicable when personal data are processed”.

    However, this expectation of privacy from minimization overlooks a crucial aspect of real world data–the inherent correlations among various features! Information about individuals is rarely isolated, thus, merely minimizing data, may still allow for confident reconstruction. This creates a gap, where individuals or organizations using the operationalization attempts of data minimization, might expect improved privacy, despite using a framework that is limited to only minimization.

    The Correct Way to Talk about Privacy

    Privacy auditing often involves performing attacks to assess real-world information leakage. These attacks serve as powerful tools to expose potential vulnerabilities and by simulating realistic scenarios, auditors can evaluate the effectiveness of privacy protection mechanisms and identify areas where sensitive information may be revealed.

    Some adversarial attacks that might be relevant in this situation include reconstruction and re-identification attacks. Reconstruction attacks aim to recover missing information from a target dataset. Re-identification attacks aim to re-identify individuals using partial or anonymized data.

    The Overall Framework. Image by Author

    The gap between Data Minimization and Privacy

    Consider the example of minimizing data from an image, and removing pixels that do not contribute to the performance of the model. Solving that optimization would give you minimized data that looks something like this.

    Image by Author

    The trends in this example are interesting. As you’ll notice, the central vertical line is preserved in the image of the digit ‘1’, while the outer curves are retained for ‘0’. In other words, while 50% of the pixels are removed, it doesn’t seem like any information is lost. One can even show that is the case by applying a very simple reconstruction attack using data imputation.

    Image by Author

    Despite minimizing the dataset by 50%, the images can still be reconstructed using overall statistics. This provides a strong indication of privacy risks and suggests that a minimized dataset does not equate to enhanced privacy!

    So What Can We Do?

    While data protection regulations aim to limit data collection with an expectation of privacy, current operationalizations of minimization fall short of providing robust privacy safeguards. Notice, however, that this is not to say that minimization is incompatible with privacy; instead, the emphasis is on the need for approaches that incorporate privacy into their objectives, rather than treating them as an afterthought.

    We provide a deeper empirical exploration of data minimization and its misalignment with privacy, along with potential solutions, in our paper. We seek to answer a critical question: “Do current data minimization requirements in various regulations genuinely meet privacy expectations in legal frameworks?” Our evaluations reveal that the answer is, unfortunately, no.


    Data Minimization Does Not Guarantee Privacy was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    Data Minimization Does Not Guarantee Privacy

    Go Here to Read this Fast! Data Minimization Does Not Guarantee Privacy