Tag: AI

  • LLMs Pitfalls

    LLMs Pitfalls

    Pier Paolo Ippolito

    An introduction to some of the key components surrounding LLMs to produce production-grade applications

    Originally appeared here:
    LLMs Pitfalls

    Go Here to Read this Fast! LLMs Pitfalls

  • 5 Things to do When Evaluating ELT/ETL Tools

    Eva Revear

    A list to make evaluating ELT/ETL tools a bit less daunting

    Photo by Volodymyr Hryshchenko on Unsplash

    We’ve all been there: you’ve attended (many!) meetings with sales reps from all of the SaaS data integration tooling companies and are granted 14 day access to try their wares. Now you have to decide what sorts of things to test in order to figure out definitively if the tool is the right commitment for you and the team.

    I wanted to throw together some notes on key evaluation questions, as well as a few ways to check functionality, as I’m confident that this is a process that I will encounter again and again, and I like to have a template for these types of things.

    These are primarily collected with cloud based integration platforms such as, but not limited to Fivetran, Airbyte, and Rivery in mind, but could apply to other cases as well!

    If you have a favorite way to test out new data tools, add them to the comments!

    1. Create a rubric

    You can find a million articles on evaluation criteria for data integration tooling (I really like this one!), but ultimately it comes down to your data platform and the problems within it that you are trying to solve.

    Gather the team together and determine what these things are. There are, of course obvious features like required source and destination connectors that can be deal breakers, but maybe you’re also looking for a metadata solution that provides lineage, or trying to increase monitoring, or needing to scale something that was built in house and is no longer holding its own.

    When you lay all of that out it also makes it easier to divide up the work of making these evaluations across team members to run in parallel.

    2. Start a simple pipeline running right away

    Pick something pretty simple and get it up and running on day one. This will help create an overall picture of logging, metadata, latency, CDC, and all the other things that come with a pipeline.

    If you are lucky you might even run into a platform error over the course of the 14 days and see how that is handled from the tooling company’s side. If you are dealing with an open source option, it can also help you understand if you are equipped to manage such issues in house.

    Key questions

    • Does the documentation and UI guide you through setting up permissions and keys, scheduling, schema setup, etc in a way that’s intuitive or do you have to reach out to the technical rep for help?
    • If platform errors do occur, are they obvious via logs or is it hard to tell if you or the platform are the problem?
    • How quickly are customers notified, and issues resolved when the platform goes down?

    3. Create a few end to end transforms

    Some tools come with built in DBT integrations, some allow for fully custom Python based transformations. Translating a few transforms, maybe even a somewhat complex one, end to end from your existing solution can give you a good idea of how heavy a lift it will be to move everything over, if it is possible at all.

    Key Questions

    • Can you land the data in the same format that it is landing in now, or will it change in ways that majorly impact upstream dependencies?
    • Are there types of transformations that you do prior to landing data that can’t be done in the tool (joining in supplemental data sources, parsing messy multi-multi level JSON, etc) that will now have to be done in the database post landing?

    4. Throw a non-native data source at it

    Try to process something from a non natively supported source or format (dummy up some fixed width files, or maybe pick an in house tool that exports data out in an unconventional way), or at least talk through how you could, with your technical sales representative. Even if, right now, that’s not an issue, if something does come up, it is worthwhile to at least understand what the options are for putting that functionality into place.

    Key Questions

    • When a non supported source comes up will you have enough flexibility from the tool to build a solution within its framework?
    • When you start adding custom functionality to the framework does the same logging, error handling, state management, etc apply?

    5. Force an error

    Somewhere along one of the test pipelines that you’ve built, throw in a badly formatted file, add bad code into a transform, change the schema, or wreak havoc in some other creative way to see what happens.

    3rd party tools like these can be black boxes in some aspects, and nothing is more frustrating when a pipeline goes down, than incomprehensible error messages.

    Key questions

    • Do error messages and logs make it clear what went wrong and where?
    • What happens to the data that was in the pipeline once you put a fix in place? Does anything get lost, or loaded more times than it should have?
    • Are there options to redirect bad data and allow the rest of the pipeline to keep going?

    A couple of bonuses

    Have a non-technical user ingest a Google sheet

    Needing to integrate data from a manually uploaded spreadsheet is a somewhat more common use case than DE’s often like to think that it is. A tool should make this easy for the producing business team to do without the DE’s getting involved at all.

    Read through the Reddit threads on the tool

    I have found Reddit to be very useful when looking at tooling options. Folks are typically very reasonable in their assessment of positive and negative experiences with a tool, and open to answering questions. At the end of the day even a thorough trial phase will miss things, and this can be an easy way to see if you have some blind spots.


    5 Things to do When Evaluating ELT/ETL Tools was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    5 Things to do When Evaluating ELT/ETL Tools

    Go Here to Read this Fast! 5 Things to do When Evaluating ELT/ETL Tools

  • Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

    Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

    Mandy Liu

    Generated using Microsoft Image Creator by Author

    Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

    Working at Meta was fast-paced, challenging, and intellectually stimulating. For three transformative years, I worked as a Product Data Scientist in Integrity and account security at the bustling London office. Now, as I look back, I find myself continually drawn to the invaluable lessons gleaned during my time there. Beyond the confines of the office, these insights have seamlessly integrated into both my professional and personal growth.

    Lesson 1: Always start with why

    Meta champions meaningful contributions over mere activity. Twice a year, employees undergo rigorous assessments based on how much tangible impact they’ve made in the last 6 months. The focus isn’t solely on tasks completed or volumes achieved; rather, it’s about the consequential outcomes stemming from one’s actions.

    It means that it doesn’t matter what you’ve done or how much you’ve done, it matters what happens because of what you’ve done.

    As a Data Scientist (DS), it’s not about the hours logged or the volume of insights generated; what truly counts is the transformative impact of your discoveries.

    Consider this: by uncovering a segment of users experiencing unusually short session times, you exposed a critical engineering bug affecting a specific device type. As a result, your findings catalyzed the Engineering team to rectify the issue and correctly measure session time, leading to a staggering increase of 20 million more minutes in session time measured per day. That’s the tangible ‘so what’ impact that defines your contribution.(Note that this is a completely made-up example)

    In this “impact” culture, I found myself continually pondering the potential outcome of my projects, delving into the ‘why’ behind my actions. Yet, grappling with the clarity of this ‘why’ wasn’t always a straightforward journey; uncertainty often clouded the path. But one thing was clear: you should determine your direction of travel and not mistake motion for progress.

    Starting with “why” works remarkably for stakeholder management, too. Often, people begin by detailing their process, only circling back to their purpose later, if at all. By probing their “why”, you gain insight into their priorities and motivations, facilitating more meaningful and effective interactions

    For example, when someone comes to you with a request, initiate by probing: “What problems are you trying to solve and when do you want it delivered?” This can go two ways. They might stumble a bit, struggling to articulate their goals. That’s ok. Ask questions to help them clarify, like “I understand that budget is a constraint, but what would you want the campaign to achieve in the first place?”. If they need time to comb things through, politely decline their requests and send them back to their thinking chair — many times they won’t come back because they would realize it’s not urgent or important.

    But if they come back with something like, “We wanna figure out which UI boosts user traffic through A/B testing and launch the new UI by Thanksgiving”, then you smile, nod and dive into measuring that impact!

    Lesson 2: Measure everything

    Generated using Microsoft Image Creator by Author

    You can’t improve what you don’t measure.

    — Peter Drucker

    At Meta, numbers are the name of the game. If it’s not quantified, it might as well not exist. Product teams zero in on a key metric they’re aiming to boost or shrink — whether it’s revenue, active users, or incoming tickets. And the team’s triumph hinges on their ability to nudge that metric in the right direction. By rallying around one primary metric, everyone’s on the same page.

    I’ve seen projects got canceled after a few months because the team could’t quantify goals that tie to the broader company goals.

    In Part 1, we covered how starting with “why” can crack open insights into others’ work and requests. Here’s another golden question to follow up on: “How do you measure success?

    This question often links closely with the “why”. For example, if the aim is to determine which UI drive more user traffic, the measurement can be the number of users who land on the page.

    Moreover, using numbers is a powerful way to navigate requests. As data people, we’re often tasked with “pulling some numbers” or “visualizing x in a dashboard.” I’ve found it helpful to challenge stakeholders to articulate the decisions they’d make based on those numbers. I’ll throw out scenarios like, “Imagine we have this number, say it’s 10%, 50%, or 80%. How would your decisions change accordingly?” It’s surprising how often stakeholders pause or dance around without a clear answer. 7 out of 10 times, they’ll say, “Let me think about it some more,” and then… crickets.

    By doing that, you’ve effectively sifted out requests that fall into the “nice-to-have” category rather than the “must-have” ones.

    Lesson 3: Prioritize, ruthlessly

    While Meta’s impact culture attracts criticism around creating a high-pressure environment and focusing on short-term results, it’s been a goldmine for learning about prioritization. Every time I kick off a project or find myself knee-deep in it, I loop back to these tough questions:

    • What outcomes are we aiming for with this analysis?
    • Do these outcomes align with our stakeholders’ goals or the bigger picture for our team, department, or the company? Think active users, revenue, retention.
    • How do we gauge the impact, and how significant is it? (Remember Lesson 2!)
    • And lastly, who needs to take action based on what we uncover?

    Once you’ve dissected your projects like this, a handy step is to plot them on an effort vs. impact matrix and find that sweet spot for prioritization. Aim to minimize Money Pit ventures while maximizing Easy Wins and Incremental gains in the short term. Save room for those Big Bets that promise long-term payoffs

    Image created by Author using Google Drawing

    When new projects come to your pipeline, evaluate the effort and impact and compare them against your existing projects. If a newcomer looks more promising, don’t hesitate to shuffle things around — sometimes that means dropping an old project to make room for the shiny new one.

    As company priorities often shift, it is ok to de-prioritize or re-prioritize projects midway; But here’s the trick: keep that constant evaluation going strong. Break down each project with those trusty questions, so you can keep your eyes locked on the prize — the most impactful stuff.

    Takeaways

    Lessons

    1. Always start with why
    2. Measure everything
    3. Prioritize, ruthlessly

    Questions for requests

    1. Why — What problems are you trying to solve?
    2. When — When do you expect it to be delieverd?
    3. How — How do you measure success?
    4. What — Let’s say we have this number and it is 10%, 50% or 80%. How would your decisions change accordingly?

    Questions for prioritization

    1. What — What outcomes are we aiming for with this project?
    2. Why — Do these outcomes align with our stakeholders’ goals or the bigger picture for our team, department, or the company?
    3. How — How do we gauge the impact, and just how significant is it?
    4. Who — Who needs to take action based on what we uncover?

    Disclaimer: This article reflects my experience and opinions only. It does not represent the views or opinions of any company or organization.


    Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

    Go Here to Read this Fast! Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

  • How to Use and Test WizardLM2: Microsoft’s New LLM

    How to Use and Test WizardLM2: Microsoft’s New LLM

    Eivind Kjosbakken

    Learn how to run and test Microsoft’s new LLM, WizardLM2, and use it to perform tasks like question-answering and information extraction

    Originally appeared here:
    How to Use and Test WizardLM2: Microsoft’s New LLM

    Go Here to Read this Fast! How to Use and Test WizardLM2: Microsoft’s New LLM

  • Getting Started with the Dev Containers Extension

    Getting Started with the Dev Containers Extension

    Rami Krispin

    This is a step-by-step tutorial for getting started with the Dev Containers extension. This tutorial is the first of a sequence of…

    Originally appeared here:
    Getting Started with the Dev Containers Extension

    Go Here to Read this Fast! Getting Started with the Dev Containers Extension

  • How does temperature impact next token prediction in LLMs?

    How does temperature impact next token prediction in LLMs?

    Ankur Manikandan

    TLDR
    1. At a temperature of 1, the probability values are the same as those derived from the standard softmax function.
    2. Raising the temperature inflates the probabilities of the less likely tokens, thereby broadening the range of potential candidates (or diversity) for the model’s next token prediction.
    3. Lowering the temperature, on the other hand, makes the probability of the most likely token approach 1.0, boosting the model’s confidence. Decreasing the temperature effectively eliminates the uncertainty within the model.

    Google colab notebook.

    Introduction
    Large Language Models (LLMs) are versatile generative models suited for a wide array of tasks. They can produce consistent, repeatable outputs or generate creative content by placing unlikely words together. The “temperature” setting allows users to fine-tune the model’s output, controlling the degree of predictability.

    Let’s take a hypothetical example to understand the impact of temperature on the next token prediction.

    We asked an LLM to complete the sentence, “This is a wonderful _____.” Let’s assume the potential candidate tokens are:

    |   token    | logit |
    |------------|-------|
    | day | 40 |
    | space | 4 |
    | furniture | 2 |
    | experience | 35 |
    | problem | 25 |
    | challenge | 15 |

    The logits are passed through a softmax function so that the sum of the values is equal to one. Essentially, the softmax function generates probability estimates for each token.

    Standard softmax function

    Let’s calculate the probability estimates in Python.

    import numpy as np
    import seaborn as sns
    import pandas as pd
    import matplotlib.pyplot as plt
    from ipywidgets import interactive, FloatSlider


    def softmax(logits):
    exps = np.exp(logits)
    return exps / np.sum(exps)


    data = {
    "tokens": ["day", "space", "furniture", "experience", "problem", "challenge"],
    "logits": [5, 2.2, 2.0, 4.5, 3.0, 2.7]
    }
    df = pd.DataFrame(data)
    df['probabilities'] = softmax(df['logits'].values)
    df
    | No. |   tokens   | logits | probabilities |
    |-----|------------|--------|---------------|
    | 0 | day | 5.0 | 0.512106 |
    | 1 | space | 2.2 | 0.031141 |
    | 2 | furniture | 2.0 | 0.025496 |
    | 3 | experience | 4.5 | 0.310608 |
    | 4 | problem | 3.0 | 0.069306 |
    | 5 | challenge | 2.7 | 0.051343 |
    ax = sns.barplot(x="tokens", y="probabilities", data=df)
    ax.set_title('Softmax Probability Estimates')
    ax.set_ylabel('Probability')
    ax.set_xlabel('Tokens')
    plt.xticks(rotation=45)
    for bar in ax.patches:
    ax.text(bar.get_x() + bar.get_width() / 2, bar.get_height(), f'{bar.get_height():.2f}',
    ha='center', va='bottom', fontsize=10, rotation=0)
    plt.show()

    The softmax function with temperature is defined as follows:

    where (T) is the temperature, (x_i) is the (i)-th component of the input vector (logits), and (n) is the number of components in the vector.

    def softmax_with_temperature(logits, temperature):
    if temperature <= 0:
    temperature = 1e-10 # Prevent division by zero or negative temperatures
    scaled_logits = logits / temperature
    exps = np.exp(scaled_logits - np.max(scaled_logits)) # Numerical stability improvement
    return exps / np.sum(exps)


    def plot_interactive_softmax(temperature):
    probabilities = softmax_with_temperature(df['logits'], temperature)
    plt.figure(figsize=(10, 5))
    bars = plt.bar(df['tokens'], probabilities, color='blue')
    plt.ylim(0, 1)
    plt.title(f'Softmax Probabilities at Temperature = {temperature:.2f}')
    plt.ylabel('Probability')
    plt.xlabel('Tokens')
    # Add text annotations
    for bar, probability in zip(bars, probabilities):
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, f"{probability:.2f}", ha='center', va='bottom', fontsize=10)
    plt.show()

    interactive_plot = interactive(plot_interactive_softmax, temperature=FloatSlider(value=1, min=0, max=2, step=0.01, description='Temperature'))
    interactive_plot

    At T = 1,

    At a temperature of 1, the probability values are the same as those derived from the standard softmax function.

    At T > 1,

    Raising the temperature inflates the probabilities of the less likely tokens, thereby broadening the range of potential candidates (or diversity) for the model’s next token prediction.

    At T < 1,

    Lowering the temperature, on the other hand, makes the probability of the most likely token approach 1.0, boosting the model’s confidence. Decreasing the temperature effectively eliminates the uncertainty within the model.

    Conclusion

    LLMs leverage the temperature parameter to offer flexibility in their predictions. The model behaves predictably at a temperature of 1, closely following the original softmax distribution. Increasing the temperature introduces greater diversity, amplifying less likely tokens. Conversely, decreasing the temperature makes the predictions more focused, increasing the model’s confidence in the most probable token by reducing uncertainty. This adaptability allows users to tailor LLM outputs to a wide array of tasks, striking a balance between creative exploration and deterministic output.

    Unless otherwise noted, all images are by the author.


    How does temperature impact next token prediction in LLMs? was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    How does temperature impact next token prediction in LLMs?

    Go Here to Read this Fast! How does temperature impact next token prediction in LLMs?