Tag: AI

LLMs Pitfalls

Pier Paolo Ippolito

An introduction to some of the key components surrounding LLMs to produce production-grade applications

Continue reading on Towards Data Science »

Originally appeared here:
LLMs Pitfalls

Go Here to Read this Fast! LLMs Pitfalls

May 7, 2024
5 Things to do When Evaluating ELT/ETL Tools
Eva Revear
A list to make evaluating ELT/ETL tools a bit less daunting

Photo by Volodymyr Hryshchenko on Unsplash

We’ve all been there: you’ve attended (many!) meetings with sales reps from all of the SaaS data integration tooling companies and are granted 14 day access to try their wares. Now you have to decide what sorts of things to test in order to figure out definitively if the tool is the right commitment for you and the team.

I wanted to throw together some notes on key evaluation questions, as well as a few ways to check functionality, as I’m confident that this is a process that I will encounter again and again, and I like to have a template for these types of things.

These are primarily collected with cloud based integration platforms such as, but not limited to Fivetran, Airbyte, and Rivery in mind, but could apply to other cases as well!

If you have a favorite way to test out new data tools, add them to the comments!

1. Create a rubric

You can find a million articles on evaluation criteria for data integration tooling (I really like this one!), but ultimately it comes down to your data platform and the problems within it that you are trying to solve.

Gather the team together and determine what these things are. There are, of course obvious features like required source and destination connectors that can be deal breakers, but maybe you’re also looking for a metadata solution that provides lineage, or trying to increase monitoring, or needing to scale something that was built in house and is no longer holding its own.

When you lay all of that out it also makes it easier to divide up the work of making these evaluations across team members to run in parallel.

2. Start a simple pipeline running right away

Pick something pretty simple and get it up and running on day one. This will help create an overall picture of logging, metadata, latency, CDC, and all the other things that come with a pipeline.

If you are lucky you might even run into a platform error over the course of the 14 days and see how that is handled from the tooling company’s side. If you are dealing with an open source option, it can also help you understand if you are equipped to manage such issues in house.

Key questions
- Does the documentation and UI guide you through setting up permissions and keys, scheduling, schema setup, etc in a way that’s intuitive or do you have to reach out to the technical rep for help?
- If platform errors do occur, are they obvious via logs or is it hard to tell if you or the platform are the problem?
- How quickly are customers notified, and issues resolved when the platform goes down?
3. Create a few end to end transforms

Some tools come with built in DBT integrations, some allow for fully custom Python based transformations. Translating a few transforms, maybe even a somewhat complex one, end to end from your existing solution can give you a good idea of how heavy a lift it will be to move everything over, if it is possible at all.

Key Questions
- Can you land the data in the same format that it is landing in now, or will it change in ways that majorly impact upstream dependencies?
- Are there types of transformations that you do prior to landing data that can’t be done in the tool (joining in supplemental data sources, parsing messy multi-multi level JSON, etc) that will now have to be done in the database post landing?
4. Throw a non-native data source at it

Try to process something from a non natively supported source or format (dummy up some fixed width files, or maybe pick an in house tool that exports data out in an unconventional way), or at least talk through how you could, with your technical sales representative. Even if, right now, that’s not an issue, if something does come up, it is worthwhile to at least understand what the options are for putting that functionality into place.

Key Questions
- When a non supported source comes up will you have enough flexibility from the tool to build a solution within its framework?
- When you start adding custom functionality to the framework does the same logging, error handling, state management, etc apply?
5. Force an error

Somewhere along one of the test pipelines that you’ve built, throw in a badly formatted file, add bad code into a transform, change the schema, or wreak havoc in some other creative way to see what happens.

3rd party tools like these can be black boxes in some aspects, and nothing is more frustrating when a pipeline goes down, than incomprehensible error messages.

Key questions
- Do error messages and logs make it clear what went wrong and where?
- What happens to the data that was in the pipeline once you put a fix in place? Does anything get lost, or loaded more times than it should have?
- Are there options to redirect bad data and allow the rest of the pipeline to keep going?
A couple of bonuses

Have a non-technical user ingest a Google sheet

Needing to integrate data from a manually uploaded spreadsheet is a somewhat more common use case than DE’s often like to think that it is. A tool should make this easy for the producing business team to do without the DE’s getting involved at all.

Read through the Reddit threads on the tool

I have found Reddit to be very useful when looking at tooling options. Folks are typically very reasonable in their assessment of positive and negative experiences with a tool, and open to answering questions. At the end of the day even a thorough trial phase will miss things, and this can be an easy way to see if you have some blind spots.

5 Things to do When Evaluating ELT/ETL Tools was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
5 Things to do When Evaluating ELT/ETL Tools

Go Here to Read this Fast! 5 Things to do When Evaluating ELT/ETL Tools
May 7, 2024
Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life
Mandy Liu
Generated using Microsoft Image Creator by Author

Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

Working at Meta was fast-paced, challenging, and intellectually stimulating. For three transformative years, I worked as a Product Data Scientist in Integrity and account security at the bustling London office. Now, as I look back, I find myself continually drawn to the invaluable lessons gleaned during my time there. Beyond the confines of the office, these insights have seamlessly integrated into both my professional and personal growth.

Lesson 1: Always start with why

Meta champions meaningful contributions over mere activity. Twice a year, employees undergo rigorous assessments based on how much tangible impact they’ve made in the last 6 months. The focus isn’t solely on tasks completed or volumes achieved; rather, it’s about the consequential outcomes stemming from one’s actions.

It means that it doesn’t matter what you’ve done or how much you’ve done, it matters what happens because of what you’ve done.

As a Data Scientist (DS), it’s not about the hours logged or the volume of insights generated; what truly counts is the transformative impact of your discoveries.

Consider this: by uncovering a segment of users experiencing unusually short session times, you exposed a critical engineering bug affecting a specific device type. As a result, your findings catalyzed the Engineering team to rectify the issue and correctly measure session time, leading to a staggering increase of 20 million more minutes in session time measured per day. That’s the tangible ‘so what’ impact that defines your contribution.(Note that this is a completely made-up example)

In this “impact” culture, I found myself continually pondering the potential outcome of my projects, delving into the ‘why’ behind my actions. Yet, grappling with the clarity of this ‘why’ wasn’t always a straightforward journey; uncertainty often clouded the path. But one thing was clear: you should determine your direction of travel and not mistake motion for progress.

Starting with “why” works remarkably for stakeholder management, too. Often, people begin by detailing their process, only circling back to their purpose later, if at all. By probing their “why”, you gain insight into their priorities and motivations, facilitating more meaningful and effective interactions

For example, when someone comes to you with a request, initiate by probing: “What problems are you trying to solve and when do you want it delivered?” This can go two ways. They might stumble a bit, struggling to articulate their goals. That’s ok. Ask questions to help them clarify, like “I understand that budget is a constraint, but what would you want the campaign to achieve in the first place?”. If they need time to comb things through, politely decline their requests and send them back to their thinking chair — many times they won’t come back because they would realize it’s not urgent or important.

But if they come back with something like, “We wanna figure out which UI boosts user traffic through A/B testing and launch the new UI by Thanksgiving”, then you smile, nod and dive into measuring that impact!

Lesson 2: Measure everything

Generated using Microsoft Image Creator by Author

You can’t improve what you don’t measure.

— Peter Drucker

At Meta, numbers are the name of the game. If it’s not quantified, it might as well not exist. Product teams zero in on a key metric they’re aiming to boost or shrink — whether it’s revenue, active users, or incoming tickets. And the team’s triumph hinges on their ability to nudge that metric in the right direction. By rallying around one primary metric, everyone’s on the same page.

I’ve seen projects got canceled after a few months because the team could’t quantify goals that tie to the broader company goals.

In Part 1, we covered how starting with “why” can crack open insights into others’ work and requests. Here’s another golden question to follow up on: “How do you measure success?”

This question often links closely with the “why”. For example, if the aim is to determine which UI drive more user traffic, the measurement can be the number of users who land on the page.

Moreover, using numbers is a powerful way to navigate requests. As data people, we’re often tasked with “pulling some numbers” or “visualizing x in a dashboard.” I’ve found it helpful to challenge stakeholders to articulate the decisions they’d make based on those numbers. I’ll throw out scenarios like, “Imagine we have this number, say it’s 10%, 50%, or 80%. How would your decisions change accordingly?” It’s surprising how often stakeholders pause or dance around without a clear answer. 7 out of 10 times, they’ll say, “Let me think about it some more,” and then… crickets.

By doing that, you’ve effectively sifted out requests that fall into the “nice-to-have” category rather than the “must-have” ones.

Lesson 3: Prioritize, ruthlessly

While Meta’s impact culture attracts criticism around creating a high-pressure environment and focusing on short-term results, it’s been a goldmine for learning about prioritization. Every time I kick off a project or find myself knee-deep in it, I loop back to these tough questions:
- What outcomes are we aiming for with this analysis?
- Do these outcomes align with our stakeholders’ goals or the bigger picture for our team, department, or the company? Think active users, revenue, retention.
- How do we gauge the impact, and how significant is it? (Remember Lesson 2!)
- And lastly, who needs to take action based on what we uncover?
Once you’ve dissected your projects like this, a handy step is to plot them on an effort vs. impact matrix and find that sweet spot for prioritization. Aim to minimize Money Pit ventures while maximizing Easy Wins and Incremental gains in the short term. Save room for those Big Bets that promise long-term payoffs

Image created by Author using Google Drawing

When new projects come to your pipeline, evaluate the effort and impact and compare them against your existing projects. If a newcomer looks more promising, don’t hesitate to shuffle things around — sometimes that means dropping an old project to make room for the shiny new one.

As company priorities often shift, it is ok to de-prioritize or re-prioritize projects midway; But here’s the trick: keep that constant evaluation going strong. Break down each project with those trusty questions, so you can keep your eyes locked on the prize — the most impactful stuff.

Takeaways

Lessons
1. Always start with why
2. Measure everything
3. Prioritize, ruthlessly
Questions for requests
1. Why — What problems are you trying to solve?
2. When — When do you expect it to be delieverd?
3. How — How do you measure success?
4. What — Let’s say we have this number and it is 10%, 50% or 80%. How would your decisions change accordingly?
Questions for prioritization
1. What — What outcomes are we aiming for with this project?
2. Why — Do these outcomes align with our stakeholders’ goals or the bigger picture for our team, department, or the company?
3. How — How do we gauge the impact, and just how significant is it?
4. Who — Who needs to take action based on what we uncover?
Disclaimer: This article reflects my experience and opinions only. It does not represent the views or opinions of any company or organization.

Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

Go Here to Read this Fast! Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life
May 7, 2024
FanFabler: Fine-Tuning Llama 3 to Be a Multilingual Fanfic Writing Assistant

Robert A. Gonsalves

How I used a custom training dataset and information retrieval for global storytelling. 好样的! Bravo! वाह! ¡Guau! 브라보!

Continue reading on Towards Data Science »

Originally appeared here:
FanFabler: Fine-Tuning Llama 3 to Be a Multilingual Fanfic Writing Assistant

Go Here to Read this Fast! FanFabler: Fine-Tuning Llama 3 to Be a Multilingual Fanfic Writing Assistant

May 7, 2024
How to Use and Test WizardLM2: Microsoft’s New LLM

Eivind Kjosbakken

Learn how to run and test Microsoft’s new LLM, WizardLM2, and use it to perform tasks like question-answering and information extraction

Continue reading on Towards Data Science »

Originally appeared here:
How to Use and Test WizardLM2: Microsoft’s New LLM

Go Here to Read this Fast! How to Use and Test WizardLM2: Microsoft’s New LLM

May 7, 2024
Getting Started with the Dev Containers Extension

Rami Krispin

This is a step-by-step tutorial for getting started with the Dev Containers extension. This tutorial is the first of a sequence of…

Continue reading on Towards Data Science »

Originally appeared here:
Getting Started with the Dev Containers Extension

Go Here to Read this Fast! Getting Started with the Dev Containers Extension

May 7, 2024
Building Transformer Models for Proteins From Scratch

Yuan Tian

A Practical Guide to Building and Evaluating Protein Language Models

Continue reading on Towards Data Science »

Originally appeared here:
Building Transformer Models for Proteins From Scratch

Go Here to Read this Fast! Building Transformer Models for Proteins From Scratch

May 7, 2024
How does temperature impact next token prediction in LLMs?
Ankur Manikandan
TLDR
1. At a temperature of 1, the probability values are the same as those derived from the standard softmax function.
2. Raising the temperature inflates the probabilities of the less likely tokens, thereby broadening the range of potential candidates (or diversity) for the model’s next token prediction.
3. Lowering the temperature, on the other hand, makes the probability of the most likely token approach 1.0, boosting the model’s confidence. Decreasing the temperature effectively eliminates the uncertainty within the model.

Google colab notebook.

Introduction
Large Language Models (LLMs) are versatile generative models suited for a wide array of tasks. They can produce consistent, repeatable outputs or generate creative content by placing unlikely words together. The “temperature” setting allows users to fine-tune the model’s output, controlling the degree of predictability.

Let’s take a hypothetical example to understand the impact of temperature on the next token prediction.

We asked an LLM to complete the sentence, “This is a wonderful _____.” Let’s assume the potential candidate tokens are:
```
|   token    | logit |
|------------|-------|
| day        |    40 |
| space      |     4 |
| furniture  |     2 |
| experience |    35 |
| problem    |    25 |
| challenge  |    15 |
```
The logits are passed through a softmax function so that the sum of the values is equal to one. Essentially, the softmax function generates probability estimates for each token.

Standard softmax function

Let’s calculate the probability estimates in Python.
```
import numpy as np
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
from ipywidgets import interactive, FloatSlider


def softmax(logits):
    exps = np.exp(logits)
    return exps / np.sum(exps)


data = {
    "tokens": ["day", "space", "furniture", "experience", "problem", "challenge"],
    "logits": [5, 2.2, 2.0, 4.5, 3.0, 2.7]
}
df = pd.DataFrame(data)
df['probabilities'] = softmax(df['logits'].values)
df
```
```
| No. |   tokens   | logits | probabilities |
|-----|------------|--------|---------------|
|   0 | day        |    5.0 |      0.512106 |
|   1 | space      |    2.2 |      0.031141 |
|   2 | furniture  |    2.0 |      0.025496 |
|   3 | experience |    4.5 |      0.310608 |
|   4 | problem    |    3.0 |      0.069306 |
|   5 | challenge  |    2.7 |      0.051343 |
```
```
ax = sns.barplot(x="tokens", y="probabilities", data=df)
ax.set_title('Softmax Probability Estimates')
ax.set_ylabel('Probability')
ax.set_xlabel('Tokens')
plt.xticks(rotation=45)
for bar in ax.patches:
    ax.text(bar.get_x() + bar.get_width() / 2, bar.get_height(), f'{bar.get_height():.2f}',
            ha='center', va='bottom', fontsize=10, rotation=0)
plt.show()
```
The softmax function with temperature is defined as follows:

where (T) is the temperature, (x_i) is the (i)-th component of the input vector (logits), and (n) is the number of components in the vector.
```
def softmax_with_temperature(logits, temperature):
    if temperature <= 0:
        temperature = 1e-10  # Prevent division by zero or negative temperatures
    scaled_logits = logits / temperature
    exps = np.exp(scaled_logits - np.max(scaled_logits))  # Numerical stability improvement
    return exps / np.sum(exps)


def plot_interactive_softmax(temperature):
    probabilities = softmax_with_temperature(df['logits'], temperature)
    plt.figure(figsize=(10, 5))
    bars = plt.bar(df['tokens'], probabilities, color='blue')
    plt.ylim(0, 1)
    plt.title(f'Softmax Probabilities at Temperature = {temperature:.2f}')
    plt.ylabel('Probability')
    plt.xlabel('Tokens')
    # Add text annotations
    for bar, probability in zip(bars, probabilities):
        yval = bar.get_height()
        plt.text(bar.get_x() + bar.get_width()/2, yval, f"{probability:.2f}", ha='center', va='bottom', fontsize=10)
    plt.show()

interactive_plot = interactive(plot_interactive_softmax, temperature=FloatSlider(value=1, min=0, max=2, step=0.01, description='Temperature'))
interactive_plot
```
At T = 1,

At a temperature of 1, the probability values are the same as those derived from the standard softmax function.

At T > 1,

Raising the temperature inflates the probabilities of the less likely tokens, thereby broadening the range of potential candidates (or diversity) for the model’s next token prediction.

At T < 1,

Lowering the temperature, on the other hand, makes the probability of the most likely token approach 1.0, boosting the model’s confidence. Decreasing the temperature effectively eliminates the uncertainty within the model.

Conclusion

LLMs leverage the temperature parameter to offer flexibility in their predictions. The model behaves predictably at a temperature of 1, closely following the original softmax distribution. Increasing the temperature introduces greater diversity, amplifying less likely tokens. Conversely, decreasing the temperature makes the predictions more focused, increasing the model’s confidence in the most probable token by reducing uncertainty. This adaptability allows users to tailor LLM outputs to a wide array of tasks, striking a balance between creative exploration and deterministic output.

Unless otherwise noted, all images are by the author.

How does temperature impact next token prediction in LLMs? was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
How does temperature impact next token prediction in LLMs?

Go Here to Read this Fast! How does temperature impact next token prediction in LLMs?
May 6, 2024
Understanding Race Conditions In the Context of Python

Christopher Tao

Python GIL can’t guarantee thread safety

Continue reading on Towards Data Science »

Originally appeared here:
Understanding Race Conditions In the Context of Python

Go Here to Read this Fast! Understanding Race Conditions In the Context of Python

May 6, 2024
Lunar Crater Detection: Computer Vision in Space

Callum Bruce

One small step towards autonomous crater-based navigation

Continue reading on Towards Data Science »

Originally appeared here:
Lunar Crater Detection: Computer Vision in Space

Go Here to Read this Fast! Lunar Crater Detection: Computer Vision in Space

May 5, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Tag: AI

A list to make evaluating ELT/ETL tools a bit less daunting

1. Create a rubric

2. Start a simple pipeline running right away

3. Create a few end to end transforms

4. Throw a non-native data source at it

5. Force an error

A couple of bonuses

Unlocking Growth: 3 Years at Meta — Transformative Lessons for Work and Life

Lesson 1: Always start with why

Lesson 2: Measure everything

Lesson 3: Prioritize, ruthlessly

Takeaways

Lessons

Questions for requests

Questions for prioritization