Year: 2024

Missouri Senate introduces bill to disqualify CBDCs as legal tender

Liam ‘Akiba’ Wright

The Missouri Senate introduced SB 194 on Dec. 1, proposing to ban central bank digital currencies (CBDCs) as legal tender within the state. The bill seeks to prohibit public entities from accepting or using CBDCs and modifies the definition of “money” under the Uniform Commercial Code to exclude these digital currencies. Sponsored by Senator Brattin, […]

The post Missouri Senate introduces bill to disqualify CBDCs as legal tender appeared first on CryptoSlate.

Go here to Read this Fast! Missouri Senate introduces bill to disqualify CBDCs as legal tender

Originally appeared here:
Missouri Senate introduces bill to disqualify CBDCs as legal tender

December 4, 2024
XRP, Bitcoin feel the heat as South Korean crisis shakes crypto markets

Adewale Olarinde

The crypto and financial market in Seoul reacted to the martial law declared.
BTC and XRP saw major impact, but rebounded.

South Korea’s declaration of martial law and its rapid reversal h

The post XRP, Bitcoin feel the heat as South Korean crisis shakes crypto markets appeared first on AMBCrypto.

Go here to Read this Fast! XRP, Bitcoin feel the heat as South Korean crisis shakes crypto markets

Originally appeared here:
XRP, Bitcoin feel the heat as South Korean crisis shakes crypto markets

December 4, 2024
Trump’s potential SEC pick: Paul Atkins and the future of crypto

AMBCrypto Team

President Donald Trump is reportedly considering Paul Atkins, a pro-crypto advocate, for the role of SEC Chairman.
Known for his stance on reducing restrictive regulations, Atkins’ potential le

The post Trump’s potential SEC pick: Paul Atkins and the future of crypto appeared first on AMBCrypto.

Go here to Read this Fast!

Trump’s potential SEC pick: Paul Atkins and the future of crypto

Originally appeared here:

Trump’s potential SEC pick: Paul Atkins and the future of crypto

December 4, 2024
Bitcoin dominance weakens: Are altcoins ready to take over?

Samantha LKM

Altcoins are gaining momentum, sparking debates on what drives this new market phase.
Experts highlight market trends and sentiment as key factors behind altcoin strength.

The market is expe

The post Bitcoin dominance weakens: Are altcoins ready to take over? appeared first on AMBCrypto.

Go here to Read this Fast! Bitcoin dominance weakens: Are altcoins ready to take over?

Originally appeared here:
Bitcoin dominance weakens: Are altcoins ready to take over?

December 4, 2024
Reserve Rights crypto surges by 70% in 24 hours, hits a 2-year high

Gladys Makena

RSR has surged over the past 24 hours by 70% to hit a two year high.
Reserve Rights trading volume surges by 1658.79% and volume by 223% to ATH.

Over the past 24 hours, Reserve Rights [RSR]

The post Reserve Rights crypto surges by 70% in 24 hours, hits a 2-year high appeared first on AMBCrypto.

Go here to Read this Fast! Reserve Rights crypto surges by 70% in 24 hours, hits a 2-year high

Originally appeared here:
Reserve Rights crypto surges by 70% in 24 hours, hits a 2-year high

December 4, 2024
How high can XRP’s price go? — Experts weigh in on 2025 targets

Benjamin Njiri

Analysts projected a $5-$7 price target for XRP in 2025.
Will FOMO and ETF approval fuel the projection?

Despite a recent slight dip, Ripple [XRP] still held up its +400% monthly gains as

The post How high can XRP’s price go? — Experts weigh in on 2025 targets appeared first on AMBCrypto.

Go here to Read this Fast!

How high can XRP’s price go? — Experts weigh in on 2025 targets

Originally appeared here:

How high can XRP’s price go? — Experts weigh in on 2025 targets

December 4, 2024
Pepe Coin Looks Ready to Flip Shiba, But Lunex Network Activity Sparks Rush

Guest Author

It’s crazy times in crypto! Indeed, it’s that time of year. With SHIB’s price not offering much promise, the PEPE coin seeks to dethrone SHIB, but its holders are beginning to cause a rally for Lunex. What’s not to like? Lunex Network has outperformed expectations in presale, is growing at a premium, and offers impressive […]

Go here to Read this Fast! Pepe Coin Looks Ready to Flip Shiba, But Lunex Network Activity Sparks Rush

Originally appeared here:
Pepe Coin Looks Ready to Flip Shiba, But Lunex Network Activity Sparks Rush

December 4, 2024
How to Bring SQL Server Data into Microsoft Fabric

Nikola Ilic

Options, options…In this article, you’ll learn what the possibilities are for bringing your on-prem SQL Server data to Microsoft Fabric

Continue reading on Towards Data Science »

Originally appeared here:
How to Bring SQL Server Data into Microsoft Fabric

Go Here to Read this Fast! How to Bring SQL Server Data into Microsoft Fabric

December 4, 2024
Information at a Glance: Do Your Charts Suck?

James Wilkins

How pre-attentive processing, Gestalt theory, and visual data encoding inform data design decisions

Continue reading on Towards Data Science »

Originally appeared here:
Information at a Glance: Do Your Charts Suck?

Go Here to Read this Fast! Information at a Glance: Do Your Charts Suck?

December 4, 2024
Training Language Models on Google Colab
John Hawkins
A guide to iterative fine-tuning and serialisation

Photo by Shio Yang on Unsplash

So, you recently discovered Hugging Face and the host of open source models like BERT, Llama, BART and a whole host of generative language models by Mistral AI, Facebook, Salesforce and other companies. Now you want to experiment with fine tuning some Large Language Models for your side projects. Things start off great, but then you discover how computationally greedy they are and you do not have a GPU processor handy.

Google Colab generously offers you a way to access to free computation so you can solve this problem. The downside is, you need to do it all inside a transitory browser based environment. To make matter worse, the whole thing is time limited, so it seems like no matter what you do, you are going to lose your precious fine tuned model and all the results when the kernel is eventually shut down and the environment nuked.

Never fear. There is a way around this: make use of Google Drive to save any of your intermediate results or model parameters. This will allow you to continue experimentation at a later stage, or take and use a trained model for inference elsewhere.

To do this you will need a Google account that has sufficient Google Drive space for both your training data and you model checkpoints. I will presume you have created a folder called data in Google Drive containing your dataset. Then another called checkpoints that is empty.

Inside your Google Colab Notebook you then mount your Drive using the following command:
```
from google.colab import drive
drive.mount('/content/drive')  
```
You now list the contents of your data and checkpoints directories with the following two commands in a new cell:
```
!ls /content/drive/MyDrive/data
!ls /content/drive/MyDrive/checkpoint
```
If these commands work then you now have access to these directories inside your notebook. If the commands do not work then you might have missed the authorisation step. The drive.mount command above should have spawned a pop up window which requires you to click through and authorise access. You may have missed the pop up, or not selected all of the required access rights. Try re-running the cell and checking.

Once you have that access sorted, you can then write your scripts such that models and results are serialised into the Google Drive directories so they persist over sessions. In an ideal world, you would code your training job so that any script that takes too long to run can load partially trained models from the previous session and continue training from that point.

A simple way for achieving that is creating a save and load function that gets used by your training scripts. The training process should always check if there is a partially trained model, before initialising a new one. Here is an example save function:
```
def save_checkpoint(epoch, model, optimizer, scheduler, loss, model_name, overwrite=True):
    checkpoint = {
        'epoch': epoch,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'scheduler_state_dict': scheduler.state_dict(),
        'loss': loss
    }
    direc = get_checkpoint_dir(model_name)
    if overwrite:
        file_path = direc + '/checkpoint.pth'
    else:
        file_path = direc + '/epoch_'+str(epoch) + '_checkpoint.pth'
    if not os.path.isdir(direc):
       try:
          os.mkdir(direc)
       except:
          print("Error: directory does not exist and cannot be created")
          file_path = direc +'_epoch_'+str(epoch) + '_checkpoint.pth'
    torch.save(checkpoint, file_path)
    print(f"Checkpoint saved at epoch {epoch}")
```
In this instance we are saving the model state along with some meta-data (epochs and loss) inside a dictionary structure. We include an option to overwrite a single checkpoint file, or create a new file for every epoch. We are using the torch save function, but in principle you could use other serialisation methods. The key idea is that your program opens the file and determines how many epochs of training were used for the existing file. This allows the program to decide whether to continue training or move on.

Similarly, in the load function we pass in a reference to a model we wish to use. If there is already a serialised model we load the parameters into our model and return the number of epochs it was trained for. This epoch value will determine how many additional epochs are required. If there is no model then we get the default value of zero epochs and we know the model still has the parameters it was initialised with.
```
def load_checkpoint(model_name, model, optimizer, scheduler):
    direc = get_checkpoint_dir(model_name)
    if os.path.exists(direc):
        file_path = get_path_with_max_epochs(direc)
        checkpoint = torch.load(file_path, map_location=torch.device('cpu'))
        model.load_state_dict(checkpoint['model_state_dict'])
        optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
        epoch = checkpoint['epoch']
        loss = checkpoint['loss']
        print(f"Checkpoint loaded from {epoch} epoch")
        return epoch, loss
    else:
        print(f"No checkpoint found, starting from epoch 1.")
        return 0, None
```
These two functions will need to be called inside your training loop, and you need to ensure that the returned value for epochs value is used to update the value of epochs in your training iterations. The result is you now have a training process that can be re-started when a kernel dies, and it will pick up and continue from where it left off.

That core training loop might look something like the following:
```
EPOCHS = 10
for exp in experiments: 
    model, optimizer, scheduler = initialise_model_components(exp)
    train_loader, val_loader = generate_data_loaders(exp)
    start_epoch, prev_loss = load_checkpoint(exp, model, optimizer, scheduler)
    for epoch in range(start_epoch, EPOCHS):
        print(f'Epoch {epoch + 1}/{EPOCHS}')
        # ALL YOUR TRAINING CODE HERE
        save_checkpoint(epoch + 1, model, optimizer, scheduler, train_loss, exp)
```
Note: In this example I am experimenting with training multiple different model setups (in a list called experiments), potentially using different training datasets. The supporting functions initialise_model_components and generate_data_loaders are taking care of ensuring that I get the correct model and data for each experiment.

The core training loop above allows us to reuse the overall code structure that trains and serialises these models, ensuring that each model gets to the desired number of epochs of training. If we restart the process, it will iterate through the experiment list again, but it will abandon any experiments that have already reached the maximum number of epochs.

Hopefully you can use this boilerplate code to setup your own process for experimenting with training some deep learning language models inside Google Colab. Please comment and let me know what you are building and how you use this code.

Massive thank you to Aditya Pramar for his initial scripts that prompted this piece of work.

Training Language Models on Google Colab was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Training Language Models on Google Colab

Go Here to Read this Fast! Training Language Models on Google Colab
December 4, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Year: 2024

A guide to iterative fine-tuning and serialisation