Code

FMEval is a library to evaluate Large Language Models (LLMs) and select the best LLM for your use case. The library powers LLM evaluations in SageMaker Studio and Amzon Bedrock.

The library can help evaluate LLMs for the following tasks:

The library is also easy to extend allowing to bring your own evaluation algorithm, task, data and LLM-based system to evaluate.

Proper estimation of predictive uncertainty is fundamental in applications that involve critical decisions. Uncertainty can be used to assess reliability of model predictions, trigger human intervention, or decide whether a model can be safely deployed in the wild.

Fortuna is a library for uncertainty quantification that makes it easy for users to run benchmarks and bring uncertainty to production systems. Fortuna provides calibration and conformal methods starting from pre-trained models written in any framework, and it further supports several Bayesian inference methods starting from deep learning models written in Flax. The language is designed to be intuitive for practitioners unfamiliar with uncertainty quantification, and is highly configurable.

Check the documentation for a quickstart, examples and references.

Bias detection and mitigation for datasets and models from the AWS service "Amazon SageMaker Clarify" (see official page).

Biases are imbalances in the training data or the prediction behavior of the model across different groups, such as age or income bracket. Biases can result from the data or algorithm used to train your model. For instance, if an ML model is trained primarily on data from middle-aged individuals, it may be less accurate when making predictions involving younger and older people. The field of machine learning provides an opportunity to address biases by detecting them and measuring them in your data and model.

AdaTune is a library to perform gradient based hyperparameter tuning for training deep neural networks. AdaTune currently supports tuning of the learning_rate parameter but some of the methods implemented here can be extended to other hyperparameters like momentum or weight_decay etc. AdaTune provides the following gradient based hyperparameter tuning algorithms - HD, RTHO and our newly proposed algorithm, MARTHE. The repository also contains other commonly used non-adaptive learning_rate adaptation strategies like staircase-decay, exponential-decay and cosine-annealing-with-restarts. The library is implemented in PyTorch. 

The package implements the three algorithms presented in the paper Forward and Reverse Gradient-Based Hyperparameter Optimization (2017):

This package contains: