What is DECILE

Data Efficient Learning with Less Data
State of the art AI and Deep Learning are very data hungry. This comes at significant cost including larger resource costs (multiple expensive GPUs and cloud costs), training times (often times multiple days), and human labeling costs and time. Decile attempts to solve this by answering the following question. Can we train state of the art deep models with only a sample (say 5 to 10\%) of massive datasets, while having neglibible impact in accuracy? Can we do this while reducing training time/cost by an order of magnitude, and/or significantly reducing the amount of labeled data required?

Why DECILE?

Addressing critical challenges in modern AI and deep learning

💰

Reduce Training Costs

State-of-the-art deep learning requires expensive GPUs and cloud infrastructure, costing thousands per experiment.

🏷️

Lower Labeling Expenses

Manual data annotation is time-consuming and expensive, often requiring domain experts for quality labels.

⚖️

Handle Noisy Data

Real-world datasets contain noise, outliers, and class imbalances that degrade model performance.

⚡

Accelerate Development

Training on massive datasets takes days or weeks, slowing down research iteration and deployment cycles.

Modules

Reduce end to end training time from days to hours and hours to minutes using coresets and data selection. CORDS implements a number of state of the art data subset selection algorithms and coreset algorithms. Some of the algorithms currently implemented with CORDS include: GLISTER, GradMatchOMP, GradMatchFixed, CRAIG, SubmodularSelection, RandomSelection etc

Learn More

DISTIL is a library that features many state-of-the-art active learning algorithms. Implemented in PyTorch, it gives fast and efficient implementations of these active learning algorithms. It allows users to modularly insert active learning selection into their pre-existing training loops with minimal change. Most importantly, it features promising results in achieving high model performance with less amount of labeled data. If you are looking to cut down on labeling costs, DISTIL should be your go-to for getting the most out of your data.

Learn More

Summarize massive datasets using submodular optimization

Learn More

SPEAR is a python library that reduce data labeling efforts using data programming. It implements several recent approaches such as Snorkel, ImplyLoss, Learning to reweight, etc. In addition to data labeling, it integrates semi-supervised approaches for training and inference.

Learn More

Targeted subset selection

Learn More

ML Efficiency for Large Models (MeLM)

Today's world needs orders of magnitude more efficient ML to address environmental and energy crises, optimize resource consumption and improve sustainability. With the end of Moore's Law and Dennard Scaling, we can no longer expect more and faster transistors for the same cost and power budget.

PI: Ganesh Ramakrishnan

🎯

Optimizing Large Language Models through Singular Vector-Based Fine-Tuning

Advancing parameter-efficient fine-tuning techniques by exploring singular vector-guided updates to adapt large-scale pre-trained models for specific downstream tasks.

Parameter Efficiency in Model Fine-Tuning
Comparison and Evaluation of PEFT Techniques
Task-Specific Sparsity Patterns and Performance
Scalability and Adaptation in Large Language Models

🧠

Pathway to Algorithmic Generalization (Memory-Augmented Transformers)

Exploring memory-augmented Transformers (Memformers) as adaptive optimizers by implementing Linear First-Order Optimization Methods (LFOMs).

Leveraging Memory Augmentation for Advanced Optimization
Comparative Performance Against Classical Optimization Techniques
Transformers as Meta-Optimizers
Theoretical Foundations and Convergence Analysis

📐

Geodesic Sharpness in Transformers

Advancing symmetry-aware sharpness metrics to improve generalization predictions for Transformer models by leveraging Riemannian geometry.

Developing Symmetry-Invariant Sharpness Measures
Comparative Analysis of Geodesic Sharpness
Evaluating Transformer Symmetries in Attention Mechanisms
Potential for Sharpness-Aware Optimization

⚙️

Efficiently Adapting Pre-Trained Models for Multiple Tasks

Investigating task arithmetic as an efficient technique for editing pre-trained models, focusing on adding, combining, or removing task-specific capabilities with minimal interference.

Developing Task Arithmetic for Efficient Model Adaptation
Investigating Weight Disentanglement Mechanisms
Examining Kernel-Based Approaches to Task Localization
Understanding the Role of Pre-Training in Task Disentanglement

Visit MeLM Research Group →

News

Decile integrated into modern healthcare curriculum

MUMBAI: When the Indian Institute of Technology, Bombay (IITB), opens its gates, even if virtually, to the fresh batch of students this August, healthcare informatics will be one of its new offerings. The institute’s foray into healthcare will be a coming together of core sciences—maths, medicine and computer science. It will be offered as an interdisciplinary dual degree programme as well as a minor programme.
IIT-Bombay utilizes Decile technology to monitor suspicious movements

MUMBAI: Surakshavyuh makes surveillance smart and does in real-time what is usually a human’s job — monitoring CCTV camera footage for hours and alerting on suspicious movements
Decile used to develop Artificial Intelligence in CCTV software

MUMBAI: Spotting every biker who travelled without a helmet in Mumbai in the last week may require the police control rooms to go through terabytes of video footage. But now, that data can be extracted in a short span of time not by deploying Israeli tech, but by using local solutions developed by IIT-Bombay.
Military surveillance utilizes Decile for remote monitoring

MUMBAI: A state-of-the-art video surveillance platform developed by researchers at the Indian Institute of Technology-Bombay (IIT-B) has found application in military surveillance as well as remote monitoring of social distancing norms violations amid Covid-19 pandemic.