Drift Detection Algorithms in Federated Learning: FedDrift and FedDrift Eager

INTRODUCTION

As discussed earlier, we know that drift detection algorithms only provide better results for centralized data. The main goal of Federated learning is training local models and taking the aggregate of weights of those models to update the central model or the global model at the server at the end of each communication round.

This renders the traditional methods useless as they require huge amounts of data to be windowed and checked. This is not possible in the case of federated learning as each client device has its own smaller local dataset. This information is sufficient to train the model locally and to send back its weights to the server.

Therefore, we examine concept drift in Federated learning, where data is heterogeneous over time and also over the clients. The following algorithms concentrate on multiple and simultaneous model training in clients while comparing adaptive FedAverage and the losses of that particular model, the previous round. The FedDrift algorithm is designed to address this issue by dynamically adapting to the changes in data distribution across different clients over time.

The two algorithms are:

FedDrift
FedDrift Eager

FedDrift Eager: Proactive Adaptation in Federated Learning

The FedDrift framework houses the FedDrift Eager algorithm, specifically designed to tackle concept drift proactively within federated learning environments. This algorithm leverages a multi-model approach to accommodate the emergence of various concepts over time. Let's delve deeper into the intricacies of FedDrift Eager.

The primary objective here is to minimize the loss function across all clients and time periods. This ensures that the trained models can effectively adapt to these evolving data distributions.

The Objective: Mitigating Concept Drift

The FedDrift Eager algorithm strives to minimize the loss function across clients and time. It achieves this by training models adept at handling these changes in data distributions caused by concept drift. This necessitates a system where models specialize in handling specific data distributions and dynamically adjust to new concepts as they arise. The ultimate goal is to guarantee robust model performance even in the face of fluctuating and evolving data distributions.

Single vs. Multiple Model Approaches: A Comparison

The traditional single-model approach utilizes a single global model for inference on all clients. This approach aims to minimize the average loss across all clients and time periods. However, its effectiveness diminishes significantly in scenarios with substantial concept drift. In contrast, the multiple-model solution tackles this challenge by training numerous global models for distinct concepts. It dynamically groups clients and assigns each cluster the most suitable model for inference. By employing specialized models for different concepts, this approach minimizes loss and offers a more robust solution for handling concept drift.

FedDrift Eager: A Multi-Step Process

The FedDrift Eager algorithm unfolds in several key steps. First, clients are grouped (clustered) based on their current data distribution. Each client is then assigned a model based on the one that yields the lowest loss on its local data. Subsequently, clients perform local updates on their designated models using their local data before sending these updates to the server. The server aggregates these updates to refine the global models. To ensure the models remain specialized for the current data distributions, the algorithm incorporates a mechanism for detecting concept drift and creating new clusters as needed.

A Detailed Look at the Algorithm Steps

The process commences with the creation of multiple global models, each catering to different concepts. In every communication round, clients compute the loss for each model using their local data. They then assign themselves to the model with the lowest loss and perform local updates on that specific model using their local data. These updated model parameters are then sent to the server. The server aggregates the model updates from all clients and utilizes these aggregated parameters to update the global models. An additional layer within the algorithm involves a drift detection mechanism. This mechanism identifies significant changes in the data distribution, creates new clusters as required, and assigns clients to the most suitable clusters based on the current data distribution.

The Advantages of FedDrift Eager

The FedDrift Eager algorithm boasts several advantages. It proactively adapts to concept drift by creating new models and clusters as needed, guaranteeing that models remain specialized for the current data distribution. This proactive approach translates to improved performance and ensures the effectiveness of the trained models even when data distributions change. Additionally, the algorithm exhibits scalability, making it adept at handling a large number of clients with heterogeneous data distributions. This scalability makes it a powerful and robust solution for real-world federated learning applications.

FedDrift: Combating Concept Drift in Federated Learning

The FedDrift algorithm employs a proactive, multi-model approach to effectively adapt to new concepts that surface over time. The core of FedDrift lies in dynamic client clustering, model assignment, and continuous vigilance against concept drift. It fosters the creation of new clusters and models on-the-fly, ensuring sustained model performance.

Single vs. Multiple Model Approaches: A Tale of Two Strategies

Federated Learning traditionally employs a single-model solution, utilizing one global model for all clients. While this approach aims to minimize the average loss across clients and time, it may falter in scenarios with significant concept drift due to the heterogeneous and dynamic nature of the data distributions. In stark contrast, the FedDrift algorithm, championing a multiple-model solution, trains multiple global models, each specializing in a specific concept. Clients are dynamically clustered based on their data distribution, and each cluster is assigned the most appropriate model for inference. This approach significantly enhances model performance by ensuring that each client leverages a model tailored to its current data, effectively handling concept drift.

FedDrift in Action: A Step-by-Step Breakdown

The FedDrift algorithm unfolds in a series of key steps. The first step involves the creation of multiple global models, each representing a distinct concept. During each communication round, clients compute the loss for every model on their local data and strategically assign themselves to the model with the lowest loss. This signifies the best fit for their current data distribution. Clients then perform local training on their assigned model using their local data. This translates to updating the model parameters based on the local dataset, reflecting the latest data distribution. The updated model parameters are then transmitted to the central server, where they are aggregated to refine the global models. The algorithm incorporates a mechanism for detecting significant deviations in data distribution (concept drift). When a drift is detected, new clusters are formed, and clients are reassigned based on their current data distribution. New models may also be initialized to address the emerging concepts. Finally, the updated global models are disseminated to the clients for the subsequent training round. This process guarantees that clients consistently have access to the most suitable model for their data distribution.

Advantages: The Power of FedDrift

The FedDrift algorithm boasts several advantages, solidifying its position as a robust solution for federated learning in dynamic environments:

Proactive Adaptation: By dynamically creating new models and clusters, the algorithm proactively adapts to concept drift, ensuring models remain specialized for the current data distribution.
Enhanced Performance: The utilization of multiple models tailored to different concepts bolsters overall performance, as each client leverages the most suitable model for its data.
Scalability: The algorithm is adept at handling a large number of clients with diverse and evolving data distributions, making it suitable for real-world applications where data exhibits non-IID characteristics and undergoes frequent changes.

Conclusion

The FedDrift algorithm stands out as a powerful and scalable solution for tackling concept drift in federated learning environments. By dynamically clustering clients and training multiple models, it guarantees that the trained models remain effective despite fluctuating data distributions. This approach proves to be pivotal for real-world federated learning applications where data is non-IID and susceptible to frequent concept drifts, ensuring that models can adapt and perform well in dynamic environments.

Summary

FedDrift employs a bottom-up approach that isolates clients that detect drift and merges clients iteratively corresponding to the same concept. FedDrift eager is a special case where only one new concept emerges at a time. Trying to apply a drift detection test globally at a server aggregating the errors leads to a poor performance.