Implementation of several Personalized Federated Learning (PFL) and Federated Learning (FL) architectures via PySyft. This was done for my Master's Thesis.
- Implementations of FL and PFL through model aggregation and passing of model dictionaries through PySyft
- Description and analysis of performance metrics of FL and PFL architecture
- Discussion of results and (dis)advantages of implemented architectures
The following are uploaded as CSV files in the Results folder
- Metrics of implementations
- Summary metrics
The thesis document, including thorough analysis, summaries, and discussion will be uploaded here once finished.
The notebooks folder contains all Jupyter Notebooks used.
The dataset linked under sources needs to be downloaded to replicate the results locally. Both the full dataset (Full) and a randomly sampled 10% (10%) version were used.
Each dataset has three variants:
- Regular (unchanged)
- Faktor 2 (where the first data slice's numerical values are multiplied by 2, the second's are divided by 2 )
- Random Noise (where random noise was added to the first data slice, and the second remains the same)
Each dataset modification has four variants. Each notebook represents one such variant.
- Central (Non-Federated)
- FedAvg (FL)
- APFL (A version of PFL)
- FedPer (Another version of PFL)
The dataset used was originally provided by Criteo for the Criteo Display Advertising Challenge.
The data was accessed here.
The DLRM model used for this implementation was based this public Kaggle Submission for the challenge. The submission is an implementation of this paper in Tensorflow.
All Implementations are simplified versions of published Federated Learning architectures:
- Federated Averaging (FedAvg)
- Adaptive Personalized Federated Learning (APFL)
- Federated Personalization (FedPer)
Syft 0.8.4 was used. For the documentation and installation, refer to the this document and the PySyft GitHub community.