Update README.md

This commit is contained in:
HengZhang 2024-11-29 14:21:34 +08:00
parent 87b4ad10d1
commit 987ad34a17
1 changed files with 181 additions and 74 deletions

251
README.md
View File

@ -1,152 +1,259 @@
# FS-TFP # FedDGCN: A Scalable Federated Learning Framework for Traffic Flow Prediction
This is the offical repository of **FedDGCN**: A Scalable Federated Learning
Framework for Traffic Flow Prediction.
![overview](./figures/overview.jpg) This is the official repository of **FedDGCN: A Scalable Federated Learning Framework for Traffic Flow Prediction.**
It is also a the traffic flow prediction extension based on [FederatedScope](https://github.com/alibaba/FederatedScope). ![Overview](./figures/overview.jpg)
NOTE: This is an early version of **FedDGCN**. The full version will be updated after testing is completed. **FedDGCN** extends [FederatedScope](https://github.com/alibaba/FederatedScope) to support federated traffic flow prediction.
> **Note:** This is an early version of **FedDGCN**. The full version will be released after testing is completed.
--- ---
# 1. Environment ## Table of Contents
- [1. Environment Setup](#1-environment-setup)
- [Step 1: Create a Conda Environment](#step-1-create-a-conda-environment)
- [Step 2: Install PyTorch](#step-2-install-pytorch)
- [Step 3: Install FederatedScope](#step-3-install-federatedscope)
- [2. Run the Code](#2-run-the-code)
- [Step 1: Prepare the Datasets](#step-1-prepare-the-datasets)
- [Step 2: Configure the Settings](#step-2-configure-the-settings)
- [Step 3: Run the Experiments](#step-3-run-the-experiments)
- [3. Visualize Results](#3-visualize-results)
- [4. Citation](#4-citation)
- [5. Acknowledgements](#5-acknowledgements)
We run the experiment on a **Linux system**, i.e **Ubuntu 22.04**. It has not been tested on other systems yet. ---
## Step 1. Create a Conda env ## 1. Environment Setup
### Step 1: Create a Conda Environment
We recommend using a **Conda** virtual environment. This project supports **Python 3.9** (recommended) and **Python 3.10**. We recommend using a **Conda** virtual environment. This project supports **Python 3.9** (recommended) and **Python 3.10**.
**WARNING: Python 3.11 and later versions are not compatible!** > **Warning:** Python 3.11 and later versions are not compatible.
``` ```bash
conda create -n FedDGCN python=3.9 conda create -n FedDGCN python=3.9
conda activate FedDGCN conda activate FedDGCN
``` ```
## Step 2. Install Pytorch ### Step 2: Install PyTorch
Download the appropriate version of [PyTorch](https://pytorch.org/get-started/locally/) based on your device.
Download the appropriate version of [PyTorch]( https://pytorch.org/get-started/locally/) based on your device.
This project has been tested with **Torch 2.4.0 (recommended)** and **Torch 2.0.0** with **CUDA 12**. Compatibility with other versions is not guaranteed. This project has been tested with **Torch 2.4.0 (recommended)** and **Torch 2.0.0** with **CUDA 12**. Compatibility with other versions is not guaranteed.
## Step 3. Install FederatedScope ### Step 3: Install FederatedScope
Clone this repository and install it:
git clone this repository, and ```bash
git clone https://github.com/your-repo/FS-TFP.git
```
cd FS-TFP cd FS-TFP
pip install -e . pip install -e .
``` ```
Additionally, you might need to install some extra packages to avoid annoying warnings. Additionally, install the required packages to avoid warnings:
``` ```bash
pip install torch_geometric community rdkit pip install torch_geometric community rdkit
``` ```
---
## 2. Run the Code
# 2. Run the Code ### Step 1: Prepare the Datasets
Download the PeMS datasets from the **[STSGCN repository](https://github.com/Davidham3/STSGCN)**.
After downloading, extract the datasets and place them in the `./data/trafficflow` directory.
## Step 1. Prepare the datasets The directory structure should be as follows:
You need to download the PeMS dataset from the **[STSGCN](https://github.com/Davidham3/STSGCN)** repository following README. After downloading, extract the dataset and place it in the `./data/trafficflow` directory at the root of the project.
The directory structure of `./data/trafficflow` should be as follows:
``` ```
FS-TFP\DATA\TRAFFICFLOW FS-TFP/data/trafficflow
├─PeMS03 ├─PeMS03
├─PeMS04 ├─PeMS04
├─PeMS07 ├─PeMS07
└─PeMS08 └─PeMS08
``` ```
## Step 2. Check your Setting ### Step 2: Configure the Settings
Run scripts for the four datasets are located in the `./scripts/trafficflow_exp_scripts/` directory.
We have placed the run scripts for the four datasets in the `./scripts/trafficflow_exp_scripts/` directory. Each dataset has a YAML configuration file: `{D3, D4, D7, D8}.yaml`.
There are YAML files for four datasets: `{D3, D4, D7, D8}.yaml`. You can customize the parameters or use the presets. Key configurable parameters include:
You can customize the parameters or use the presets we provide. ```yaml
# Line 3: GPU device to use (for multi-GPU machines)
Some key parameters include:
```
# Line 3: Adjust the GPU device to use (for multi-GPU machines)
device: 0 device: 0
# Line 8: Adjust the total number of training rounds # Line 8: Total number of training rounds
total_round_num: <number_of_rounds> total_round_num: <number_of_rounds>
# Line 9: Adjust the number of clients based on your machine configuration # Line 9: Number of clients
client_num: <number_of_clients> client_num: <number_of_clients>
# Line 47/48: open/close minigraph strategy # Line 65: Training loss function
use_minigraph: True/False
minigraph_size: 10
# Line 65: Adjust the training loss function
# Options: L1Loss, RMSE, MAPE # Options: L1Loss, RMSE, MAPE
criterion: criterion:
type: <loss_function> type: <loss_function>
``` ```
**WARNING:** Processing the **PEMSD7** dataset may require more than **32GB** RAM. If your system lacks sufficient RAM, it is recommended to increase the size of the swap partition. > **Warning:** Processing the **PeMSD7** dataset may require more than **32GB RAM**.
> If your system lacks sufficient RAM, increase the size of the swap partition.
### Step 3: Run the Experiments
Use the following commands to run **FedDGCN**:
## Step 3. Run the experiments ```bash
# Run experiments on different datasets
You can use the following command to run **FedDGCN** directly. It is recommended to create the corresponding run configuration in your IDE based on the command below: python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D3.yaml # PeMSD3
python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D4.yaml # PeMSD4
``` python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D7.yaml # PeMSD7
# PEMSD3 python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D8.yaml # PeMSD8
python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D3.yaml
# PEMSD4
python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D4.yaml
# PEMSD7
python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D7.yaml
# PEMSD8
python federatedscope/main.py --cfg scripts/trafficflow_exp_scripts/D8.yaml
``` ```
If you see output similar to the image below, congratulations! You have successfully run the experiment:
![Experiment Output](./figures/exp.png)
If you see the following output in your terminal, congratulations! You have successfully run the experiment: ---
![image-20241121193843250](./figures/exp.png) ## 3. Visualize Results
The experiment logs will be saved in the **`exp`** folder.
We provide a script, **`global.py`**, in the same folder. Replace the old logs with the new logs from the experiments and run the script to visualize the results:
# 3. Visualize the result ```bash
The experiment logs will be placed in the **exp** folder. We have written a script, **global.py**, in the **exp** folder. You need to replace the previous logs with the new ones generated from the experiment. Once replaced, simply run the script to visualize the experiment results.
```
python exp/global.py python exp/global.py
``` ```
The script will generate a **baseline.jpg** file to visualize the logs. You are also free to modify the script to implement additional functionality as needed. This will generate a **baseline.jpg** file to visualize the logs.
You may install matplotlib first for drawing: To install the required package for visualization:
``` ```bash
pip install matplotlib pip install matplotlib
``` ```
---
## 4. Citation
# Citation
TBD TBD
---
## 5. Acknowledgements
Special thanks to the authors of [FederatedScope](https://github.com/alibaba/FederatedScope), upon which this project is built.
# Acknowledgements
We would like to extend our gratitude to the authors of the following works: [FederatedScope](https://github.com/alibaba/FederatedScope).
Our codes are built upon their open-source projects.
# How to improve our framework?
We welcome the community to help expand and improve our framework! This guide outlines how to customize configurations, models, datasets, trainers, and loss functions.
---
## 📋 How to Add Configurations
1. **Create a YAML file**
Use the `./scripts` folder as a reference. We recommend copying an existing configuration and modifying it.
2. **Update Core Configurations**
Go to `./federatedscope/core/configs/`, locate `cfg_model`, and add your configurations with default values. For example:
```python
cfg.model.num_nodes = 0
cfg.model.rnn_units = 64
cfg.model.dropout = 0.1
```
3. **Add Nested Parameters**
If needed, create nested parameters using `CN()`:
```python
cfg.model.next = CN()
cfg.model.next.default = 1
```
4. **Sync Parameters Across Configs**
Ensure the parameters in `cfg_model` align with those in other configs (e.g., `cfg_trafficflow`) to avoid compatibility issues across systems (Windows, Linux, etc.).
5. **Customize YAML**
After adding parameters to config files (e.g., `cfg_data`, `cfg_training`), customize them in the corresponding YAML file.
---
## 🛠️ How to Add a Model
1. **Create Your Model**
Save your model in the `federatedscope/trafficflow/model/` folder (or another location of your choice).
Add the model name to `model:type` in the configuration YAML file.
2. **Register the Model**
In `federatedscope/core/auxiliaries/model_builder.py`, add logic for your model (around line 214):
```python
elif model_config.type.lower() in ['your_model']:
from federatedscope.trafficflow.model.your_model import YourModel
model = YourModel(model_config)
```
---
## 📊 How to Add a Dataset
1. **Create a DataLoader**
Implement a function in `federatedscope/trafficflow/dataloader/` to generate data for clients. The function should return a list of dictionaries (one per client) with `['train']`, `['val']`, and `['test']` datasets. Each dataset should be a `torch.utils.data.TensorDataset` containing `x` and `label` tensors.
2. **Register the DataLoader**
Update `federatedscope/core/data/utils.py` (around line 108) to import your dataloader:
```python
elif config.data.type.lower() in ['trafficflow']:
from federatedscope.trafficflow.dataloader.traffic_dataloader import load_traffic_data
dataset, modified_config = load_traffic_data(config, client_cfgs)
```
---
## 🎓 How to Customize Your Trainer
1. **Create a Custom Trainer**
Implement your trainer in `federatedscope/trafficflow/trainer/`. Inherit from an existing trainer, e.g.:
```python
from federatedscope.core.trainers.torch_trainer import GeneralTorchTrainer as Trainer
class TrafficflowTrainer(Trainer):
# Overwrite methods or hooks as needed
```
2. **Reference Examples**
Review existing trainers for additional guidance.
---
## 🔧 How to Add a Custom Loss Function
1. **Implement the Loss Function**
Create a file in `federatedscope/contrib/loss/` and write your custom loss function.
2. **Register the Loss Function**
Register your loss function using `register_criterion`:
```python
from federatedscope.register import register_criterion
register_criterion('RMSE', call_my_criterion)
register_criterion('MAPE', call_my_criterion)
```
3. **Update Configuration**
Set the loss function in the configuration YAML file using `criterion:type`.
---
By following these steps, you can extend and customize the framework to meet your needs. Happy coding! 🎉