FS-TFP/federatedscope/gfl/README.md

292 lines
8.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning
FederatedScope-GNN (FS-G) is a unified, comprehensive and efficient package for federated graph learning. We provide a hands-on tutorial here, while for more detailed tutorial, please refer to [FGL Tutorial](https://federatedscope.io/docs/graph/).
## Quick Start
Lets start with a two-layer GCN on FedCora to familiarize you with FS-G.
### Step 1. Installation
The installation of FS-G follows FederatedScope, please refer to [Installation](https://github.com/alibaba/FederatedScope#step-1-installation).
After installing the minimal version of FederatedScope, you should install extra dependencies ([PyG](https://github.com/pyg-team/pytorch_geometric), rdkit, and nltk) for the application version of FGL, run:
```bash
conda install -y pyg==2.0.4 -c pyg
conda install -y rdkit=2021.09.4=py39hccf6a74_0 -c conda-forge
conda install -y nltk
```
Now, you have successfully installed the FGL version of FederatedScope.
### Step 2. Run with exmaple config
Now, we train a two-layer GCN on FedCora with FedAvg.
```bash
python federatedscope/main.py --cfg federatedscope/gfl/baseline/example.yaml
```
For more details about customized configurations, see **Advanced**.
## Reproduce the results in our paper
We provide scripts (grid search to find optimal results) to reproduce the results of our experiments.
* Node-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/node_level/`:
```bash
# Example of FedAvg
cd federatedscope/gfl/baseline/repro_exp/node_level/
bash run_node_level.sh 0 cora louvain
# Example of FedAvg
bash run_node_level.sh 0 cora random
# Example of FedOpt
bash run_node_level_opt.sh 0 cora louvain gcn 0.25 4
# Example of FedProx
bash run_node_level_prox.sh 0 cora louvain gcn 0.25 4
```
* Link-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/link_level/`:
```bash
cd federatedscope/gfl/baseline/repro_exp/link_level/
# Example of FedAvg
bash run_link_level_KG.sh 0 wn18 rel_type
# Example of FedOpt
bash run_link_level_opt.sh 0 wn18 rel_type gcn 0.25 16
# Example of FedProx
bash run_link_level_prox.sh 7 wn18 rel_type gcn 0.25 16
```
* Graph-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/graph_level/`:
```bash
cd federatedscope/gfl/baseline/repro_exp/graph_level/
# Example of FedAvg
bash run_graph_level.sh 0 proteins
# Example of FedOpt
bash run_graph_level_opt.sh 0 proteins gcn 0.25 4
# Example of FedProx
bash run_graph_level_prox.sh 0 proteins gcn 0.25 4
```
## Advanced
### Start with built-in functions
You can easily run through a customized `yaml` file:
```yaml
# Whether to use GPU
use_gpu: True
# Deciding which GPU to use
device: 0
# Federate learning related options
federate:
# `standalone` or `distributed`
mode: standalone
# Evaluate in Server or Client test set
make_global_eval: True
# Number of dataset being split
client_num: 5
# Number of communication round
total_round_num: 400
# Dataset related options
data:
# Root directory where the data stored
root: data/
# Dataset name
type: cora
# Use Louvain algorithm to split `Cora`
splitter: 'louvain'
# Use fullbatch training, batch_size should be `1`
batch_size: 1
# Model related options
model:
# Model type
type: gcn
# Hidden dim
hidden: 64
# Dropout rate
dropout: 0.5
# Number of Class of `Cora`
out_channels: 7
# Criterion related options
criterion:
# Criterion type
type: CrossEntropyLoss
# Trainer related options
trainer:
# Trainer type
type: nodefullbatch_trainer
# Train related options
train:
# Number of local update steps
local_update_steps: 4
# Optimizer related options
optimizer:
# Learning rate
lr: 0.25
# Weight decay
weight_decay: 0.0005
# Optimizer type
type: SGD
# Evaluation related options
eval:
# Frequency of evaluation
freq: 1
# Evaluation metrics, accuracy and number of correct items
metrics: ['acc', 'correct']
```
### Start with customized functions
FS-G also provides `register` function to set up the FL. Here we provide an example about how to run your own model and data to FS-G.
* Load your data (write in `federatedscope/contrib/data/`):
```python
import copy
import numpy as np
from torch_geometric.datasets import Planetoid
from federatedscope.core.splitters.graph import LouvainSplitter
from federatedscope.register import register_data
def my_cora(config=None):
path = config.data.root
num_split = [232, 542, np.iinfo(np.int64).max]
dataset = Planetoid(path,
'cora',
split='random',
num_train_per_class=num_split[0],
num_val=num_split[1],
num_test=num_split[2])
global_data = copy.deepcopy(dataset)[0]
dataset = LouvainSplitter(config.federate.client_num)(dataset[0])
data_local_dict = dict()
for client_idx in range(len(dataset)):
data_local_dict[client_idx + 1] = dataset[client_idx]
data_local_dict[0] = global_data
return data_local_dict, config
def call_my_data(config):
if config.data.type == "mycora":
data, modified_config = my_cora(config)
return data, modified_config
register_data("mycora", call_my_data)
```
* Build your model (write in `federatedscope/contrib/model/`):
```python
import torch
import torch.nn.functional as F
from torch.nn import ModuleList
from torch_geometric.data import Data
from torch_geometric.nn import GCNConv
from federatedscope.register import register_model
class MyGCN(torch.nn.Module):
def __init__(self,
in_channels,
out_channels,
hidden=64,
max_depth=2,
dropout=.0):
super(MyGCN, self).__init__()
self.convs = ModuleList()
for i in range(max_depth):
if i == 0:
self.convs.append(GCNConv(in_channels, hidden))
elif (i + 1) == max_depth:
self.convs.append(GCNConv(hidden, out_channels))
else:
self.convs.append(GCNConv(hidden, hidden))
self.dropout = dropout
def forward(self, data):
if isinstance(data, Data):
x, edge_index = data.x, data.edge_index
elif isinstance(data, tuple):
x, edge_index = data
else:
raise TypeError('Unsupported data type!')
for i, conv in enumerate(self.convs):
x = conv(x, edge_index)
if (i + 1) == len(self.convs):
break
x = F.relu(F.dropout(x, p=self.dropout, training=self.training))
return x
def gcnbuilder(model_config, input_shape):
x_shape, num_label, num_edge_features = input_shape
model = MyGCN(x_shape[-1],
model_config.out_channels,
hidden=model_config.hidden,
max_depth=model_config.layer,
dropout=model_config.dropout)
return model
def call_my_net(model_config, local_data):
# Please name your gnn model with prefix 'gnn_'
if model_config.type == "gnn_mygcn":
model = gcnbuilder(model_config, local_data)
return model
register_model("gnn_mygcn", call_my_net)
```
- Run with following command to start:
```bash
python federatedscope/main.py --cfg federatedscope/gfl/baseline/example.yaml data.type mycora model.type gnn_mygcn
```
## Publications
If you find FS-G useful for research or development, please cite the following [paper](https://arxiv.org/abs/2204.05562):
```latex
@inproceedings{federatedscopegnn,
title = {FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning},
author = {Zhen Wang and Weirui Kuang and Yuexiang Xie and Liuyi Yao and Yaliang Li and Bolin Ding and Jingren Zhou},
booktitle = {Proc.\ of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'22)},
year = {2022}
}
```