FS-TFP/federatedscope/gfl/README.md

# FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning

FederatedScope-GNN (FS-G) is a unified, comprehensive and efficient package for federated graph learning. We provide a hands-on tutorial here, while for more detailed tutorial, please refer to [FGL Tutorial](https://federatedscope.io/docs/graph/).

## Quick Start

Let’s start with a two-layer GCN on FedCora to familiarize you with FS-G.

### Step 1. Installation

The installation of FS-G follows FederatedScope, please refer to [Installation](https://github.com/alibaba/FederatedScope#step-1-installation).

After installing the minimal version of FederatedScope, you should install extra dependencies ([PyG](https://github.com/pyg-team/pytorch_geometric), rdkit, and nltk) for the application version of FGL, run:

```bash
conda install -y pyg==2.0.4 -c pyg
conda install -y rdkit=2021.09.4=py39hccf6a74_0 -c conda-forge
conda install -y nltk
```

Now, you have successfully installed the FGL version of FederatedScope.

### Step 2. Run with exmaple config

Now, we train a two-layer GCN on FedCora with FedAvg.

```bash
python federatedscope/main.py --cfg federatedscope/gfl/baseline/example.yaml
```

For more details about customized configurations, see **Advanced**.

## Reproduce the results in our paper

We provide scripts (grid search to find optimal results) to reproduce the results of our experiments.

* Node-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/node_level/`:

  ```bash
  # Example of FedAvg
  cd federatedscope/gfl/baseline/repro_exp/node_level/
  bash run_node_level.sh 0 cora louvain

  # Example of FedAvg
  bash run_node_level.sh 0 cora random

  # Example of FedOpt
  bash run_node_level_opt.sh 0 cora louvain gcn 0.25 4

  # Example of FedProx
  bash run_node_level_prox.sh 0 cora louvain gcn 0.25 4
  ```

* Link-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/link_level/`:

  ```bash
  cd federatedscope/gfl/baseline/repro_exp/link_level/

  # Example of FedAvg
  bash run_link_level_KG.sh 0 wn18 rel_type

  # Example of FedOpt
  bash run_link_level_opt.sh 0 wn18 rel_type gcn 0.25 16

  # Example of FedProx
  bash run_link_level_prox.sh 7 wn18 rel_type gcn 0.25 16
  ```

* Graph-level tasks, please refer to `federatedscope/gfl/baseline/repro_exp/graph_level/`:

  ```bash
  cd federatedscope/gfl/baseline/repro_exp/graph_level/

  # Example of FedAvg
  bash run_graph_level.sh 0 proteins

  # Example of FedOpt
  bash run_graph_level_opt.sh 0 proteins gcn 0.25 4

  # Example of FedProx
  bash run_graph_level_prox.sh 0 proteins gcn 0.25 4
  ```

## Advanced

### Start with built-in functions

You can easily run through a customized `yaml` file:

```yaml
# Whether to use GPU
use_gpu: True

# Deciding which GPU to use
device: 0

# Federate learning related options
federate:
  # `standalone` or `distributed`
  mode: standalone
  # Evaluate in Server or Client test set
  make_global_eval: True
  # Number of dataset being split
  client_num: 5
  # Number of communication round
  total_round_num: 400

# Dataset related options
data:
  # Root directory where the data stored
  root: data/
  # Dataset name
  type: cora
  # Use Louvain algorithm to split `Cora`
  splitter: 'louvain'
  # Use fullbatch training, batch_size should be `1`
  batch_size: 1

# Model related options
model:
  # Model type
  type: gcn
  # Hidden dim
  hidden: 64
  # Dropout rate
  dropout: 0.5
  # Number of Class of `Cora`
  out_channels: 7

# Criterion related options
criterion:
  # Criterion type
  type: CrossEntropyLoss

# Trainer related options
trainer:
  # Trainer type
  type: nodefullbatch_trainer

# Train related options
train:
  # Number of local update steps
  local_update_steps: 4
  # Optimizer related options
  optimizer:
    # Learning rate
    lr: 0.25
    # Weight decay
    weight_decay: 0.0005
    # Optimizer type
    type: SGD

# Evaluation related options
eval:
  # Frequency of evaluation
  freq: 1
  # Evaluation metrics, accuracy and number of correct items
  metrics: ['acc', 'correct']
```

### Start with customized functions

FS-G also provides `register` function to set up the FL. Here we provide an example about how to run your own model and data to FS-G.

* Load your data (write in `federatedscope/contrib/data/`):

  ```python
  import copy
  import numpy as np

  from torch_geometric.datasets import Planetoid
  from federatedscope.core.splitters.graph import LouvainSplitter
  from federatedscope.register import register_data


  def my_cora(config=None):
      path = config.data.root

      num_split = [232, 542, np.iinfo(np.int64).max]
      dataset = Planetoid(path,
                          'cora',
                          split='random',
                          num_train_per_class=num_split[0],
                          num_val=num_split[1],
                          num_test=num_split[2])
      global_data = copy.deepcopy(dataset)[0]
      dataset = LouvainSplitter(config.federate.client_num)(dataset[0])

      data_local_dict = dict()
      for client_idx in range(len(dataset)):
          data_local_dict[client_idx + 1] = dataset[client_idx]

      data_local_dict[0] = global_data
      return data_local_dict, config


  def call_my_data(config):
      if config.data.type == "mycora":
          data, modified_config = my_cora(config)
          return data, modified_config


  register_data("mycora", call_my_data)

  ```

* Build your model (write in `federatedscope/contrib/model/`):

  ```python
  import torch
  import torch.nn.functional as F

  from torch.nn import ModuleList
  from torch_geometric.data import Data
  from torch_geometric.nn import GCNConv
  from federatedscope.register import register_model


  class MyGCN(torch.nn.Module):
      def __init__(self,
                   in_channels,
                   out_channels,
                   hidden=64,
                   max_depth=2,
                   dropout=.0):
          super(MyGCN, self).__init__()
          self.convs = ModuleList()
          for i in range(max_depth):
              if i == 0:
                  self.convs.append(GCNConv(in_channels, hidden))
              elif (i + 1) == max_depth:
                  self.convs.append(GCNConv(hidden, out_channels))
              else:
                  self.convs.append(GCNConv(hidden, hidden))
          self.dropout = dropout

      def forward(self, data):
          if isinstance(data, Data):
              x, edge_index = data.x, data.edge_index
          elif isinstance(data, tuple):
              x, edge_index = data
          else:
              raise TypeError('Unsupported data type!')

          for i, conv in enumerate(self.convs):
              x = conv(x, edge_index)
              if (i + 1) == len(self.convs):
                  break
              x = F.relu(F.dropout(x, p=self.dropout, training=self.training))
          return x


  def gcnbuilder(model_config, input_shape):
      x_shape, num_label, num_edge_features = input_shape
      model = MyGCN(x_shape[-1],
                    model_config.out_channels,
                    hidden=model_config.hidden,
                    max_depth=model_config.layer,
                    dropout=model_config.dropout)
      return model


  def call_my_net(model_config, local_data):
      # Please name your gnn model with prefix 'gnn_'
      if model_config.type == "gnn_mygcn":
          model = gcnbuilder(model_config, local_data)
          return model


  register_model("gnn_mygcn", call_my_net)

  ```

- Run with following command to start:

  ```bash
  python federatedscope/main.py --cfg federatedscope/gfl/baseline/example.yaml data.type mycora model.type gnn_mygcn
  ```

## Publications

If you find FS-G useful for research or development, please cite the following [paper](https://arxiv.org/abs/2204.05562):

```latex
@inproceedings{federatedscopegnn,
  title     = {FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning},
  author    = {Zhen Wang and Weirui Kuang and Yuexiang Xie and Liuyi Yao and Yaliang Li and Bolin Ding and Jingren Zhou},
  booktitle = {Proc.\ of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'22)},
  year      = {2022}
}
```