659 lines
26 KiB
Markdown
659 lines
26 KiB
Markdown
# 模型迁移教程
|
||
|
||
这里以[STGODE](https://github.com/square-coder/STGODE/tree/main)为例。
|
||
|
||
|
||
1. [创建模型](#创建模型)
|
||
2. [修改 import](#修改import)
|
||
3. [修改最外层 Module 与创建配置文件](#修改最外层Module与创建配置文件)
|
||
4. [图的传入](#图的传入)
|
||
5. [在 model_selector 中添加自己的模型](#在model_selector中添加自己的模型)
|
||
6. [调整输入输出的shape](#调整输入输出的shape)
|
||
7. [开始训练模型](#开始训练模型)
|
||
|
||
|
||
|
||
# 创建模型
|
||
|
||
确定模型的名称 `{model_name}`。推荐全大写,记住这个名称。
|
||
|
||
在model文件夹下创建模型文件夹,以 `{model_name}` 命名。这里存放模型文件。
|
||
|
||
STGODE的模型文件一共有两个,`model.py, odegcn.py`,把他拷贝到`TrafficWheel/model/STGODE`下
|
||
|
||
|
||
|
||
# 修改import
|
||
|
||
原仓库中,model.py文件import了odegcn.py文件下的ODEG模块。
|
||
|
||
```python
|
||
from odegcn import ODEG
|
||
```
|
||
|
||
迁移到TrafficWheel中,我们需要将odegcn指向我们的文件夹。
|
||
|
||
```python
|
||
from model.STGODE.odegcn import ODEG
|
||
```
|
||
|
||
|
||
|
||
# 修改最外层Module与创建配置文件
|
||
|
||
我们只关注最外层的`nn.Module`类,这里STGODE最外层的`nn.Module`类是`class ODEGCN(nn.Module):`
|
||
|
||
```python
|
||
class ODEGCN(nn.Module):
|
||
""" the overall network framework """
|
||
def __init__(self, num_nodes, num_features, num_timesteps_input,
|
||
num_timesteps_output, A_sp_hat, A_se_hat):
|
||
"""
|
||
Args:
|
||
num_nodes : number of nodes in the graph
|
||
num_features : number of features at each node in each time step
|
||
num_timesteps_input : number of past time steps fed into the network
|
||
num_timesteps_output : desired number of future time steps output by the network
|
||
A_sp_hat : nomarlized adjacency spatial matrix
|
||
A_se_hat : nomarlized adjacency semantic matrix
|
||
"""
|
||
|
||
super(ODEGCN, self).__init__()
|
||
# spatial graph
|
||
self.sp_blocks = nn.ModuleList(
|
||
[nn.Sequential(
|
||
STGCNBlock(in_channels=num_features, out_channels=[64, 32, 64],
|
||
num_nodes=num_nodes, A_hat=A_sp_hat),
|
||
STGCNBlock(in_channels=64, out_channels=[64, 32, 64],
|
||
num_nodes=num_nodes, A_hat=A_sp_hat)) for _ in range(3)
|
||
])
|
||
# semantic graph
|
||
self.se_blocks = nn.ModuleList([nn.Sequential(
|
||
STGCNBlock(in_channels=num_features, out_channels=[64, 32, 64],
|
||
num_nodes=num_nodes, A_hat=A_se_hat),
|
||
STGCNBlock(in_channels=64, out_channels=[64, 32, 64],
|
||
num_nodes=num_nodes, A_hat=A_se_hat)) for _ in range(3)
|
||
])
|
||
|
||
self.pred = nn.Sequential(
|
||
nn.Linear(num_timesteps_input * 64, num_timesteps_output * 32),
|
||
nn.ReLU(),
|
||
nn.Linear(num_timesteps_output * 32, num_timesteps_output)
|
||
)
|
||
```
|
||
|
||
我们先关注init函数,即模型定义。这里,我们要看init传入了什么东西,这里传入了一堆:
|
||
|
||
```python
|
||
def __init__(self, num_nodes, num_features, num_timesteps_input,
|
||
num_timesteps_output, A_sp_hat, A_se_hat):
|
||
```
|
||
|
||
好在作者给了注释。我们的TrafficWheel在定义模型时只传入一个字典args,所有的参数都在args中提取。
|
||
|
||
我们把init的传参修改为args字典,并使用args['key']从字典中访问参数,像这样
|
||
|
||
```python
|
||
def __init__(self, args):
|
||
super(ODEGCN, self).__init__()
|
||
num_nodes = args['num_nodes']
|
||
num_features = args['num_features']
|
||
num_timesteps_input = args['history']
|
||
num_timesteps_output = args['horizon']
|
||
A_sp_hat, A_se_hat = get_A_hat(args) # 这个之后讲
|
||
```
|
||
|
||
这里的args读取了`config/STGODE/PEMSDX.yaml`的配置文件。写配置文件最简单的方法是,先在config目录中创建文件夹,与模型名一致为STGODE。再从config中已有模型的yaml文件中复制参数。例如,这里从STGNCDE中复制了一份PEMSD4.yaml的配置文件,放在`config/STGODE`下。文件名需要与数据集名一致,可以是PEMSD{3,4,7,8}中的任一种。
|
||
|
||
STGNCDE在model部分的配置如下:
|
||
|
||
```yaml
|
||
model:
|
||
type: type1
|
||
g_type: agc
|
||
input_dim: 2
|
||
output_dim: 1
|
||
embed_dim: 10
|
||
hid_dim: 128
|
||
hid_hid_dim: 128
|
||
num_layers: 2
|
||
cheb_k: 2
|
||
solver: rk4
|
||
```
|
||
|
||
你需要保留`input_dim`和`output_dim`这两个属性,`num_nodes`这个属性会从data中自动拷贝一份到model,不用写。
|
||
|
||
删掉其余参数,然后将刚刚init中访问的key的value写到yaml文件里,如下:
|
||
|
||
```
|
||
model:
|
||
input_dim: 1
|
||
output_dim: 1
|
||
history: 12
|
||
horizon: 12
|
||
num_features: 1
|
||
```
|
||
|
||
后续如果在子模块中有新的参数,就在config后续补充,并把args传给子模块,在子模块的init中使用args访问参数。
|
||
|
||
yaml支持int值,bool值`{True, False}`,字符串str,列表`[[64,64,64],[64,64,64],[64,64,64]]`,在访问args时会自动转换。
|
||
|
||
|
||
|
||
# 图的传入
|
||
|
||
在修改init时,有一个比较棘手的地方,原版代码中传入了两个矩阵,是tensor类,而yaml不支持tensor(也不可能把矩阵写入tensor)。
|
||
|
||
我们需要自己构造函数来生成矩阵,在`model/STGODE`中创建一个新文件,我命名为`adj.py`。在这里构造一个`get_A_hat`的方法,传参是args。返回所需要的两个矩阵。
|
||
|
||
在`STGODE.py`里面import这个方法,并调用。
|
||
|
||
```py
|
||
from model.STGODE.adj import get_A_hat
|
||
```
|
||
|
||
```python
|
||
def __init__(self, args):
|
||
...略
|
||
A_sp_hat, A_se_hat = get_A_hat(args)
|
||
```
|
||
|
||
这里我们需要看原版代码中矩阵是怎么生成并传入init的。
|
||
|
||
原版代码的`run_stode.py`里面的77~87行
|
||
|
||
```python
|
||
data, mean, std, dtw_matrix, sp_matrix = read_data(args)
|
||
train_loader, valid_loader, test_loader = generate_dataset(data, args)
|
||
A_sp_wave = get_normalized_adj(sp_matrix).to(device)
|
||
A_se_wave = get_normalized_adj(dtw_matrix).to(device)
|
||
|
||
net = ODEGCN(num_nodes=data.shape[1],
|
||
num_features=data.shape[2],
|
||
num_timesteps_input=args.his_length,
|
||
num_timesteps_output=args.pred_length,
|
||
A_sp_hat=A_sp_wave,
|
||
A_se_hat=A_se_wave)
|
||
```
|
||
|
||
这里先用`read_data(args)`读取数据,返回两个矩阵`dtw_matrix, sp_matrix`,再调用`get_normalized_adj(matrix)`标准化两个矩阵。
|
||
|
||
我们发现`read_data`和`get_normalized_adj`这两个函数都在`utils.py`里面,所以在utils里面找这两个函数
|
||
|
||
```python
|
||
from utils import generate_dataset, read_data, get_normalized_adj
|
||
```
|
||
|
||
|
||
|
||
我们将这两个函数复制到TrafficWheel中的adj.py中
|
||
|
||
```python
|
||
files = {
|
||
'pems03': ['PEMS03/pems03.npz', 'PEMS03/distance.csv'],
|
||
'pems04': ['PEMS04/pems04.npz', 'PEMS04/distance.csv'],
|
||
'pems07': ['PEMS07/pems07.npz', 'PEMS07/distance.csv'],
|
||
'pems08': ['PEMS08/pems08.npz', 'PEMS08/distance.csv'],
|
||
'pemsbay': ['PEMSBAY/pems_bay.npz', 'PEMSBAY/distance.csv'],
|
||
'pemsD7M': ['PeMSD7M/PeMSD7M.npz', 'PeMSD7M/distance.csv'],
|
||
'pemsD7L': ['PeMSD7L/PeMSD7L.npz', 'PeMSD7L/distance.csv']
|
||
}
|
||
|
||
def read_data(args):
|
||
"""read data, generate spatial adjacency matrix and semantic adjacency matrix by dtw
|
||
|
||
Args:
|
||
sigma1: float, default=0.1, sigma for the semantic matrix
|
||
sigma2: float, default=10, sigma for the spatial matrix
|
||
thres1: float, default=0.6, the threshold for the semantic matrix
|
||
thres2: float, default=0.5, the threshold for the spatial matrix
|
||
|
||
Returns:
|
||
data: tensor, T * N * 1
|
||
dtw_matrix: array, semantic adjacency matrix
|
||
sp_matrix: array, spatial adjacency matrix
|
||
"""
|
||
filename = args.filename
|
||
file = files[filename]
|
||
filepath = "./data/"
|
||
if args.remote:
|
||
filepath = '/home/lantu.lqq/ftemp/data/'
|
||
data = np.load(filepath + file[0])['data']
|
||
|
||
num_node = data.shape[1]
|
||
mean_value = np.mean(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
std_value = np.std(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
data = (data - mean_value) / std_value
|
||
mean_value = mean_value.reshape(-1)[0]
|
||
std_value = std_value.reshape(-1)[0]
|
||
|
||
if not os.path.exists(f'data/{filename}_dtw_distance.npy'):
|
||
data_mean = np.mean([data[:, :, 0][24*12*i: 24*12*(i+1)] for i in range(data.shape[0]//(24*12))], axis=0)
|
||
data_mean = data_mean.squeeze().T
|
||
dtw_distance = np.zeros((num_node, num_node))
|
||
for i in tqdm(range(num_node)):
|
||
for j in range(i, num_node):
|
||
dtw_distance[i][j] = fastdtw(data_mean[i], data_mean[j], radius=6)[0]
|
||
for i in range(num_node):
|
||
for j in range(i):
|
||
dtw_distance[i][j] = dtw_distance[j][i]
|
||
np.save(f'data/{filename}_dtw_distance.npy', dtw_distance)
|
||
|
||
dist_matrix = np.load(f'data/{filename}_dtw_distance.npy')
|
||
|
||
mean = np.mean(dist_matrix)
|
||
std = np.std(dist_matrix)
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args.sigma1
|
||
dist_matrix = np.exp(-dist_matrix ** 2 / sigma ** 2)
|
||
dtw_matrix = np.zeros_like(dist_matrix)
|
||
dtw_matrix[dist_matrix > args.thres1] = 1
|
||
|
||
if not os.path.exists(f'data/{filename}_spatial_distance.npy'):
|
||
with open(filepath + file[1], 'r') as fp:
|
||
dist_matrix = np.zeros((num_node, num_node)) + np.float('inf')
|
||
file = csv.reader(fp)
|
||
for line in file:
|
||
break
|
||
for line in file:
|
||
start = int(line[0])
|
||
end = int(line[1])
|
||
dist_matrix[start][end] = float(line[2])
|
||
dist_matrix[end][start] = float(line[2])
|
||
np.save(f'data/{filename}_spatial_distance.npy', dist_matrix)
|
||
|
||
dist_matrix = np.load(f'data/{filename}_spatial_distance.npy')
|
||
# normalization
|
||
std = np.std(dist_matrix[dist_matrix != np.float('inf')])
|
||
mean = np.mean(dist_matrix[dist_matrix != np.float('inf')])
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args.sigma2
|
||
sp_matrix = np.exp(- dist_matrix**2 / sigma**2)
|
||
sp_matrix[sp_matrix < args.thres2] = 0
|
||
# np.save(f'data/{filename}_sp_c_matrix.npy', sp_matrix)
|
||
# sp_matrix = np.load(f'data/{filename}_sp_c_matrix.npy')
|
||
|
||
print(f'average degree of spatial graph is {np.sum(sp_matrix > 0)/2/num_node}')
|
||
print(f'average degree of semantic graph is {np.sum(dtw_matrix > 0)/2/num_node}')
|
||
return torch.from_numpy(data.astype(np.float32)), mean_value, std_value, dtw_matrix, sp_matrix
|
||
```
|
||
|
||
由于文件太长,我们需要删掉不必要的部分,最后我们只需要返回两个矩阵`dtw_matrix, sp_matrix`,因此把无关的data,mean_value, std_value相关函数去掉。
|
||
|
||
```python
|
||
def read_data(args):
|
||
"""read data, generate spatial adjacency matrix and semantic adjacency matrix by dtw
|
||
|
||
Args:
|
||
sigma1: float, default=0.1, sigma for the semantic matrix
|
||
sigma2: float, default=10, sigma for the spatial matrix
|
||
thres1: float, default=0.6, the threshold for the semantic matrix
|
||
thres2: float, default=0.5, the threshold for the spatial matrix
|
||
|
||
Returns:
|
||
data: tensor, T * N * 1
|
||
dtw_matrix: array, semantic adjacency matrix
|
||
sp_matrix: array, spatial adjacency matrix
|
||
"""
|
||
files = {
|
||
'pems03': ['PEMS03/pems03.npz', 'PEMS03/distance.csv'],
|
||
'pems04': ['PEMS04/pems04.npz', 'PEMS04/distance.csv'],
|
||
'pems07': ['PEMS07/pems07.npz', 'PEMS07/distance.csv'],
|
||
'pems08': ['PEMSD8/pems08.npz', 'PEMSD8/distance.csv'],
|
||
'pemsbay': ['PEMSBAY/pems_bay.npz', 'PEMSBAY/distance.csv'],
|
||
'pemsD7M': ['PeMSD7M/PeMSD7M.npz', 'PeMSD7M/distance.csv'],
|
||
'pemsD7L': ['PeMSD7L/PeMSD7L.npz', 'PeMSD7L/distance.csv']
|
||
}
|
||
|
||
filename = args.filename
|
||
file = files[filename]
|
||
filepath = "./data/"
|
||
data = np.load(filepath + file[0])['data']
|
||
|
||
num_node = data.shape[1]
|
||
mean_value = np.mean(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
std_value = np.std(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
data = (data - mean_value) / std_value
|
||
|
||
if not os.path.exists(f'data/{filename}_dtw_distance.npy'):
|
||
data_mean = np.mean([data[:, :, 0][24 * 12 * i: 24 * 12 * (i + 1)] for i in range(data.shape[0] // (24 * 12))],
|
||
axis=0)
|
||
data_mean = data_mean.squeeze().T
|
||
dtw_distance = np.zeros((num_node, num_node))
|
||
for i in tqdm(range(num_node)):
|
||
for j in range(i, num_node):
|
||
dtw_distance[i][j] = fastdtw(data_mean[i], data_mean[j], radius=6)[0]
|
||
for i in range(num_node):
|
||
for j in range(i):
|
||
dtw_distance[i][j] = dtw_distance[j][i]
|
||
np.save(f'data/{filename}_dtw_distance.npy', dtw_distance)
|
||
|
||
dist_matrix = np.load(f'data/{filename}_dtw_distance.npy')
|
||
|
||
mean = np.mean(dist_matrix)
|
||
std = np.std(dist_matrix)
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args.sigma1
|
||
dist_matrix = np.exp(-dist_matrix ** 2 / sigma ** 2)
|
||
dtw_matrix = np.zeros_like(dist_matrix)
|
||
dtw_matrix[dist_matrix > args.thres1] = 1
|
||
|
||
if not os.path.exists(f'data/{filename}_spatial_distance.npy'):
|
||
with open(filepath + file[1], 'r') as fp:
|
||
dist_matrix = np.zeros((num_node, num_node)) + float('inf')
|
||
file = csv.reader(fp)
|
||
for line in file:
|
||
break
|
||
for line in file:
|
||
start = int(line[0])
|
||
end = int(line[1])
|
||
dist_matrix[start][end] = float(line[2])
|
||
dist_matrix[end][start] = float(line[2])
|
||
np.save(f'data/{filename}_spatial_distance.npy', dist_matrix)
|
||
|
||
dist_matrix = np.load(f'data/{filename}_spatial_distance.npy')
|
||
# normalization
|
||
std = np.std(dist_matrix[dist_matrix != float('inf')])
|
||
mean = np.mean(dist_matrix[dist_matrix != float('inf')])
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args.sigma2
|
||
sp_matrix = np.exp(- dist_matrix ** 2 / sigma ** 2)
|
||
sp_matrix[sp_matrix < args.thres2] = 0
|
||
|
||
return dtw_matrix, sp_matrix
|
||
```
|
||
|
||
原版函数也传入了args,并通过args.sigma这种成员变量的方式访问参数。而我们是用字典传参,所以要将类似`args.sigma`改成`args['sigma']`,并在config中配置相应参数.
|
||
|
||
```python
|
||
sigma = args['sigma1']
|
||
dtw_matrix[dist_matrix > args['thres1']] = 1
|
||
sigma = args['sigma2']
|
||
sp_matrix[sp_matrix < args['thres2']] = 0
|
||
```
|
||
|
||
这里需要注意的是,由于PEMSD3的距离文件中,from和to是传感器编号(而不是从0开始的下标),所以要通过data中的`PEMSD3.txt`文件做个映射,将传感器编号映射到0到357的下标。
|
||
|
||
```csv
|
||
from,to,distance
|
||
317842,318711,0.872
|
||
318721,315955,1.322
|
||
...
|
||
```
|
||
|
||
```py
|
||
# 计算spatial_distance, 如果存在缓存则直接读取缓存
|
||
if not os.path.exists(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy'):
|
||
if num_node == 358:
|
||
with open(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}.txt', 'r') as f:
|
||
id_dict = {int(i): idx for idx, i in enumerate(f.read().strip().split('\n'))} # 建立映射列表
|
||
# 使用 pandas 读取 CSV 文件,跳过标题行
|
||
df = pd.read_csv(filepath + file[1], skiprows=1, header=None)
|
||
dist_matrix = np.zeros((num_node, num_node)) + float('inf')
|
||
for _, row in df.iterrows():
|
||
start = int(id_dict[row[0]])
|
||
end = int(id_dict[row[1]])
|
||
dist_matrix[start][end] = float(row[2])
|
||
dist_matrix[end][start] = float(row[2])
|
||
np.save(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy', dist_matrix)
|
||
else:
|
||
# 使用 pandas 读取 CSV 文件,跳过标题行
|
||
df = pd.read_csv(filepath + file[1], skiprows=1, header=None)
|
||
dist_matrix = np.zeros((num_node, num_node)) + float('inf')
|
||
for _, row in df.iterrows():
|
||
start = int(row[0])
|
||
end = int(row[1])
|
||
dist_matrix[start][end] = float(row[2])
|
||
dist_matrix[end][start] = float(row[2])
|
||
np.save(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy', dist_matrix)
|
||
```
|
||
|
||
|
||
|
||
完成对read_data()函数的修改后,get_normalized_adj()函数倒是不复杂,传入一个narray,返回一个tensor,直接复制即可。
|
||
|
||
```py
|
||
def get_normalized_adj(A):
|
||
"""
|
||
Returns a tensor, the degree normalized adjacency matrix.
|
||
"""
|
||
alpha = 0.8
|
||
D = np.array(np.sum(A, axis=1)).reshape((-1,))
|
||
D[D <= 10e-5] = 10e-5 # Prevent infs
|
||
diag = np.reciprocal(np.sqrt(D))
|
||
A_wave = np.multiply(np.multiply(diag.reshape((-1, 1)), A),
|
||
diag.reshape((1, -1)))
|
||
A_reg = alpha / 2 * (np.eye(A.shape[0]) + A_wave)
|
||
return torch.from_numpy(A_reg.astype(np.float32))
|
||
```
|
||
|
||
|
||
|
||
最后,源代码是在read_data()外面调用get_normalized_adj()来返回标准化的tensor矩阵的,这样写太麻烦。我们直接在read_data的return上完成`get_normalized_adj(matrix).to(device)`完成这个操作,保持model代码的简洁。
|
||
|
||
```py
|
||
data, mean, std, dtw_matrix, sp_matrix = read_data(args)
|
||
train_loader, valid_loader, test_loader = generate_dataset(data, args)
|
||
A_sp_wave = get_normalized_adj(sp_matrix).to(device)
|
||
A_se_wave = get_normalized_adj(dtw_matrix).to(device)
|
||
```
|
||
|
||
修改为
|
||
|
||
```py
|
||
return get_normalized_adj(dtw_matrix).to(args['device']), get_normalized_adj(sp_matrix).to(args['device'])
|
||
```
|
||
|
||
`args['device']`不用写在config中,由运行命令指定。
|
||
|
||
最终,`model/STGODE/adj.py`文件下的`get_A_hat()`函数如下:
|
||
|
||
```python
|
||
def get_A_hat(args):
|
||
"""read data, generate spatial adjacency matrix and semantic adjacency matrix by dtw
|
||
|
||
Args:
|
||
sigma1: float, default=0.1, sigma for the semantic matrix
|
||
sigma2: float, default=10, sigma for the spatial matrix
|
||
thres1: float, default=0.6, the threshold for the semantic matrix
|
||
thres2: float, default=0.5, the threshold for the spatial matrix
|
||
|
||
Returns:
|
||
data: tensor, T * N * 1
|
||
dtw_matrix: array, semantic adjacency matrix
|
||
sp_matrix: array, spatial adjacency matrix
|
||
"""
|
||
filepath = './data/'
|
||
num_node = args['num_nodes']
|
||
file = files[num_node]
|
||
filename = file[0][:6]
|
||
|
||
data = np.load(filepath + file[0])['data']
|
||
num_node = data.shape[1]
|
||
mean_value = np.mean(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
std_value = np.std(data, axis=(0, 1)).reshape(1, 1, -1)
|
||
data = (data - mean_value) / std_value
|
||
|
||
# 计算dtw_distance, 如果存在缓存则直接读取缓存
|
||
if not os.path.exists(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_dtw_distance.npy'):
|
||
data_mean = np.mean([data[:, :, 0][24 * 12 * i: 24 * 12 * (i + 1)] for i in range(data.shape[0] // (24 * 12))],
|
||
axis=0)
|
||
data_mean = data_mean.squeeze().T
|
||
dtw_distance = np.zeros((num_node, num_node))
|
||
for i in tqdm(range(num_node)):
|
||
for j in range(i, num_node):
|
||
dtw_distance[i][j] = fastdtw(data_mean[i], data_mean[j], radius=6)[0]
|
||
for i in range(num_node):
|
||
for j in range(i):
|
||
dtw_distance[i][j] = dtw_distance[j][i]
|
||
np.save(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_dtw_distance.npy', dtw_distance)
|
||
|
||
dist_matrix = np.load(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_dtw_distance.npy')
|
||
|
||
mean = np.mean(dist_matrix)
|
||
std = np.std(dist_matrix)
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args['sigma1']
|
||
dist_matrix = np.exp(-dist_matrix ** 2 / sigma ** 2)
|
||
dtw_matrix = np.zeros_like(dist_matrix)
|
||
dtw_matrix[dist_matrix > args['thres1']] = 1
|
||
|
||
# 计算spatial_distance, 如果存在缓存则直接读取缓存
|
||
if not os.path.exists(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy'):
|
||
if num_node == 358:
|
||
with open(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}.txt', 'r') as f:
|
||
id_dict = {int(i): idx for idx, i in enumerate(f.read().strip().split('\n'))} # 建立映射列表
|
||
# 使用 pandas 读取 CSV 文件,跳过标题行
|
||
df = pd.read_csv(filepath + file[1], skiprows=1, header=None)
|
||
dist_matrix = np.zeros((num_node, num_node)) + float('inf')
|
||
for _, row in df.iterrows():
|
||
start = int(id_dict[row[0]])
|
||
end = int(id_dict[row[1]])
|
||
dist_matrix[start][end] = float(row[2])
|
||
dist_matrix[end][start] = float(row[2])
|
||
np.save(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy', dist_matrix)
|
||
else:
|
||
# 使用 pandas 读取 CSV 文件,跳过标题行
|
||
df = pd.read_csv(filepath + file[1], skiprows=1, header=None)
|
||
dist_matrix = np.zeros((num_node, num_node)) + float('inf')
|
||
for _, row in df.iterrows():
|
||
start = int(row[0])
|
||
end = int(row[1])
|
||
dist_matrix[start][end] = float(row[2])
|
||
dist_matrix[end][start] = float(row[2])
|
||
np.save(f'data/PEMS0{filename[-1]}/PEMS0{filename[-1]}_spatial_distance.npy', dist_matrix)
|
||
# normalization
|
||
std = np.std(dist_matrix[dist_matrix != float('inf')])
|
||
mean = np.mean(dist_matrix[dist_matrix != float('inf')])
|
||
dist_matrix = (dist_matrix - mean) / std
|
||
sigma = args['sigma2']
|
||
sp_matrix = np.exp(- dist_matrix ** 2 / sigma ** 2)
|
||
sp_matrix[sp_matrix < args['thres2']] = 0
|
||
|
||
return get_normalized_adj(dtw_matrix).to(args['device']), get_normalized_adj(sp_matrix).to(args['device'])
|
||
```
|
||
|
||
在model的init函数中调用,返回两个tensor矩阵。
|
||
|
||
```py
|
||
A_sp_hat, A_se_hat = get_A_hat(args)
|
||
```
|
||
|
||
里层的`nn.Module`,一般不必修改任何内容。因为初始化里层的`nn.Module`所需要的相关参数已经在最外层的`nn.Module`中有定义。除非里层`nn.Module`需要额外的图。
|
||
|
||
|
||
|
||
# 在model_selector中添加自己的模型
|
||
|
||
到了这一步,需要先在`model/model_selector.py`中添加自己的模型。
|
||
|
||
先import最外层的模型的`nn.Module`,再仿照前面模型的样式添加上自己的
|
||
|
||
```python
|
||
from model.DDGCRN.DDGCRN import DDGCRN
|
||
from model.TWDGCN.TWDGCN import TWDGCN
|
||
# ...
|
||
from model.STGODE.STGODE import ODEGCN # 新添加
|
||
|
||
|
||
|
||
def model_selector(model):
|
||
match model['type']:
|
||
case 'DDGCRN': return DDGCRN(model)
|
||
case 'TWDGCN': return TWDGCN(model)
|
||
# ...
|
||
case 'STGODE': return ODEGCN(model) # 新添加
|
||
|
||
```
|
||
|
||
其中,case就是文首提到的`{model_name}`
|
||
|
||
|
||
|
||
# 调整输入输出的shape
|
||
|
||
在`model/model_selector.py`中添加自己的模型之后,就可以进行debug了。首先,你需要在pycharm中编写配置。写参数。
|
||
|
||
```bash
|
||
python run.py --model STGODE --dataset PEMSD4 --mode train --device cuda:0
|
||
```
|
||
|
||
之后在自己的forward上打断点。若程序运行时报错,说明init配置的还是有问题,模型未能正常初始化,请重新检查。
|
||
|
||
若初始化完成后,就可以开始修改forward函数了。
|
||
|
||
|
||
|
||
TrafficWheel的外层会对模型传入一个shape为**`[batch_size, time_step, num_nodes, dim]`**的张量,一般情况下,batch_size为64,time_step为12,num_nodes取决于数据集,dim是输入维度为3。其中,第1维为流量数据,第2维为日嵌入,第3维为周嵌入。
|
||
|
||
需要观察原版模型的forward函数里传入什么shape的tensor。一般情况下,不需要日嵌入,周嵌入。
|
||
|
||
这里,STGODE里面传入的shape是 `(batch_size, num_nodes, num_timesteps, num_features)`,与我们传入的shape不一致,所以我们需要修改。
|
||
|
||
以PEMSD4为例,首先,截取 `x = x[..., 0:1]`,会使得x的shape由(64,12,307,3)变为(64,12,307,1),即取流量维度。
|
||
|
||
再通过`x = x.permute(0, 2, 1, 3)` 将`[batch_size, time_step, num_nodes, dim]`转换为 `(batch_size, num_nodes, num_timesteps, num_features)`。输入维度dim和特征features其实是一个意思。
|
||
|
||
```py
|
||
def forward(self, x):
|
||
"""
|
||
Args:
|
||
x : input data of shape (batch_size, num_nodes, num_timesteps, num_features) == (B, N, T, F)
|
||
Returns:
|
||
prediction for future of shape (batch_size, num_nodes, num_timesteps_output)
|
||
"""
|
||
x = x[..., 0:1].permute(0, 2, 1, 3)
|
||
```
|
||
|
||
最后修改输出,在forward最后的return这个地方打断点,如果输入无误,可以直接debug,幸运的话就能一次跑到断点处。不幸运的话可能会遇到各种问题(通常是原代码仓库的问题)。这时候需要debug各个模块的forward中是否有张量不匹配的地方。问题一般就出在这。修改这些bug,直到程序跑到return断点处。
|
||
|
||
```python
|
||
def forward(self, x):
|
||
"""
|
||
Args:
|
||
x : input data of shape (batch_size, num_nodes, num_timesteps, num_features) == (B, N, T, F)
|
||
Returns:
|
||
prediction for future of shape (batch_size, num_nodes, num_timesteps_output)
|
||
"""
|
||
...
|
||
|
||
return self.pred(x)
|
||
```
|
||
|
||
这里在return时还套了个pred函数,为了便于观察,我们可以直接在外层打断点(通常是`trainer/Trainer.py`的第55行,观察output变量)。然而,原版的代码已经用注释告诉我们输出的shape,所以我们只需要看注释就可以了。
|
||
|
||
我们看到输出的shape是(batch_size, num_nodes, num_timesteps_output),而**TrafficWheel接受的输出shape是`[batch_size, time_step, num_nodes, dim]`**,所以我们需要进行转换。
|
||
|
||
在PEMSD4下,原版shape是(64,307,12),首先通过`permute(0,2,1)`调整为(64,12,307),再通过`unsqueeze(dim=-1)`调整为(64,12,307,1)。以符合输出的shape。
|
||
|
||
```py
|
||
return self.pred(x).permute(0,2,1).unsqueeze(dim=-1)
|
||
```
|
||
|
||
|
||
|
||
调整shape常用的技巧:注意,下述方法需要用一个x接受。不能没有返回值。
|
||
|
||
例如 `x=x.permute(0,2,1,3)`是正确的,不能是`x.permute(0,2,1,3)`
|
||
|
||
| 操作 | 原张量shape | 操作后张量shape |
|
||
| ----------------- | ------------- | --------------------- |
|
||
| permute(0,2,1,3) | (64,12,307,3) | (64,307,12,3) |
|
||
| x[..., 0:1] | (64,12,307,3) | (64,12,307,1) |
|
||
| x[..., 0] | (64,12,307,3) | (64,12,307) 取第0维 |
|
||
| squeeze(dim=-1) | (64,12,307,1) | (64,12,307) |
|
||
| unsqueeze(dim=-1) | (64,12,307) | (64,12,307,1) |
|
||
|
||
更多的可以查pytorch的官方文档,比较常用的就这几个
|
||
|
||
|
||
|
||
# 开始训练模型
|
||
|
||
在一切就绪后(forward的input, output的shape对得上),就可以训练模型了。可以使用jupyter notebook批量训练,也可以在pycharm中配置任务。取决于个人喜好。训练的代码为:
|
||
|
||
```bash
|
||
python run.py --model STGODE --dataset PEMSD4 --mode train --device cuda:0
|
||
```
|
||
|