Add data example.
This commit is contained in:
parent
4344dddc6d
commit
3caf887484
31
README.md
31
README.md
|
|
@ -22,8 +22,19 @@ pip install -r requirements.txt
|
|||
## Data Preparation
|
||||
The traffic data files for Los Angeles (METR-LA) and the Bay Area (PEMS-BAY), i.e., `metr-la.h5` and `pems-bay.h5`, are available at [Google Drive](https://drive.google.com/open?id=10FOTa6HXPqX8Pf5WRoRwcFnW9BrNZEIX) or [Baidu Yun](https://pan.baidu.com/s/14Yy9isAIZYdU__OYEQGa_g), and should be
|
||||
put into the `data/` folder.
|
||||
The `*.h5` files store the data in `panads.DataFrame` using the `HDF5` file format. Here is an article about [Using HDF5 with Python](https://medium.com/@jerilkuriakose/using-hdf5-with-python-6c5242d08773).
|
||||
The `*.h5` files store the data in `panads.DataFrame` using the `HDF5` file format. Here is an example:
|
||||
|
||||
| | sensor_0 | sensor_1 | sensor_2 | sensor_n |
|
||||
|:-------------------:|:--------:|:--------:|:--------:|:--------:|
|
||||
| 2018/01/01 00:00:00 | 60.0 | 65.0 | 70.0 | ... |
|
||||
| 2018/01/01 00:05:00 | 61.0 | 64.0 | 65.0 | ... |
|
||||
| 2018/01/01 00:10:00 | 63.0 | 65.0 | 60.0 | ... |
|
||||
| ... | ... | ... | ... | ... |
|
||||
|
||||
|
||||
Here is an article about [Using HDF5 with Python](https://medium.com/@jerilkuriakose/using-hdf5-with-python-6c5242d08773).
|
||||
|
||||
Run the following commands to generate train/test/val dataset at `data/{METR-LA,PEMS-BAY}/{train,val,test}.npz`.
|
||||
```bash
|
||||
# Create data directories
|
||||
mkdir -p data/{METR-LA,PEMS-BAY}
|
||||
|
|
@ -34,10 +45,16 @@ python -m scripts.generate_training_data --output_dir=data/METR-LA --traffic_df_
|
|||
# PEMS-BAY
|
||||
python -m scripts.generate_training_data --output_dir=data/PEMS-BAY --traffic_df_filename=data/pems-bay.h5
|
||||
```
|
||||
The generated train/val/test dataset will be saved at `data/{METR-LA,PEMS-BAY}/{train,val,test}.npz`.
|
||||
|
||||
## Graph Construction
|
||||
As the currently implementation is based on pre-calculated road network distances between sensors, it currently only
|
||||
supports sensor ids in Los Angeles (see `data/sensor_graph/sensor_info_201206.csv`).
|
||||
```bash
|
||||
python -m scripts.gen_adj_mx --sensor_ids_filename=data/sensor_graph/graph_sensor_ids.txt --normalized_k=0.1\
|
||||
--output_pkl_filename=data/sensor_graph/adj_mx.pkl
|
||||
```
|
||||
Besides, the locations of sensors in Los Angeles, i.e., METR-LA, are available at [data/sensor_graph/graph_sensor_locations.csv](https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/graph_sensor_locations.csv).
|
||||
|
||||
The locations of sensors Los Angeles are available at [data/sensor_graph/graph_sensor_locations.csv](https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/graph_sensor_locations.csv).
|
||||
## Run the Pre-trained Model on METR-LA
|
||||
|
||||
```bash
|
||||
|
|
@ -62,14 +79,6 @@ Each epoch takes about 5min or 10 min on a single GTX 1080 Ti for METR-LA or PEM
|
|||
|
||||
There is a chance that the training loss will explode, the temporary workaround is to restart from the last saved model before the explosion, or to decrease the learning rate earlier in the learning rate schedule.
|
||||
|
||||
## Graph Construction
|
||||
As the currently implementation is based on pre-calculated road network distances between sensors, it currently only
|
||||
supports sensor ids in Los Angeles (see `data/sensor_graph/sensor_info_201206.csv`).
|
||||
|
||||
```bash
|
||||
python -m scripts.gen_adj_mx --sensor_ids_filename=data/sensor_graph/graph_sensor_ids.txt --normalized_k=0.1\
|
||||
--output_pkl_filename=data/sensor_graph/adj_mx.pkl
|
||||
```
|
||||
|
||||
More details are being added ...
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue