diff --git a/README.md b/README.md index 2001833..2ee235c 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,8 @@ pip install -r requirements.txt ## Data Preparation The traffic data files for Los Angeles (METR-LA) and the Bay Area (PEMS-BAY), i.e., `metr-la.h5` and `pems-bay.h5`, are available at [Google Drive](https://drive.google.com/open?id=10FOTa6HXPqX8Pf5WRoRwcFnW9BrNZEIX) or [Baidu Yun](https://pan.baidu.com/s/14Yy9isAIZYdU__OYEQGa_g), and should be put into the `data/` folder. -Besides, the locations of sensors Los Angeles are available at [data/sensor_graph/graph_sensor_locations.csv](https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/graph_sensor_locations.csv). +The `*.h5` files store the data in `panads.DataFrame` using the `HDF5` file format. Here is an article about [Using HDF5 with Python](https://medium.com/@jerilkuriakose/using-hdf5-with-python-6c5242d08773). + ```bash # Create data directories mkdir -p data/{METR-LA,PEMS-BAY} @@ -36,6 +37,7 @@ python -m scripts.generate_training_data --output_dir=data/PEMS-BAY --traffic_df The generated train/val/test dataset will be saved at `data/{METR-LA,PEMS-BAY}/{train,val,test}.npz`. +The locations of sensors Los Angeles are available at [data/sensor_graph/graph_sensor_locations.csv](https://github.com/liyaguang/DCRNN/blob/master/data/sensor_graph/graph_sensor_locations.csv). ## Run the Pre-trained Model on METR-LA ```bash diff --git a/scripts/generate_training_data.py b/scripts/generate_training_data.py index 585e080..608f8b6 100644 --- a/scripts/generate_training_data.py +++ b/scripts/generate_training_data.py @@ -116,7 +116,7 @@ if __name__ == "__main__": parser.add_argument( "--traffic_df_filename", type=str, - default="data/df_highway_2012_4mon_sample.h5", + default="data/metr-la.h5", help="Raw traffic readings.", ) args = parser.parse_args()