タクシー需要予測

未来のタクシー需要を先読みしよう!

賞金: 100,000 参加ユーザー数: 274 10ヶ月前に終了

jupyterノートブックを公開いたします( Public2位、Private1位)

皆様お疲れさまでした。また運営のみなさまありがとうございました。
いろいろと試し、楽しく参加させていただきました。
コードを公開いたしますます。

環境

 Python 3.9.8
 pandas 2.0.3
 numpy 1.21.6
 matplotlib 3.5.2
 sklearn 1.1.1
 tensorflow 2.11.0

手法の概要

 LSTM4層+全結合層2層の深層学習モデルです。

工夫した点

 1)休日を説明変数に加えました。13種類の休日があったので非祝祭日を0として、祝祭日は1-13の番号にしました。
 2)過去の全地点のタクシー乗車数と気象データを連結してLSTMに入力しました。
 3)最後の全結合層のところで、予測対象日時の気象データ+月、日、曜日、時刻、祝祭日 を追加で入力しました。
 4)気象は気温、湿度、風速、降水量の4つを利用しました。
 5)最初と、LSTMの層の間に BatchNormalizationを入れたところ、学習が早くなり少し汎化したようでした。

その他

 このコードで複数回実行しましたが、毎回結果が異なりRMSEでおおむね 0.5程度の標準偏差があるようです。
 そのため、1位の結果は ”たまたま” だったかもしれず、恐縮です。

コーディングが下手でお恥ずかしいですが、コード公開いたします。

MIT License

Copyright © 2023 Satoshi Kodama

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

import gc
import datetime

#環境確認
import pandas as pd
import numpy as np
#import statsmodels
!python3 --version
print(pd.__version__)
print(np.__version__)
#print(statsmodels.__version__)
import matplotlib
print(matplotlib.__version__)
import matplotlib.pyplot as plt
Python 3.9.8
2.0.3
1.21.6
3.5.2
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
import sklearn

from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout, Bidirectional, Flatten, BatchNormalization
from tensorflow.keras.utils import plot_model
from keras.callbacks import EarlyStopping
from tensorflow.keras import Input, Model
2023-12-19 13:53:54.812281: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-19 13:53:55.012782: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-12-19 13:53:55.908083: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2023-12-19 13:53:55.908216: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2023-12-19 13:53:55.908223: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
print(sklearn.__version__)
print(tf.__version__)
1.1.1
2.11.0
# google colab を使う場合

#from google.colab import drive
#drive.mount('/content/drive')
#ニューヨークの祝祭日を設定

holydays_data =( [
datetime.date(2017, 1, 1),  # "New Year's Day"
datetime.date(2017, 1, 16), # 'Martin Luther King Jr. Day'
datetime.date(2017, 2, 13), # "Lincoln's Birthday "
datetime.date(2017, 2, 20), # "Washington's Birthday"
datetime.date(2017, 5, 29), # 'Memorial Day'
datetime.date(2017, 7, 4),  # 'Independence Day'
datetime.date(2017, 9, 4),  # 'Labor Day'
datetime.date(2017, 10, 9), # 'Columbus Day'
datetime.date(2017, 11, 7), # 'Election Day'
datetime.date(2017, 11, 11),# 'Veterans Day'
datetime.date(2017, 11, 23),# 'Thanksgiving'
datetime.date(2017, 12, 25),# 'Christmas Day'
datetime.date(2018, 1, 1),  # "New Year's Day"
datetime.date(2018, 1, 15), # 'Martin Luther King Jr. Day'
datetime.date(2018, 2, 12), # "Lincoln's Birthday "
datetime.date(2018, 2, 19), # "Washington's Birthday"
datetime.date(2018, 5, 28), # 'Memorial Day'
datetime.date(2018, 7, 4),  # 'Independence Day'
datetime.date(2018, 9, 3),  # 'Labor Day'
datetime.date(2018, 10, 8), # 'Columbus Day'
datetime.date(2018, 11, 6), # 'Election Day'
datetime.date(2018, 11, 12),# 'Veterans Day'
datetime.date(2018, 11, 22),# 'Thanksgiving'
datetime.date(2018, 12, 25),# 'Christmas Day'
datetime.date(2019, 1, 1),  # "New Year's Day"
datetime.date(2019, 1, 21), # 'Martin Luther King Jr. Day'
datetime.date(2019, 2, 12), # "Lincoln's Birthday "
datetime.date(2019, 2, 18), # "Washington's Birthday"
datetime.date(2019, 5, 27), # 'Memorial Day'
datetime.date(2019, 7, 4),  # 'Independence Day'
datetime.date(2019, 9, 2),  # 'Labor Day'
datetime.date(2019, 10, 14),# 'Columbus Day'
datetime.date(2019, 11, 5), # 'Election Day'
datetime.date(2019, 11, 11),# 'Veterans Day'
datetime.date(2019, 11, 28),# 'Thanksgiving'
datetime.date(2019, 12, 25) # 'Christmas Day'
])
# GPUが認識されていることを確認
physical_devices = tf.config.list_physical_devices('GPU')
if len(physical_devices) > 0:
    for device in physical_devices:
        tf.config.experimental.set_memory_growth(device, True)
        print('{} memory growth: {}'.format(device, tf.config.experimental.get_memory_growth(device)))
else:
    print("Not enough GPU hardware devices available")
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU') memory growth: True
2023-12-19 13:53:56.821992: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:53:56.991445: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:53:56.991768: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
#データの読み込みと前処理
import pandas as pd

path = "/home/kodama3104/ws_011/data/" #データを置くフォルダ ( train , zone , test , weather  )
#path = "/content/drive/MyDrive/data/" #データを置くフォルダ

train_data = pd.read_csv(path + "train_data.csv", index_col='tpep_pickup_datetime')
weather_data = pd.read_csv(path + "nyc_weather_2017_2019.csv")
zone_data = pd.read_csv(path + "taxi_zones.csv")
print(train_data.shape)
print(weather_data.shape)
print(zone_data.shape)
(51072, 79)
(40261, 20)
(79, 3)
/tmp/ipykernel_31951/1210795780.py:8: DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.
  weather_data = pd.read_csv(path + "nyc_weather_2017_2019.csv")
#日付をdatetime型に変換

train_data.index = pd.to_datetime(train_data.index)
weather_data['DATE'] = pd.to_datetime(weather_data['DATE'])
x_col = train_data.columns
x_num = train_data.shape[1]
weather_data = weather_data.set_index('DATE')
weather_data = weather_data[weather_data['REPORT_TYPE']=='FM-15']
# 気象データの時間間隔を、タクシー乗車数のデータと時間間隔をそろえる

weather_30min = weather_data.resample('30min').nearest()
weather_30min.loc['2017-01-01 00:00:00'] = weather_30min.iloc[0]
weather_30min.index = pd.to_datetime(weather_30min.index)
weather_30min = weather_30min.sort_index(ascending=True)
print(weather_data.shape)
print(weather_30min.shape)
(25660, 19)
(51408, 19)
weather_30min = weather_30min.fillna(method = 'ffill')
train_data    = train_data.fillna(method = 'ffill')
# 気象データの中から影響の少なそうな変数を削除
weather_30min = weather_30min.drop(['REPORT_TYPE','SOURCE','HourlyPresentWeatherType',
                                    'HourlySkyConditions','HourlyWindGustSpeed','REM','HourlyDryBulbTemperature',
                                    'HourlyAltimeterSetting','HourlyPressureChange','HourlyPressureTendency',
                                    'HourlySeaLevelPressure','HourlyDewPointTemperature','HourlyWindDirection',
                                    'HourlyStationPressure','HourlyVisibility'],axis=1)

# 変数に、月、曜日、日付、時間、分を加える

weather_30min['month'] = weather_30min.index.month
weather_30min['week'] = weather_30min.index.weekday
weather_30min['day'] = weather_30min.index.day
weather_30min['hour'] = weather_30min.index.hour
weather_30min['minute'] = weather_30min.index.minute

train_data['month'] = train_data.index.month
train_data['week'] = train_data.index.weekday
train_data['day'] = train_data.index.day
train_data['hour'] = train_data.index.hour
train_data['minute'] = train_data.index.hour
# 休日の列を追加して、何の休日かを1-13の数字で入れる

holydays_flag = []
holydays_count = 0
for i in range( len(weather_30min)):
    if(datetime.date( weather_30min.index[i].year, weather_30min.index[i].month, weather_30min.index[i].day ) in holydays_data):
        holydays_count += 1
        if(holydays_count == 13 ):
          holydays_count = 1
        holydays_flag.append(holydays_count)
    else:
        holydays_flag.append(0)

weather_30min['holydays'] = holydays_flag

holydays_flag = []
holydays_count = 0
for i in range( len(train_data)):
    if(datetime.date( train_data.index[i].year, train_data.index[i].month, train_data.index[i].day ) in holydays_data):
        holydays_count += 1
        if(holydays_count == 13 ):
          holydays_count = 1
        holydays_flag.append(holydays_count)
    else:
        holydays_flag.append(0)
train_data['holydays']    = holydays_flag
#for i in range(train_data.shape[1]):
#    plt.figure(figsize=(18,2))
#    plt.plot(train_data.iloc[:train_data.shape[0], i])
#    plt.show()
#    plt.clf()
# 気象データの中の非数値を、数値に置き換える

for elem in weather_30min.select_dtypes(include=object).columns :
    print(elem)
    weather_30min[elem] = weather_30min[elem].str.replace('s', '0')
    weather_30min[elem] = weather_30min[elem].str.replace('V', '0')
    weather_30min[elem] = weather_30min[elem].str.replace('RB', '0')
    weather_30min[elem] = weather_30min[elem].str.replace('T', '0.001')

weather_30min = weather_30min.astype('float32')
HourlyPrecipitation
#for i in range(weather_30min.shape[1]):
#    plt.figure(figsize=(18,2))
#    plt.plot(weather_30min.iloc[:48*92, i])
#    plt.show()
#    plt.clf()
weather_30min.head(1)
HourlyPrecipitation HourlyRelativeHumidity HourlyWetBulbTemperature HourlyWindSpeed month week day hour minute holydays
DATE
2017-01-01 0.0 44.0 38.0 10.0 1.0 6.0 1.0 0.0 0.0 1.0
weather_30min.tail(1)
HourlyPrecipitation HourlyRelativeHumidity HourlyWetBulbTemperature HourlyWindSpeed month week day hour minute holydays
DATE
2019-12-07 23:30:00 0.0 50.0 27.0 8.0 12.0 5.0 7.0 23.0 30.0 0.0
lag_max = int(48*7)      # 過去何日分のデータを使って予測するか
predict_len = int(48*7)  # 予測日数
# 気象データを、学習機関(train)、予測対象期間(future)、テスト期間(test)に分ける
wea_30_train  = weather_30min[:-predict_len]
wea_30_future = weather_30min[lag_max + predict_len-1:-predict_len]
wea_30_test   = weather_30min[-predict_len:]
wea_30_train  = wea_30_train.astype('float32')
wea_30_future = wea_30_future.astype('float32')
wea_30_test   = wea_30_test.astype('float32')

print(train_data.shape)
print(wea_30_train.shape)
print(wea_30_future.shape)
print(wea_30_test.shape)
(51072, 85)
(51072, 10)
(50401, 10)
(336, 10)
# タクシー乗車数データと気象データを結合

df_train = pd.concat([train_data, wea_30_train['HourlyPrecipitation'],wea_30_train['HourlyRelativeHumidity']
                       ,wea_30_train['HourlyWindSpeed'],wea_30_train['HourlyWetBulbTemperature']
                      ],axis='columns', ignore_index=True)

# df_train = pd.concat([train_data, wea_30_train],axis='columns', ignore_index=True)

df_train = df_train.astype('float32')
df_train.head(1)
0 1 2 3 4 5 6 7 8 9 ... 79 80 81 82 83 84 85 86 87 88
2017-01-01 53.0 16.0 45.0 38.0 12.0 6.0 2.0 47.0 31.0 238.0 ... 1.0 6.0 1.0 0.0 0.0 1.0 0.0 44.0 10.0 38.0

1 rows × 89 columns

df_train.tail(1)
0 1 2 3 4 5 6 7 8 9 ... 79 80 81 82 83 84 85 86 87 88
2019-11-30 23:30:00 12.0 1.0 9.0 11.0 4.0 4.0 0.0 23.0 7.0 25.0 ... 11.0 5.0 30.0 23.0 23.0 0.0 0.0 42.0 9.0 27.0

1 rows × 89 columns

df_train.shape
(51072, 89)
# 学習用の説明変数、目的変数、予測用の説明変数の生成
def gen_dataset(dataset, lag_max, predict_len):
    X, y = [], []
    test_X = []

    for i in range(len(dataset) - lag_max - predict_len + 1):
        a = i + lag_max
        X.append(dataset[i:a, :]) #説明変数
        y.append(dataset[a + predict_len - 1, :])   #目的変数

    for i in range(len(dataset),len(dataset)+predict_len):
        end = i - predict_len + 1
        start = end - lag_max
        test_X.append(dataset[start:end, :])
    return np.array(X), np.array(y), np.array(test_X)

X_train, y_train, X_test = gen_dataset(df_train.values, lag_max, predict_len)
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)

X_train = X_train.astype(np.float32)
y_train = y_train.astype(np.float32)
X_test  = X_test.astype(np.float32)
(50401, 336, 89)
(50401, 89)
(336, 336, 89)
del df_train,  wea_30_train, weather_30min
gc.collect()
0
y_train = y_train[:,:x_num]
validation_len = 48*1

X_valid = X_train[-validation_len:]
X_train = X_train[:-validation_len]

w_valid = wea_30_future[-validation_len:]
w_train = wea_30_future[:-validation_len]

y_valid = y_train[-validation_len:]
y_train = y_train[:-validation_len]

print(X_train.shape, w_train.shape)
print(X_valid.shape, w_valid.shape)

print(y_train.shape)
print(y_valid.shape)
(50353, 336, 89) (50353, 10)
(48, 336, 89) (48, 10)
(50353, 79)
(48, 79)
# モデルの各パラメータ

# 隠れ層のノード数
hidden_num1 = 512

# 入力データ
input1 = Input(( X_train.shape[1], X_train.shape[2])) # タクシー乗車数データと気象データを結合したデータ
input2 = Input(  w_train.shape[1] ) # 予測対象時刻の気象データ

# モデル
x  = BatchNormalization()(input1)
x  = LSTM(hidden_num1, return_sequences = True)(x)
x  = BatchNormalization()(x)
x  = LSTM(hidden_num1, return_sequences = True)(x)
x  = BatchNormalization()(x)
x  = LSTM(hidden_num1, return_sequences = True)(x)
x  = BatchNormalization()(x)
x  = LSTM(hidden_num1, return_sequences = True)(x)

x1 = Flatten()(x)
x2 = input2 # 予測対象時刻の気象データ
x3 = tf.keras.layers.Concatenate(axis=1)([x1, x2])
x3 = Dense(128, activation="relu")(x3)
x3 = Dense(128, activation="relu")(x3)

# 出力層
outputs = Dense(y_train.shape[1], activation="linear")(x3)  #全結合層

model = Model(inputs=[input1,input2], outputs=[outputs])
model.summary()
2023-12-19 13:54:22.235870: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-19 13:54:22.292678: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:22.301207: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:22.302398: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:23.363925: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:23.364416: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:23.364430: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Could not identify NUMA node of platform GPU id 0, defaulting to 0.  Your kernel may not have been built with NUMA support.
2023-12-19 13:54:23.364613: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-12-19 13:54:23.364655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9368 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3080 Ti, pci bus id: 0000:01:00.0, compute capability: 8.6
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 336, 89)]    0           []                               
                                                                                                  
 batch_normalization (BatchNorm  (None, 336, 89)     356         ['input_1[0][0]']                
 alization)                                                                                       
                                                                                                  
 lstm (LSTM)                    (None, 336, 512)     1232896     ['batch_normalization[0][0]']    
                                                                                                  
 batch_normalization_1 (BatchNo  (None, 336, 512)    2048        ['lstm[0][0]']                   
 rmalization)                                                                                     
                                                                                                  
 lstm_1 (LSTM)                  (None, 336, 512)     2099200     ['batch_normalization_1[0][0]']  
                                                                                                  
 batch_normalization_2 (BatchNo  (None, 336, 512)    2048        ['lstm_1[0][0]']                 
 rmalization)                                                                                     
                                                                                                  
 lstm_2 (LSTM)                  (None, 336, 512)     2099200     ['batch_normalization_2[0][0]']  
                                                                                                  
 batch_normalization_3 (BatchNo  (None, 336, 512)    2048        ['lstm_2[0][0]']                 
 rmalization)                                                                                     
                                                                                                  
 lstm_3 (LSTM)                  (None, 336, 512)     2099200     ['batch_normalization_3[0][0]']  
                                                                                                  
 flatten (Flatten)              (None, 172032)       0           ['lstm_3[0][0]']                 
                                                                                                  
 input_2 (InputLayer)           [(None, 10)]         0           []                               
                                                                                                  
 concatenate (Concatenate)      (None, 172042)       0           ['flatten[0][0]',                
                                                                  'input_2[0][0]']                
                                                                                                  
 dense (Dense)                  (None, 128)          22021504    ['concatenate[0][0]']            
                                                                                                  
 dense_1 (Dense)                (None, 128)          16512       ['dense[0][0]']                  
                                                                                                  
 dense_2 (Dense)                (None, 79)           10191       ['dense_1[0][0]']                
                                                                                                  
==================================================================================================
Total params: 29,585,203
Trainable params: 29,581,953
Non-trainable params: 3,250
__________________________________________________________________________________________________
# 損失関数:MSE, 最適化関数:Adam
opt = tf.keras.optimizers.Adam( learning_rate=0.0001)
model.compile(loss="mse", optimizer=opt)
# Lossをグラフ表示
def history_graph(history):
    loss = history.history['loss']
    val_loss = history.history["val_loss"]
    plt.plot(loss, label='train')
    plt.plot(val_loss, label='validation')
    plt.ylabel('loss')
    plt.xlabel('epochs')
    plt.show()
# 学習
early_stopping =  EarlyStopping(monitor='val_loss', min_delta=0.0, patience=10)

history = model.fit([X_train, w_train] , y_train,
                    validation_data = ([X_valid, w_valid], y_valid),
                    epochs = 300,
                    callbacks=[early_stopping],
                    batch_size = 64)
Epoch 1/300
2023-12-19 13:54:34.505869: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:428] Loaded cuDNN version 8700
2023-12-19 13:54:35.021752: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:630] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-12-19 13:54:35.035516: I tensorflow/compiler/xla/service/service.cc:173] XLA service 0x7ff058118d00 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-12-19 13:54:35.035551: I tensorflow/compiler/xla/service/service.cc:181]   StreamExecutor device (0): NVIDIA GeForce RTX 3080 Ti, Compute Capability 8.6
2023-12-19 13:54:35.062816: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-12-19 13:54:35.302645: I tensorflow/compiler/jit/xla_compilation_cache.cc:477] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
184/787 [======>.......................] - ETA: 56s - loss: 3625.9412
# Lossをグラフ表示
history_graph(history)
# 予測
def predict_test(model, data1, data2):
    dt = pd.date_range(start='2019-12-01 00:00:00', end='2019-12-07 23:30:00',freq='30T')
    pred = model.predict([data1, data2])

    df_pred = pd.DataFrame(pred, columns=x_col)
    df_pred['tpep_pickup_datetime'] = dt
    rnn_pred = df_pred.set_index("tpep_pickup_datetime")
    return rnn_pred

predict_df = predict_test(model, X_test, wea_30_test)
11/11 [==============================] - 1s 34ms/step
# 予測結果の表示
predict_df.head(5)
0 1 2 3 4 5 6 7 8 9 ... 69 70 71 72 73 74 75 76 77 78
tpep_pickup_datetime
2019-12-01 00:00:00 20.524151 10.021410 8.623723 5.892232 5.599199 1.902857 3.138924 9.182867 3.296710 32.088543 ... 65.809982 5.078995 55.101887 205.071075 14.857251 10.543024 -3.234933 3.898872 12.069722 76.063858
2019-12-01 00:30:00 33.481152 11.196672 6.777467 3.840772 7.880636 2.677950 3.882199 9.041745 6.170432 22.862257 ... 57.680309 4.072216 54.963619 234.529877 20.588902 13.257601 -3.807296 0.079872 13.746686 83.828537
2019-12-01 01:00:00 42.203674 12.913897 8.758183 5.161508 7.966932 4.861914 1.806206 9.746524 8.666472 14.500937 ... 50.382183 2.432377 71.211182 252.702972 23.935171 17.963701 -2.904502 0.630962 9.963893 81.621979
2019-12-01 01:30:00 44.559643 14.488832 6.742269 4.959348 8.468466 5.595797 1.123459 11.294849 10.808940 2.638332 ... 35.153545 1.659405 74.073509 230.377274 25.196012 20.871161 -0.390971 1.259590 2.504596 71.689178
2019-12-01 02:00:00 32.880737 14.082376 8.775491 3.238468 5.866222 2.661674 1.542237 9.066517 8.514407 -4.506200 ... 23.671654 3.578797 71.901833 169.839447 19.969809 15.573454 2.060479 5.451656 2.956710 51.237854

5 rows × 79 columns

submit_df = predict_df
submit_df = submit_df.where(submit_df > 0.0, 0.0)
submit_df = submit_df.round()
submit_df.to_csv('./submit/submission_LSTM_BEST.csv')
#submit_df.to_csv('/content/drive/MyDrive/submit/submission_180.csv')
for i in range(submit_df.shape[1]):
    plt.figure(figsize=(18,2))
    plt.plot(train_data.iloc[-48*21:, i])
    plt.plot(submit_df.iloc[:, i])
    plt.show()
    plt.clf()
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>