TF-Agents在訓練時出現malloc錯誤-有解無憂

我在嘗試使用 tf-agents 庫來訓練 DQN 時遇到了 malloc 錯誤的問題。

規格： M1 mac os 12 TF 2.6.2，tf-agents 0.10.0 Python 3.8（3.9 的結果相同）我使用自定義環境，包裝到 TF env。其他一切都是 tf-agents 的默認組件，沒有任何自定義。

錯誤出現 9/10 運行，有時訓練回圈成功完成。但是如果它失敗了，它總是在最后一行代碼中呼叫 agent_tf.train(experience) 失敗。

非常感謝您的任何建議！

錯誤：

python3(11957,0x307c33000) malloc: Incorrect checksum for freed object 0x7fc8a9875110: probably modified after being freed.
Corrupt value: 0x7fc8ce5c5a80
python3(11957,0x307c33000) malloc: *** set a breakpoint in malloc_error_break to debug
[1]    11957 abort      python3 main_loop.py
/Users/jankolnik/miniconda3/envs/ml_int/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 3 leaked semaphore objects to clean up at shutdown

代碼：

# ENV init
env_tf = environment_tf.Environment(
    n_actions=len(environment.Actions),
    random_start_on_close=params.random_start_on_close.value,
    bars=params.bars_count.value,
    data=train_data,
    reward_on_close_only=params.reward_on_close_only.value
)
env_tf = tf_py_environment.TFPyEnvironment(env_tf)

# NET init
net = tf_agents.networks.q_network.QNetwork(input_tensor_spec=env_tf.observation_spec(),
                                            action_spec=env_tf.action_spec(),
                                            fc_layer_params=(50, 2),
                                            activation_fn=tf.keras.activations.relu)
tgt_net = tf_agents.networks.q_network.QNetwork(input_tensor_spec=env_tf.observation_spec(),
                                            action_spec=env_tf.action_spec(),
                                            fc_layer_params=(50, 2),
                                            activation_fn=tf.keras.activations.relu)

# AGENT init
train_step_counter = tf.Variable(0)
global_step = tf.compat.v1.train.get_or_create_global_step()
epsilon = tf.compat.v1.train.polynomial_decay(
    params.epsilon_start.value,
    global_step,
    decay_steps=params.episodes.value * 5,
    end_learning_rate=params.epsilon_stop.value)

agent_tf = DqnAgent(
    action_spec=env_tf.action_spec(),
    gamma=params.gamma.value,
    target_update_period=params.target_net_sync.value,
    q_network=net,
    target_q_network=tgt_net,
    optimizer=tf.keras.optimizers.Adam(learning_rate=params.learning_rate.value),
    td_errors_loss_fn=common.element_wise_squared_loss,
    train_step_counter=train_step_counter,
    time_step_spec=env_tf.time_step_spec(),
    epsilon_greedy=epsilon
)
agent_tf.initialize()

# MEMORY init
memory = tf_uniform_replay_buffer.TFUniformReplayBuffer(
    batch_size=1,
    max_length=params.reply_size.value,
    data_spec=agent_tf.collect_data_spec
)

# DRIVER TF
train_metrics = [
            tf_metrics.NumberOfEpisodes(),
            tf_metrics.EnvironmentSteps(),
            tf_metrics.AverageReturnMetric(),
            tf_metrics.AverageEpisodeLengthMetric()
]
driver = dynamic_episode_driver.DynamicEpisodeDriver(num_episodes=1,
                                                     env=env_tf,
                                                     policy=agent_tf.collect_policy,
                                                     observers=[memory.add_batch]   train_metrics)

# MEMORY –> DATASET
sample = memory.as_dataset(sample_batch_size=params.batch_size.value,
                           single_deterministic_pass=False,
                           num_parallel_calls=3,
                           num_steps=2).prefetch(3)
iterator = iter(sample)
agent_tf.train = common.function(agent_tf.train)

# MAIN LOOP
time_step = env_tf.reset()
driver.run(num_episodes=30, time_step=time_step)

for i in tqdm(range(params.episodes.value)):
    time_step, _ = driver.run(time_step=time_step)
    experience, _ = next(iterator)
    loss, _ = agent_tf.train(experience)

uj5u.com熱心網友回復：

這似乎是使用金屬 api 的 macbook 的一個已知錯誤（例如 Windows 如何使用 opengl、vulkan 或 directx），您將無法在您的終端上修復它，這是他們如何在兩者上實作金屬的問題英特爾 mac 和 m1 mac。

https://github.com/apple/tensorflow_macos/issues/19

https://github.com/apple/tensorflow_macos/issues/177

uj5u.com熱心網友回復：

Thx anarchy，您鏈接的問題指示我使用二進制 Bazel 安裝程式，而不是點擊pip install tf-agents==0.10.0. 現在我可以將 tf-agents 和 tf 用于蘋果硅，這很快！

轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/372470.html

標籤：Python 张量流 tf-agent

上一篇：如何使用TF-1.4對具有字串值的tf張量進行字串樣式拆分

下一篇：ReactTS變數未及時更新