Analysis of AQuaDQN hyperparameters

For each environment, we sample 1000 different configurations of hyperparameters at random, and report statistics over the configurations.

Success rate learning curves


Hyperparameters influence breakdown

For each hyperparameter, we aggregate all configurations for each possible value of the hyperparameter and compute the following statistic:
the average of final success rate for configurations top 50% of configurations (since poorly performing configurations are uninformative).
Some plot bars are "missing", it is because the success rate is 0 for all configurations.

Analysis of aquadem learning rate


Analysis of temperature


Analysis of number of actions


Analysis of aquadem input dropout rate


Analysis of aquadem hidden dropout rate


Analysis of DQN learning rate


Analysis of n step


Analysis of epsilon


Analysis of min demonstration reward


Analysis of demonstration ratio