Examples
Random Controller
The following example demonstrates the use of a random controller, which can be used to test any MELGYM environment. This is useful for sanity checks or to ensure that the environment behaves as expected before integrating more complex controllers.
import melgym
import gymnasium as gym
def rand_control(env, n_episodes=1):
"""
Random controller for testing purposes.
Args:
env (gym.Env): Gymnasium environment.
n_episodes (int): Number of episodes to run. Default is 1.
"""
for _ in range(n_episodes):
obs, _ = env.reset()
done = trunc = False
while not (done or trunc):
action = env.action_space.sample()
obs, reward, done, trunc, info = env.step(action)
print(f"Action: {action}, Reward: {reward}, Info: {info}")
env.render()
if __name__ == '__main__':
env = gym.make('pressure-v0')
rand_control(env)
env.close()
Training and Evaluation of a DRL Agent
The following script showcases how to train and evaluate a reinforcement learning agent within the MELGYM environment pressures-v0:
import melgym
import gymnasium as gym
from stable_baselines3 import PPO
env = gym.make('pressure-v0')
# Training
agent = PPO('MlpPolicy', env)
agent.learn(total_timesteps=10_000)
# Evaluation
obs, _ = env.reset()
done = trunc = False
while not (done or trunc):
env.render()
act, _ = agent.predict(obs)
obs, rew, done, trunc, info = env.step(act)
env.close()
In this example, the agent learns to control the pressure of a specific control volume based on its current pressure values. The control action involves adjusting the exhaust flow rate accordingly.
Environment instantiation follows the standard procedure of any Gymnasium-based environment. For the agent, we use the Stable-Baselines3 implementation of Proximal Policy Optimization (PPO). The agent is initialized and trained for a user-defined number of timesteps. Once the training phase is complete, the model is evaluated by running it through an entire simulation episode.
Tip
This example provides a practical starting point for building and testing custom DRL controllers in MELGYM environments.