Kendali Kinematika Excavator Berbasis Reinforcement Learning dengan Algoritma Proximal Policy Optimization (PPO)
CENDIKIA ISHMATUKA S, Dr. Indah Soesanti, S.T., M.T.; Ahmad Ataka Awwalur Rizqi , S.T., Ph.D
2023 | Skripsi | S1 TEKNIK ELEKTROExcavator adalah sistem yang rumit dan dinamis yang memerlukan kontrol yang akurat untuk operasional yang efisien dan aman. Pengendalian tradisional excavator oleh operator berpengalaman memiliki keterbatasan dalam hal efektivitas dan efisiensi dalam proses industri. Metode kontrol konvensional seperti PID juga membutuhkan upaya besar dalam memodelkan kinematika excavator. Namun, reinforcement learning menawarkan pendekatan alternatif di mana model kontrol dapat dipelajari melalui proses coba dan kesalahan. Algoritma reinforcement learning yang digunakan dalam penelitian ini adalah PPO (Proximal Policy Optimization), yang disediakan oleh library Stable Baselines3. Pelatihan agen reinforcement learning dilakukan melalui simulasi komputer menggunakan PyBullet sebagai mesin fisika dan OpenAI Gym sebagai API lingkungan dalam reinforcement learning. Selama proses pelatihan, model dievaluasi dan reward function, observation space, serta hyperparameter disesuaikan untuk mendapatkan model yang optimal. Hasil penelitian menunjukkan bahwa model reinforcement learning yang dikembangkan dapat efektif mengendalikan excavator, terutama dalam mengatur posisi dan orientasi bucket menuju titik yang diinginkan. Bahkan, model ini dapat mengikuti trayektori tertentu dengan kesalahan posisi dan orientasi yang kecil. Dengan demikian, pemanfaatan model reinforcement learning dalam pengendalian excavator dapat menghemat waktu dan tenaga, karena tidak memerlukan pemahaman yang mendalam tentang kinematika excavator.
Excavator operation demands careful control because it is a dynamic and complicated system. Low effectiveness and efficiency in industrial processes can be caused by traditional control methods, which are normally carried out by professional operators. On the other hand, modeling the excavator's kinematics is a laborious process when using conventional control methods like PID. An alternate strategy is provided by reinforcement learning, in which the control policy is discovered through experimentation. The PPO (Proximal Policy Optimization) algorithm of the Stable Baselines3 package was utilized as the reinforcement learning algorithm for this study. PyBullet is used as the physics engine and OpenAI Gym is used as the environment API during the computer simulations used to train the reinforcement learning agent. The model is assessed throughout training, and the reward function, observation space, and hyperparameters are adjusted to create the best possible model. The research results show that the developed reinforcement learning model can effectively operate the excavator, especially in controlling the position and orientation of the bucket towards the desired point, and can even follow specific trajectories with small errors in position and orientation. As a result, utilizing a reinforcement learning model to drive an excavator does not need in-depth understanding of the excavator's kinematics. This can speed up and simplify the control process.
Kata Kunci : Excavator, Reinforcement Learning, Proximal Policy Optimization