Integrasi Gesekan Roda dan Inersia Bucket untuk Navigasi Ekskavator Menggunakan Deep Reinforcement Learning

Andhika Indra Laksana

Andhika Indra Laksana, Dr.Eng. Ir. Igi Ardiyanto, S.T., M.Eng., IPM., ASEAN Eng., SMIEEE.; Ir. Prapto Nugroho, S.T., M.Eng., D.Eng., IPM.

2026 | Tesis | S2 Teknik Elektro

Abstrak
File Pdf

Modernisasi industri konstruksi menuntut sistem navigasi otonom yang mampu beradaptasi terhadap variasi dinamika beban dan traksi permukaan. Tantangan utama pada ekskavator adalah kegagalan operasional akibat kondisi medan dengan koefisien gesek rendah dan variasi inersia beban yang memicu risiko slipping dan tipping. Penelitian ini mengusulkan arsitektur Deep Reinforcement Learning (DRL) yang secara eksplisit mengintegrasikan estimasi koefisien gesekan roda-tanah (?) dan momen inersia bucket (I) ke dalam state observation agen untuk meningkatkan stabilitas navigasi. Sistem diimplementasikan menggunakan model Yanmar ViO80-2PB pada simulator Webots dengan YOLOv8m sebagai lokalisator target secara real-time. Kontribusi utama penelitian ini terletak pada perancangan reward function adaptif yang memungkinkan agen mengoptimalkan torsi gerakan berdasarkan dinamika beban secara otonom. Kinerja sistem dievaluasi secara kuantitatif melalui perbandingan antara algoritma off-policy Soft Actor-Critic (SAC), on-policy Proximal Policy Optimization (PPO), serta data demonstrasi manusia dan model PPO dan SAC yang baseline. Hasil eksperimen menunjukkan bahwa SAC terintegrasi merupakan model paling andal dengan Navigation Success Rate tertinggi sebesar 75%, secara signifikan melampaui performa manusia dan model ideal. SAC memiliki eksplorasi ruang kebijakan secara lebih halus, sehingga mampu memitigasi osilasi angular. Penelitian ini menyimpulkan bahwa SAC dengan integrasi fitur fisik adalah arsitektur terbaik untuk kontrol robotika alat berat yang membutuhkan presisi dan keamanan tinggi. Kedua algoritma yang diusulkan secara signifikan juga melampaui performa model baseline, membuktikan bahwa penyertaan fitur fisik dalam proses pembelajaran mempercepat konvergensi agen menuju kebijakan kontrol yang optimal. Penelitian ini memberikan dasar bagi pengembangan kontrol alat berat otonom yang lebih aman dan efisien, khususnya dalam menangani variasi koefisien gesek permukaan dan fluktuasi beban muatan. Meskipun demikian, penelitian ini terbatas pada pengaturan simulasi semi-ideal tanpa kemiringan medan, hambatan eksternal, atau penerapan di dunia nyata. Penelitian selanjutnya akan berfokus pada pengacakan domain dan validasi transfer simulasi ke dunia nyata untuk menilai kemampuan generalisasi di luar lingkungan simulasi.

The modernization of the construction industry demands autonomous navigation systems capable of adapting to variations in load dynamics and surface traction. The main challenge in excavators is operational failure due to terrain conditions with low friction coefficients and load inertia variations, which trigger the risks of slipping and tipping. This research proposes a Deep Reinforcement Learning (DRL) architecture that explicitly integrates the estimation of the wheel-ground friction coefficient (?) and bucket moment of inertia (I) into the agent’s state observation to improve navigation stability. The system is implemented using the Yanmar ViO80-2PB model in theWebots simulator with YOLOv8m as a real-time target localizer. The main contribution of this research lies in the design of an adaptive reward function that allows the agent to autonomously optimize movement torque based on load dynamics. System performance is quantitatively evaluated through a comparison among the off-policy Soft Actor-Critic (SAC) algorithm, the on-policy Proximal Policy Optimization (PPO) algorithm, human demonstration data, and the baseline PPO and SAC models. Experimental results show that the integrated SAC is the most reliable model, achieving the highest Navigation Success Rate of 75%, significantly outperforming human performance and the ideal models. SAC exhibits a smoother exploration of the policy space, enabling it to mitigate angular oscillations. This research concludes that SAC with physical feature integration is the best architecture for heavy machinery robotic control requiring high precision and safety. Both proposed algorithms also significantly outperform the baseline models, proving that the inclusion of physical features in the learning process accelerates the agent’s convergence toward optimal control policies. This research provides a foundation for the development of safer and more efficient autonomous heavy machinery control, particularly in handling surface friction coefficient variations and payload fluctuations. Nevertheless, the study is limited to semi-ideal simulation settings without terrain slope, external obstacles, or real-world deployment. Future work will focus on domain randomization and sim-to-real transfer validation to assess generalization capability beyond the simulated environment.

Kata Kunci : Ekskavator, Navigasi, DRL, YOLOv8m, Koefisien Gesek, Inersia Bucket

S2-2026-508386-abstract.pdf
S2-2026-508386-bibliography.pdf
S2-2026-508386-tableofcontent.pdf
S2-2026-508386-title.pdf

LAYANAN

E-Resources

Quick Access