Laporkan Masalah

Sistem Perencanaan Lintasan Swarm Drone berbasis Reinforcement Learning

NANDA RIANGGA DAMANIK, Andi Dharmawan, S.Si., M.Cs., Dr.; Bakhtiar Alldino A.S., S.Si., M.Cs.

2023 | Skripsi | S1 ELEKTRONIKA DAN INSTRUMENTASI

Kawanan pesawat nirawak atau swarm drone adalah kumpulan robot terbang yang bekerja bersama untuk mencapai tujuan tertentu, salah satunya adalah perencanaan lintasan untuk bergerak ke posisi tujuan tanpa mengenai penghalang. Algoritma heuristik perencanaan lintasan mengalami curse of dimentionality sehingga muncul metode berbasis reinforcement learning. Penggunaan metode reinforcement learning berjenis Deep Q-Learning (DQN) mengalami kesulitan dalam merencanakan lintasan swarm drone sehingga dikembangkan menjadi Dueling Double Deep Q-Learning (D3QN). Dunia simulasi berbasis gazebo dan SITL Ardupilot diintegrasi untuk perancangan sistem perencanaan lintasan swarm drone. Data-data sensor seperti depth map, kecepatan, jarak ke tujuan, jarak ke batas geofence, dan jarak antar agen dimasukkan pada Q-network untuk memilih aksi berupa modifikasi kecepatan tiga dimensi. Penelitian dilakukan dengan pelatihan selama 60.000 langkah terhadap tiga buah quadcopter dan dua skenario pengujian. Pelatihan D3QN mendapatkan rata-rata reward untuk seluruh agen sebesar -97,68 sedangkan DQN hanya mendapatkan -108,22. Pengujian dengan 4 penghalang terhadap D3QN menghasilkan rata-rata reward 37,36 dan rata-rata jarak akhir ke tujuan 23,54 sedangkan DQN hanya menghasilkan -121,04 dan 148,26 secara berurutan. Pengujian dengan 8 penghalang terhadap D3QN menghasilkan rata-rata reward sebesar -51,91 dan rata-rata jarak akhir ke tujuan 90,96 sedangkan DQN menghasilkan -105,70 dan 142,27 secara berurutan. Oleh karena itu, disimpulkan metode D3QN lebih unggul dari DQN dalam perencanaan lintasan swarm drone.

Swarm drones are groups of flying robots that work together to achieve a specific goal, such as path planning in order to guides the airgraft to destination without crashing Heuristic's path planning algorithm has curse of dimentionality problem, therefore reinforcement learning approach emerge, especially Deep Q-Learning (DQN). DQN method experienced difficulties in path planning of swarm drone by unable to generate paths, so Dueling Double Deep Q-Learning (D3QN) is developed. Gazebo-based simulation environment and Ardupilot SITL are integrated for design the world of experiments. Sensor data such as depth map, velocity, distance to destination, distance to geofence boundaries, and distance between agents are processed by Q-network in order to select actions to modify three-dimensional velocity. This research was conducted by training for 60.000 steps on three quadcopter and two test scenarios. D3QN training earned an average reward -97,68 while DQN earned -108,22 . Each method is tested with 4-obstacle scenario and D3QN got average total reward of 37,36 and average final distance to the destination of 23,54 while DQN only earned -121,04 and 148,26 respectively. Furthermore, each method is tested with 8-obstacle scenario and D3QN got an overall average reward of -51,91 and overall average final distance to destination of 90,96 while DQN only earned -105,70 and 142,27 respectively. Therefore, it is concluded that the D3QN method is better than DQN in path planning for swarm drone.

Kata Kunci : Kawanan pesawat nirawak, Pembelajaran mesin, DQN, D3QN

  1. S1-2023-442385-abstract.pdf  
  2. S1-2023-442385-bibliography.pdf  
  3. S1-2023-442385-tableofcontent.pdf  
  4. S1-2023-442385-title.pdf