BATIK RVGAN: A GENERATIVE ADVERSARIAL NETWORK WITH RETRIEVAL FOR BATIK MOTIF SYNTHESIS

Agus Eko Minarno

Agus Eko Minarno, Prof. Ir. Hanung Adi Nugroho, S.T., M.Eng., Ph.D., IPM., SMIEEE. , Dr. Indah Soesanti, S.T., M.T.

2025 | Disertasi | S3 Teknik Elektro

Abstrak
File Pdf

Batik merupakan warisan budaya yang memiliki nilai filosofi dan nilai ekonomi yang tinggi. Namun, pada perkembangannya batik mengalami stagnasi motif dan mempengaruhi pasar. Penelitian menggunakan artificial intelligence telah diusulkan untuk membangkitkan motif-motif batik baru. Metode artificial intelligence yang paling banyak diusulkan untuk membangkitkan motif batik adalah Generative Adversarial Network (GAN). Secara umum, GAN mempelajari motif batik asli dan berusaha membuat motif sintesis. GAN mempelajari distribusi piksel pada citra motif asli dan meniru distribusi tersebut dari latent space.

Namun, distribusi piksel pada saat pelatihan model tidak selalu menghasilkan motif sintesis seperti yang diharapkan dikarenakan banyak komponen yang mempengaruhi persebaran piksel. Komponen penting yang paling mempengaruhi distribusi piksel pada GAN adalah generator, discriminator, dan loss function. Desain arsitektur dan parameter pada model generator dan discriminator serta loss function yang tidak tepat mengakibatkan

kegagalan membentuk motif sintesis bahkan dapat mengalami vanishing gradient atau exploding gradient.

Penelitian yang paling berkontribusi besar pada pembangkitan motif batik sintesis adalah Batik GAN SL yang mampu menciptakan motif baru dari dua motif batik yang dikombinasikan. Namun, Batik GAN SL masih memiliki beberapa masalah. Pertama, kualitas sintesis yang rendah sehingga detail motif tidak tampak, masalah ini disebabkan oleh kualitas dataset dengan resolusi rendah. Masalah kedua, pada saat implementasi, sintesis motif yang dibangkitkan dari pemilihan motif acak sangat mempengaruhi kualitas sintesis. Motif yang tidak relevan jika dikombinasikan akan mempengaruhi kualitas sintesis sehingga diperlukan pemilihan motif yang relevan agar dapat menghasilkan sintesis yang diharapkan. Ketiga, model Batik GAN SL mengalami vanishing dan exploding gradient yang cukup tinggi disebabkan oleh arsitektur dan konfigurasi hyperparameter yang kurang optimal. Di samping itu, sintesis motif memiliki banyak derau dan artefact, hal ini disebabkan oleh kegagalan konvergensi piksel pada model GAN, terutama pada generator. Keempat, loss function pada Batik GAN SL hanya menghitung kesamaan piksel tanpa mempertimbangkan persepsi visual manusia, mengakibatkan sintesis motif yang dihasilkan kurang optimal.

Oleh karena itu, penelitian ini mengusulkan perbaikan pada metode Batik GAN SL pada empat aspek. Pertama, membangun dataset primer dengan kualitas dan resolusi tinggi untuk pelatihan model GAN dan model image retrieval. Kedua, membangun model image retrieval baru menggunakan CNN untuk memilih motif yang relevan sebelum dibangkitkan menggunakan GAN. Ketiga, meningkatkan kualitas sintesis motif dengan optimasi pada model generator dan discriminator. Pada model generator dioptimasi menggunakan teknik Pixel Shuffle untuk meningkatkan konvergensi piksel pada saat pembentukan motif sintesis. Sedangkan pada model discriminator ditambahkan batch normalization untuk menjaga gradien agar tidak mengalami exploding. Keempat, pada bagian loss function diusulkan model Perceptual Loss yang dimodifikasi dengan BatikNet sebagai feature extractor.

Metode yang diusulkan dinamakan Batik RVGAN yang merupakan gabungan dari RetrieVal dan GAN (RVGAN). Evaluasi pada Batik RVGAN meliputi dua hal; pertama, evaluasi pada image retrieval menggunakan precision dan recall, sementara evaluasi pada GAN menggunakan Fréchet Inception Distance (FID), Peak Signal-to-Noise Ratio PSNR), Structural Similarity Index Measure (SSIM), dan Learned Perceptual Image Patch Similarity (LPIPS).

Hasil pengujian Batik RVGAN pada empat dataset yaitu Batik MLCV, Batik Nitik Sarimbit 120, Batik Nitik 252, dan Batik ITB menunjukkan bahwa arsitektur model generator yang dioptimasi dengan Pixel Shuffle dan hyperparameter tuning dapat meningkatkan kualitas motif sintesis dan mengurangi derau. Hal ini tampak pada peningkatan performa yang ditandai dengan penurunan nilai FID. Sedangkan pada discriminator, penambahan batch normalization memberikan keseimbangan gradien pada saat pelatihan. Pengembangan Perceptual Loss menunjukkan dampak positif dalam proses pembaharuan bobot pada model generator dan discriminator, sehingga motif sintesis menjadi lebih konvergen. Model Batik RVGAN secara keseluruhan memberikan kontribusi yang signifikan dengan nilai FID sebesar 12,61.

Di samping itu, sintesis yang dihasilkan juga diuji secara kualitatif menggunakan kuisioner sebanyak 325 responden dengan memilih motif sintesis terbaik yang dihasilkan oleh Batik GAN SL, Batik GAN CL, dan Batik RVGAN. Hasilnya, lebih dari 80% responden memilih sintesis Batik RVGAN sebagai motif terbaik. Sementara itu, pengujian model image retrieval menggunakan CNN juga memberikan performa yang signifikan dibandingkan penelitian sebelumnya. Model image retrieval dilatih menggunakan data primer yaitu Batik Nitik 960 dengan nilai precision sebesar 0,99.

Penelitian ini memberikan kontribusi dan kebaruan dengan membangun dua datasets baru: Batik Nitik 960 dan Batik Sarimbit 120; membangun model baru untuk image retrieval dengan CNN; membangun model baru Batik RVGAN; serta mengusulkan model baru BatikNet pada Perceptual Loss. Selain itu, model Batik RVGAN menyajikan performa yang lebih unggul dibandingkan dengan Batik GAN SL dan Batik GAN CL ketika diuji menggunakan empat datasets. Pengembangan metode Perceptual Loss menggunakan BatikNet terbukti efektif dalam mengklasifikasikan motif asli dan sintesis. Motif sintesis yang dihasilkan oleh Batik RVGAN mendapatkan apresiasi dari 325 responden, dengan lebih dari 80% suara positif dalam hasil voting. Disamping itu hasil sintesis motif juga telah dikurasi oleh tiga pakar batik yang menyatakan bahwa hasil sintesis motif dapat

diterima sebagai bagian dari pengembangan motif batik dengan mempertahankan motif asli serta filosofinya.

Batik is a cultural heritage that holds high philosophical and economic value.

However, over time, batik has experienced stagnation in motif development, affecting the

market. Research using artificial intelligence has been proposed to generate new batik

motifs. The most frequently suggested method for generating batik motifs is the Generative

Adversarial Network (GAN). In general, GAN learns from original batik motifs and

attempts to create synthetic ones. GAN studies the pixel distribution of the original motifs

and replicates this distribution from the latent space.

However, the pixel distribution during model training does not always produce

the expected synthetic motifs, as various components affect pixel dispersion. The key

components that influence pixel distribution in GAN are the generator, discriminator, and

loss function. Inadequate architectural design and parameter settings for the generator,

discriminator, and loss function can lead to failures in generating synthetic motifs and

may even result in vanishing or exploding gradients.

One of the most significant contributions to the generation of synthetic batik motifs

comes from Batik GAN SL, which can create new motifs by combining two batik motifs.

However, Batik GAN SL still faces several issues. First, the low quality of the synthesis

results in unclear motif details, caused by the low-resolution dataset. Second, during

implementation, the random selection of motifs significantly affects the quality of the

synthesis. When irrelevant motifs are combined, it negatively impacts synthesis quality,

necessitating the selection of relevant motifs to achieve the desired synthesis. Third, Batik

GAN SL suffers from high vanishing and exploding gradients due to suboptimal architecture

and hyperparameter selection. Additionally, the synthetic motifs exhibit significant

noise and artifacts, caused by pixel convergence failures, particularly in the generator.

Fourth, the loss function in Batik GAN SL only calculates pixel similarity without considering

human visual perception, resulting in suboptimal synthetic motifs.

Therefore, this research proposes improvements to the Batik GAN SL method in four

aspects. First, a high-quality, high-resolution primary dataset is constructed for training

both the GAN model and the image retrieval model. Second, a new image retrieval model

using Convolutional Neural Networks (CNN) is developed to select relevant motifs before

generating them with GAN. Third, the quality of the synthetic motifs is improved by optimizing

the generator and discriminator models. The generator is optimized using the

Pixel Shuffle technique to enhance pixel convergence during motif synthesis, while batch

normalization is added to the discriminator to prevent exploding gradients. Fourth, a new

loss function model is proposed, utilizing Perceptual Loss modified with BatikNet.

The proposed method, named Batik RVGAN, combines RetrieVal and GAN

(RVGAN). The evaluation of Batik RVGAN covers two aspects: first, image retrieval is

evaluated using precision and recall, while GAN is evaluated using Fréchet Inception

Distance (FID), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure

(SSIM), and Learned Perceptual Image Patch Similarity (LPIPS).

The results of testing Batik RVGAN on four datasets Batik MLCV, Batik Nitik Sarimbit

120, Batik Nitik 252, and Batik ITB demonstrate that the generator architecture optimized

with Pixel Shuffle and hyperparameter tuning improves the quality of synthetic motifs

and reduces noise, as indicated by the improvement in performance with a reduction in

FID values. Meanwhile, the addition of batch normalization to the discriminator ensures gradient stability during training.

The development of Perceptual Loss has a positive impact on weight updates in both the generator and discriminator models, resulting in

more convergent synthetic motifs. Overall, the Batik RVGAN model makes a significant

contribution, achieving an FID score of 12.61. In addition, the synthetic motifs were

qualitatively evaluated using a questionnaire involving 325 respondents, who selected the

best synthetic motifs produced by Batik GAN SL, Batik GAN CL, and Batik RVGAN. Over

80% of the respondents chose Batik RVGAN’s synthetic motifs as the best. Furthermore,

the testing of the CNN-based image retrieval model demonstrated significant performance

improvements compared to previous studies, with the model trained using the primary

dataset Batik Nitik 960, achieving a precision score of 0.99.

This research contributes novelty by creating two new datasets, Batik Nitik 960 and

Batik Sarimbit 120; developing a new image retrieval model using CNN; constructing

a new Batik RVGAN model; and proposing a new BatikNet model for Perceptual Loss.

Additionally, the Batik RVGAN model outperforms Batik GAN SL and Batik GAN CL

when tested on four datasets. The development of the Perceptual Loss method using BatikNet

has proven effective in classifying both original and synthetic motifs. The synthetic

motifs produced by Batik RVGAN received positive feedback from 325 respondents, with

over 80% of votes in favor of its motifs. Moreover, the synthesized motifs have been curated

by three batik experts, who stated that the synthesized motifs are acceptable as part of batik motif development

while preserving the original motifs and their philosophy.

Kata Kunci : Batik, BatikNet, Generative Adversarial Network, Image Retrieval, Convolutional Neural Network, Autoencoder, Perceptual.

S3-2025-495159-abstract.pdf
S3-2025-495159-bibliography.pdf
S3-2025-495159-tableofcontent.pdf
S3-2025-495159-title.pdf

LAYANAN

E-Resources

Quick Access