Authentication systems

paper

2023·arXiv·arxiv.org/abs/2310.19109

Authors

Huseyin Fuat Alsan·Taner Arsan

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This arxiv preprint demonstrates curriculum learning applied to multimodal deep learning for post-disaster analytics, relevant to AI safety through its focus on improving model robustness and reliability in high-stakes emergency response scenarios.

Paper Details

Citations

1 influential

Year

2023

arXiv:2310.19109 DOI:10.7717/peerj-cs.3054/fig-3 Semantic Scholar

Metadata

arxiv preprintprimary source

Abstract

This paper explores post-disaster analytics using multimodal deep learning models trained with curriculum learning method. Studying post-disaster analytics is important as it plays a crucial role in mitigating the impact of disasters by providing timely and accurate insights into the extent of damage and the allocation of resources. We propose a curriculum learning strategy to enhance the performance of multimodal deep learning models. Curriculum learning emulates the progressive learning sequence in human education by training deep learning models on increasingly complex data. Our primary objective is to develop a curriculum-trained multimodal deep learning model, with a particular focus on visual question answering (VQA) capable of jointly processing image and text data, in conjunction with semantic segmentation for disaster analytics using the FloodNet\footnote{https://github.com/BinaLab/FloodNet-Challenge-EARTHVISION2021} dataset. To achieve this, U-Net model is used for semantic segmentation and image encoding. A custom built text classifier is used for visual question answering. Existing curriculum learning methods rely on manually defined difficulty functions. We introduce a novel curriculum learning approach termed Dynamic Task and Weight Prioritization (DATWEP), which leverages a gradient-based method to automatically decide task difficulty during curriculum learning training, thereby eliminating the need for explicit difficulty computation. The integration of DATWEP into our multimodal model shows improvement on VQA performance. Source code is available at https://github.com/fualsan/DATWEP.

Summary

This paper proposes a curriculum learning approach for post-disaster analytics using multimodal deep learning models that jointly process images and text. The authors introduce Dynamic Task and Weight Prioritization (DATWEP), a novel gradient-based curriculum learning method that automatically determines task difficulty during training without manual specification. The approach combines U-Net for semantic segmentation, image encoding, and a custom text classifier for visual question answering, evaluated on the FloodNet dataset for flood damage assessment.

Cited by 1 page

Page	Type	Quality
AI Capability Threshold Model	Analysis	72.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202698 KB

[2310.19109] 1 Introduction 
 
 
 
 
 
 
 
 
 
 
 

 
 

 
 
 
 
 
 
 
 Dynamic Task and Weight Prioritization Curriculum Learning for Multimodal Imagery 

 Hüseyin Fuat Alsan a (huseyinfuat.alsan@stu.khas.edu.tr), Taner Arsan a (arsan@khas.edu.tr)

 a Computer Engineering Department, Kadir Has University, Istanbul, Turkey 
 

 Corresponding Author: 
 Hüseyin Fuat Alsan 
 Computer Engineering Department, Kadir Has University, Istanbul, Turkey 
 Tel: (+90) 0536 825 29 11 
 Email: huseyinfuat.alsan@stu.khas.edu.tr 

 
 
 Abstract

 This paper explores post-disaster analytics using multimodal deep learning models trained with curriculum learning method. Studying post-disaster analytics is important as it plays a crucial role in mitigating the impact of disasters by providing timely and accurate insights into the extent of damage and the allocation of resources. We propose a curriculum learning strategy to enhance the performance of multimodal deep learning models. Curriculum learning emulates the progressive learning sequence in human education by training deep learning models on increasingly complex data. Our primary objective is to develop a curriculum-trained multimodal deep learning model, with a particular focus on visual question answering (VQA) capable of jointly processing image and text data, in conjunction with semantic segmentation for disaster analytics using the FloodNet dataset. To achieve this, U-Net model is used for semantic segmentation and image encoding. A custom built text classifier is used for visual question answering. Existing curriculum learning methods rely on manually defined difficulty functions. We introduce a novel curriculum learning approach termed Dynamic Task and Weight Prioritization (DATWEP), which leverages a gradient-based method to automatically decide task difficulty during curriculum learning training, thereby eliminating the need for explicit difficulty computation. The integration of DATWEP into our multimodal model shows improvement on VQA performance. Source code is available at https://github.com/fualsan/DATWEP.

 
 
 keywords: 

Deep Learning, Multimodal Deep Learning, Curriculum Learning, Semantic Segmentation, Visual Question Answering

 
 
 
 1 Introduction

 
 Multimodal deep learning is about using (and relating between) different data types and can include images, text, audio, time series data and even tabular data. It can be seen as a generalization method for deep learning at the data level. Relations between different types (modalities) can be automatically constructed via pattern recognition commonly employed by deep learning models. However, training with multimodal data can be challenging and often requires efficient training algorithms. Curriculum learning is considered in this work to increase multimodal performance. Instead of random shuffling the data samples, curriculum learning schedules training in a meaningful order, from the data samples that are easy for deep learning models learn

... (truncated, 98 KB total)

Resource ID: 6125e188a886af2d | Stable ID: sid_75JfJaCkUO