Hossein Kashiani

I am a fourth-year Ph.D. student in the IS-WiN Lab at Clemson University (CU), advised by Dr. Fatemeh Afghah. Before joining CU, I worked as a research assistant on biometrics at West Virginia University. I completed my Master's degree in Electrical Engineering at Iran University of Science & Technology. My research focuses on enhancing the generalization of machine learning models to unseen domains, with applications spanning various areas, including anomaly detection, biometrics, healthcare, visual perception tasks, and scene understanding.

Email  /  Google Scholar  /  Github /  LinkedIn  /  Twitter

News
  • [2024/11] Two papers have been accepted at WACV 2025.
  • [2024/11] Successfully passed Qualifying Exam.
  • [2024/06] My new paper CATFace is accepted by IEEE Transactions on Biometrics, Behavior, and Identity Science.
  • [2024/02] AAFACE is accepted at IEEE ICIP 2023.
  • [2023/11] MedViT has been featured in Computers in Biology and Medicine.
  • [2023/09] Our new method on Morph Attack Detection has been accepted by IJCB 2023.
  • [2023/03] Our face morphing detector ranks among the top in NIST's Face Recognition Vendor Test (FRVT).
Research

My current research focuses on advancing Multimodal Large Language Models to enhance the robustness and adaptability of anomaly detection in few-shot settings and provide detailed descriptions of each detected irregularity. My prior work at CU includes developing advanced prompting techniques for Vision-Language models and robust multi-class anomaly detection.

At WVU, in collaboration with CITER and NIST, I explored advanced security measures for automated face recognition systems, focusing on robust detection and generation of face morphing attacks. Additionally, I worked on multimodal biometric recognition under long-range constraints to address issues related to turbulent, low-quality imagery from extended distances.

I also engaged in advancing Vision Transformers for healthcare at IUST. This includes developing hybrid Transformers with locality inductive biases for medical imaging; enhancing adversarial robustness with innovative data augmentation techniques like Moment Exchanger and Patch Momentum Changer; improving local and global dependencies within Transformer architectures.

Beyond these areas, my research extends to visual perception and scene understanding, such as object tracking, detection, and segmentation. This includes enhancing the robustness, generalization, and adaptability of perception models such as object detection and semantic segmentation for autonomous vehicles; developing multi-teacher knowledge distillation framework to enhance the performance of lightweight perception models, ensuring they operate effectively in challenging environments.

Citations: 0 | H-Index: 0 | i10-Index: 0
Publications
Image of Style-Pro Paper
Style-Pro: Style-Guided Prompt Learning for Generalizable Vision-Language Models.
NA Talemi, H Kashiani, F Afghah
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025.
CVF / arxiv / Poster/ bibtex

We propose a style-guided prompt learning framework to enhance generalization in Vision-Language models.

Image of ROADS Paper
ROADS: Robust Prompt-driven Multi-Class Anomaly Detection under Domain Shift.
H Kashiani, NA Talemi, F Afghah
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025.
CVF / arxiv / Poster / bibtex

We propose a robust multi-class anomaly detection framework with a class-aware prompt integration mechanism to mitigate inter-class interference and a domain adapter to handle domain shifts.

CATFace
CATFace: Cross-Attribute-Guided Transformer With Self-Attention Distillation for Low-Quality Face Recognition.
NA Talemi, H Kashiani, NM Nasrabadi
IEEE Transactions on Biometrics, Behavior, and Identity Science, 2024.
arxiv / IEEE / bibtex

We propose a novel multi-branch network with cross-attribute-guided fusion and self-attention distillation, improving face recognition in low-quality images using soft biometric attributes.

MedViT
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification.
ON Manzari, H Ahmadabadi, H Kashiani, SB Shokouhi, A Ayatollahi
Computers in Biology and Medicine, 2023.
arxiv / Elsevier / bibtex / code

This study proposes a robust and efficient CNN-Transformer hybrid model, combining CNN locality with the global connectivity of vision Transformers. Additionally, we enhance robustness by learning smoother decision boundaries through feature mean and variance permutation within mini-batches.

AAFACE AAFACE: Attribute-aware Attentional Network for Face Recognition.
NA Talemi, H Kashiani, and 4 more authors
IEEE International Conference on Image Processing (ICIP), 2023.
arxiv / IEEE / bibtex

We present a multi-branch network using attribute-aware integration to enhance face recognition through soft biometric prediction.

Morph Attack Detection
Towards Generalizable Morph Attack Detection with Consistency Regularization.
H Kashiani, NA Talemi, NM Nasrabadi
IEEE International Joint Conference on Biometrics (IJCB), 2023.
arxiv / IEEE / bibtex / Poster / code

We propose consistency regularization to enhance the generalization of morph attack detection through morph-wise augmentations to enhance robustness against unseen morph attacks in biometric systems.

Face Quality Vector Face Image Quality Vector Assessment for Biometrics Applications.
N Najafzadeh, H Kashiani, and 4 more authors
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023.
CVF / IEEE / bibtex

This paper proposes a multi-task neural network that generates a face quality vector, including nuisance factors, offering improved performance and detailed feedback for face image quality assessment.

Robust Ensemble
Robust Ensemble Morph Detection with Domain Generalization.
H Kashiani, SM Sami, S Soleymani, NM Nasrabadi
IEEE International Joint Conference on Biometrics (IJCB), 2022.
arxiv / IEEE / bibtex / Poster / Video / FRVT Results / code

This paper proposes a robust ensemble of CNNs and Transformers for morph detection that enhances generalization to morph attacks and increases robustness against adversarial threats through multi-perturbation training.

Robust Transformer
Robust Transformer with Locality Inductive Bias and Feature Normalization.
ON Manzari, H Kashiani, HA Dehkordi, SB Shokouhi
Engineering Science and Technology, 2022.
arxiv / Elsevier / bibtex / code

This paper proposes a robust transformer model that incorporates locality inductive bias and feature normalization, enhancing generalization and robustness in feature extraction tasks.

Human Action Recognition
Multi-expert Human Action Recognition with Hierarchical Super-class Learning.
HA Dehkordi, AS Nezhad, H Kashiani, SB Shokouhi, A Ayatollahi
Knowledge-Based Systems, 2022.
arxiv / Elsevier / bibtex

We propose a two-phase multi-expert classification method for human action recognition, addressing long-tailed distribution using super-class learning without extra data or manual annotation. A novel Graph-Based Class Selection (GCS) algorithm optimizes class configurations and inter-class dependencies.

Autonomous Vehicles
Generalizing State-of-the-art Object Detectors for Autonomous Vehicles in Unseen Environments.
A Khosravian, A Amirkhani, H Kashiani, M Masih-Tehrani
Expert Systems with Applications, 2021.
Elsevier / bibtex

We address the generalization issues in scene understanding for autonomous vehicles by employing GANs for weather modeling, and advanced augmentations, improving object detection robustness and generalization across domains, especially in adverse weather conditions and natural distortions.

COVID Detection Lightweight Local Transformer for COVID-19 Detection Using Chest CT Scans.
HA Dehkordi, H Kashiani, AAH Imani, SB Shokouhi
International Conference on Computer Engineering and Knowledge, 2021.
arxiv / IEEE / bibtex/ Video

This paper introduces a hybrid CNN-Transformer model for COVID-19 diagnosis using CT images, combining local and global feature extraction, and achieving superior performance with limited training data.

Semantic Segmentation Robust Semantic Segmentation with Multi-Teacher Knowledge Distillation.
A Amirkhani, A Khosravian, M Masih-Tehrani, H Kashiani
IEEE Access, 2021.
IEEE / bibtex

We propose a multi-teacher KD framework in which several expert CNNs, trained on different settings, supervise a lightweight student model. This framework enhances the robustness and performance of the student by using diverse knowledge sources.

Visual Object Tracking
Visual Object Tracking Based on Adaptive Siamese and Motion Estimation Network.
Hossein Kashiani, Shahriar B Shokouhi
Image and Vision Computing, 2019.
arxiv / Elsevier / bibtex

This work aims to improve motion and observation models in visual object tracking. We propose a motion estimation network to refine target location predictions, with a Siamese network detecting the most probable candidate. Additionally, a weighting CNN adaptively assigns weights to similarity scores, accounting for target appearance changes.

Patchwise Object Tracking
Patchwise Object Tracking via Structural Local Sparse Appearance Model.
Hossein Kashiani, Shahriar B Shokouhi
International Conference on Computer and Knowledge Engineering, 2017.
arxiv / IEEE / bibtex

This paper proposes a robust tracking method that exploits relationships between target patches in adjacent frames using a sparse appearance model.

Academic Service