未分类 · 分类

CorrMatch Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation

2025-03-22

Nankai University、NKIARI, Shenzhen Futian、SICE, UESTC

CC4S Encouraging Certainty and Consistency in Scribble-Supervised Semantic Segmentation

2025-03-20

Deep learning-based solutions have achieved impressive performance in semantic segmentation but often require large amounts of training data with fine-grained annotations. To alleviate such requisition, a variety of weakly supervised annotation strategies have been proposed, among which scribble supervision is emerging as a popular one due to its user-friendly annotation way. However, the sparsity and diversity of scribble annotations make it nontrivial to train a network to produce deterministic and consistent predictions directly. To address these issues, in this paper we propose holistic solutions involving the design of network structure, loss and training procedure, named CC4S to improve Certainty and Consistency for Scribble-Supervised Semantic Segmentation. Specifically, to reduce uncertainty, CC4S embeds a random walkmodule into the network structure to make neural representations uniformly distributed within similar semantic regions, which works together with a soft entropy loss function to force the network to produce deterministic predictions. To encourage consistency, CC4S adopts self-supervision training and imposes the consistency loss on the eigenspace of the probability transition matrix in the random walk module (we named neural eigenspace). Such self-supervision inherits the category-level discriminability from the neural eigenspace and meanwhile helps the network focus on producing consistent predictions for the salient parts and neglect semantically heterogeneous backgrounds. Finally, to further improve the performance, CC4S uses the network predictions as pseudo-labels and retrains the network with an extra color constraint regularizer. From comprehensive experiments, CC4S achieves comparable performance to those from fully supervised methods and shows promising robustness under extreme supervision cases.

#Scribble-Supervised Semantic Segmentation #CC4S

Scaling Up Multi-domain Semantic Segmentation with Sentence Embeddings

2025-03-18

未分类

The state-of-the-art semantic segmentation methods have achieved impressive performance on predefined close-set individual datasets, but their generalization to zero-shot domains and unseen categories is limited. Labeling a large-scale dataset is challenging and expensive, Training a robust semantic segmentation model on multi-domains has drawn much attention. However, inconsistent taxonomies hinder the naive merging of current publicly available annotations. To address this, we propose a simple solution to scale up the multi-domain semantic segmentation dataset with less human effort. We replace each class label with a sentence embedding, which is a vector-valued embedding of a sentence describing the class. This approach enables the merging of multiple datasets from different domains, each with varying class labels and semantics. We merged publicly available noisy and weak annotations with the most finely annotated data, over 2 million images, which enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets, despite not using any images therefrom. Instead of manually tuning a consistent label space, we utilized a vector-valued embedding of short paragraphs to describe the classes. By fine-tuning the model on standard semantic segmentation datasets, we also achieve a significant improvement over the state-of-the-art supervised segmentation on NYUD-V2 (Silberman et al., in: European conference on computer vision, Springer, pp 746–760, 2012) and PASCAL-context (Everingham et al. in Int J Comput Visi 111(1):98–136, 2015) at 60% and 65% mIoU, respectively. Our method can segment unseen labels based on the closeness of language embeddings, showing strong generalization to unseen image domains and labels. Additionally, it enables impressive performance improvements in some adaptation applications, such as depth estimation and instance segmentation. Code is available at https://github.com/YvanYin/SSIW.

#Sentence Embeddings

Scribble Hides Class Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label

2025-03-18

未分类

Peking University, Beijing, China

#Scribble

Learning Generalized Medical Image Segmentation from Decoupled Feature Queries

2025-03-17

未分类

Jarvis Research Center、Wuhan University、Guangxi Medical University

#Medical Image Segmentation #Decoupled Feature Queries

Progressive Feature Self-Reinforcement for Weakly Supervised Semantic Segmentation

2025-03-13

未分类

Zhejiang Lab、Xidian University、Zhejiang University、University of Manchester

#Feature Self-Reinforcement #Weakly Supervised Semantic Segmentation

Pytorch教程

2025-03-12

未分类

Snipaste_2025-03-11_17-02-44

#无标签

Relevant Intrinsic Feature Enhancement Network for Few-Shot Semantic Segmentation

2025-03-11

未分类

University of Chinese Academy of Sciences、Chinese Academy of Sciences、Alibaba group

#Feature Enhancement Network

Scribble-Supervised Semantic Segmentation with Prototype-based Feature Augmentation

2025-03-11

未分类

Hohai University, Nanjing, China

#Feature Augmentation #Semantic Segmentation

Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation

2025-03-10

未分类

Nanjing University of Aeronautics and Astronautics 、State Key Laboratory of Integrated Services Networks, Xidian University

#Few-Shot Semantic Segmentation #Transformation

Prompt-and-Transfer Dynamic Class-Aware Enhancement for Few-Shot Segmentation

2025-03-08

未分类

Snipaste_2025-03-05_19-30-17

#Prompt-and-Transfer

Prompting Multi-Modal Image Segmentation with Semantic Grouping

2025-03-06

未分类

Multi-modal image segmentation is one of the core issues in computer vision. The main challenge lies in integrating common information between modalities while retaining specific patterns for each modality. Existing methods typically perform full fine-tuning on RGB-based pre-trained parameters to inherit the powerful representation of the foundation model. Although effective, such paradigm is not optimal due to weak transferability and scarce downstream data. Inspired by the recent success of prompt learning in language models, we propose the Grouping Prompt Tuning Framework(GoPT), which introduces explicit semantic grouping to learn modal-related prompts, adapting the frozen pre-trained foundation model to various downstream multi-modal segmentation tasks. Specifically, a class-aware uni-modal prompter is designed to balance intra- and inter-modal semantic propaga- tion by grouping modality-specific class tokens, thereby improving the adaptability of spatial information. Furthermore, an alignment-induced cross-modal prompter is introduced to aggregate class-aware representations and share prompt parameters among different modalities to assist in modeling common statistics. Extensive experiments show the superiority of our GoPT, which achieves SOTA performance on various downstream multi-modal image segmentation tasks by training only < 1% model parameters.

#Semantic Grouping

Disentangle then Parse Night-time Semantic Segmentation with Illumination Disentanglement

2025-03-04

未分类

University of Science and Technology of China Shanghai AI Laboratory

#Illumination Disentanglement #Semantic Segmentation

SED A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

2025-03-04

未分类

研究背景：传统的方法只能分割训练集的种类，不能识别出来在训练集中没有的未知场景，同时两阶段和单阶段的方法都存在不足。两阶段的框架存在不足：计算效率低，没有充分利用上下文信息；单阶段的框架存在不足：对于低分辨率的输入，主干网络对空间信息变得不敏感，即使加入额外的网络来提供空间信息，也会增加计算资源，分割种类的增加也会增加计算资源。

#SED

High Quality Segmentation for Ultra High-resolution Images

2025-03-02

未分类

**摘要：**To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation. Common strategies, such as down-sampling, patch crop- ping, and cascade model, cannot address well the balance issue between accuracy and computation cost. Motivated by the fact that humans distinguish among objects continu- ously from coarse to precise levels, we propose the Contin- uous Refinement Model (CRM) for the ultra high-resolution segmentation refinement task. CRM continuously aligns the feature map with the refinement target and aggregates fea- tures to reconstruct these image details. Besides, our CRM shows its significant generalization ability to fill the resolu- tion gap between low-resolution training images and ultra high-resolution testing ones. We present quantitative per- formance evaluation and visualization to show that our pro- posed method is fast and effective on image segmentation refinement. Code is available at https://github.com/dvlab-research/Entity/tree/main/CRM.

#Ultra High-resolution Images

Segment Anything

2025-02-27

未分类

https://segment-anything.com

#SAM

目录