The ten-volume set LNCS 15016-15025 constitutes the refereed proceedings of the 33rd International Conference on Artificial Neural Networks and Machine Learning, ICANN 2024, held in Lugano, Switzerland, during September 17–20, 2024. The 294 full papers and 16 short papers included in these proceedings were carefully reviewed and selected from 764 submissions. The papers cover the following topics:  Part I - theory of neural networks and machine learning; novel methods in machine learning; novel neural architectures; neural architecture search; self-organization; neural processes; novel architectures for computer vision; and fairness in machine learning. Part II - computer vision: classification; computer vision: object detection; computer vision: security and adversarial attacks; computer vision: image enhancement; and computer vision: 3D methods. Part III - computer vision: anomaly detection; computer vision: segmentation; computer vision: pose estimation and tracking; computer vision: video processing; computer vision: generative methods; and topics in computer vision. Part IV - brain-inspired computing; cognitive and computational neuroscience; explainable artificial intelligence; robotics; and reinforcement learning. Part V - graph neural networks; and large language models. Part VI - multimodality; federated learning; and time series processing. Part VII - speech processing; natural language processing; and language modeling. Part VIII - biosignal processing in medicine and physiology; and medical image processing. Part IX - human-computer interfaces; recommender systems; environment and climate; city planning; machine learning in engineering and industry; applications in finance; artificial intelligence in education; social network analysis; artificial intelligence and music; and software security. Part X - workshop: AI in drug discovery; workshop: reservoir computing; special session: accuracy, stability, and robustness in deep neural networks; special session: neurorobotics; and special session: spiking neural networks.
Les mer
and fairness in machine learning.Part II - computer vision: classification; and computer vision: 3D methods.Part III - computer vision: anomaly detection; and topics in computer vision.Part IV - brain-inspired computing;
Les mer
.- Computer Vision: Anomaly Detection. .- Hybrid Encoder for Anomaly Detection Based on Latent Feature Regularization. .- Computer Vision: Segmentation. .- DGFormer: A Dynamic Kernel with Gaussian Fusion Transformer for Semantic Image Segmentation. .- Integrating Audio-Visual Contexts with Refinement for Segmentation. .- Loci-Segmented: Improving Scene Segmentation Learning. .- Large Language Model for Action Anticipation. .- MFPNet: A Multi-scale Feature Propagation Network for Lightweight Semantic Segmentation. .- Weakly-Supervised Semantic Segmentation via Label Re-assignment in Dual-view Framework. .- Computer Vision: Pose Estimation and Tracking. .- DT2S-Pose: A Deeper Temporal-Spatial Skeleton Refine Model for Pedestrian Pose Estimation.  .- DTG: Learning A Dynamic Token Graph for 3D Pose Forecasting. .- Dual-Branch Network with Online Knowledge Distillation for 3D Hand Pose Estimation. .- MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices. .- Siamese visual tracking with correlation and awareness. .- Computer Vision: Video Processing. .- Alignment-Enhanced Network for Temporal Language Grounding in Videos. .- Boundary-aware Noise-resistant Video Moment Retrieval. .- Large Language Model for Action Anticipation. .- Learning Object Permanence from Videos via Latent Imaginations. .- SSFlowNet: Semi-supervised Scene Flow Estimation On Point Clouds With Pseudo Label. .- Video Understanding Using 2D-CNNs on Salient Spatio-temporal Slices. .- Computer Vision: Generative Methods. .- A robust cycle generative adversarial network with an improved atmospheric scatter model for image dehazing. .- CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Ground Image Synthesis. .- Dual Dreamer: Extending Single-view Dreamer with Few shot of Complementary Views. .- Hair Transfer with Efficient Heuristic Chain of Editing. .- MAGIC: Multi-prompt Any length video Generation model with controllable Inter-frame Correlation and low barrier. .- Make Audio Solely Drive Lip in Talking Face Video Synthesis. .- P2H-GAN: An Effective Method For Generating Handwritten Expressions. .- SCI-Font: Enhancing Content-Style Representation for Chinese Calligraphy Generation with Skeleton, Contour and Inexact Paired Data. .- Topics in Computer Vision. .- Driver Safety System: A Real-time Sleep Detection and Lane Detection Model using IoT and Deep Learning. .- Gaze target detection with Visual Prompt Tuning based on attention. .- Let Multi-Classification Help Deep Imbalanced Regression. .- ProGEO: Generating Prompts through Image-Text Contrastive Learning for Visual Geo-localization.
Les mer

Produktdetaljer

ISBN
9783031723377
Publisert
2024-09-17
Utgiver
Vendor
Springer International Publishing AG
Høyde
235 mm
Bredde
155 mm
Aldersnivå
Research, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet