ByteTrack: Multi-Object Tracking by Associating Every Detection Box.- Robust Multi-Object Tracking by Marginal Inference.- PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?.- Particle Video Revisited: Tracking through Occlusions Using Point Trajectories.- Tracking Objects As Pixel-Wise Distributions.- CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds.- Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline.- Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting.- AiATrack: Attention in Attention for Transformer Visual Tracking.- Disentangling Architecture and Training for Optical Flow.- A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow.- Robust Landmark-Based Stent Tracking in X-Ray Fluoroscopy.- Social ODE: Multi-agent Trajectory Forecasting with Neural Ordinary Differential Equations.- Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-agent Trajectory Prediction.- Diverse Human Motion Prediction Guided by Multi-level Spatial- Temporal Anchors.- Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction.- Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation.- E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs.- Point Cloud Compression with Range Image-Based Entropy Model for<div>Autonomous Driving.- Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework.- MotionCLIP: Exposing Human Motion Generation to CLIP Space.- Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking.- Aware of the History: Trajectory Forecasting with the Local Behavior Data.- Optical Flow Training under Limited Label Budget via Active Learning.- Hierarchical Feature Embedding for Visual Tracking.- Tackling Background Distraction in Video Object Segmentation.- Social-Implicit: Rethinking Trajectory Prediction Evaluation and the Effectiveness of Implicit Maximum Likelihood Estimation.- TEMOS: Generating Diverse Human Motions from Textual Descriptions.- Tracking Every Thing in the Wild.- HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance.- Towards Sequence-Level Training for Visual Tracking.- Learned Monocular Depth Priors in Visual-Inertial Initialization.- Robust Visual Tracking by Segmentation.- MeshLoc: Mesh-Based Visual Localization.- S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction.- Large-Displacement 3D Object Tracking with Hybrid Non-local Optimization.- FEAR: Fast, Efficient, Accurate and Robust Visual Tracker.- PREF: Predictability Regularized Neural Motion Fields.- View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums.- HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking.- RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer.- SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image.- Entropy-Driven Sampling and Training Scheme for Conditional Diffusion Generation.</div>