• Ep. 247 - Part 1 - June 13, 2024

  • Jun 15 2024
  • Length: 48 mins
  • Podcast

Ep. 247 - Part 1 - June 13, 2024

  • Summary

  • ArXiv Computer Vision research for Thursday, June 13, 2024.


    00:21: FouRA: Fourier Low Rank Adaptation

    01:41: Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

    03:18: Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

    04:57: Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting

    06:46: ToSA: Token Selective Attention for Efficient Vision Transformers

    08:00: Computer vision-based model for detecting turning lane features on Florida's public roadways

    09:08: Improving Adversarial Robustness via Feature Pattern Consistency Constraint

    10:52: Research on Deep Learning Model of Feature Extraction Based on Convolutional Neural Network

    12:10: NeRF Director: Revisiting View Selection in Neural Volume Rendering

    13:36: Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency

    15:03: Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

    16:40: COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing

    18:16: Fusion of regional and sparse attention in Vision Transformers

    19:26: Zoom and Shift are All You Need

    20:17: EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

    21:49: The Penalized Inverse Probability Measure for Conformal Classification

    23:24: OpenMaterial: A Comprehensive Dataset of Complex Materials for 3D Reconstruction

    24:47: Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation

    26:30: Computer Vision Approaches for Automated Bee Counting Application

    27:17: Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding

    28:16: A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras

    29:43: Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer

    31:25: Neural NeRF Compression

    32:29: Preserving Identity with Variational Score for General-purpose 3D Editing

    33:50: AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings

    34:51: Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition

    36:10: Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation

    37:34: AMSA-UNet: An Asymmetric Multiple Scales U-net Based on Self-attention for Deblurring

    38:49: Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark

    40:45: A PCA based Keypoint Tracking Approach to Automated Facial Expressions Encoding

    42:02: Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?

    43:28: FacEnhance: Facial Expression Enhancing with Recurrent DDPMs

    45:11: How structured are the representations in transformer-based vision encoders? An analysis of multi-object representations in vision-language models

    47:08: Suitability of KANs for Computer Vision: A preliminary investigation

    Show More Show Less
activate_samplebutton_t1

What listeners say about Ep. 247 - Part 1 - June 13, 2024

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.