论文标题
基于事件的时间密集的光流估计与顺序学习
Event-based Temporally Dense Optical Flow Estimation with Sequential Learning
论文作者
论文摘要
事件摄像机在没有运动模糊的情况下捕获快速移动对象时,比传统的基于框架的摄像机提供了优势。他们通过记录光强度(称为事件)的变化来实现这一目标,从而使它们可以以更高的频率运行并使其适合在高度动态的场景中捕获运动。许多最近的研究提出了培训神经网络(NNS)来预测事件光流的方法。但是,他们通常依赖于在固定间隔内通过事件构建的时空表示,例如在DSEC数据集中训练中使用的10Hz。这种限制将流量预测限制为相同的间隔(10Hz),而事件摄像机的快速速度最高可达3KHz,但尚未有效地使用。在这项工作中,我们表明,可以使用两个不同的经常性网络的不同变体将流量估计作为顺序问题来实现100Hz的时间密集流量估计 - 长期任期内存(LSTM)和峰值神经网络(SNN)。首先,我们利用与流行的EV-Flownet相似的NN模型,但使用LSTM层来证明我们的培训方法的效率。该模型不仅产生比现有的光流的频率高10倍,而且估计流量的误差也比基线EV-Flownet的预测低13%。其次,我们构建了一个EV-FLOWNET SNN,但具有泄漏的集成和消防神经元以有效捕获时间动力学。我们发现,与LSTM模型相比,SNN的简单固有复发动力学导致显着降低参数。此外,由于其事件驱动的计算,峰值模型估计仅消耗1.5%的LSTM模型能量,从而强调了SNN在处理事件中的效率以及实现时间致密流量的潜力。
Event cameras provide an advantage over traditional frame-based cameras when capturing fast-moving objects without a motion blur. They achieve this by recording changes in light intensity (known as events), thus allowing them to operate at a much higher frequency and making them suitable for capturing motions in a highly dynamic scene. Many recent studies have proposed methods to train neural networks (NNs) for predicting optical flow from events. However, they often rely on a spatio-temporal representation constructed from events over a fixed interval, such as 10Hz used in training on the DSEC dataset. This limitation restricts the flow prediction to the same interval (10Hz) whereas the fast speed of event cameras, which can operate up to 3kHz, has not been effectively utilized. In this work, we show that a temporally dense flow estimation at 100Hz can be achieved by treating the flow estimation as a sequential problem using two different variants of recurrent networks - Long-short term memory (LSTM) and spiking neural network (SNN). First, We utilize the NN model constructed similar to the popular EV-FlowNet but with LSTM layers to demonstrate the efficiency of our training method. The model not only produces 10x more frequent optical flow than the existing ones, but the estimated flows also have 13% lower errors than predictions from the baseline EV-FlowNet. Second, we construct an EV-FlowNet SNN but with leaky integrate and fire neurons to efficiently capture the temporal dynamics. We found that simple inherent recurrent dynamics of SNN lead to significant parameter reduction compared to the LSTM model. In addition, because of its event-driven computation, the spiking model is estimated to consume only 1.5% energy of the LSTM model, highlighting the efficiency of SNN in processing events and the potential for achieving temporally dense flow.