Learning Optical Flow with Auxiliary Cost Aggregation

Xiao, Chengyao

View/Open

ChengyaoXiao2023.pdf (41.63Mb)

Date

2023-11-30

Author

Xiao, Chengyao

Metadata

Show full item record

Abstract

Optical flow represents motions for each pixel between two adjacent frames in a video sequence. Deep learning-based estimation approaches for optical flow have overshadowed the variational approaches over the past few years, as they achieve real-time estimation with reduced estimation error. The construction of deep learning-based estimation models heavily relies on the cost volume which is constructed through matrix multiplication and encodes the dense matching information between the given inputs. Long-range correlation and occlusion, however, remain challenging as information drawn from the cost volume is heavily weighted by the local correlation defined over a fixed window size. In this thesis, we propose to enrich the information used for the iterative residual flow decoding process with an Auxiliary Cost Aggregation (ACA) unit that constructs an auxiliary cost volume based on the top-k matches from the 4D cost volume and then augments it using Transformers. Additionally, a post-refinement module is also proposed to refine the predicted residual flow at the end of each iteration based on the feature's local coherence. Extensive experiments indicate that our model achieves better cross-dataset generalizability than two baseline models, RAFT and GMA. On the Sintel and KITTI benchmarks, our model outperforms RAFT and has comparable performance with other state-of-the-art (SOTA) models.

URI

http://hdl.handle.net/10222/83181

Subject

Collections

Faculty of Graduate Studies Online Theses

Find Full text