References

[1] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In CVPR, 2016.

[2] Y. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving Landmark and Non-Landmark Images from Community Photo Collections. In ACM Multimedia, 2010.

[3] Artem Babenko and Victor Lempitsky. Aggregating deep convolutional features for image retrieval. In ICCV, 2015.

[4] Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky. Neural codes for image retrieval. In ECCV, 2014.

[5] Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation. arXiv preprint arXiv:1511.00561, 2015.

[6] M. Bujnak, Z. Kukelova, and T. Pajdla. New efficient solution to the absolute pose problem for camera with unknown focal length. In ACCV, 2010.

[7] Andrei Bursuc, Giorgos Tolias, and Herve Jegou. Kernel local descriptors with implicit rotation matching. In ICMR, 2015.

[8] F. Camposeco, T. Sattler, and M. Pollefeys. Minimal Solvers for Generalized Pose and Scale Estimation from Two Rays and One Point. In ECCV, 2016.

[9] D.M. Chen, G. Baatz, K. Koeser, S.S. Tsai, R. Vedantham, T. Pylvaenaeinen, K. Roimela, Xin Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk. City-scale landmark identification on mobile devices. In CVPR, 2011.

[10] S. Choudhary and P. J. Narayanan. Visibility probability structure from sfm datasets and applications. In ECCV, 2012.

[11] A. Cohen, T. Sattler, and M. Pollefeys. Merging the Unmatchable: Stitching Visually Disconnected SfM Models. In ICCV, 2015

[12] A. Cohen, J. L. Schoenberger, P. Speciale, T. Sattler, J.-M. Frahm, and M. Pollefeys. Indoor-Outdoor 3D Reconstruction Alignment. In ECCV, 2016.

[13] Mark Cummins and Paul Newman. Highly Scalable Appearance-Only SLAM - FAB-MAP 2.0. In RSS, 2019.

[14] J. Delhumeau, P.-H. Gosselin, H. Jegou, and P. P´erez. Revisiting the VLAD image representation. In ACM Multimedia, Barcelona, Spain, October 2013.

[15] Yunchao Gong, Liwei Wang, Ruiqi Guo, and Svetlana Lazebnik. Multi-scale orderless pooling of deep convolutional activation features. In ECCV, 2014.

[16] Albert Gordo, Jon Almazan, Jerome Revaud, and Diane Larlus. Deep image retrieval: Learning global representations for image search. In ECCV, 2016.

[17] P. Gronat, G. Obozinski, J. Sivic, and T. Pajdla. Learning per-location classifiers for visual place recognition. In CVPR, 2013.

[18] R.M. Haralick, C.-N. Lee, K. Ottenberg, and M. Noelle. Review and analysis of solutions of the three point perspective pose estimation problem. IJCV, 13(3):331–356, 1994.

[19] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2nd edition, 2004.

[20] M. Havlena, A. Torii, and T. Pajdla. Efficient structure from motion by graph optimization. In ECCV, 2010.

[21] Ahmet Iscen, Giorgos Tolias, Philippe-Henri Gosselin, and Herve Jegou. A comparison of dense region detectors for image search and fine-grained classification. Image Processing, IEEE Transactions on, 24(8):2369–2381, 2015.

[22] H. Jegou, M. Douze, and C. Schmid. On the burstiness of visual elements. In CVPR, 2009.

[23] H. Jegou, M. Douze, C. Schmid, and P. Perez. Aggregating local descriptors into a compact image representation. In CVPR, pages 3304–3311, jun 2010.

[24] H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, and C. Schmid. Aggregating local image descriptors into compact codes. PAMI, 34(9):1704–1716, September 2012.

[25] K. Josephson and M. Byroed. Pose estimation with radial distortion and unknown focal length. In CVPR, 2009.

[26] Yannis Kalantidis, Clayton Mellina, and Simon Osindero. Cross-dimensional weighting for aggregated deep convolutional features. In arXiv:1512.04065, 2015.

[27] Alex Kendall, Vijay Badrinarayanan, and Roberto Cipolla. Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding. arXiv preprint arXiv:1511.02680, 2015.

[28] Alex Kendall and Roberto Cipolla. Modelling Uncertainty in Deep Learning for Camera Relocalization. In ICRA, 2016.

[29] Alex Kendall, Matthew Grimes, and Roberto Cipolla. PoseNet: A Convolutional Network for RealTime 6-DOF Camera Relocalization. In ICCV, 2015.

[30] K. Kim, A. Torii, and M. Okutomi. Multi-View Inverse Rendering under Arbitrary Illumination and Albedo. In ECCV, 2016.

[31] J. Knopp, J. Sivic, and T. Pajdla. Avoding Confusing Features in Place Recognition. In ECCV, 2010.

[32] Y. Li, N. Snavely, D. Huttenlocher, and P. Fua. Worldwide Pose Estimation Using 3D Point Clouds. In ECCV, 2012.

[33] Y. Li, N. Snavely, and D. P. Huttenlocher. Location Recognition using Prioritized Feature Matching. In ECCV, 2010.

[34] H. Lim, S. N. Sinha, M. F. Cohen, and M. Uyttendaele. Real-Time Image-Based 6-DOF Localization in Large-Scale Environments. In CVPR, 2012.

[35] S. Lynen, T. Sattler, M. Bosse, J. Hesch, M. Pollefeys, and R. Siegwart. Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization. In Robotics Science and Systems (RSS), 2015.

[36] S. Middelberg, T. Sattler, O. Untzelmann, and L. Kobbelt. Scalable 6-DOF Localization on Mobile Devices. In ECCV, 2014.

[37] M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP, 2009.

[38] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006.

[39] K. Nozawa, A. Torii, and M. Okutomi. Stable two view reconstruction using the six-point algorithm. In ACCV, 2012.

[40] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object Retrieval with Large Vocabularies and Fast Spatial Matching. In CVPR, 2007.

[41] T. Quack, B. Leibe, and L. Van Gool. World-Scale Mining of Objects and Events from Community Photo Collections. In CIVR, 2008.

[42] Filip Radenovic, Giorgos Tolias, and Ondrej Chum. CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. In ECCV, 2016.

[43] Ali Sharif Razavian, Josephine Sullivan, Atsuto Maki, and Stefan Carlsson. A baseline for visual instance retrieval with deep convolutional networks. In arXiv:1412.6574, 2014.

[44] T. Sattler, M. Havlena, F. Radenovic, K. Schindler, and M. Pollefeys. Hyperpoints and Fine Vocabularies for Large-Scale Location Recognition. In ICCV, 2015.

[45] T. Sattler, M. Havlena, K. Schindler, and M. Pollefeys. Large-Scale Location Recognition And The Geometric Burstiness Problem. In CVPR, 2016.

[46] T. Sattler, B. Leibe, and L. Kobbelt. SCRAMSAC: Improving RANSAC’s Efficiency with a Spatial Consistency Filter. In ICCV, 2009.

[47] T. Sattler, B. Leibe, and L. Kobbelt. Fast Image-Based Localization using Direct 2D-to-3D Matching. In ICCV, 2011.

[48] T. Sattler, B. Leibe, and L. Kobbelt. Improving Image-Based Localization by Active Correspondence Search. In ECCV, 2012.

[49] T. Sattler, C. Sweeney, and M. Pollefeys. On Sampling Focal Length Values to Solve the Absolute Pose Problem. In ECCV, 2014.

[50] G. Schindler, M. Brown, and R. Szeliski. City-Scale Location Recognition. In CVPR, 2007.

[51] J. Sivic and A. Zisserman. Video Google: A Text Retrieval Approach to Object Matching in Videos. In ICCV, 2003.

[52] L. Svarm, O. Enqvist, M. Oskarsson, and F. Kahl. Accurate Localization and Pose Estimation for Large 3D Models. In CVPR, 2014.

[53] C. Sweeney, T. Sattler, M. Turk, T. H¨ollerer, and M. Pollefeys. Optimizing the Viewing Graph for Structure-from-Motion. In ICCV, 2016.

[54] G. Tolias and Y. Avrithis. Speeded-up relaxed spatial matching. In ICCV, 2011.

[55] G. Tolias, Y. Kalantidis, and Y. Avrithis. Symcity: Feature selection by symmetry for large scale image retrieval. In ACM Multimedia, 2012.

[56] G. Tolias, R. Sicre, and Herve Jegou. Particular object retrieval with integral max-pooling of cnn activations. In ICLR, 2016.

[57] Giorgos Tolias, Yannis Avrithis, and Herve Jegou. Image search with selective match kernels: aggregation across single and multiple images. IJCV, 2015.

[58] Giorgos Tolias, Teddy Furon, and Herve Jegou. Orientation covariant aggregation of local descriptors with embeddings. In ECCV, 2014.

[59] A. Torii, R. Arandjelovic, J. Sivic, M. Okutomi, and T. Pajdla. 24/7 place recognition by view synthesis. In CVPR, 2015.

[60] A. Torii, M. Havlena, and T. Pajdla. From Google street view to 3D city models. In OMNIVIS, 2009.

[61] A. Torii, M. Havlena, T. Pajdla, and B. Leibe. Measuring camera translation by the dominant apical angle. In CVPR, 2008.

[62] A. Torii, Z. Kukelova, M. Bujnak, and T. Pajdla. The six point algorithm revisited. In CVVT:E2M, 2010.

[63] A. Torii and T. Pajdla. Omnidirectional camera motion estimation. In VISAPP, 2008.

[64] A. Torii, J. Sivic, and T. Pajdla. Visual Localization by Linear Combination of Image Descriptors. In MVW, 2011.

[65] A. Torii, J. Sivic, T. Pajdla, and M. Okutomi. Visual Place Recognition with Repetitive Structures. In CVPR, 2013.

[66] O. Untzelmann, T. Sattler, S. Middelberg, and L. Kobbelt. A Scalable Collaborative Online System for City Reconstruction. In International Conference on Computer Vision (ICCV) Workshops, 2013.

[67] A. R. Zamir and M. Shah. Accurate Image Localization Based on Google Maps Street View. In ECCV, 2010.

[68] B. Zeisl, T. Sattler, and M. Pollefeys. Camera Pose Voting for Large-Scale Image-Based Localization. In ICCV, 2015.

[69] W. Zhang and J. Kosecka. Image based localization in urban environments. In 3DPVT, 2006.