Computer Vision Engineer

at Intelligent Content Analysis Laboratory, Shahid Beheshti University, Tehran, Iran


Research Interests:

Action Recognition, Face Recognitio Weakly Suppervised Semantic Segmentation, Object Recognition

1. News:


Dec-2019

    • 3rd place team award:

"Face Recognition" challenge, AAIC 2019, Amirkabir University of Technology, Tehran, Iran.

Reza Kahani, Mohammad Kamalzare

Certificate

May-2019

    • 3rd place team award:

"Tree Detection in Horizontal Digital Images" challenge, in conjunction with the 27th Iranian conference on Electrical Engineering, Yazd, Iran. supported by Yazd municipality

Reza Kahani, Mohammad Kamalzare, Mahmoud Rahat

Certificate

Mar-2019

    • a paper accepted at Multimedia Tools and Applications:

R. Kahani, A. Talebpour, and A. Mahmoudi-Aznaveh, "A correlation based feature representation for first-person activity recognition," Multimedia Tools and Applications, March 30 2019.

Feb-2019

    • a paper submitted to ICIP-2019:

M. Kamalzare, R. Kahani, A. Talebpour, and A. Mahmoudi-Aznaveh, "The effect of scene context on weakly supervised semantic segmentation," arXiv preprint arXiv:1902.04356, 2019.

3. Education:

  • M.S. (2013-2016)

Computer Engineering, Artificial Intelligence,

Shahid Beheshti University (SBU), Tehran, Iran.

  • B.S. (2009-2013)

Computer Engineering, Software Engineering,

Quchan University of Technology (QIET), Quchan, Iran.

4. Publications:

2019

  • R. Kahani, A. Talebpour, and A. Mahmoudi-Aznaveh, "A correlation based feature representation for first-person activity recognition," Multimedia Tools and Applications, March 30 2019.

Abstract— In this paper, a simple yet efficient activity recognition method for first-person video is introduced. The proposed method is appropriate for the representation of high-dimensional features such as those extracted from convolutional neural networks (CNNs). The per-frame (per-segment) extracted features are considered as a set of time series, and inter and intra-time series relations are employed to represent the video descriptors. To find the inter-time relations, the series are grouped and the linear correlation between each pair of groups is calculated. The relations between them can represent the scene dynamics and local motions. The introduced grouping strategy helps to considerably reduce the computational cost. Furthermore, we split the series in the temporal direction in order to preserve long term motions and better focus on each local time window. In order to extract the cyclic motion patterns, which can be considered as primary components of various activities, intra-time series correlations are exploited. The representation method results in highly discriminative features which can be linearly classified. The experiments confirm that our method outperforms the state-of-the-art methods in recognizing first-person activities on the three challenging first-person datasets.

Paper, Code

2019

  • M. Kamalzare, R. Kahani, A. Talebpour, and A. Mahmoudi-Aznaveh, "The effect of scene context on weakly supervised semantic segmentation," arXiv preprint arXiv:1902.04356, 2019.

Abstract— Image semantic segmentation is parsing image into several partitions in such a way that each region of which involves a semantic concept. In a weakly supervised manner, since only image-level labels are available, discriminating objects from the background is challenging, and in some cases, much more difficult. More specifically, some objects which are commonly seen in one specific scene (e.g. 'train' typically is seen on 'railroad track') are much more likely to be confused. In this paper, we propose a method to add the target-specific scenes in order to overcome the aforementioned problem. Actually, we propose a scene recommender which suggests to add some specific scene contexts to the target dataset in order to train the model more accurately. It is notable that this idea could be a complementary part of the baselines of many other methods. The experiments validate the effectiveness of the proposed method for the objects for which the scene context is added.

Paper

2016

  • R. Kahani, A. Talebpour, and A. Mahmoudi-Aznaveh, "Time Series Correlation for First-Person Videos," in Iranian Conference on Electrical Engineering, ICEE, 2016.

AbstractIn this paper, an efficient feature encoding for firstperson video is introduced. The proposed method is appropriate for abstraction of high dimensional features such as features extracted from Convolutional Neural Networks (CNNs). The perframe extracted features are considered as time series, and the relations between them, in both temporal and spatial directions, are employed to represent the video descriptors. To find the relations, the time series are grouped and the linear correlation between each pair of groups are calculated. Furthermore, we split series in temporal direction in order to better focus on each local time window. The experiments show that our method outperforms previous methods such as Bag of Visual Word (BoVW), Improved Fisher Vector (IFV) and recently proposed Pooled Time Series (PoT) on the first-person DogCentric dataset. In addition, the presented method achieves a considerable improvement in computation time.

Paper, Code, Sample

Feature Representation Framework

Recognition Accuracy: on the Dog-Centric Dataset

Feature Representation Speed

Final Feature Dimension