Method
We used a Graph Convolutional Neural Network to learn the tactile states. The graph is made of distributed tactile sensors, each taxel being a node with features being the sensor values in X, Y and Z directions, respectively. The node features are the tactile sensor values in X, Y and Z directions, whereas the edge features carry spatial information of the hand, tracing positions of nodes (taxels) relative to their neighbourhood.
The adjacency matrix A consists of 1's when two nodes are connected (edge is present) and 0's when two nodes are not connected (edge is absent). Replacing the 1's by edge feature vector renders matrix multiplication impossible. Hence, to address the problem posed by the edge features, we propose a method to map the edge features into node feature space. Edge features are mapped into node feature space using a single-layer perceptron, with an input dimension of (number of edges * 4) and an output dimension of (number of nodes * 4). This perceptron layer is referred to as the 'edge feature encoder'. The output of the edge feature encoder can be used for Graph Convolutions, as shown in the figure below.
Instead of concatenation of node features of different modalities, to pass it through as single Graph Convolutional layer thread, we propose a method of simultaneous Graph Convolutions on features of different modalities and different feature spaces associated with the same graph.
In the proposed MT-GCN architecture, we process each node feature modality of the graph nodes in parallel independent threads of Graph Convolutional layers as shown in the figure below. After the final Graph Convolutional layers of the respective threads of each modality, the features are fused for further utilization.