Hosting, Structure & Download

Data Hosting

We currently host the EnvoDat dataset on the CPS Server, which is maintained by the Chair of Cyber-Physical Systems, Montanuniversität, Leoben. The raw data is curated in ROS bag format and post-processed in several other formats. To make storage and transfer easier, we have organized the data according to their respective sensing modalities and scenes. This structure allows you to download only the data formats, scenes, and feature characteristics that match your specific requirements.

Directory Structure

For each scene in EnvoDat, we organised the data in the example hierarchical structure shown in Fig. (a) and (b) below. This is to ensure that the user only downloads the data format, that meets their needs.

gt-poses-xxx.csv contains the ground truth trajectories, which include the position and orientation components of the robot's poses in the form: <timestamp>, <position_x>, <position_y>, <position_z>, <orientation_x>, <orientation_y>, <orientation_z>, <orientation_w>. Also, the translational components are provided in the form: <timestamp>, <translation_x>, <translation_y>, <translation_z>, <rotation_x>, <rotation_y>, <rotation_z>, <rotation_w>.
Seq01 contains the recorded data for the first sequence.
imu-xxx-seq0x.csv contains the inertial data in the form: <index>, <timestamp>, <angular_velocity_x>, <angular_velocity_y>, <angular_velocity_z>, <linear_acceleration_x>, <linear_acceleration_y>, <linear_acceleration_z>.
lidar-xxx-seq0x is the 3D lidar data for the first sequence. We provide the point clouds in PCD formats whereas the 2D data layer images (NIR, Range, Reflective, Signal) are in PNG format.
xxx-seq0x.bag is the ROS bag file that contains all the recorded raw data.
The rgbd-xxx-seq0x contains both the depth and the RGB images.
The depth and RGB data are further post-processed into CSV (depth-xxx-seq01.csv and rgb-xxx-seq01.csv), in the following convention: <index>, <timestamp>, <frame_name>, <frame_width>, <frame_hight>.

Fig. (a)

Fig. (b)

In addition, Fig.(b) is the directory tree for the ready-to-deploy fine-grained polygon-based annotated data. It is available in different formats (e.g., OPENAI CLIP, Microsoft COCO, YOLOv*, and JSON). Other formats will be updated as soon as they are available. We employed 70% - 20% - 10% train, validation, and test split ratio respectively.

For the individual formats, the:

README.dataset.txt contains information about the annotation, including a link to download the annotated data directly in the desired format.

CLIP:

test, train, and valid folders contain lists of object categories.
_tokenization.txt contains the natural language descriptive labels of objects in the annotated image frames.

COCO:

test, train, and valid folders contain annotations.coco.json and frame_0000xx.jpg files (images associated with the annotations).

YOLO:

data.yaml contains the annotation metadata used to configure the YOLOv8 model for training, i.e., information about dataset paths, class names, etc.
test, train, and valid folders contain images, labels, and the corresponding *.jpg files along with annotations.