Data Description

Audio Data is provided by the San Diego Zoo Research Center

Audio collected in Peru in 2019.
30-60 minutes of audio per audiomoth from 5 different audiomoths, split into 1-minute intervals.
Roughly two weeks of audio data per deployment (approximately 30 deployments) where the devices were recording for a minute every 10 minutes.

The audio recordings capture vocalizations of birds, insects, bats, and other wildlife at various times of the day.

Hoatzin Stock photos by Vecteezy

Data Preprocessing

Audio Data was preprocessed using the animal2vec model by Dr. Julian Schäfer-Zimmermann.

The model was trained using the Xeno-Canto bird dataset (https://xeno-canto.org/).

More on the approach: https://arxiv.org/abs/2406.01253.

Audio files were transformed into vector embeddings using animal2vec transformer model. The input audio of one minute at 384 kHz is embedded into 200 Hz (5ms per embedding).

Embeddings are high-dimensional vector representations of audio data.
The goal is to train the model using Peruvian bird data for better performance.

The resulting data is represented by vector embeddings with 768 dimensions.

6.1 Gb of Data
192 .h5 files
2,293,146 Vector embeddings

Page updated

Google Sites

Report abuse

Data Description

Data Preprocessing

The resulting data is represented by vector embeddings with 768 dimensions.

6.1 Gb of Data

192 .h5 files

2,293,146 Vector embeddings