| data | ||
| src | ||
| .gitignore | ||
| README.md | ||
Orthoptera Sound Classification
List the French orthoptera species
From the ASsociation pour la Caractérisation et l’ÉTude des Entomocénoses ASCETE, we can retrieve a PDF listing the orthoptera species from France. From the raw text content of this file, we can extract a TSV listing the species binomial and French common names, using ./src/extract_species_list_tsv.py.
Build a reference audio dataset from Xeno-Canto
Using the xeno-canto-py helper functions to deal with Xeno-Canto API (modified for API v3 in this fork), we can bulk download a set of audio recordings for each orthoptera species, using ./src/construct_reference_dataset.py.
To run this step, you will need to set the XENO_CANTO_API_KEY environment variable, e.g, in a .env file.
The audio files are downloaded and stored in subfolders in dataset/audio, named with the species binomial names.
Audio features extraction with Tadarida-D
Tadarida-D is a C++ program developed for the Vigie-Chiro program to extract features from audio files.
The objective is to be able to build a classifier of Orthoptera sounds in the audible spectrum.
A bash script ./src/tadarida_bulk.sh runs Tadarida-D on all audio files retrieved in the precedent step.