semi-supervised-labeling-of-vehicle-traffic-dataset-using-active-learning-for-traffic-analysis-applications



Semi-supervised Labeling of Vehicle Traffic Dataset  Using Active Learning for Traffic Analysis  Applications
This repository contains all the references and files for this project.

Pre-requisites
The following github repositories are used in this project. Download/install these repositories for future implementations/recreations.


Ultralytics YOLOv8 - used for feature extraction, inference, detection, tracking, and automated labeling. Can be installed through the pip module
pip install ultralytics
https://github.com/ultralytics/ultralytics


Sci-kitlearn - used for data clustering. Can be installed through the pip module
pip install sklearn
https://github.com/scikit-learn/scikit-learn


TrackEval - used for evaluation of detection and tracking performance. Download the repository in the link
https://github.com/scikit-learn/scikit-learn


Folders and Files


/158,000 NCTS Images Version 1 contains the 158,000 image daset created by Maclang, et. al.

/Training Images contains the image datasets selected for training the models C and N. Non-clustered 30k contains the images used for model N, while Clustered 22k contains the images used for model C.

/Modifications contains the edits to the source code of the relevant repositories used in the project. More details can be found in the folder's readme file.

/Scripts contains the python scripts used as tools to aid in the automation of project tasks.

/YOLO Training contains all the training sets and validation sets used in YOLO training, as well as training results.

/docs contains the presentations and documentations as required by the EEE 196 and EEE 199 courses.

all_models.zip contains the 7 models developed in this project. The model weights are named depending on the dataset it was trained on:


base.pt - initial 3,000 images

n1.pt - initial 3,000 images + 1 round of non-clustered active learning (500 images)

n2.pt - initial 3,000 images + 2 rounds of non-clustered active learning (1000 images)

n3.pt - initial 3,000 images + 3 rounds of non-clustered active learning (1500 images)

c1.pt - initial 3,000 images + 1 round of clustered active learning (500 images)

c2.pt - initial 3,000 images + 2 rounds of clustered active learning (1000 images)

c3.pt - initial 3,000 images + 3 rounds of clustered active learning (1500 images)


Reference links

For official documentation on YOLOv8 and ultralytics components, go here https://docs.ultralytics.com/#where-to-start

For community discussions on YOLOv8 implementations, go here or on similar forums https://github.com/orgs/ultralytics/discussions

CVAT annotation tool https://www.cvat.ai/

Previous study by Maclang et. al used as reference https://gitlab.eee.upd.edu.ph/miguel.lorenzo.orante/video-dataset-labeling-using-active-learning-with-applications-in-vehicle-classification-and-traffic-flow-rate-measurement