MODERN TECHNIQUES IN MACHINE LEARNING WITH IMAGES

Bruno Seki Schenberg; Rafael Colen de Almeida; Rogério de Oliveira

doi:10.56238/sevened2026.019-048

Authors

Bruno Seki Schenberg
Rafael Colen de Almeida
Rogério de Oliveira

DOI:

https://doi.org/10.56238/sevened2026.019-048

Keywords:

Computer Vision, Machine Learning, Detection and Classification, Video, 3D Images

Abstract

This chapter presents modern Machine Learning techniques applied to computer vision, divided into two complementary fronts: the processing of 2D images and videos and the classification of three-dimensional data (3D images). The first has strong applications in security and product sales, while the second is fundamental for areas such as the parts manufacturing industry and medical image processing. Thus, the first part of the chapter explores the practical use of the YOLO ecosystem (specifically YOLO26) for object detection and scene classification, covering the implementation of inference pipelines, data annotation, fine-tuning, and data augmentation techniques to mitigate training overfitting. In the second part of the chapter, the focus is on the challenges involved in processing 3D spaces, detailing geometric representation processes (meshes, point clouds, and voxel grids). The chapter also presents the construction and preprocessing of a volumetric classifier from scratch, using the ResNet3D-18 architecture on the ModelNet10 dataset, along with a critical analysis of the model’s performance.

MODERN TECHNIQUES IN MACHINE LEARNING WITH IMAGES

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Information

Language