MODERN TECHNIQUES IN MACHINE LEARNING WITH IMAGES
DOI:
https://doi.org/10.56238/sevened2026.019-048Keywords:
Computer Vision, Machine Learning, Detection and Classification, Video, 3D ImagesAbstract
This chapter presents modern Machine Learning techniques applied to computer vision, divided into two complementary fronts: the processing of 2D images and videos and the classification of three-dimensional data (3D images). The first has strong applications in security and product sales, while the second is fundamental for areas such as the parts manufacturing industry and medical image processing. Thus, the first part of the chapter explores the practical use of the YOLO ecosystem (specifically YOLO26) for object detection and scene classification, covering the implementation of inference pipelines, data annotation, fine-tuning, and data augmentation techniques to mitigate training overfitting. In the second part of the chapter, the focus is on the challenges involved in processing 3D spaces, detailing geometric representation processes (meshes, point clouds, and voxel grids). The chapter also presents the construction and preprocessing of a volumetric classifier from scratch, using the ResNet3D-18 architecture on the ModelNet10 dataset, along with a critical analysis of the model’s performance.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.