Daily Log

3.12 Managed to understand the whole code base of the CLIP repo from OpenAI. Planned to take a look at CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation, to understand how to implement Open-Vocabulary Segmentation (OVS) using CLIP. 3.13 1. DETR Got a basic understanding of DETR, which is an awesome end-to-end 2D object detection architecture, with its downside lies in: Long training period Difficulty of detecting small objects but has advantages in:...

March 20, 2024 · 21 min · Banghao Chi

Real-time Object Recognition in Chess: Personalized Tuning and Hardware Acceleration

1. Selected and customized the YOLOv5 model for Chinese chess annotation data. 2. Conducted testing and analysis of the model. The results indicated exceptional accuracy in recognition capabilities. However, a significant shortfall was identified in terms of efficiency, with the model taking approximately 6 seconds to process a single image. 3. Implemented model optimization. We substitute the YOLOv5 model with a more lightweight variant, YOLOv5-lite and convert the model into the ONNX format to leverage hardware acceleration, thereby enhancing computational efficiency....

August 5, 2023 · 2 min · Banghao Chi