arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

What are some Computer Vision Tips

hashtag
Attention

For new contributors of Percetion_CV team, please first create your own branch and make sure all your work is done within your branch. Do PR (pull request) only if your team leader asks you to do so.

For new team leaders of Perception_CV, the master branch should only contain stable code that has been confirmed working. Master branch will be the source we use for integration with other teams when the time is ready.

hashtag
Introduction

Full CV Repo here:

This repo is originally forked from but heavily modified for our own use. The purpose of this repo is to achieve custom object detection for Brandeis Autonomous Robotics Course. Changes were made based on our object of deploying CV on ROS. To download our most recent best trained weights, please go to Then unzip the file and copy coco and weights directory in this repo and replace everything.

Notes: I've put a low of useful tools inside the ./utils directory, please feel free to use them whenever you need it.

  • ./utils/vid_2_frm.py : The python script that extracts all frames out of a video, you can control the extracting rate by reading the comment and do small modification. This script will also tell you the fps of the source video which will be useful for later converting frames back to video.

  • ./utils/frm_2_vid.py : The python script that is converting frames by its name into a video, you better know the original/target video's fps to get the optimal output.

Here are links to download our datasets (images and annotations) by certain class:

Doorplate Recognition:

  • custom_volen(provided by Haofan):

  • custom_doorplate(provided by Haofan):

Facial Recognition:

  • Abhishek:

  • Haofan:

  • Yuchen:

hashtag
CV Subscriber & Publisher

All the CV subscriber and publisher are located at ./utils/ directory, they are:

  • ./utils/image_subscriber.py : The python script that subscribe image from raspicam_node/image rostopic.

  • ./utils/string_publisher.py : The python script that publishes a string on rostopic of /mutant/face_detection which is generated from detect.py, the format is explained below:

CV Publisher example: "['sibo', -4.34, 1.63]"

[ <"class name">, <"angle of target to front in degree (negative -> left, positive -> right")>, <"rough distance in meter"> ]

hashtag
Cheat Sheet For Raspberry Pi Camera

Detailed official user guide here:

Some useful commands:

  • raspivid -vf -hf -t 30000 -w 640 -h 480 -fps 25 -b 1200000 -p 0,0,640,480 -o pivideo.h264 recording 30 seconds video on 25 fps.

  • MP4Box -add pivideo.h264 pivideo.mp4 converting .h264 video to .mp4

  • scp donatello@129.64.243.61:~/pivideo.mp4 ~/Downloads/ downloading video from ssh to local machine

Pipeline of recording video on DONATELLO:

  • ssh donatello@129.64.243.61

  • If you want to see preview images, roslaunch turtlebot3_bringup turtlebot3_rpicamera.launch, then on remote computer, do rqt_image_view

  • when you recording video, shut down the rpicamera bringup

hashtag
Cheat Sheet For USB Web-Camera

Get image_view

  • ssh <Robot_name_space>@

  • plug in the USB camera

  • On slave, do lsusb and ls /dev |grep video to check if camera was recognized by system

Web Streaming

  • On slave, install web-video-server ROS node sudo apt install ros-kinetic-web-video-server

  • On slave, to make it right, create catkin workspace for our custom launch file mkdir -p ~/rosvid_ws/src

  • On slave,cd ~/rosvid_ws

  • On slave, build package cd.. then catkin_make

  • On master, Make sure roscore is running

  • On slave, run created launch file roslaunch vidsrv vidsrv.launch

hashtag
Description

The repo contains inference and training code for YOLOv3 in PyTorch. The code works on Linux, MacOS and Windows. Training is done on the COCO dataset by default: . Credit to Joseph Redmon for YOLO: .

hashtag
Requirements

Python 3.7 or later with the following pip3 install -U -r requirements.txt packages:

  • numpy

  • torch >= 1.0.0

  • opencv-python

hashtag
Tutorials

hashtag
Training

Start Training: Run train.py to begin training after downloading COCO data with data/get_coco_dataset.sh.

Resume Training: Run train.py --resume resumes training from the latest checkpoint weights/latest.pt.

Each epoch trains on 117,263 images from the train and validate COCO sets, and tests on 5000 images from the COCO validate set. Default training settings produce loss plots below, with training speed of 0.6 s/batch on a 1080 Ti (18 epochs/day) or 0.45 s/batch on a 2080 Ti.

Here we see training results from coco_1img.data, coco_10img.data and coco_100img.data, 3 example files available in the data/ folder, which train and test on the first 1, 10 and 100 images of the coco2014 trainval dataset.

from utils import utils; utils.plot_results()

hashtag
Image Augmentation

datasets.py applies random OpenCV-powered () augmentation to the input images in accordance with the following specifications. Augmentation is applied only during training, not during inference. Bounding boxes are automatically tracked and updated with the images. 416 x 416 examples pictured below.

Augmentation
Description

hashtag
Speed

Machine type: n1-standard-8 (8 vCPUs, 30 GB memory) CPU platform: Intel Skylake GPUs: K80 ($0.198/hr), P4 ($0.279/hr), T4 ($0.353/hr), P100 ($0.493/hr), V100 ($0.803/hr) HDD: 100 GB SSD Dataset: COCO train 2014

hashtag
Inference

Run detect.py to apply trained weights to an image, such as zidane.jpg from the data/samples folder:

YOLOv3: python3 detect.py --cfg cfg/yolov3.cfg --weights weights/yolov3.weights

YOLOv3-tiny: python3 detect.py --cfg cfg/yolov3-tiny.cfg --weights weights/yolov3-tiny.weights

YOLOv3-SPP: python3 detect.py --cfg cfg/yolov3-spp.cfg --weights weights/yolov3-spp.weights

hashtag
Webcam

Run detect.py with webcam=True to show a live webcam feed.

hashtag
Pretrained Weights

  • Darknet *.weights format:

  • PyTorch *.pt format:

hashtag
mAP

  • Use test.py --weights weights/yolov3.weights to test the official YOLOv3 weights.

  • Use test.py --weights weights/latest.pt to test the latest training results.

  • Compare to darknet published results .

hashtag
Citation

hashtag
Contact

Issues should be raised directly in the repository. For additional questions or comments please contact your CV Team Leader or Sibo Zhu at siboz1995@gmail.com

./utils/xml_2_txt
: The repo that converts .xml format annotation into our desired .txt format (Yolo format), read and follow the README file inside.
  • ./utils/labelimg : The repo that we use for labelling images, great tool! Detailed README inside.

  • ./utils/check_missing_label.py : The python script that can be used for checking if there's any missing label in the annotation/image mixed directory.

  • ./utils/rename_dataset.py : The python script doing mass rename in case different datasets' images names and annotations are the same and need to be distinguished.

  • ./list_img_path.py : The python script that splits the datasets (images with its corresponding annotations) into training set and validation set in the ratio of 6:1 (you can modify the ratio).

  • ./utils/img_2_gif.py : The python script that converts images to gif.

  • ./coco/dataset_clean.py : The python script that cleans the uneven images and labels that is going to be trained and make sure they are perfectly parallel.

  • ./utils/video_recorder_pi.py : The python script that records videos on pi camera. This script should be located in the robot and run under SSH

  • Huaigu: https://drive.google.com/file/d/1QNKtvanc58PoQZCg6htQpcImd00toYby/view?usp=sharingarrow-up-right
  • Eli: https://drive.google.com/file/d/14qII9t4tyDsYqj_bxxCdxSyIip0CRwQT/view?usp=sharingarrow-up-right

  • Nate: https://drive.google.com/file/d/1KE0UVu7dalip4mDVoVGpBgr1uyyhoHQB/view?usp=sharingarrow-up-right

  • Cody: https://drive.google.com/file/d/1Yb4RmYWXWCBO3nb_Di--3tRh0LdiRZBn/view?usp=sharingarrow-up-right

  • Pito: https://drive.google.com/file/d/1NZ4SBfv1Y5zuGpRQLebOlK-duG_p_0pg/view?usp=sharingarrow-up-right

  • Sibo: https://drive.google.com/file/d/1c7ZcMN-LcMAjmO62oS_C3y2hpA6IUgvP/view?usp=sharingarrow-up-right

  • Arjun: https://drive.google.com/file/d/10NnfTU150Pis5ugOWLzxVvwcesi873LY/view?usp=sharingarrow-up-right

  • Charlie: https://drive.google.com/file/d/1UmCUl-uLPwwOub2ZsTNpQHK9Q_rdVScI/view?usp=sharingarrow-up-right

  • rqt_image_view getting vision from camera, requires bringup which is conflict to the video recording function

  • rosrun rqt_reconfigure rqt_reconfigure edit camera configuration

  • in advance
  • Do raspivid -vf -hf -t 30000 -w 640 -h 480 -fps 25 -b 1200000 -p 0,0,640,480 -o pivideo.h264 on DONATELLO to record video

  • On slave, install usb_cam ROS node sudo apt install ros-kinetic-usb-cam

  • On slave, check the usb camera launch file cat /opt/ros/kinetic/share/usb_cam/launch/usb_cam-test.launch

  • (Optional) On local client machine (master machine), run roscore (Usually it's constantly running on the desktop of Robotics Lab so you won't need to do this line)

  • On slave, start usb_cam node roslaunch usb_cam usb_cam-test.launch

  • (Optional) On slave, bring running process to background with CTRL+Z and execute bg command to continue execution it in background

  • (Optional) On slave, check the topic of usb camera rostopic list

  • (Optional) On master, check the topics in GUI rqt_graph

  • On master, read camera data with image_view rosrun image_view image_view image:=/<name_space>/usb_cam/image_raw

  • On slave, to bring background task to foreground fg

  • On slave, catkin_make

  • On slave, source devel/setup.bash

  • On slave, create ROS package cd src then catkin_create_pkg vidsrv std_msgs rospy roscpp

  • On slave, create launch file using nano, vim, etc mkdir -p vidsrv/launch then nano vidsrv/launch/vidsrv.launch. Then copy and paste the code below

  • On your client machine, open web browser and go to <Robot IP address>:8080 . Under /usb_cam/ categoryand and click image_raw .

  • Enjoy the web streaming

  • tqdm

    Train Single Classarrow-up-right
  • Train Custom Dataarrow-up-right

  • HSV Intensity

    +/- 50%

    0.51s

    125min

    $0.58

    1 T4

    16

    0.78s

    94min

    $0.55

    1 P100

    16

    0.39s

    48min

    $0.39

    2 P100

    32

    0.48s

    29min

    $0.47

    4 P100

    64

    0.65s

    20min

    $0.65

    1 V100

    16

    0.25s

    31min

    $0.41

    2 V100

    32

    0.29s

    18min

    $0.48

    4 V100

    64

    0.41s

    13min

    $0.70

    8 V100

    128

    0.49s

    7min

    $0.80

    56.5

    -

    YOLOv3-spp 608

    60.7

    60.6

    Translation

    +/- 10% (vertical and horizontal)

    Rotation

    +/- 5 degrees

    Shear

    +/- 2 degrees (vertical and horizontal)

    Scale

    +/- 10%

    Reflection

    50% probability (horizontal-only)

    HSV Saturation

    +/- 50%

    GPUs

    batch_size

    batch time

    epoch time

    epoch cost

    (images)

    (s/batch)

    1 K80

    16

    1.43s

    175min

    $0.58

    1 P4

    ultralytics/yolov3arrow-up-right

    darknetarrow-up-right

    YOLOv3 320

    51.8

    51.5

    YOLOv3 416

    55.4

    55.3

    YOLOv3 608

    58.2

    57.9

    YOLOv3-spp 320

    52.4

    -

    https://github.com/campusrover/Robotics_Computer_Visionarrow-up-right
    https://github.com/ultralytics/yolov3arrow-up-right
    https://drive.google.com/file/d/1DquRwpNDaXgkON2gj9Oks8N2ZWgN2Z9q/view?usp=sharingarrow-up-right
    https://drive.google.com/file/d/1A9yI5PdLeAlKEVQww2NJgQLRDxq9tcOJ/view?usp=sharingarrow-up-right
    https://drive.google.com/file/d/1jITWceHYYFXjUyaJ1bp4Wdb_tKhtylQJ/view?usp=sharingarrow-up-right
    https://drive.google.com/file/d/1Z3ICrLEVt50ia1C07ZCxE_Na105aRjsE/view?usp=sharingarrow-up-right
    https://drive.google.com/file/d/1nDcGb0QGSzLJaQL1ewMWVXtvXXjSwWC0/view?usp=sharingarrow-up-right
    https://drive.google.com/file/d/1PomjuCvcJ25_d_EaQwqE9l1wuZ5z5Zz3/view?usp=sharingarrow-up-right
    http://emanual.robotis.com/docs/en/platform/turtlebot3/appendix_raspi_cam/arrow-up-right
    https://github.com/campusrover/Perception_CV/blob/master/utils/vidsrv.launcharrow-up-right
    https://github.com/ultralytics/yolov3arrow-up-right
    https://cocodataset.org/#homearrow-up-right
    https://pjreddie.com/darknet/yolo/arrow-up-right
    GCP Quickstartarrow-up-right
    Transfer Learningarrow-up-right
    Train Single Imagearrow-up-right
    https://opencv.org/arrow-up-right
    https://cloud.google.com/deep-learning-vm/arrow-up-right
    https://pjreddie.com/media/files/yolov3.weightsarrow-up-right
    https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAIarrow-up-right
    https://arxiv.org/abs/1804.02767arrow-up-right
    arrow-up-right

    8

    YOLOv3-spp 416

    git clone https://github.com/ultralytics/yolov3
    # bash yolov3/data/get_coco_dataset.sh
    git clone https://github.com/cocodataset/cocoapi && cd cocoapi/PythonAPI && make && cd ../.. && cp -r cocoapi/PythonAPI/pycocotools yolov3
    cd yolov3
    
    python3 test.py --save-json --img-size 416
    Namespace(batch_size=32, cfg='cfg/yolov3-spp.cfg', conf_thres=0.001, data_cfg='data/coco.data', img_size=416, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3-spp.weights')
    Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', total_memory=16130MB)
                   Class    Images   Targets         P         R       mAP        F1
    Calculating mAP: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 157/157 [05:59<00:00,  1.71s/it]
                     all     5e+03  3.58e+04     0.109     0.773      0.57     0.186
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.335
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.565
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.349
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.151
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.360
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.493
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.280
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.432
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.458
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.255
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.494
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.620
    
    python3 test.py --save-json --img-size 608 --batch-size 16
    Namespace(batch_size=16, cfg='cfg/yolov3-spp.cfg', conf_thres=0.001, data_cfg='data/coco.data', img_size=608, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3-spp.weights')
    Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', total_memory=16130MB)
                   Class    Images   Targets         P         R       mAP        F1
    Computing mAP: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 313/313 [06:11<00:00,  1.01it/s]
                     all     5e+03  3.58e+04      0.12      0.81     0.611     0.203
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.366
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.607
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.386
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.207
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.391
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.485
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.296
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.464
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.494
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.331
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.517
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.618
    DOI
    results