Coding and Engineering Experiments

Last update: February 22, 2015

I am a junior researcher in computer vision. Probably, not everything here is correct...we learn by time :)

General

The Big picture

Never forget the pig picture of your research: technical and non-technical
If you build some disciplines (e.g. every day will have 30 min doing X) follow them.
When experiments gives bad performance we put all the time and effort in the code
- One bad thing about that, you forget the non-technical goals and objectives
- You typically narrow your mind to little coding sub-components and the issue may be in another area. Even error may be a trivial missing line somewhere.
- As juniors need more time in engineering, when you implement your idea and it gives bad results, we have 2 cases
  - The idea is wrong. It will never produce good work
  - It is nice. Just some bugs + tuning some parameters
    - Ask experienced guys/advisor for little help.
    - Try forums. Many variations exist on Quora, Linkedin, Researchgate.

Approach Implementation

Try to start with a simplified version of approach. Code, test and evaluate.
Try simple learning models. E.g. Linear SVM before NonLinear SVM
Think about memory caching points and data storage points.
- Sometimes we do the same processing in each trial.
- Save as possible as you can if doable in time/memory/disk space.
If your code is a pipeline of multiple stages, try to start from the tail if someone provides some head processing for first pipeline phases
- The point: mistaking in a phase highly affect the whole performance and it is hard to detect problems.
- Say you implement BOW. You need a dictionary and use it to quantize images. Finally, learn a classifier
- If someone provided good quantized images, write the code that learns classification first.
- If someone provided dictionary, start from quantizing and so on...shorter code per
- This lets you have correct base and build over it

Coding

Don’t reinvent the wheel.
Think ahead of code components and interfaces.
Separation of concerns. Divide code as much as possible. A shorter code per class is highly preferred
Sub-folder the code
It is fine to have code chaos at some points. In the nearest time (few
days), back everything organized. Chaos is fine too during prototyping.
When the code becomes larger, functional is not a good idea...Think OOP
- Write modular code. Methods have shortcode.
- Upload the code to the web, allowing licensees to let all the people share it
  - E.g. pull requests on GitHub
Do as much as logical asserts, even if you think it is not needed
- We change the code all the times, bugs are introduced
- Assert save your debugging time.
  - E.g. Object is null...we forgot to initialize it in new changes!
Code Refactoring
- Setup a weekly check on the code and how to make it better
- How to remove redundancy
- How to make it more modular
- How to minimize function length and reasonable methods names
When ever finished coding some functions/classes:
- Do code review
  - Once finished function class, think about cases and trace by eyes
  - Trace method lines by debugger
  - Next day morning review the code again
    - Always keep a copy of each day code, then every day you have last day version and the comparison reveals 1 day changes
    - May be use some SVN to do this
  - Do unit testing if possible
  - If possible, print / visualize outputs and make they them make sense
- Nothing is worse than investigating the whole code for a trivial bug! The more careful you are, the less consumed time
Always have mini-dataset to apply to make sure overall is fine. Don’t discover problems due to crash after X hours of running!
- Have a nice size to use later as a guide for some estimations. E.g. size = 10, 100
Documentation...Documentation...Documentation...Documentation...Documentation…
- Write initially head notes about function target. Which variables are output
- Later, use standard format to describe each function
- Code body preferred to have some comments. Variables names expressive.

Logging

Always log for every important function
File logging is good to avoid machine restarts/crash
- In C++, it is easy to orient all output toward file using freopen
Print gentle logging in program head for the selected configuration
- E.g. what pipeline to run
- E.g. What are the important parameters values?
- You may force a confirm (input) to proceed with data.
Avoid logging too much..make it smart
- E.g. if we process X images, each 10% give status
- Let some algorithms log once, then stop logging
Visualizing in Videos
- Videos has many frames, it is good idea to write per frame the important info

Floating Point Precision

Whenever saving precision numbers to file, write this with as many digits as possible (e.g. 3.5024614207824543e-05)
- When the application is case sensitive to the tiny fractional part, it will affect performance
- As example, Say you trained linear SVM, then compressed all support vectors in 1 vector to be used in O(k) to predict. All values will be so tiny. Be careful.

Time Complexity

Don’t run an experiment and you don’t know the overall time for it!
Let every main block do an estimate for the expected time to finish
- E.g. sum first 5 times per step (e.g. for loop extract sift for image), use average to estimate. Simple class/struct will be nice.
- Print average per step and overall expectation
Use a mini-dataset to calculate manually whole pipeline of processing times

Memory Complexity

Never write a code that you don’t estimate its memory
Don’t wait program to crash due to memory concern
Do balance between using hard disk and available memory
- Matlab is the best when comes to saving many matrices
- Avoid text data. Save binary
- When saving data that could be post-processed in several ways, save it unprocessed, so that you don’t lose data
  - E.g. you have training data and could be normalized/scaled in several ways.
If possible in language/platform, write function that estimate memory per important functions
Check memory used by the process in the memory..process that keeps increasing in unexpected way is ALARM for you

Using new staff

Whenever using new staff, specially learning components (train in SVM, k-means, GMM..):
- Read Documentation carefully
- Make sure to test them well independently
- Try them with different parameters
- Visualize them
- Internal data
  - sometimes you want to use internal data not just direct interface
    - E.g. internal support vectors not just predict method
  - Be highly careful. Typically docs don’t go in such details
  - Don’t make assumptions
    - E.g. OpenCV multiply -1 in his internal data on opposite to libsvm (reverse for -1 and 1 class)

Generated Artifacts

Always check your generated data. Sometimes they are weird and something wrong happened.
Visualization is good key to find bugs/improvements
Have a separate folder for each artifact.
If take a copy from some outputs, write notepad what are them..How generated

Parameters tuning and Automation

Parameters document
- Whenever you think something might be a parameter that affects performance:
- List all these variables and possible values ranges. Later, we tune them: E.g.
  - -ve examples generation: boxes, olis, hard negatives. Ratio: 1 => 5
  - Dict size = 500, 1000, 2000, 3000, 4000, 500.
    - Points to cluster: 50,000 -> 200,000
  - Image keypoints
    - SURF + dense (step: 6, 10, 20...) + randpoints (1000 -> 10000) + point scale (1, 6...)
  - Svm C / Gamma: 1e-4, 1e-3.....1 10 100 1000
  - Classifier: LinearSVM, RBF, ANN
- Estimate initial good values
  - This is very challengeable.
  - Any initial bad combination could result in severe low performance
  - When read papers, keep your eyes on their parameter values
    - VOC dict size: 400, 1000, 3000, 4096
      - 1000 appear a lot
    - In some problems: C = 1, Gamma: 1 / FeaturesCount
    - E.g. paper filtered near boxes with 70
    - E.g. paper removed overlapped -ve box with 20 IoU
- Consistency
  - If 2 blocks need same processing, better give same parameters
    - E.g. generating dictionary and quantizing image in BOVW extract keypoints from images
    - If selected rand points for dictionary and sift points for quantizing, this may behave bad
Don’t change params in code to try. Have all needed parameters as variables.
Manually changing algorithm parameters is a big waste of time. Tune by code as possible.
Once determined your application set of parameters, write some greedy algorithm to automate them
- This way you don’t lose time
- Have better idea about algorithm performance
we can’t search for every possible combination….think greedily
- E.g. for C, search in range 1e-5, 1e-3…...100
- Then, start for the best one to search deeper (e.g. 2e-3, 3e-3..)
- Google grid search. See
Tuning and accuracy variance
- sometimes u r testing recall. u have 2 choices. both gave same recall.
- However, on full set, they will give recall and one with good precision too and other note. E.g.
- -ve generation method. one method could tell classifier better about -ve classes
- How to handle? better to not have only ur dataset for recall images only. Have some other data for -ve data (e.g. same as +ve data)
Randomization...be careful
- Sometimes your data is based on randomization (e.g. generate -ve training examples)
- Given fixed tuned parameters, results range from good-bad with this randomization.
- Force any function that depends on randomization to generate SAME results under any sequence of calls to your pipeline
  - Best to use RNG rng(value) from Boost
  - Let each function has its RNG initialized based on sent parameters.
  - If it calls another function, allow setting, so that you force call to generate the same sequence
- Watch out from algorithms that has randomization behavior (many approximation algorithms)
  - OpenCV - FlannBasedMatcher
  - STL - random_shuffle

Efficient Experimenting

Minor Datasets

It is not good to run experiments on whole data
Build 1 or minor dataset of your work. Typically, use randomization to extract some data. E.g. 10% of the dataset.
Use always this dataset to inform you about the performance of your system

Recall before Precision

Don’t run all ur code on whole dataset to judge together the 2 measures
Typically, recall happens on subset of data where items exist
- E.g. in object detection. Say validation set is 2500 image. Class aeroplane exist in only 200 images
- Recall care much with these 200 images only. Precision care with all.
- If you can’t get good recall on 200 images initially, you don’t have algorithm
- Work on 200 till you have good performance
- Use Recall images for initial tuning for the parameters
- Now, move to precision...use F-Score as a united measure. ROC curves.
  - Don’t again run on the whole. Pick 2 classes only and test on them.

Accuracy of the whole training before cross validation

We might have a model, and we want to apply CV to get best parameters based on average accuracy
However, this doesn’t make sense if training, then prediction on whole training data results in low performance.

Experiments Results

Be organized as possible.
When running an experiment, document its results. Point to locations of its generated artifacts.
- Copy program arguments with the program to indicate your choices.