Knitro

Introduction   Algorithms & Interfaces   Supports & Services   Product Info   Fact Sheet  

Training  Parallel Issues  MINLP Research

 

 
"Solving convex optimization problems, whether generated directly or as sub-problems when solving non-convex problems, is the principal requirement in computational optimization. Ever-increasing problem size, particularly as a result of the growing importance and scale of data analysis, is driving the development of novel techniques for solving convex optimization problems using high performance computing systems." Convex Optimization and Beyond workshop, 27 June 2014, ICMS Edinburgh.
 
Parallel Optimization Services --- Parallel Computation in Mathematical Algorithmic Optimization (PCMAO)
 
The Research Area :

Title : Parallel Optimization in Engineering
Author : Henry Kar Ming Chan

For the parallel optimization in computational science and engineering, it is required to solve the large scales non-linear optimization problems in parallel processing mode. OpenMP supports multi-platform shared memory multi-processing architecture and is an industrial standard Application Programming Interface (API) for forking and joining program and sub-program tasks. From the product of optimizers, Knitro is an advanced solver for non-linear optimization problems, handling bound constraints, non-linear equalities and inequalities (both convex and non-convex), and complementarity constraints. This is because Knitro mainly has the state-of-the-art features using the algorithmic options, such as interior point methods and active set method, which can be switched between one another automatically or manually by changing the parameters settings. Knitro can handle the inequality constraints by an interior point algorithm and direct solution of the barrier sub-problems. In large scales non-linear problems, Knitro can handle the inequality constraints by an interior point algorithm and solution of the barrier sub-problems by conjugate gradient iterations. For detecting infeasibility, Knitro can handle the inequality constraints by an active set algorithm, which is especially beneficial when a good initial point is available and when solving a sequence of related problems. In some cases, the initial guess provided by the user and all subsequent iterates also satisfy these inequality constraints, then a feasible mode can be set in the parameters list. A simple test scenario is used as test cases. Different active set methods are tested in parallel processing using OpenMP section or task parallelism. Within each active set methods, multi-start features are used to randomly select the initial points so as to find the global optimization points if possible. Parallel Knitro for nonlinear programming solver and Intel MKL library for vector and matrix processing are used with OpenMP directives. Gradient and Hessian vectors for directional and partial derivatives are generated with OpenMP features through Automatic differentiation. Finally, vectorization of data, pre-fetching of data and inter-procedural optimization for parallel OpenMP programs are necessary to be considered during program compilation to achieve the best performance.

In Parallel Branch and Bound : strong-branching and pseudo-costs; depth-first search and best-first search. Potential Problems with OpenMP : Load Balancing, Thread Cancellation, Error Model, Data Affinity, Data Sharing, Data Dependency, Task Dependency. Potential Parallelism in strong branching includes generating lists of candidates using job cancellation, memory aware load balancing features for parallelizing “(most likely nested) While Loop” subprogram within maximum iteration limit, and applying within-subprogram near search tree nodes by controlling threads placement on processors (spreading [instead of closing] the threads to idle processors for strong-branching and depth-first searching at root nodes and at nodes of roots especially for large scale integer problems). Please also note the "FREE" ramp-ups at the beginning and the "FREE" processors at the end.

Desirable Features for Parallel Branch and Bound

  • Parallel Multi-start (Lower/Upper bounds getting from Multi-starts)
  • Priority for variables with Strong Branching features (Automatic Tuner)
  • Fork-Join hierarchy (Workshare with Transactional Memory and Speculative Execution)
  • Vectorization/SIMD features (Common Array Shaping)
  • Loop, Section and Task parallelism (Looping constructs and Task Model refinements)
  • Memory-aware load balancing (CPU Affinity)
  • Cancellation features (Full Error Model)
Footnote : Some developers find that OpenMP in gfortran with Intel MKL BLAS and Sparse Linear Solver reaches linear speedup in the number of cores. If your application does not exploit these SIMD capabilities, you can easily lose a factor of 16x or 8x compared to the peak performance of the CPU.
 
Current and Future Year External Link :
 

21-25 July 2014.

23 May 2014.

 

Reference :

Parallel Branch and Bound :

Parallel Second/Higher Order Approaches :

 
Benchmarks :
 

 

NEW CASE for F-8 aircraft (In Parallel). A solution obtained by Knitro version 9.0 under NEOS server environment was tested by Henry Kar Ming Chan for the reduced model. The reduced model is defined by using less number of period of time (i.e. ntperu := 10). The standard defaults were used except mainly the strong branching strategies, maximum iterations and maximum nodes with parallel feature in MKL Blas only and without external optimizers. Final objective value is 0.020929. Number of nodes processed is 519 and number of sub-problems solved is 526. Total program CPU time is 30.6946 seconds with time spent in evaluations for 12.14195 seconds. The intervals on the equidistant grid on which w(t) = 1 holds, counting from 0 to 59, are 0, 1, 31, 32, 44, 52, and 53.

The future release of Knitro may have the features such as the tuner, new pre-solver, new parallel features and improvements in mixed-integer programming and the active-set algorithm.

 

Benchmark for "While Loop" on OpenMP 4.0 (Proposed Pseudo-code for testing only)

 

int* searchnodes = (int*)malloc...
cancelpoint = NULL;
...

#pragma omp parallel proc_bind(spread) num_threads(8)
#pragma omp single //nowait at end by using with untied and depend clauses
        #pragma omp taskgroup // example : cancelpoint = process(searchnodes)

        #pragma omp task firstprivate(searchnodes) shared(cancelpoint)
                                       default(private) untied                          
        {while(...)  {preprocess(searchnodes);
                          #pragma omp simd safelen(4) // no support for firstrivate and "While Loop"

                          process(searchnodes);
                          if (...) {#pragma omp cancel taskgroup

                                    ...

                                    #pragma omp critical
                                    cancelpoint = ...; } } }

 

A sample gfortran program using compiler version 4.10 (development stage) with OpenMP 4.0 features for Binary Searching is attached for reference : [Note : the original

serial program is coming from internet]

 

program parbsearch

    use OMP_LIB

    implicit none
    integer, allocatable, dimension(:) ::  x
    integer :: range, start, finish, mid
    integer :: i, n, val

    print *, 'Number of values ?'
    read *, N
    allocate( x(N) )

    do i = 1, n
        READ *, x(i)
    end do

    print *, 'Enter Value'
    read *, VAL

    start =  1
    finish = N
    range = finish - start
    mid = (start + finish) /2

 

!$omp parallel proc_bind(spread) num_threads(2)
!$omp single
!$omp taskgroup
!$omp task default(firstprivate) depend(inout:x) shared(mid) untied

 

    do while( x(mid) /= val .and. range >  0)
      if (val > x(mid)) then
        start = mid + 1
      else
        finish = mid - 1
      end if
      range = finish - start

      mid = (start + finish)/2

 

!$omp cancel taskgroup if ( .not. ( x(mid) /= val .and. range > 0 ) )

 

    end do

 

!$omp end task
!$omp end taskgroup
!$omp end single nowait
!$omp end parallel

 

    if( x(mid) /= val) then
      print *, val, 'NOT FOUND'
    else
      print *, 'VALUE AT' , mid
    end if

    deallocate( x )

end program parbsearch

 

Past Year External Link :
 

24 May 2013.