ContestCommittee
USA Toll-Free: 888-426-6840
USA Caller Paid: 215-861-6239
Passcode: 57181504
Action Items:
1) TAU 2015 Contest paper to be included in TAU 2015 and potentially ISPD 2015 proceedings
-- Lead: Jin
-- Timeline: camera-copy ready by Jan. 31
-- I will start the paper, and ask for input as the paper shapes up.
-- Sections to add (new content): challenges of incremental timing (Greg/Debjit), challenges of incremental CPPR (Vibhor), performance impact of incremental timing vs. standalone (older slides from KK/AJS?)
2) TAU 2015 Contest evaluation machine and g++ version
-- Lead: Greg/Vibhor
-- Question: does IBM have g++ 4.9 or higher? One contestant asked about using openMP 4.0 for his binary, which apparently is not supported on g++ 4.4, which is what we have on the fshlnx machines. Do we have access to anything better? Does Cadence have higher versions of g++?
>> We will support 4.8.2 and below
3) TAU 2015 Contest alpha binary submission testing
-- Lead: Jin (see 2 - wherever has the most recent version of g++)
-- Timeline: Feb 1 - Feb 15
-- Make sure students' binaries are compatible with running environment. Some output should be generated. No evaluation for accuracy will be done (some implicit sanity checkers will be done for runtime/memory).
>> Ask for student generated output, match given output and generated output.
4) TAU 2015 Contest benchmarks for final evaluation
-- Lead: Vibhor (Greg to generate goldens)
-- Timeline: start evaluations on Feb. 15 (final binaries submission)
-- Set up existing 3 benchmarks from Vibhor (need FF mapping for libraries) [Greg/Vibhor]
-- Combine existing benchmarks into larger ones [Jin] << no need, just generate larger ops files >>
-- create more complex .ops files [Greg] << half # benchmarks from existing, half # benchmarks from Vibhor and generate beefier ops files >>
>> Vibhor to provide new lib (to be released) with no extraneous keywords and more flop definitions added
5) TAU 2015 Contest student binary evaluation process
-- Lead: ??
-- Timeline: Realistically done by Mar. 1 so we have time to order plaques, awards, etc.
-- (1) Measure performance from students binaries and generate output
-- (2) Comparing accuracy of generated output to golden output
-- (3) Compiling accuracy and runtime and memory (see #6)
6) TAU 2015 Contest evaluation metrics (in conjunction with (3))
-- Lead: ?? (Can add in Debjit/Igor for input)
-- Timeline: Mar. 1 or sooner preferred
-- When comparing student output to golden, what is considered "good"? Do we give partial credit to correct slacks but incorrect paths, for example? We need metrics for both breadth and depth (testing for large # of worst paths). How to combine memory and runtime as factors to accuracy
This is what I did last year for accuracy of paths and slacks:
# When generating contestants, allow -numTests to be #
# greater than what we're actually evaluating, e.g., #
# by 10% or so. #
# #
# ===== Accuracy Evaluation Protocol ===== #
# For each TEST: #
# match pre-CPPR slack with $preTWeight #
# match post-CPPR slack with $postTWeight #
# $preTWeight + $postTWeight = 1 #
# abs(diff) ~ [0,0.1] ps => 100 #
# abs(diff) ~ (0.1,0.5] ps => 80 #
# abs(diff) ~ (0.5,1.0] ps => 50 #
# abs(diff) ~ (1,\inf] ps => 0 #
# discard all tests where post-CPPR slack is positive #
#-----------------------------------------------------#
# For each PATH: direct path number #
# match slack #
# match full path --> 75% then checks slack (if slack is corect) 25% #
# discard all tests where post-CPPR slack is positive #
#------------------------------------------------#
# Overall metrics: #
# a) average all tests #
# b) average all paths #
# c) most-critical test #
# d) most-critical path for each test #
# e) worst test accuracy score (discard one 0?) #
#------------------------------------------------#
Runtime factor was Contestant Runtime / Avg. (all contestant runtime)
To add in runtime, the formula was Accuracy * (0.6 + 0.2*RF + 0.2*M)
(reference: https://sites.google.com/site/taucontest2014/home/resources --> TAU2014Slides.pdf)
7) TAU registration deadline
-- Timeline: early is $325, late is $375. Don't know what people's travel approval situation is, but would be really great if everyone is there.
8) TAU Contest Presentation
-- Timeline: March 12
-- Lead: Jin
-- Current request now: will be asking for pictures! Find a reasonable headshot you're willing to use (see TAU 2014 slides as ref)
report_at / report_slack / report_rat for all pins RFEL
-- potential pruning for only PI/PO/D pins of FFs if design size is potentially too big
-- every merge point and test point
report_worst_paths -pin <every test point> -numPaths 10 (?)
report_worst_paths -numPaths 100 (?)
TODO: Confirm that we're giving full credit when slacks are both positive.
TODO: Generate benchmarks to reflect (1) correct path tracing, (2) cutoff +9999, (3) worst_slew propagation.
TODO: finalize accuracy metrics (have working script).
Preliminary Data from IITrace:
s400 (fast)
======================================================================
VALUE ACCURACY: Average of 73656 data points is 99.76105 (raw sum = 73480)
======================================================================
===================================================================
High accuracy: 73480 of 73656 (99.76105)
Mid accuracy: 0 of 73656 (0.00000)
Low accuracy: 0 of 73656 (0.00000)
No accuracy: 176 of 73656 (0.23895)
===================================================================
No-accuracy values: rat / at / slack that are between (1, \infinity) picoseconds
Ops [43396]:: report_rat -pin inst_148:A1 :: Golden [42153] = 36.320 vs. Contestant [42432] = 55.479
Ops [43397]:: report_rat -pin inst_148:A1 -fall :: Golden [42154] = 62.982 vs. Contestant [42433] = 82.140
Ops [43400]:: report_slack -pin inst_148:A1 :: Golden [42157] = 363.693 vs. Contestant [42436] = 344.533
Ops [43401]:: report_slack -pin inst_148:A1 -fall :: Golden [42158] = 334.298 vs. Contestant [42437] = 315.139
edit_dist_ispd2 (~1 hour and 13G for ~150K gates but verbose queries)
User time (seconds): 5424.95
System time (seconds): 182.33
Percent of CPU this job got: 164%
Elapsed (wall clock) time (h:mm:ss or m:ss): 56:46.84
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 13643712
======================================================================
VALUE ACCURACY: Average of 40348224 data points is 87.36504 (raw sum = 35250240)
======================================================================
===================================================================
High accuracy: 35242492 of 40348224 (87.34583)
Mid accuracy: 4832 of 40348224 (0.01198)
Low accuracy: 7766 of 40348224 (0.01925)
No accuracy: 5093134 of 40348224 (12.62294)
===================================================================
Mid-accuracy values: rat / at / slack that are between (0.1, 0.5] picoseconds
Ops [10541727]:: report_rat -pin FE_OFC75511_n_67216:a -late :: Golden [10542463] = -3271.638 vs. Contestant [10542771] = -3271.337
Ops [10541731]:: report_slack -pin FE_OFC75511_n_67216:a -late :: Golden [10542467] = -8713.131 vs. Contestant [10542775] = -8712.831
Ops [10541740]:: report_rat -pin FE_OFC75511_n_67216:o -late -fall :: Golden [10542476] = -3145.623 vs. Contestant [10542784] = -3145.322
Ops [10541744]:: report_slack -pin FE_OFC75511_n_67216:o -late -fall :: Golden [10542480] = -8713.131 vs. Contestant [10542788] = -8712.831
Ops [10549909]:: report_rat -pin g1881672:a :: Golden [10550645] = 4778.676 vs. Contestant [10550953] = 4778.573
Ops [10549913]:: report_slack -pin g1881672:a :: Golden [10550649] = -566.190 vs. Contestant [10550957] = -566.087
Ops [10549921]:: report_rat -pin g1881672:b :: Golden [10550657] = 4736.973 vs. Contestant [10550965] = 4736.870
Ops [10549925]:: report_slack -pin g1881672:b :: Golden [10550661] = -543.617 vs. Contestant [10550969] = -543.514
Ops [10549934]:: report_rat -pin g1881672:o -fall :: Golden [10550670] = 4795.740 vs. Contestant [10550978] = 4795.637
Ops [10549938]:: report_slack -pin g1881672:o -fall :: Golden [10550674] = -566.190 vs. Contestant [10550982] = -566.087
Ops [11350982]:: report_rat -pin g1881123:a -fall :: Golden [11351718] = 4797.446 vs. Contestant [11352026] = 4797.343
Ops [11350986]:: report_slack -pin g1881123:a -fall :: Golden [11351722] = -785.242 vs. Contestant [11352030] = -785.140
Ops [11350994]:: report_rat -pin g1881123:b -fall :: Golden [11351730] = 4801.354 vs. Contestant [11352038] = 4801.251
Ops [11350998]:: report_slack -pin g1881123:b -fall :: Golden [11351734] = -776.945 vs. Contestant [11352042] = -776.841
Ops [11351006]:: report_rat -pin g1881123:c -fall :: Golden [11351742] = 4801.504 vs. Contestant [11352050] = 4801.402
Ops [11351010]:: report_slack -pin g1881123:c -fall :: Golden [11351746] = -566.190 vs. Contestant [11352054] = -566.087
....
-----------------------------------------------------------------
Low-accuracy values: rat / at / slack that are between (0.5, 1] picoseconds
Ops [10843455]:: report_rat -pin FE_OFC79044_n_3498:a -late :: Golden [10844191] = 8700.820 vs. Contestant [10844499] = 8701.769
Ops [10843456]:: report_rat -pin FE_OFC79044_n_3498:a -late -fall :: Golden [10844192] = 8713.203 vs. Contestant [10844500] = 8714.073
Ops [10843459]:: report_slack -pin FE_OFC79044_n_3498:a -late :: Golden [10844195] = -3864.567 vs. Contestant [10844503] = -3863.619
....
-----------------------------------------------------------------
No-accuracy values: rat / at / slack that are between (1, \infinity) picoseconds
Ops [1000003]:: report_rat -pin g2125110:o -late :: Golden [1000003] = 7885.923 vs. Contestant [1000003] = 7935.030
Ops [1000004]:: report_rat -pin g2125110:o -late -fall :: Golden [1000004] = 7887.097 vs. Contestant [1000004] = 7936.204
Ops [10000059]:: report_rat -pin b_in_105_1 -late :: Golden [10000795] = 3327.996 vs. Contestant [10001103] = 3336.396
Ops [10000060]:: report_rat -pin b_in_105_1 -late -fall :: Golden [10000796] = 3326.289 vs. Contestant [10001104] = 3334.689
....
netcard_iccad (forgot to measure, sorry :(, but hours worth of time for ~1.5M gates and non-verbose output)
======================================================================
VALUE ACCURACY: Average of 44304 data points is 99.33731 (raw sum = 44010)
======================================================================
===================================================================
High accuracy: 42836 of 44304 (96.68653)
Mid accuracy: 1468 of 44304 (3.31347)
Low accuracy: 0 of 44304 (0.00000)
No accuracy: 0 of 44304 (0.00000)
===================================================================
Mid-accuracy values: rat / at / slack that are between (0.1, 0.5] picoseconds
Ops [100032]:: report_rat -pin g2309483_u0_a -late -fall :: Golden [43670] = 41632.539 vs. Contestant [42102] = 41632.433
Ops [100091]:: report_rat -pin g2303697_u2_a -late :: Golden [43729] = 46829.801 vs. Contestant [42161] = 46829.685
Ops [100103]:: report_rat -pin g2303738_u2_a -late :: Golden [43741] = 44624.215 vs. Contestant [42173] = 44624.108
edit_dist_ispd2.UItimer (~1 hour and 9.64G for ~150K gates but verbose queries and no report_worst_paths)
User time (seconds): 4831.13
System time (seconds): 472.77
Percent of CPU this job got: 288%
Elapsed (wall clock) time (h:mm:ss or m:ss): 30:41.20
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 9647648
Ops [10000146]:: report_slack -pin b_in_100_1 -fall :: Golden [10000882] = -8090.479 vs. Contestant [10001190] = -8067.769
Friday, 2/13/15:
1) Evaluation:
-- Greg says he has ops generator ready (confirm that for paths, contestants allowed more paths as compared to golden) - TODO: discuss benchmarks used for evaluation
-- Preliminary list: (16 with 7 released and 9 hidden)
Released:
tau2015_crc32d16N
netcard_iccad
fft_ispd
des_perf_ispd
vga_lcd
cordic_ispd
edit_dist_ispd
Hidden:
tau2015_cordic_core
tau2015_softusb_navre
tau2015_tip_master
b19_iccad
mgc_edit_dist_iccad
mgc_matrix_mult_iccad
vga_lcd_iccad
leon2_iccad
leon3mp_iccad
-- Evaluation script (goldenFileChecker.pl) for accuracy is ready (Based on 50% from value accuracy + 50% from path accuracy)
======
VALUE ACCURACY: Average of 44304 data points is 83.61322 (raw sum = 37044)
PATH ACCURACY: Average of 21 paths is 66.66667 (raw sum of 14)
FINAL ACCURACY: Average of path and value is 75.13994
======
-- TODO: script to gather runtime and memory usage from output (/usr/bin/time -v)
-- 60% * Accuracy + 20% * relative_runtime + 20% * relative_memory
2) Breakdown of prize money (~1500? to play with?)
Preliminary : 750 / 500 / 250