Curriculum Vitae - Mark Leighton Fisher

Mark Leighton Fisher Curriculum Vitae

Purdue University Digital Library Software Developer 6/2012-present

Added the creation of an OAIS AIP (Archival Information Package) as part of the document publication process.

Added the creation of an OAIS SIP (Submission Information Package) as part of the document publication process.

Created PHP O-O modules for METS (Metadata Encoding and Transmission Standard) metadata creation.

Created PHP O-O modules for PREMIS (PREservation Metadata: Implementation Strategies) metadata creation.

Created PHP O-O modules for Dublin Core metadata creation.

Perl Developer, Plain Black Corporation 4/2012-present

WebGUI

Worked towards enabling UTF-8 characters in WebGUI 7.10 passwords.

Added a "--delete userId [userId ...]" option to the WGDev 'user' command.

Found and fixed a bug in WebGUI for the Department of State Alumni customizations where the non-existent WebGUI::Storage->new() was called.

Found and fixed a bug in WebGUI core (7.10 branch) so unmatched character pairs (a '(' without a corresponding ')') do not break asset search.

Added a '-d' option to the WGDev 'ls' command so that 'ls' lists either the children of the specified asset or (using '-d') the asset itself.

Fixed WebGUI Spectre so startup and shutdown are logged at the INFO level even when everything is logged at the WARN level (Log::Log4perl config trick).

Forked WGDev on GitHub (https://github.com/pbmarklf/wgdev) to add the Batchedit command, which lets you edit an asset or URL right in the shell script / batch file, like wgd Batchedit --pattern=http...yui.yahooapis.com. --string=http://ajax.googleapis.com/ ChcG6WcA6jLwSTbWFez2Qg where pattern is the pattern to match, string is the replacement string, and ChcG6WcA6jLwSTbWFez2Qg is the assetId or URL to edit.

Found and devised a workaround for a bug in the Twitter JavaScript search/list/faves/profile widget where https:// pages still contain some http:// Twitter links.

Found and fixed a MySQL deletion ordering issue with WebGUI.

Debugged various hosted WebGUI content issues.

Owner and Chief Consultant, Fisher's Creek Consulting LLC 2003-present

pmtools Perl Module Tools

Release pmtools-1.20 (Perl Module Tools), which includes fixes for pminst (display only unique package files and ignore non-existent directories in @INC and the new tool pmcheck to check that Perl is set up for Perl modules (currently checks that @INC contains only existing, readable directories).

Wrote and submitted a grant proposal to the Perl Foundation for porting pmtools (Perl Module Tools) to Perl 6.

Took over maintenance of pmtools, Tom Christiansen's Perl module tools suite – 3 enhanced releases so far. pmtools are a standard part of Fedora Linux.

pmtools-perl6 Perl Module Tools for Perl 6

Started pmtools-perl6, a port of pmtools to Perl 6. A Config.pm generator for Pugs has already been released

DeepData, Inc. (2003-2004)

Sr. Perl Designer/Implementer, PostgreSQL Optimization Expert, and Automated Testing Expert for DeepData, a Yellow Pages strategic information solutions provider.

Wrote Test::MockDBI, a Perl module for testing database program by mocking-up the Perl database interface (DBI) API, then allowing developers to write rules for the mockup DBI's behavior.

Modified a Yellow Pages processing program to handle processing all U.S. medical Yellow Pages headings (3.7M records).

Enhanced DeepData's address standardization system by assigning ZIP codes to all possible addresses, adding latitude/longitude to all possible addresses, and standardizing all address components.

Java

Learned Java - Sun Certified Java Programmer exam passed on 2005-05-12 with 70% (minimum passing score was 52%). Writing a YAML parser and a weblogging framework in Java as an aid to learning Java thoroughly.

TUTOS

Installed, configured, and fixed TUTOS (The Ultimate Team Organization Software) for project management. Fixes were for proper Gantt chart display when a task starts and ends on the same day, and for proper milestone task entry.

Nagios

Installed and configured Nagios Open Source network monitoring software on DeepData and Thyatia.net servers.

Web Scraper

Created Boston Building Permits web scraper for PropertyShark.

Writing

Published weblog essays picked up by Slashdot, Scripting News, and other news sites.

Systems Engineer II, Regenstrief Institute Inc. 8/2005-3/2012

RELMA (Regenstrief LOINC Mapping Assistant) programming team -- VB.NET, C#, VB6, SQL, Microsoft Access/Jet/VBA, Perl, Java

Ported RELMA from VB6 to VB.NET 2.0. Later ported RELMA from VB.NET 2.0 to VB.NET 4.0

Added foreign-language search to RELMA's SQL database-based search. This included writing a Perl self-extracting Windows GUI program for loading the foreign language indexes.

Wrote a program that generates databases of LOINC name Parts for the LOINC translators to load with their translations.

Switched RELMA's LOINC search and indexing to Lucene.NET. This required developing standalone programs to index LOINCs, LOINC Part names, and LOINC answer lists. Several languages were indexed, including English, French, Greek, and Chinese. This also required some reading of the Lucene Java and Lucene.NET C# code to answer detailed questions about Lucene's internal workings (including whether Lucene generates parse trees -- which it does not).

Developed IgnoreUnknownWordsQueryParser, a Lucene query parser that ignores unknown words (inherits from Lucene.Net.QueryParsers.QueryParser).

Under the direction of Dr. Clement J. McDonald, developed software (with both batch and interactive modes) to automatically find mappings from local terms to LOINCs (the RELMA AutoMapper) using a longest-name-first matching algorithm (which required developing an n-gram class for .NET). This AutoMapper standardizes timings and part names, finds specimens/systems, methods, and other parts of LOINC names in the HL7 OBR and OBX descriptions, guesses specimens/systems if none are present in the HL7 OBR and OBX descriptions, then searches the Lucene index of LOINCs for matching LOINCs. If units are available for the local term, the search is restricted to LOINCs that match the property corresponding to the units. Results are scored such that the best match in the fewest words wins. If the AutoMapper is run against laboratory local terms, scoring ties are settled by the LOINC that is more commonly used as a lab test.

Some experimentation was done using a Lucene index that contained only the unique words in the default Lucene field. The results were that using a non-unique-words index (i.e. allowing phrase searching) proved more effective in the current LOINC Lucene indexing scheme.

Added LOINC search restrictions for laboratory LOINCs, the Top 2000+ Lab Observations, and the Top 300+ Laboratory Order Codes. These changes were made in the Lucene LOINC index and the RELMA program.

Added dynamic (on-demand) loading of tree data to the Regenstrief .NET tree/grid control. This included separate loading/showing/hiding of LOINC leaf nodes.

Added a search mode to the .NET tree data code such that only the matching nodes and their ancestors are displayed.

Added a configuration dialog to the Regenstrief .NET tree/grid control that allowed for hidden columns and always-visible columns as well controlling column placement and width.

Added a custom export dialog to the Regenstrief .NET tree/grid control that allowed for the export (with or without header rows) of all rows or only the selected rows. Custom export actions include copying to the Clipboard, printing, exporting as a CSV file, exporting to Excel, and sending as an email (tested with Outlook, Outlook Express, and Thunderbird).

Added multi-column prioritized reversible sorting to the Regenstrief .NET tree/grid control.

Created HTML output formats for LOINCs, LOINC name parts, and LOINC answer lists. This required handling multi-megabyte HTML files that allow persistent configuring of whether each section of the HTML should be shown or hidden using JavaScript manipulation of CSS class styles. These HTML files could be loaded from the Internet, or built locally using an LRU file cache.

Updated the HTML formats for faster apparent rendering of tables by incrementally rendering the tables after applying the CSS style 'table-layout: fixed'.

Added a simple, Google-like search screen to RELMA.

Enhanced RELMA's HL7 importing mechanism to include importing code systems and importing OBR and OBX NTE segments (message note segments).

Added a command-line interface to the program for extracting the releasable LOINC database from the master LOINC database.

Added deletion of unreleasable reference information (descriptions etc.) to the program for extracting the releasable LOINC database from the master LOINC database.

Added verification of whether the specified releasable tables and fields actually exist in the master LOINC database to the program for extracting the releasable LOINC database from the master LOINC database.

Wrote and enhanced various .NET/VB6/Perl/SQL routines for dealing with hierarchical Access database tables in both linked parent-child and path enumeration ('Dewey Decimal') formats.

Maintained external and internal RELMA Citrix applications.

Wrote and upgraded Microsoft Access reports.

Developed internal VB.NET-compatible unit testing framework in VB6.

Started using the Microsoft Moles isolation/mocking framework for unit testing with MSTest.

Communicable Diseases reporting for the Indiana Network for Patient Care (INPC)

Communicable diseases county and state reporting for the INPC (Indiana Network For Patient Care) - Access & SQL & VBA.

Shared Pathology Information Network (SPIN)

Created first port of the SPIN client from VB6 to .NET 1.0.

XSLT transform reworked for SPIN client report -- tested using Java & JUnit. Note that i2B2 now handles most of what the SPIN client initially handled for Regenstrief.

Java Class

Implemented part of an X12-270 reader in Java.

Thomson, Inc. (1991-2003)

I was a member of the IT group Electronic Design Automation, which provided computer software and hardware assistance to the engineering community. Unless otherwise mentioned, all projects below were my sole responsibility.

Corporate Technical Memory

The Thomson Corporate Technical Memory (CTM) was a knowledge management application before the phrase "knowledge management" was even coined (1994 — which made it hard to explain initially). CTM enabled users to write information down into a document once (rather than repeatedly explain the information to each questioner), which then enabled other users to find the information from their browser by full-text document searches, searches on document title or author, or by browsing the main subject area of a document, browsing by author name, or browsing by document title. CTM V2.0's code consisted of a main Perl module along with associated CGI programs and an external search engine (V1.0 used a Perl4 daemon program and simple Perl4 CGI programs with an external search engine). Document metadata was stored in an Oracle database while the documents were stored on the webserver's filesystem. I published a paper on CTM, "The TCE Corporate Technical Memory: groupware on the cheap", in the International Journal of Human-Computer Studies, Volume 46 , Issue 6 (June 1997) (Special issue: innovative applications of the World Wide Web). Writing CTM saved Thomson $180,000, as the only alternative was a Lotus Notes-based system.

Status Reporter

Status Reporter V6.0 enabled automatic assembling of text-based status reports by allowing users to enter status for each task of each project they had worked on recently. The status report assembler could then be invoked over a specified time period for that group of people, producing one unified status report for the whole group. Using Status Reporter saved around half a day of work for each group, as manual assembly took around 4 person-hours per group at Thomson. V6.0 was web-based, storing the reports in an Oracle database (previous versions were in 16-bit Microsoft Word's WordBasic with reports in Word .DOC files). Status Reporter's code consisted of a main Perl module along with associated CGI programs and a smattering of helper Perl utility programs. Status Reporter saved Thomson over 2000 person-hours by automating status report assembly.

Thomson Application Status (UpApp)

Thomson Application Status (UpApp) provided a concise view of the up/down/in-trouble status of major Thomson applications/systems for end users in their browsers. UpApp was co-developed with Mark Bender, where I developed the main Perl module and mod_perl CGI-like programs while Mark Bender developed the HTML user interface. UpApp status information was held in a MySQL database to provide easy multi-admin updating. UpApp also provided both XML-RPC and SOAP Web Service interfaces, including an all-application status — red if any applications were "red" (crashed), yellow if no applications were red but any applications were "yellow" (in-trouble), or otherwise "green".

Overview and Overdrive

Overview was a search engine I set up that covered the ten most-used webservers by Thomson Americas when there was no overall Thomson Americas search engine. Overview used the Verity SEARCH'97 search engine software. Overdrive was started as the successor to Overview, which would use an Open Source search engine on all 30+ Thomson Americas webservers. Overdrive was in beta-test using ht://Dig's December 2001 beta of V3.2.0 with very nice results — often the first search result was the term searched-for when the search result page had Dublin Core metadata (this was on a 200MHz RedHat V6.2 Linux box). I had started setting up ASPSeek V1.2.10 to compare against ht://Dig.

Nagios

Nagios is an Open Source network monitoring tool using plugins for monitoring hosts and services, where the plugins cover monitoring everything from basic PING functionality to Oracle+MySQL databases and FlexLM license servers. I set up Nagios to monitor the 27 services on 19 servers that our group was either responsible for or were affected by. Nagios provided the back-end functionality for tracking applications/servers/hosts whose status would be displayed to end-users by UpApp. Setting up Nagios like saved Thomson more than $20,000 compared to commercial alternatives.

EDA/Weblog

EDA/Weblog was an internal weblog for the engineering and IT communities at Thomson, powered by Slashcode (Slashcode powers Slashdot). Articles/stories were stored in a MySQL database. EDA/Weblog featured announcements of internal interest (new systems in production, policy changes, etc.) along with links to particularly interesting technical news articles. EDA/Weblog was Thomson's first weblog (March 2002).

EDA (Electronic Design Automation) Home Page

I started and maintained Thomson's first web server in 1993, which provided a home for EDA services to the engineering and IT communities. Except as noted, all web pages and applications mentioned were on that server. When I left, the server was a Sun SPARC Ultra 30 running Solaris 2.6. The webserver programs were Stronghold V2.4 (an early Apache V1.3.x with a licensed SSL implementation) and Apache V1.3.23 with mod_perl V1.26. (CERN httpd was used before we switched to Apache.) The server had ~3000 pages on it when I left Thomson.

ISO 9001 Web

Thomson Americas TV/Audio/Video Product Development received its ISO 9001 quality certification around 1995. I maintained the ISO 9001 Web as well as the Perl web gateway to the Thomson ISO 9001-compliant document management system, DMS9000. The DMS9000 web gateway provided URL-based document retrieval (with URLs almost identical to the native DMS9000 document paths) along with simple and advanced DMS9000 document searches and a CGI program that converted native DMS9000 document paths to their equivalent URLs.

PACO: Password-Checking Objects

PACO (Password-Checking Objects) was a Perl system that helped ensure the use of good passwords by running the password through Crack V5.0 by Alec Muffett. PACO provided a web form interface, although the intention was to provide other interfaces suitable for programs (like XML-RPC, SOAP, etc.) so that all Thomson programs could use PACO rather than their own password strength-checking code. PACO remembered a hash of the password so it could tell you whether you were reusing an old password without compromising the actual password.

CADWeb

CADWeb was a separate webserver (Perl CGIs with Apache on HP-UX V10.20) that drove the additional CAD utility programs written by Thomson employees for use with the Visula and P870 PCB CAD packages. These CAD utility programs originally ran on a text GUI under VAX/VMS, so this was a non-trivial migration task. My major accomplishment for CADWeb was writing the authentication code, which accepted a user's Oracle username and password, producing an HTTP cookie used for later re-authentication using methods based on HTTP Digest Authentication. This authentication code is similar in functionality to the Apache module mod_cookie, with the actual authentication deferred to a program specified in a configuration file — this code was later used for domain (SMB/NT/LanMAN/etc.) authentication by switching authentication programs. CADWeb users would upload CAD files to the server, request a run of a program, then download the resulting output files from the requested program.

In-Circuit Test

In-circuit testing is (basically) the process of testing that components are correctly placed and soldered into a printed-circuit board (PCB). Thomson's program for In-Circuit Test (ICT) testpoint generation/selection was created in 1987, which by the time I received it has grown to ~20 000 lines of C. I made numerous improvements to it, including most notably the ability to test topside components from the bottom side of the board and the addition of boundary-scan testing outputs. During the boundary-scan improvement I added a hash table implementation based on Perl's hash table implementation — before then all name searches were done as sequential scans of linked lists as in the original 1987 ICT code. I also used the Perl-Compatible Regular Expressions library. ICT ran as part of the CADWeb suite of tools. ICT migrated from HP-UX V9.05 to HP-UX V10.20 while I was maintaining it.

Incircuit Testpoint Analysis and Placement

Incircuit Testpoint Analysis and Placement (ITAP) was slated to replace ICT. Initial design and some code was done by Rick Lyndon, then I was brought on-board to speed development and create a detailed project management plan (I ended up doing around 30% of the coding). ITAP ended up as ~20 000 lines of Microsoft Visual C++, running on the user's Windows PC with vastly greater capabilities including avoiding testpoint placement under component bodies and built-in HP TestJet processing as well as much more table-driven operation so that ICT parameters could be tweaked without recompiling ITAP. ITAP included object-oriented wrappers around the Perl-Compatible Regular Expressions library and the Perl-like hash table used in ICT.

Full Circle Compare

Full Circle Compare (FCC) was a several-thousand-line C++ program that verified that identical components were present in the Bill of Materials (BOM), schematic CAD file, and PCB CAD file. FCC was obsoleted by a more-integrated design process that drove the BOM and the PCB CAD file from the schematic CAD file. I took over FCC from the original outside developer (David Sarnoff Research Center) in 1991, got it into production shape, and maintained it until was obsolete around 1999-2000. FCC went into production on VAX/VMS, then migrated to HP-UX V9.05, then finally to HP-UX V10.20.

Software Engineering and Project Management

I was the group resource on software engineering and project management. I was the project manager on three projects:

    • ITAP (when I joined the project)

    • ICT: Adding the testing of topside components from the bottom side of the PCB

    • ICT: Boundary-scan testing outputs

which were all on-time and on-budget as far as they went (ITAP development was placed on hiatus before it was completed — we delivered two production versions before hiatus).

Perl (See also: Links -> Perl)

I started using Perl with V4.0.19 in 1992, which has greatly increased my productivity due to both the built-in features of Perl (so much of programming is parsing text which Perl excels at) and the advent of the Comprehensive Perl Archive Network (CPAN) which provides Open Source Perl modules for about any task you can think of. The Links -> Perl section below details my contributions to Perl.

World Wide Web (See also: Links -> World Wide Web)

I started using Web technology in 1993, two years before Thomson had an Internet connection (see Links -> Humor for my comments on FTP-by-mail). (For a long time I kept around my Sept. 21, 1993 email from Tim Berners-Lee about how commercial enterprises could also use CERN's httpd webserver without paying licensing fees.) I was a member of several Internet Engineering Task Force working groups, leading to enough participation in the WebDAV group to merit an acknowledge in the initial WebDAV RFC. Much of my work at Thomson leveraged Web technology, including theCorporate Technical Memory, Status Reporter, UpApp, CADWeb, ISO 9001 Web, Overview and Overdrive, PACO, EDA Home Page, EDA/Weblog, and Nagios. The Links -> World Wide Web section shows some of my involvement with Web technology.

Technical Library Applications

I installed and maintained the library automation software, EOLAS (initially named DataTrek). I setup and maintained WISE install programs for the Technical Library Applications, which were applications supplied by the Technical Library. Among these applications were the EOLAS online catalog search, Books in Print, Microsoft Bookshelf, and the CAPSXpert component library.

IBM Lotus Team Workplace (QuickPlace)

As part of the Thomson Global Knowledge Management initiative, I constructed QuickPlaces for ICT and ITAP. These QuickPlaces contained discussion areas, design documents, project management plans, and status reports for ICT and ITAP.

SlashWiki - a Wiki for Slashcode

As part of EDA/Weblog I set up SlashWiki, a WikiWiki for Slashcode-driven servers that adds the possibility of requiring authentication to edit certain pages. WikiWikis, Status Reporter, and Slashcode all use simplified text markup schemes to enable users to create HTML content without learning HTML.

Cypherpunks

To help track security issues I have lurked on (and occasionally contributed to) the cypherpunks mailing list. Cypherpunks explores the social implications of security/privacy/cryptography technology, occasionally delving deep into technical details when appropriate. It is unfortunately also the home to a lot of spam and some amount of knee-jerk anti-authoritarianism. It is my opinion that some amount of crypto anarchy is coming, so it is better to be prepared.

PGP and GNU Privacy Guard

Before Thomson switched to Entrust, I was the expert on PGP (Pretty Good Privacy), an encryption/decryption tool. I am now using GNU Privacy Guard, a mostly-compatible free software replacement for PGP. (Entrust is a better solution for large corporations due to Entrust's centralized automatic key handling.)

HCL-eXceed X Window System server

When Windows 3.0 came out, I spent months evaluating Windows-based X Window System servers, settling on the HCL-eXceed X Window System server. HCL-eXceed is what I like in commercial, closed-source software:

    1. It just works; and

    2. If it doesn't work, the defect is either minor or fixed quickly.

I don't have a problem working with closed-source software, but I don't like seeing many defects in it either. Using HCL-eXceed probably saved Thomson over $100,000 by allowing access of Unix workstation resources from Windows PCs.

Wintek, Inc. (1984-1991)

I worked as part of a six-member software development team, mainly on electronic CAD products for MS-DOS.

smARTWORK Editor

smARTWORK is a 50-mil rule PCB CAD package for which I became the primary developer of the Editor program. smARTWORK was the first PCB CAD package available for DOS.

smARTWORK simulated annealing Automatic Component Placement

smARTWORK included an Automatic Component Placement program that used a simulated annealing algorithm, developed when the main information on simulated annealing was one book and a handful of research papers. I took over as developer late in the initial code-writing stage, bringing it to production status. This was probably the first commercial use of the simulated-annealing algorithm.

smARTWORK Video Drivers

I took over maintenance and enhancement of smARTWORK's video drivers, which including the development of smARTWORK's first 800x600 video driver.

smARTWORK Copy Protection

I took over maintenance of smARTWORK's copy-protection scheme, which involved manipulation of the DOS floppy filesystem.

Installation Program Generator

I wrote what was probably one of the first installation program generator programs, using it to generate the batch files necessary for installing smARTWORK and HiWIRE from either 5.25" or 3.5" floppies. Rick Strehlow's extremely clearly-written installation batch files for smARTWORK inspired me to write the installation program generator program.

HiWIRE II Main Menu Program

HiWIRE II uses a text GUI for its main menu. I wrote both the text GUI menu program (my first sizable C++ program) and the program overlay manager (my last sizable assembly program).

HiWIRE II Editor Memory Manager

The HiWIRE II Editor uses a memory manager that allows objects to be stored in conventional memory, in extended/expanded memory, or on disk.

smARTWORK and HiWIRE II Netlist Converter

On my own initiative I wrote a netlist converter for smARTWORK and HiWIRE II for inputting foreign netlist formats (including EDIF).

Serial Port Interface for Remote Debugger

I wrote the serial port interface for remote debugging of microcontroller boards, along with the user documentation for the remote debugger.

MDBS (Micro Data Base Systems, 1981-1984)

I worked as part of a team of 20+ developers, although my projects (except as noted) were my sole responsibility. MDBS IV is a network CODASYL database management system (pre-relational database).

MDBS IV Page-Level Memory Manager

The MDBS IV page-level memory manager used a best-fit algorithm from Knuth's The Art of Programming.

MDBS IV Porting (CP/M, WICAT, Unix)

For 18 months I was responsible for part of the MDBS IV C code porting effort between CP/M, VAX Unix, and WICAT (a Unix look-similar for workstations).

Regular Expression Search

On my own I wrote both globbing and regular expression matching code in C. The regular expression code is still part of their GURU expert-system-enhanced database product.

Hash Tables (Associative Arrays)

I wrote hash table (associative array) code in C.

Fixed Hash Table Generator

I wrote a fixed hash table generator program, similar to GNU's mkperf.

MDBS IV User Query External Sort

The MDBS IV user query program allows users to arbitrarily sort their output queries (like ORDER BY in SQL only for a CODASYL database) using an external sorting function (sorting chunks into multiple files, then merging the files), co-written by Tim Stockman and me.

Unix Device Driver for Named Pipes

To provide client-server IPC on early Unix boxes, I wrote a device driver that provided a named pipe facility on Unix before "named pipes" were a standard part of Unix.

CP/M Filesystem Emulator for Unix Floppies

Tim Stockman and I wrote a CP/M filesystem emulator for Unix floppies to enable easy copying of files between CP/M and Unix systems.

C Compiler Selection

(A "how times have changed" entry.) I was part of the team that finally selected the BDS software C compiler (by Leor Zolman) for CP/M software development in C. This required a compiler that could have a new back-end written for it that generated a pseudo-assembly that assembled down into either 8080, 8086, or Z-80 instructions.

Unix Systems Administration

Part of my duties included system administration of their PDP-11, VAX 11/780, and WICAT computer systems.

Northrop Defense Systems Division (1979-1981)

My duties included radar countermeasures software (ECM). I held a SECRET clearance. I can't say more than that except that my project name was unclassified but that my participation in the project was classified.

Purdue (as B.S.E.E. student 1975-1979)

IEEE Dictionary Computerization (1978-1979)

Professor Benjamin Leon hired me as a research assistant to work on the first try to computerize the IEEE Dictionary of Electrical and Electronic Terms.

Unix Systems Administration (1978-1979)

I was one of two Unix System Administrators (V6 and V7 PDP-11 Unix) from what was still called the ARPA Lab, although by that point the focus of the Lab was (as I remember) on computer vision (this was in the pre-TCP/IP days).

8080 Cluster Schematic (1978)

I did an independent study project that designed an 8080 computer board to be used as part of an 8080 cluster computer.

APL Editor (1978)

I did an independent study project to create a standard APL editor.

B.S.E.E.

I graduated with a bachelors of science in electrical engineering with concentration on computer engineering.

John Adams High School, South Bend, IN (1971-1975)

National Merit Scholarship

I was one of the National Merit Scholarship winners for 1975. This was due to my 730 verbal and 700 math SAT (1430 total).

Graduated Cum Laude

I graduated Cum Laude (top 10% of my class).

Purdue President's Freshman Engineering Scholarship

I received a Purdue President's Freshman Engineering Scholarship for my freshman year at Purdue.

Experience

* I worked in assembly language from 1979-1990. 80x86 assembly language was from 1981-1990. The only assembly language I've used since 1990 has been in a debugger.

Other Influences

APL

Purdue's computer architecture class used APL as the design language, so I got quite familiar with APL. I even wrote an editor for APL as an independent research project for the computer architecture professor. APL opened my eyes to just how differently you can approach programming a computer. I have not used APL to any degree since Purdue, however.

FORTH

I used FORTH on a Commodore 64 in the early 1980's before C++ came out, partially in an effort to help develop a saleable videotape library system for video rental stores with a group of friends. FORTH showed me how powerful language extensions could be, thereby preparing me for object-oriented programming with C++ and Perl. <namedrop>I even answered Brian Kernighan's question about what is FORTH during his second visit to Purdue ("a very-high-level extensible assembly for a stack-based virtual machine").</namedrop>.

LISP

Although I've only written less than a hundred or so lines of LISP in my life, because LISP is a favorite of academics due to its flexible nature (apart from the parentheses) I have followed LISP developments for many years now. Among other nuggets of knowledge, I can thank LISP for what I know about garbage collection.

Haskell

Haskell is my first foray into functional programming. I am just beginning to learn Haskell, but it has already turned my attention to the occasions where a no-side-effects programming style is appropriate.

Expert Systems

Around 1992-1993 I took a class in Expert Systems while at Thomson. This inspired me get a copy of Translate Rules To C, a simple forward-chaining expert system that compiled the rules down to a C program. I was only able to use my Expert Systems knowledge indirectly, however, as part of my ICT and ITAP work at Thomson. IMHO, an Expert System is just part of what is required for intelligence — there are lots of other components.

Lucene and Lucy

Lucene is Apache's powerful (and popular) search engine technology. Products ranging from Netflix to RELMA use Lucene's technology. Apache Lucy is Lucene's lesser-known cousin, a "loose C" port (their words) of Lucene. I learned a lot about designing search indexes and search strategies by learning Lucene, Lucene.NET, and Lucy. Lucy is exciting to me because since Lucy is in C, languages like Perl, PHP, and Python will be eventually be able to make use of Lucene/Lucy's power (Lucy already has a Perl binding).

CYC and OpenCYC

OpenCYC is the Open Source version of CYC, an attempt to write down common sense in a form that computers can use (my paraphrase), basically repeating the learning process every child goes through. The end result should be a usable artificial intelligence that could be plugged into other programs.

Honors and Publications

Education and Training

Advanced Perl Programming

Taught by Nathan Torkington of the Tom Christiansen Perl Consultancy.

Real-World Requirements

Taught by Steve Tockey of Construx.

Real-World Software Testing

Taught by Steve Tockey of Construx.

Information Systems Project Management

Taught by Mark A. Reed for the American Management Association International.

Information Engineering - Analysis

Taught by the staff of Thomson Consumer Electronics Information Systems.

Expert Systems

Taught by the staff of Thomson Consumer Electronics Information Systems.

Micro-based Automated Library Systems

Taught by the staff of the Indiana Cooperative Library Services Authority (INCOLSA).

Objected-Oriented Concepts & Design: Advanced C++ Workshop

Taught by the staff of the Technology Exchange Company (the teacher was a colleague of Jim Coplien's whose name I can't find in my paperwork).

Links

World Wide Web

Perl

Perl6 RFC349: Perl modules should be built with a Perl make program

My RFC that proposes that only Perl should be built with make(1) -- everything else should be built with a Perl make program. This appears to be implicitly accepted by the Parrot development team, at least.

CGI

Version 2.90: Documentation patch for the import_names() method transforming the parameter names into valid Perl names.

Version 2.36: Patch so that cookie and HTTP date formats now meet spec.

CORE

perlhack: External Tools Rational Purify documentation patch (perl5-porters email message)

hv.h: Documentation patch (perl5-porters email message)

hv.c: Patch to use Bob Jenkins One-at-a-Time key hashing function (perl5-porters email message)

My CPAN Modules

Apache::Authen::Program: (Apache authentication using an external program)

Date::LastModified: (Extract last-modified date from zero or more files, directories, and DBI database tables)

SQL::AnchoredWildcards: (Transparently use substrings and anchored wildcards in SQL)

Test::MockDBI enables testing of DBI programs by mocking up the entire DBI API, then allowing developers to create rules for the mockup's behavior

GitHub for WebGUI

Fork of the WebGUI WGDev command line tool that adds batchedit, a command for directly editing an asset or URL from a shell script / batch file rather than bringing up an editor program. Batchedit will eventually will be integrated into the main WGDev repository.

DBIx::TableHash

Patch to turn off warnings when used under -w.

Perl Power Tools

ls(1) Implementation: Complete V7 Unix with some GNU options.

fancy grep(1) patch: I added the -E (extract data) option.

perl5-porters

My contributions to the Perl5 development list.

Humor

Last updated December 7, 2012.

Mark Leighton Fisher Curriculum Vitae

Purdue University Digital Library Software Developer 6/2012-present

Added the creation of an OAIS AIP (Archival Information Package) as part of the document publication process.

Added the creation of an OAIS SIP (Submission Information Package) as part of the document publication process.

Created PHP O-O modules for METS (Metadata Encoding and Transmission Standard) metadata creation.

Created PHP O-O modules for PREMIS (PREservation Metadata: Implementation Strategies) metadata creation.

Created PHP O-O modules for Dublin Core metadata creation.

Perl Developer, Plain Black Corporation 4/2012-present

WebGUI

Worked towards enabling UTF-8 characters in WebGUI 7.10 passwords.

Added a "--delete userId [userId ...]" option to the WGDev 'user' command.

Found and fixed a bug in WebGUI for the Department of State Alumni customizations where the non-existent WebGUI::Storage->new() was called.

Found and fixed a bug in WebGUI core (7.10 branch) so unmatched character pairs (a '(' without a corresponding ')') do not break asset search.

Added a '-d' option to the WGDev 'ls' command so that 'ls' lists either the children of the specified asset or (using '-d') the asset itself.

Fixed WebGUI Spectre so startup and shutdown are logged at the INFO level even when everything is logged at the WARN level (Log::Log4perl config trick).

Forked WGDev on GitHub (https://github.com/pbmarklf/wgdev) to add the Batchedit command, which lets you edit an asset or URL right in the shell script / batch file, like wgd Batchedit --pattern=http...yui.yahooapis.com. --string=http://ajax.googleapis.com/ ChcG6WcA6jLwSTbWFez2Qg where pattern is the pattern to match, string is the replacement string, and ChcG6WcA6jLwSTbWFez2Qg is the assetId or URL to edit.

Found and devised a workaround for a bug in the Twitter JavaScript search/list/faves/profile widget where https:// pages still contain some http:// Twitter links.

Found and fixed a MySQL deletion ordering issue with WebGUI.

Debugged various hosted WebGUI content issues.

Owner and Chief Consultant, Fisher's Creek Consulting LLC 2003-present

pmtools Perl Module Tools

Release pmtools-1.20 (Perl Module Tools), which includes fixes for pminst (display only unique package files and ignore non-existent directories in @INC and the new tool pmcheck to check that Perl is set up for Perl modules (currently checks that @INC contains only existing, readable directories).

Wrote and submitted a grant proposal to the Perl Foundation for porting pmtools (Perl Module Tools) to Perl 6.

Took over maintenance of pmtools, Tom Christiansen's Perl module tools suite – 3 enhanced releases so far. pmtools are a standard part of Fedora Linux.

pmtools-perl6 Perl Module Tools for Perl 6

Started pmtools-perl6, a port of pmtools to Perl 6. A Config.pm generator for Pugs has already been released

DeepData, Inc. (2003-2004)

Sr. Perl Designer/Implementer, PostgreSQL Optimization Expert, and Automated Testing Expert for DeepData, a Yellow Pages strategic information solutions provider.

Wrote Test::MockDBI, a Perl module for testing database program by mocking-up the Perl database interface (DBI) API, then allowing developers to write rules for the mockup DBI's behavior.

Modified a Yellow Pages processing program to handle processing all U.S. medical Yellow Pages headings (3.7M records).

Enhanced DeepData's address standardization system by assigning ZIP codes to all possible addresses, adding latitude/longitude to all possible addresses, and standardizing all address components.

Java

Learned Java - Sun Certified Java Programmer exam passed on 2005-05-12 with 70% (minimum passing score was 52%). Writing a YAML parser and a weblogging framework in Java as an aid to learning Java thoroughly.

TUTOS

Installed, configured, and fixed TUTOS (The Ultimate Team Organization Software) for project management. Fixes were for proper Gantt chart display when a task starts and ends on the same day, and for proper milestone task entry.

Nagios

Installed and configured Nagios Open Source network monitoring software on DeepData and Thyatia.net servers.

Web Scraper

Created Boston Building Permits web scraper for PropertyShark.

Writing

Published weblog essays picked up by Slashdot, Scripting News, and other news sites.

Systems Engineer II, Regenstrief Institute Inc. 8/2005-3/2012

RELMA (Regenstrief LOINC Mapping Assistant) programming team -- VB.NET, C#, VB6, SQL, Microsoft Access/Jet/VBA, Perl, Java

Ported RELMA from VB6 to VB.NET 2.0. Later ported RELMA from VB.NET 2.0 to VB.NET 4.0

Added foreign-language search to RELMA's SQL database-based search. This included writing a Perl self-extracting Windows GUI program for loading the foreign language indexes.

Wrote a program that generates databases of LOINC name Parts for the LOINC translators to load with their translations.

Switched RELMA's LOINC search and indexing to Lucene.NET. This required developing standalone programs to index LOINCs, LOINC Part names, and LOINC answer lists. Several languages were indexed, including English, French, Greek, and Chinese. This also required some reading of the Lucene Java and Lucene.NET C# code to answer detailed questions about Lucene's internal workings (including whether Lucene generates parse trees -- which it does not).

Developed IgnoreUnknownWordsQueryParser, a Lucene query parser that ignores unknown words (inherits from Lucene.Net.QueryParsers.QueryParser).

Under the direction of Dr. Clement J. McDonald, developed software (with both batch and interactive modes) to automatically find mappings from local terms to LOINCs (the RELMA AutoMapper) using a longest-name-first matching algorithm (which required developing an n-gram class for .NET). This AutoMapper standardizes timings and part names, finds specimens/systems, methods, and other parts of LOINC names in the HL7 OBR and OBX descriptions, guesses specimens/systems if none are present in the HL7 OBR and OBX descriptions, then searches the Lucene index of LOINCs for matching LOINCs. If units are available for the local term, the search is restricted to LOINCs that match the property corresponding to the units. Results are scored such that the best match in the fewest words wins. If the AutoMapper is run against laboratory local terms, scoring ties are settled by the LOINC that is more commonly used as a lab test.

Some experimentation was done using a Lucene index that contained only the unique words in the default Lucene field. The results were that using a non-unique-words index (i.e. allowing phrase searching) proved more effective in the current LOINC Lucene indexing scheme.

Added LOINC search restrictions for laboratory LOINCs, the Top 2000+ Lab Observations, and the Top 300+ Laboratory Order Codes. These changes were made in the Lucene LOINC index and the RELMA program.

Added dynamic (on-demand) loading of tree data to the Regenstrief .NET tree/grid control. This included separate loading/showing/hiding of LOINC leaf nodes.

Added a search mode to the .NET tree data code such that only the matching nodes and their ancestors are displayed.

Added a configuration dialog to the Regenstrief .NET tree/grid control that allowed for hidden columns and always-visible columns as well controlling column placement and width.

Added a custom export dialog to the Regenstrief .NET tree/grid control that allowed for the export (with or without header rows) of all rows or only the selected rows. Custom export actions include copying to the Clipboard, printing, exporting as a CSV file, exporting to Excel, and sending as an email (tested with Outlook, Outlook Express, and Thunderbird).

Added multi-column prioritized reversible sorting to the Regenstrief .NET tree/grid control.

Created HTML output formats for LOINCs, LOINC name parts, and LOINC answer lists. This required handling multi-megabyte HTML files that allow persistent configuring of whether each section of the HTML should be shown or hidden using JavaScript manipulation of CSS class styles. These HTML files could be loaded from the Internet, or built locally using an LRU file cache.

Updated the HTML formats for faster apparent rendering of tables by incrementally rendering the tables after applying the CSS style 'table-layout: fixed'.

Added a simple, Google-like search screen to RELMA.

Enhanced RELMA's HL7 importing mechanism to include importing code systems and importing OBR and OBX NTE segments (message note segments).

Added a command-line interface to the program for extracting the releasable LOINC database from the master LOINC database.

Added deletion of unreleasable reference information (descriptions etc.) to the program for extracting the releasable LOINC database from the master LOINC database.

Added verification of whether the specified releasable tables and fields actually exist in the master LOINC database to the program for extracting the releasable LOINC database from the master LOINC database.

Wrote and enhanced various .NET/VB6/Perl/SQL routines for dealing with hierarchical Access database tables in both linked parent-child and path enumeration ('Dewey Decimal') formats.

Maintained external and internal RELMA Citrix applications.

Wrote and upgraded Microsoft Access reports.

Developed internal VB.NET-compatible unit testing framework in VB6.

Started using the Microsoft Moles isolation/mocking framework for unit testing with MSTest.

Communicable Diseases reporting for the Indiana Network for Patient Care (INPC)

Communicable diseases county and state reporting for the INPC (Indiana Network For Patient Care) - Access & SQL & VBA.

Shared Pathology Information Network (SPIN)

Created first port of the SPIN client from VB6 to .NET 1.0.

XSLT transform reworked for SPIN client report -- tested using Java & JUnit. Note that i2B2 now handles most of what the SPIN client initially handled for Regenstrief.

Java Class

Implemented part of an X12-270 reader in Java.

Thomson, Inc. (1991-2003)

I was a member of the IT group Electronic Design Automation, which provided computer software and hardware assistance to the engineering community. Unless otherwise mentioned, all projects below were my sole responsibility.

Corporate Technical Memory

The Thomson Corporate Technical Memory (CTM) was a knowledge management application before the phrase "knowledge management" was even coined (1994 — which made it hard to explain initially). CTM enabled users to write information down into a document once (rather than repeatedly explain the information to each questioner), which then enabled other users to find the information from their browser by full-text document searches, searches on document title or author, or by browsing the main subject area of a document, browsing by author name, or browsing by document title. CTM V2.0's code consisted of a main Perl module along with associated CGI programs and an external search engine (V1.0 used a Perl4 daemon program and simple Perl4 CGI programs with an external search engine). Document metadata was stored in an Oracle database while the documents were stored on the webserver's filesystem. I published a paper on CTM, "The TCE Corporate Technical Memory: groupware on the cheap", in the International Journal of Human-Computer Studies, Volume 46 , Issue 6 (June 1997) (Special issue: innovative applications of the World Wide Web). Writing CTM saved Thomson $180,000, as the only alternative was a Lotus Notes-based system.

Status Reporter

Status Reporter V6.0 enabled automatic assembling of text-based status reports by allowing users to enter status for each task of each project they had worked on recently. The status report assembler could then be invoked over a specified time period for that group of people, producing one unified status report for the whole group. Using Status Reporter saved around half a day of work for each group, as manual assembly took around 4 person-hours per group at Thomson. V6.0 was web-based, storing the reports in an Oracle database (previous versions were in 16-bit Microsoft Word's WordBasic with reports in Word .DOC files). Status Reporter's code consisted of a main Perl module along with associated CGI programs and a smattering of helper Perl utility programs. Status Reporter saved Thomson over 2000 person-hours by automating status report assembly.

Thomson Application Status (UpApp)

Thomson Application Status (UpApp) provided a concise view of the up/down/in-trouble status of major Thomson applications/systems for end users in their browsers. UpApp was co-developed with Mark Bender, where I developed the main Perl module and mod_perl CGI-like programs while Mark Bender developed the HTML user interface. UpApp status information was held in a MySQL database to provide easy multi-admin updating. UpApp also provided both XML-RPC and SOAP Web Service interfaces, including an all-application status — red if any applications were "red" (crashed), yellow if no applications were red but any applications were "yellow" (in-trouble), or otherwise "green".

Overview and Overdrive

Overview was a search engine I set up that covered the ten most-used webservers by Thomson Americas when there was no overall Thomson Americas search engine. Overview used the Verity SEARCH'97 search engine software. Overdrive was started as the successor to Overview, which would use an Open Source search engine on all 30+ Thomson Americas webservers. Overdrive was in beta-test using ht://Dig's December 2001 beta of V3.2.0 with very nice results — often the first search result was the term searched-for when the search result page had Dublin Core metadata (this was on a 200MHz RedHat V6.2 Linux box). I had started setting up ASPSeek V1.2.10 to compare against ht://Dig.

Nagios

Nagios is an Open Source network monitoring tool using plugins for monitoring hosts and services, where the plugins cover monitoring everything from basic PING functionality to Oracle+MySQL databases and FlexLM license servers. I set up Nagios to monitor the 27 services on 19 servers that our group was either responsible for or were affected by. Nagios provided the back-end functionality for tracking applications/servers/hosts whose status would be displayed to end-users by UpApp. Setting up Nagios like saved Thomson more than $20,000 compared to commercial alternatives.

EDA/Weblog

EDA/Weblog was an internal weblog for the engineering and IT communities at Thomson, powered by Slashcode (Slashcode powers Slashdot). Articles/stories were stored in a MySQL database. EDA/Weblog featured announcements of internal interest (new systems in production, policy changes, etc.) along with links to particularly interesting technical news articles. EDA/Weblog was Thomson's first weblog (March 2002).

EDA (Electronic Design Automation) Home Page

I started and maintained Thomson's first web server in 1993, which provided a home for EDA services to the engineering and IT communities. Except as noted, all web pages and applications mentioned were on that server. When I left, the server was a Sun SPARC Ultra 30 running Solaris 2.6. The webserver programs were Stronghold V2.4 (an early Apache V1.3.x with a licensed SSL implementation) and Apache V1.3.23 with mod_perl V1.26. (CERN httpd was used before we switched to Apache.) The server had ~3000 pages on it when I left Thomson.

ISO 9001 Web

Thomson Americas TV/Audio/Video Product Development received its ISO 9001 quality certification around 1995. I maintained the ISO 9001 Web as well as the Perl web gateway to the Thomson ISO 9001-compliant document management system, DMS9000. The DMS9000 web gateway provided URL-based document retrieval (with URLs almost identical to the native DMS9000 document paths) along with simple and advanced DMS9000 document searches and a CGI program that converted native DMS9000 document paths to their equivalent URLs.

PACO: Password-Checking Objects

PACO (Password-Checking Objects) was a Perl system that helped ensure the use of good passwords by running the password through Crack V5.0 by Alec Muffett. PACO provided a web form interface, although the intention was to provide other interfaces suitable for programs (like XML-RPC, SOAP, etc.) so that all Thomson programs could use PACO rather than their own password strength-checking code. PACO remembered a hash of the password so it could tell you whether you were reusing an old password without compromising the actual password.

CADWeb

CADWeb was a separate webserver (Perl CGIs with Apache on HP-UX V10.20) that drove the additional CAD utility programs written by Thomson employees for use with the Visula and P870 PCB CAD packages. These CAD utility programs originally ran on a text GUI under VAX/VMS, so this was a non-trivial migration task. My major accomplishment for CADWeb was writing the authentication code, which accepted a user's Oracle username and password, producing an HTTP cookie used for later re-authentication using methods based on HTTP Digest Authentication. This authentication code is similar in functionality to the Apache module mod_cookie, with the actual authentication deferred to a program specified in a configuration file — this code was later used for domain (SMB/NT/LanMAN/etc.) authentication by switching authentication programs. CADWeb users would upload CAD files to the server, request a run of a program, then download the resulting output files from the requested program.

In-Circuit Test

In-circuit testing is (basically) the process of testing that components are correctly placed and soldered into a printed-circuit board (PCB). Thomson's program for In-Circuit Test (ICT) testpoint generation/selection was created in 1987, which by the time I received it has grown to ~20 000 lines of C. I made numerous improvements to it, including most notably the ability to test topside components from the bottom side of the board and the addition of boundary-scan testing outputs. During the boundary-scan improvement I added a hash table implementation based on Perl's hash table implementation — before then all name searches were done as sequential scans of linked lists as in the original 1987 ICT code. I also used the Perl-Compatible Regular Expressions library. ICT ran as part of the CADWeb suite of tools. ICT migrated from HP-UX V9.05 to HP-UX V10.20 while I was maintaining it.

Incircuit Testpoint Analysis and Placement

Incircuit Testpoint Analysis and Placement (ITAP) was slated to replace ICT. Initial design and some code was done by Rick Lyndon, then I was brought on-board to speed development and create a detailed project management plan (I ended up doing around 30% of the coding). ITAP ended up as ~20 000 lines of Microsoft Visual C++, running on the user's Windows PC with vastly greater capabilities including avoiding testpoint placement under component bodies and built-in HP TestJet processing as well as much more table-driven operation so that ICT parameters could be tweaked without recompiling ITAP. ITAP included object-oriented wrappers around the Perl-Compatible Regular Expressions library and the Perl-like hash table used in ICT.

Full Circle Compare

Full Circle Compare (FCC) was a several-thousand-line C++ program that verified that identical components were present in the Bill of Materials (BOM), schematic CAD file, and PCB CAD file. FCC was obsoleted by a more-integrated design process that drove the BOM and the PCB CAD file from the schematic CAD file. I took over FCC from the original outside developer (David Sarnoff Research Center) in 1991, got it into production shape, and maintained it until was obsolete around 1999-2000. FCC went into production on VAX/VMS, then migrated to HP-UX V9.05, then finally to HP-UX V10.20.

Software Engineering and Project Management

I was the group resource on software engineering and project management. I was the project manager on three projects:

    • ITAP (when I joined the project)

    • ICT: Adding the testing of topside components from the bottom side of the PCB

    • ICT: Boundary-scan testing outputs

which were all on-time and on-budget as far as they went (ITAP development was placed on hiatus before it was completed — we delivered two production versions before hiatus).

Perl (See also: Links -> Perl)

I started using Perl with V4.0.19 in 1992, which has greatly increased my productivity due to both the built-in features of Perl (so much of programming is parsing text which Perl excels at) and the advent of the Comprehensive Perl Archive Network (CPAN) which provides Open Source Perl modules for about any task you can think of. The Links -> Perl section below details my contributions to Perl.

World Wide Web (See also: Links -> World Wide Web)

I started using Web technology in 1993, two years before Thomson had an Internet connection (see Links -> Humor for my comments on FTP-by-mail). (For a long time I kept around my Sept. 21, 1993 email from Tim Berners-Lee about how commercial enterprises could also use CERN's httpd webserver without paying licensing fees.) I was a member of several Internet Engineering Task Force working groups, leading to enough participation in the WebDAV group to merit an acknowledge in the initial WebDAV RFC. Much of my work at Thomson leveraged Web technology, including theCorporate Technical Memory, Status Reporter, UpApp, CADWeb, ISO 9001 Web, Overview and Overdrive, PACO, EDA Home Page, EDA/Weblog, and Nagios. The Links -> World Wide Web section shows some of my involvement with Web technology.

Technical Library Applications

I installed and maintained the library automation software, EOLAS (initially named DataTrek). I setup and maintained WISE install programs for the Technical Library Applications, which were applications supplied by the Technical Library. Among these applications were the EOLAS online catalog search, Books in Print, Microsoft Bookshelf, and the CAPSXpert component library.

IBM Lotus Team Workplace (QuickPlace)

As part of the Thomson Global Knowledge Management initiative, I constructed QuickPlaces for ICT and ITAP. These QuickPlaces contained discussion areas, design documents, project management plans, and status reports for ICT and ITAP.

SlashWiki - a Wiki for Slashcode

As part of EDA/Weblog I set up SlashWiki, a WikiWiki for Slashcode-driven servers that adds the possibility of requiring authentication to edit certain pages. WikiWikis, Status Reporter, and Slashcode all use simplified text markup schemes to enable users to create HTML content without learning HTML.

Cypherpunks

To help track security issues I have lurked on (and occasionally contributed to) the cypherpunks mailing list. Cypherpunks explores the social implications of security/privacy/cryptography technology, occasionally delving deep into technical details when appropriate. It is unfortunately also the home to a lot of spam and some amount of knee-jerk anti-authoritarianism. It is my opinion that some amount of crypto anarchy is coming, so it is better to be prepared.

PGP and GNU Privacy Guard

Before Thomson switched to Entrust, I was the expert on PGP (Pretty Good Privacy), an encryption/decryption tool. I am now using GNU Privacy Guard, a mostly-compatible free software replacement for PGP. (Entrust is a better solution for large corporations due to Entrust's centralized automatic key handling.)

HCL-eXceed X Window System server

When Windows 3.0 came out, I spent months evaluating Windows-based X Window System servers, settling on the HCL-eXceed X Window System server. HCL-eXceed is what I like in commercial, closed-source software:

    1. It just works; and

    2. If it doesn't work, the defect is either minor or fixed quickly.

I don't have a problem working with closed-source software, but I don't like seeing many defects in it either. Using HCL-eXceed probably saved Thomson over $100,000 by allowing access of Unix workstation resources from Windows PCs.

Wintek, Inc. (1984-1991)

I worked as part of a six-member software development team, mainly on electronic CAD products for MS-DOS.

smARTWORK Editor

smARTWORK is a 50-mil rule PCB CAD package for which I became the primary developer of the Editor program. smARTWORK was the first PCB CAD package available for DOS.

smARTWORK simulated annealing Automatic Component Placement

smARTWORK included an Automatic Component Placement program that used a simulated annealing algorithm, developed when the main information on simulated annealing was one book and a handful of research papers. I took over as developer late in the initial code-writing stage, bringing it to production status. This was probably the first commercial use of the simulated-annealing algorithm.

smARTWORK Video Drivers

I took over maintenance and enhancement of smARTWORK's video drivers, which including the development of smARTWORK's first 800x600 video driver.

smARTWORK Copy Protection

I took over maintenance of smARTWORK's copy-protection scheme, which involved manipulation of the DOS floppy filesystem.

Installation Program Generator

I wrote what was probably one of the first installation program generator programs, using it to generate the batch files necessary for installing smARTWORK and HiWIRE from either 5.25" or 3.5" floppies. Rick Strehlow's extremely clearly-written installation batch files for smARTWORK inspired me to write the installation program generator program.

HiWIRE II Main Menu Program

HiWIRE II uses a text GUI for its main menu. I wrote both the text GUI menu program (my first sizable C++ program) and the program overlay manager (my last sizable assembly program).

HiWIRE II Editor Memory Manager

The HiWIRE II Editor uses a memory manager that allows objects to be stored in conventional memory, in extended/expanded memory, or on disk.

smARTWORK and HiWIRE II Netlist Converter

On my own initiative I wrote a netlist converter for smARTWORK and HiWIRE II for inputting foreign netlist formats (including EDIF).

Serial Port Interface for Remote Debugger

I wrote the serial port interface for remote debugging of microcontroller boards, along with the user documentation for the remote debugger.

MDBS (Micro Data Base Systems, 1981-1984)

I worked as part of a team of 20+ developers, although my projects (except as noted) were my sole responsibility. MDBS IV is a network CODASYL database management system (pre-relational database).

MDBS IV Page-Level Memory Manager

The MDBS IV page-level memory manager used a best-fit algorithm from Knuth's The Art of Programming.

MDBS IV Porting (CP/M, WICAT, Unix)

For 18 months I was responsible for part of the MDBS IV C code porting effort between CP/M, VAX Unix, and WICAT (a Unix look-similar for workstations).

Regular Expression Search

On my own I wrote both globbing and regular expression matching code in C. The regular expression code is still part of their GURU expert-system-enhanced database product.

Hash Tables (Associative Arrays)

I wrote hash table (associative array) code in C.

Fixed Hash Table Generator

I wrote a fixed hash table generator program, similar to GNU's mkperf.

MDBS IV User Query External Sort

The MDBS IV user query program allows users to arbitrarily sort their output queries (like ORDER BY in SQL only for a CODASYL database) using an external sorting function (sorting chunks into multiple files, then merging the files), co-written by Tim Stockman and me.

Unix Device Driver for Named Pipes

To provide client-server IPC on early Unix boxes, I wrote a device driver that provided a named pipe facility on Unix before "named pipes" were a standard part of Unix.

CP/M Filesystem Emulator for Unix Floppies

Tim Stockman and I wrote a CP/M filesystem emulator for Unix floppies to enable easy copying of files between CP/M and Unix systems.

C Compiler Selection

(A "how times have changed" entry.) I was part of the team that finally selected the BDS software C compiler (by Leor Zolman) for CP/M software development in C. This required a compiler that could have a new back-end written for it that generated a pseudo-assembly that assembled down into either 8080, 8086, or Z-80 instructions.

Unix Systems Administration

Part of my duties included system administration of their PDP-11, VAX 11/780, and WICAT computer systems.

Northrop Defense Systems Division (1979-1981)

My duties included radar countermeasures software (ECM). I held a SECRET clearance. I can't say more than that except that my project name was unclassified but that my participation in the project was classified.

Purdue (as B.S.E.E. student 1975-1979)

IEEE Dictionary Computerization (1978-1979)

Professor Benjamin Leon hired me as a research assistant to work on the first try to computerize the IEEE Dictionary of Electrical and Electronic Terms.

Unix Systems Administration (1978-1979)

I was one of two Unix System Administrators (V6 and V7 PDP-11 Unix) from what was still called the ARPA Lab, although by that point the focus of the Lab was (as I remember) on computer vision (this was in the pre-TCP/IP days).

8080 Cluster Schematic (1978)

I did an independent study project that designed an 8080 computer board to be used as part of an 8080 cluster computer.

APL Editor (1978)

I did an independent study project to create a standard APL editor.

B.S.E.E.

I graduated with a bachelors of science in electrical engineering with concentration on computer engineering.

John Adams High School, South Bend, IN (1971-1975)

National Merit Scholarship

I was one of the National Merit Scholarship winners for 1975. This was due to my 730 verbal and 700 math SAT (1430 total).

Graduated Cum Laude

I graduated Cum Laude (top 10% of my class).

Purdue President's Freshman Engineering Scholarship

I received a Purdue President's Freshman Engineering Scholarship for my freshman year at Purdue.

Experience

* I worked in assembly language from 1979-1990. 80x86 assembly language was from 1981-1990. The only assembly language I've used since 1990 has been in a debugger.

Other Influences

APL

Purdue's computer architecture class used APL as the design language, so I got quite familiar with APL. I even wrote an editor for APL as an independent research project for the computer architecture professor. APL opened my eyes to just how differently you can approach programming a computer. I have not used APL to any degree since Purdue, however.

FORTH

I used FORTH on a Commodore 64 in the early 1980's before C++ came out, partially in an effort to help develop a saleable videotape library system for video rental stores with a group of friends. FORTH showed me how powerful language extensions could be, thereby preparing me for object-oriented programming with C++ and Perl. <namedrop>I even answered Brian Kernighan's question about what is FORTH during his second visit to Purdue ("a very-high-level extensible assembly for a stack-based virtual machine").</namedrop>.

LISP

Although I've only written less than a hundred or so lines of LISP in my life, because LISP is a favorite of academics due to its flexible nature (apart from the parentheses) I have followed LISP developments for many years now. Among other nuggets of knowledge, I can thank LISP for what I know about garbage collection.

Haskell

Haskell is my first foray into functional programming. I am just beginning to learn Haskell, but it has already turned my attention to the occasions where a no-side-effects programming style is appropriate.

Expert Systems

Around 1992-1993 I took a class in Expert Systems while at Thomson. This inspired me get a copy of Translate Rules To C, a simple forward-chaining expert system that compiled the rules down to a C program. I was only able to use my Expert Systems knowledge indirectly, however, as part of my ICT and ITAP work at Thomson. IMHO, an Expert System is just part of what is required for intelligence — there are lots of other components.

Lucene and Lucy

Lucene is Apache's powerful (and popular) search engine technology. Products ranging from Netflix to RELMA use Lucene's technology. Apache Lucy is Lucene's lesser-known cousin, a "loose C" port (their words) of Lucene. I learned a lot about designing search indexes and search strategies by learning Lucene, Lucene.NET, and Lucy. Lucy is exciting to me because since Lucy is in C, languages like Perl, PHP, and Python will be eventually be able to make use of Lucene/Lucy's power (Lucy already has a Perl binding).

CYC and OpenCYC

OpenCYC is the Open Source version of CYC, an attempt to write down common sense in a form that computers can use (my paraphrase), basically repeating the learning process every child goes through. The end result should be a usable artificial intelligence that could be plugged into other programs.

Honors and Publications

Education and Training

Advanced Perl Programming

Taught by Nathan Torkington of the Tom Christiansen Perl Consultancy.

Real-World Requirements

Taught by Steve Tockey of Construx.

Real-World Software Testing

Taught by Steve Tockey of Construx.

Information Systems Project Management

Taught by Mark A. Reed for the American Management Association International.

Information Engineering - Analysis

Taught by the staff of Thomson Consumer Electronics Information Systems.

Expert Systems

Taught by the staff of Thomson Consumer Electronics Information Systems.

Micro-based Automated Library Systems

Taught by the staff of the Indiana Cooperative Library Services Authority (INCOLSA).

Objected-Oriented Concepts & Design: Advanced C++ Workshop

Taught by the staff of the Technology Exchange Company (the teacher was a colleague of Jim Coplien's whose name I can't find in my paperwork).

Links

World Wide Web

Perl

Perl6 RFC349: Perl modules should be built with a Perl make program

My RFC that proposes that only Perl should be built with make(1) -- everything else should be built with a Perl make program. This appears to be implicitly accepted by the Parrot development team, at least.

CGI

Version 2.90: Documentation patch for the import_names() method transforming the parameter names into valid Perl names.

Version 2.36: Patch so that cookie and HTTP date formats now meet spec.

CORE

perlhack: External Tools Rational Purify documentation patch (perl5-porters email message)

hv.h: Documentation patch (perl5-porters email message)

hv.c: Patch to use Bob Jenkins One-at-a-Time key hashing function (perl5-porters email message)

My CPAN Modules

Apache::Authen::Program: (Apache authentication using an external program)

Date::LastModified: (Extract last-modified date from zero or more files, directories, and DBI database tables)

SQL::AnchoredWildcards: (Transparently use substrings and anchored wildcards in SQL)

Test::MockDBI enables testing of DBI programs by mocking up the entire DBI API, then allowing developers to create rules for the mockup's behavior

GitHub for WebGUI

Fork of the WebGUI WGDev command line tool that adds batchedit, a command for directly editing an asset or URL from a shell script / batch file rather than bringing up an editor program. Batchedit will eventually will be integrated into the main WGDev repository.

DBIx::TableHash

Patch to turn off warnings when used under -w.

Perl Power Tools

ls(1) Implementation: Complete V7 Unix with some GNU options.

fancy grep(1) patch: I added the -E (extract data) option.

perl5-porters

My contributions to the Perl5 development list.

Humor

Last updated December 7, 2012.