Python scripting code began it's life circa 2004, when a guy named Steve Dunham posted it for GNUCash development. I found the code during an initial search for a fix, when I ran across a discussion by TheFinanceBuff. The code was fairly straight-forward, but unless you're comfortable with Python, it looks like gibberish. Until seeing this code, I had never considered Python. I had, however, spent countless hours coding in a variety of languages and scripting interfaces, so I saw this as a new "challenge". Frankly, after using Python for only a few hours, I developed a great appreciation for it's power and simplicity. It presented a solid framework within which we could get some work done, without worrying about pretty or user-friendly interfaces.
My first step was to
install the ActivePython interpreter, download the initial
scripts, and edit for a couple of the institutions and accounts that I
used. After a few attempts, it became obvious that the scripts did
indeed work, but that they were primitive. For example, the original
scripts required you to define your institution (site) information
inside the body of the main code and then provide a password for each
account, every time you performed a download. The code also provided no
validation of the resulting OFX files prior to sending them to Money, often resulting with Money bombing after something went squirrely (as I can attest).
The first issue wasn't really a problem, since institution data resided in a data structure at the head of the primary code module. Still, it seemed better that this information should be stored separate from the executing routines. Doing so allowed the user to edit their institution data, without worrying about accidentally damaging the working portions. TheFinanceBuff got around the second of these issues by calling the main routine from a batch file and putting his account information there. I personally didn't care for the idea of having my account information (user-names and passwords) stored inside a plain text file. The better option was to encrypt the account data, similar to the way that Money did it internally. Encryption could occur outside the Python environment (e.g., using Windows encryption), but encryption modules exist for Python, so why not use them?
Fully validating an OFX file
isn't trivial, and wasn't something that I intended to do myself.
What I felt could be done (and better than nothing), was to look
inside the data sent from the financial institution and see if it
"looked" right. By this, I simply mean that data is well-formed, has the correct OFX message structure, and doesn't contain any error conditions. During initial testing,
for example, I learned that Vanguard may send an OFX file that fails
the Money sniff test when the download interval is too short. I didn't
bother figuring that out, but it seemed like a good idea to get a
handle on what was happening and either automate a fix, or just skip the
statement and not send it to Money for import. My immediate fix for the issue was to implement a "minimum download interval" as a user-defined field in the OFX description for the site. After the code had been in use for a while, I implemented a better way to handle this specific condition, along with many others, but the minimum interval user parameter still exists.
consideration that TheFinanceBuff included in his discussion was the
topic of updating prices for stock and fund quotes (ticker symbols) in Money. For symbols that
are already linked to an active account, this occurs automatically
whenever a statement is imported. For accounts that don't get updated
automatically via an investment account, stock and fund prices would no longer update after the
built-in Money updates expired. TheFinanceBuff wrote a Python module for updating security prices via a "dummy"
statement import. I initially tested the module "as-was", and found that it worked well. Using the module required you to configure a special account in
Money to receive the updates, but it was easy to do. I modified it to work with the overall structure and logic I adopted for the scripts, and have since incorporated many updates/changes for improved operation, including much more robust condition handling, and support for both Yahoo and Google Finance.
What were the odds of running into issues right out of the starting gate? Apparently, they were pretty high. We use Discover as our primary charge card, and this just happens to be one of the institutions that didn't implement the OFX standard very well. When I first started testing the code, I found that duplicate entries were being inserted into my Money file. Not one or two duplicates, but every transaction that I had retrieved during the time interval. If I re-ran the code on that same day, there would be no more duplicates. If I waited a day and ran it again, the duplicates would come back. All of them. This, obviously, didn't cut it. Fixing Discover data transmissions, so that Money would properly match transactions, became a priority.
I decided that a mechanism was
needed for "scrubbing" OFX data files sent from an institution to correct
known issues. Based on my limited account set, none of the institutions
format their transaction identifiers the same. It therefore seemed
reasonable to include the ability to call tailor-made routines
prior to sending OFX files to Money. At that time, I was only concerned
about Discover, but felt that the "fix" shouldn't be hard-coded into
the main routine. This way, any site could have a special routine that would be called prior to transferring the data to Money.
Simple enough, except that we first needed to figure out what was
actually wrong with the Discover data. You can read more about the "fix" if you're interested, but suffice it to say that Discover Card transaction downloads have worked great since.
I started off with the notion that I would just fix a few issues that I was experiencing, and get things working. I quickly ran into the issues with Discover, and soon found myself rewriting sections of the code and also putting a "wrapper" around it in the form of a Setup routine. The result was an integrated package that doesn't require the user to edit or understand Python scripting, data structures, or syntax.
The following functionality was implemented:
Site Data: Site and stock information is now defined in a text file named sites.dat. The original versions defined this data inside of Python code, which is fine for a programmer, but not so hot for other folks. The parser that reads this file is much more forgiving of entry format such as punctuation, capitalization, etc.
Setup: There is now a Setup.py routine for configuring account information. This was necessary to implement encryption of locally-stored account information (including user-names and passwords). If I had remembered how bad it is to write a command-line menu structure... I probably wouldn't have done it :)
Security: Passwords and user-names can be encrypted. The encryption that I settled on is standard DES, which uses a 64 bit password key (for local storage). This isn't a crack-proof algorithm, but unless you're running money for "da man", it's more than satisfactory for keeping even a techno-savvy thief from getting your goodies. As for the security of the code itself, the full source code is "open", meaning that anyone (yes... you) can look at the inner workings and see that nothing spooky is going on. The code makes a fully secure (SSL) connection directly with each institution and transfers data over the encrypted link (at the full strength provided by the bank server). There are no other connections made and your information is not being (silently) sent to somewhere in western Siberia.
Scrubbing the Statements: As discussed here, I first ran into issues when downloading Discover Card transactions. To fix the issue, I implemented a "scrubber" routine that re-formats transaction statements before passing to Money. The interface is "open", meaning that it's easy to add a scrubber for any other institution or statement format, and numerous additional routines have been implemented since beginning the project, to address issues as they arose.
Code Structure: The code structure is similar to the original, but far from the same. While implementing changes, the code was restructured and made more modular (to my taste, of course). The PocketSense version is not compatible with any other OFX Python code, and cannot be intermixed.
Continued Enhancements: Since originally posting this work in early 2010, I have continued to fix user-reported issues and add enhancements. The revisions page shows a summary of the change history, but suffice it to say that there's been a bunch.