Web Death by Strings
Posted by Uncle Bob on 01/04/2007
Communication between web clients and servers is dominated by strings. This leads to complex and horrific problems of coupling, and fragility. Where are the rules?
I am in the enviable position of working on two web systems at the same time. One is a ruby-on-rails system for tracking substitute teachers. The other is a JEE system for managing the contents of a library. The point-counter-point of this happy coincindence has illuminated something that has tickled my subconscious for years. The world of Web programming is a world of pathological string manipulation.
Take, for instance, the library system I am working on. One of the pages in this system manages the books in the library by their ISBN, and by their copy ids. Let’s say we had 3 copies of ISBN 0131857258. The page would have a table row for the ISBN that contained a check box for each of the three copies. If the user checks the checkbox, the copy will be deleted from the library. Another checkbox in that row is named “Delete all”. When the user clicks that check box, all the other check boxes in that row are automatically checked, and all copies of that book are eliminated.
Now, think about this from an HTML point of view. How does the server know which copies should be deleted? That’s easy, the server builds the HTML for the page, so it simply gives a special name to each checkbox. When the form is submitted the names of the checked checkboxes are sent back to the server. So all the server has to do is to give each checkbox a name that identifies the copy it represents. We chose a syntax similar to: “delete_432”, which would be the name of the checkbox that represents the deletion of the copy whose id is 432.
Notice the string manipulation? We have encoded server side information in a string that is sent to the client, and we expect that information to come back to the server unchanged. While this makes perfect sense, any good software designer should feel a bit queasy about it. Depending on strings to encode information like this feels just a little bit reckless. It’s manageable, but it’s icky.
Today that ickiness got a lot worse for me. Dean Wampler is working with me on the library project. He was working on the JavaScript to make the “delete all” checkbox work. Now copy ids are globally unique. No two copies, regardless of ISBN, share the same copy id. So when the ‘delete_nnn” comes back to the server, the server does not need to know which ISBN the book belongs to. It just happily deletes copy ‘nnn’. However, Dean needed get his client side JavaScript to set only those checkboxes that corresond to the ISBN of the ‘delete all’ button. The client does not know which copies correspond to which ISBNs. To solve this problem he changed the format of the checkbox name to ‘delete_ssss_nnnn’ where ssss is the ISBN, and nnnn is the copy id. This allowed him to write the JavaScript to look for all the delete buttons that corresponded to the appropriate ISBN.
Of course when he made that change, he broke my server code which was looking for ‘delete_nnnn’. Fortunately I had unit tests that detected the problem instantly. (I truly pity those poor programmers whose only means to stumble accross errors like this is to deploy the system to test and work through the pages manually!) This would have been easy for me to repair on the server side; and I was tempted to do so, simply in the name of efficiency; but my conscience wouldn’t let me.
Why should a client-side JavaScript issue have any impact on the server code? Answer: It shouldn’t!. This is software design 101. Don’t couple different domains!
So I talked it over with Dean and we quickly realized that he could change the JavaScript to use the the ‘id’ attribute of the checkbox tag. The server would construct the page with the id’s set correctly, and the checkboxes would retain their normal name of ‘delete_nnn’.
There is a general rule here somewhere. It’s something like: use names to communicate with the server, and use ‘id’ attributes to communicate with the client. Or, rather, don’t break server code to make client side javascript work.
I’ve had similar string issues with the ‘Substitute’ system I’ve been working on in Rails. In this case I am using Ajax to allow users to type the names of substitute teachers and quickly pop up a list of possible teachers. So if you type “B” into the “Substitute” field, you quickly see a menu of all substitues whose name begins with “B”. As you type more letters the list gets smaller. You can pick a name from the list when it’s convenient for you.
This works great, but has one gaping flaw. The server is looking these names up using SQL statements and is then populating the list in a convenient format. So, for example, it will put “Bob Martin” into the popup list, constructing the name from the first_name and last_name fields of the Substitute record. It is this constructed name that comes back to the server in the form when the submit button is pressed. But the constructed name is not the key of the Substitute record! So how does the server know which substitute has been selected? It could break apart the string “Bob Martin” into “Bob” and “Martin” and then do a query against first_name and last_name, but I hope you share my disgust with that solution! Not only is it inefficient, there are just loads of opportunities for error and fragility. (Just think of honorifics, suffixes, prefixes, middle names, etc.)
My solution, which I dislike almost as much, is to encode the id of the substitute along with the name. So the string that actually pops up in the menu is “(384) Bob Martin”. OK, OK, I know this is bad, and I intend to fix it once I learn how to get the JavaScript that pops up the menu to load a hidden field. But I don’t know how to do that yet, and I am agahst that I need to learn it! It seems to me that being able couple a pretty name to an unambiguous ID is such a common thing to do that I would not have to resort to the deep mysticism of javascript to achieve it.
Ah well, the web is hell. That’s all I can really say about this. Web programming is probably the worst programming environment I have ever worked in; and I’ve worked in a lot of programmign environments. Not only is it flogged by commercial hype that tries to make it seem much more complicated than it is; but it’s so poorly conceived, and so sloppily put together that it is, frankly, embarrasing.
Posted in Uncle Bob's Blatherings
Meta 21 comments, permalink, rss, atom
Comments
Michael Schuerig 1 day later:
Bob, simply put, the Rails auto_complete_field just isn’t up to the task you’re trying to make it do. There is no way to associate a unique id with each of the display strings. It’s all well when this is used to choose a value, it breaks down when the task is choosing an individual object.
There are two different approaches I can think of to get what you need.
(1) Replace your use of auto_complete_field with an observed search field and a select element. On changes to the search field, update the select element’s options.
(2) Extend auto_complete_field (and its support methods in Ruby and JavaScript) so that it associates an id with each of the choices and stores the chosen id in a hidden field.
Kjetil Ødegaard 2 days later:
Not to detract from your main point, but you can give your checkboxes the same name yet with different values. Thus you can handle them on the server without string manipulation.
Michael Schuerig 3 days later:
Looking at the autocompletion code (JavaScript and Ruby), my suggestion (2) above actually is pretty easy to implement.
View
<%= hidden_field_tag 'something_id' %> <%= text_field_with_auto_complete :something, :name, {}, :after_update_element => "function(targetField, selectedElement) { $('something_id').value = selectedElement.id }" %>
Controller
class SomethingsController auto_complete_for :something, :name end
Unfortunately, some monkey patching is necessary, too, as auto_complete_result doesn’t have any options for generating the result list.
module ActionView::Helpers::JavaScriptMacrosHelper def auto_complete_result(entries, field, phrase = nil) return unless entries items = entries.map do |entry| content_tag("li", phrase ? highlight(entry[field], phrase) : h(entry[field]), :id => entry.id) end content_tag("ul", items.uniq) end end
Unfortunately, again, and this proves the title of this blog entry, simply using entry.id in most cases will not work. It has to be prefixed with the class name, say, and this prefix has to be stripped again in the JavaScript snippet further above.
Karri-Pekka Laakso 4 days later:
The UI design of the ruby autocomplete that uses a textfield can actually never work, if it’s intended use is to select a pre-existing value. In short, the problem is that mixing search criteria with search results almost never works from UI design point of view.
The textfield component is meant for arbitrary input, and the dropdown menu for selecting pre-existing values. The current autocomplete design is copied from the address bar of web browsers without realizing that in the context of web addresses, it is reasonable that the user can type in anything. However, in your case (and we have bumped into similar cases), it is not.
I don’t know any other solution than to fix the UI design: what you need is a dropdown that has a search field inside its popup menu. Building one is probably as frustrating as all development for the web.
IsmetKahraman 4 days later:
Google provides a very good solution for Web programming. GWT enables developers to code in terms of abstractions without worrying about those javascript and string manipulation errors. Communication between client and server is also fulfilled by means of remote object calls and serialized object requests. Subset of java is supported and this is transparently compiled to JavaScript classes and codes.
Ravi 8 days later:
Regarding the issue of “Delete” and “Delete All”
I assume that the second is new functionality that was not there originally. In that case, there is a subtle difference between the two tasks: the first command requires one specific copy (with its own unique ID) to be deleted; while the second asks to delete all copies for the specified ISBN.
If we wanted to reuse the exisitng functionality for the new task “Delete All”, on the web server, we could loop through the rows and generate a series of delete_nnn calls whenever the ISBN matches the specified ISBN. This series of calls bundled appropriately would be sent to the application/database server to execute.
Assuming that the actual data is stored in a relational database, another approach is possible. Recognizing that “Delete” and “Delete All” are two different tasks, we could code our web page so that when “Delete” is required, then we call a stored procedure/SQL statement and pass it the copy ID.
If “Delete All” is called then we would call a different stored procedure/SQL statement and pass it the ISBN.
This takes advantage of the features of the relational database that allows “set” operations, as in deleting the set of copies that correspond to the specified ISBN.
Assuming further that the SQL statements and stored procedure calls are never hard-coded in the code or JSP page, and that these are set in some sort of Configuration or Property files, we get another benefit. The web page developer can simply say: using the configuration information, get the SQL to delete copy_nnn, and pass nnn as a parameter. This way the web or application developer is abstracted away from the structure of the database. And the database developers can optimize the query without ever breaking the code. They could even redesign the database without the application developer being affected!
I would never even contemplate doing it the way your team approached the problem, using delete_ssss_nnnn. That approach, if suggested, would be discarded almost immediately. The solution suggested in the original article merely reinforces my belief that the vast majority of programmers have a lot to learn in terms of designing applications.
ssara over 4 years later:
Communication is the main tool to start a new system as they need to know all information at first. And in case of bad communication the work can not start flawlessly. Foreclosure Process in Kansas
ghd australia over 4 years later:
GHD australia have fairly very rated for rather a few of elements just like pattern durability and ease of use.
Criminal Records over 4 years later:
That approach, if suggested, would be discarded almost immediately. The solution suggested in the original article merely reinforces my belief that the vast majority of programmers have a lot to learn in terms of designing applications.
Tenant Screening over 4 years later:
I wanted to get my tests to pass so I could check in my code. I knew that if I left the code checked out until morning, one of my compatriots would wake up at 3am and change something, and I’d have to do a merge.