Tales from the trenches.
Two companies develop a communication protocol to exchange information between their respective systems. SOAP is picked up, presumably because of all its enterprise-trendiness.
One company builds a J2EE application, the other one a single C++ builder win32 exe. Both sides can receive and send messages.
The applications get tested by the developers while at their respective office, and declared ready to go….
…but fail to exchange a single message when deployed in the production environment, which includes quite a few extra devices:
- https encryption over the internet part
- your run-of-the-mill layer 3 firewall
- an application-layer firewall terminating the https connections and applying layer 7 http filtering
- another application-layer firewall applying antivirus scanning
The developers have no clue as to what might be causing the problem, and are prone to point fingers against the “bad firewalls”.
- the java guys are helpless in locating errors, because they never look at the http layer which should supposedly be carrying the xml messages. All they know are the java objects that the framework builds for them. If any error occurs in handling the http connection, xml (de)serialization or https handshake they do not know – all they get is an empty pointer (note: presumably the informative, verbose exception that would help in debugging is being catched too early inside the framework itself, and the coders do not dare split it open)
- the c++ guys are helpless in locating errors, because their app in fact knows nothing about xml and much less about http. It reads strings from sockets and writes strings to sockets. The received strings are parsed, presumably by regexps, while the outgoing messages are built from .txt template files with very basic token substitution
There is really something interesting in comparing the two approaches to development: on one side the communication protocol is completely abstracted away, so much in fact that it cannot be examined at all; on the other side the protocol is not handled at all, and old-school quick-n-dirty pragmatic coding has produced the bare minimum necessary to fake understanding the language, while in reality not being able to speak it at all.
Luckily, Wireshark comes to the rescue of the zealous sysadmin. Traces of the communication are run on the network segments in between every firewall in the line.
- it quickly becomes evident that one of the firewalls is removing the http body of the soap responses. The soap caller sees an empty content and correctly returns an error
- the http body part is in fact present in the response, but the firewall is removing it because the http content-length header declares a zero byte length. Debatable behavior, but the culprit in this case has to be found closer to the soap server
- When asked to fix his application in order to emit a correct content length, the coder protests that it would be too hard, since the application has not been designed for that.
A quick inspection of the templates used by the (c++) almost-soap server for building the responses reveals the following gem (single .txt file):
HTTP/1.1 200 OK
Content-type: text/xml; charset=UTF-8
<soap tags junkard...>
GOTCHA! the content-length header is de-facto hardwired into the templates themselves. The application would “really” have a hard time calculating the content length of the payload, since the payload is built from “part” of the template file plus some string variables taken from runtime. And the templates are many, one for every soap message exchanged…
The fix is easily found: the template-fixed content-length header is set to 40000, large enough to accomodate any valid soap response.
It is a ugly hack, but it works. Every soap response will eventually generate a warning in the firewall logs, since to the firewall the http message body is being received incomplete, but who cares…
The moral of this story is: why on earth are software developers forced to use layers upon layers upon layers of technology they do not understand? Where is the smart project leader who forced soap to be adopted for this particular project? Will he be fired for that? Will he – at least – ever acknowledge his error?
All I know is we have a saying around here: “do not use a cannon to shoot a fly”. Maybe if we can somehow convey a “web 3.0” aura to it, somebody will notice.