For a project I’m working on I have to provide web services via Soap in Rails. The consuming client is SOAPpy (Python).
I hit two different problems and a solution that I wouldn’t mind feedback on.
The first one is that when you post via SOAPpy you get this header
Content-type: text/xml; charset="utf-8"
That generates an error along the lines of
XSD::Charset::CharsetConversionError (Converter not found: X_UNKNOWN -> UTF8):
/usr/lib/ruby/1.8/xsd/charset.rb:112:in `encoding_conv'
/usr/lib/ruby/1.8/xsd/charset.rb:102:in `encoding_from_xml'
The source of the problem is that SOAPpy is putting quotes around the charset. I won’t tell you how long it took me to figure that out. I don’t know if this is a bug in SOAPpy or in Rails because I don’t know if it violates the standard to put it in quotes. (If you know let me know - it will help me figure out who to submit the bug to) In the interim I figured I’d need to fix this.
The other issue is related to Mongrel. For whatever reason when Mongrel talks to SOAPpy it ends up putting the Content-Length header twice. SOAPpy stores it as a list - which then causes it to blow up. I don’t know if that is really a bug on Mongrel’s part because again I don’t know the standard. I submitted a bug anyway.
AS you can see these are problems specifically in interacting with SOAPpy. I don’t have control of the other side so I needed to fix it on the Rails side. So if you have feedback - please keep in mind that I CANNOT modify the Python side unless the SOAPpy project itself releases the update - which since the current version is from 2005 - seems unlikely.
So my solution involved Opening A Class(aka Monkey Patching)
Before we go further let me just say that this strategy is very powerful. Like all powerful things it can be very dangerous. First you are modifing the behvior of something standard to do something different. That means that when someone else tries to use your code - or when updates come out - if you are not careful you may earn your self a lot of time in the penalty box when you can’t figure out why something doesn’t work for you. The goal of the process as described here is to make it at least possible to figure out where to look.
First of all, most of the structure is actually from another developer I’m working with (Thanks Ryan!) - but I figured I’d spend some time documenting it with my solution to the Soap problem.
In the rails project we now have a directory /add/core_ext
This is where all modifications go.
At the very bottome of config/enviroment.rb the following line is added
Dir["#{File.expand_path(RAILS_ROOT)}/app/core_ext/*.rb"].each { |file| load file }
If an object acts weird you get to look here first.
Second - the modifications are added in a module named in a way that shows that modifications are being made.
This allows me to issue a command like this to console
ActionWebService::Protocol::Soap::SoapProtocol.ancestors
Which returns
[ActionWebService::Protocol::Soap::SoapProtocol, SoapProtocolExtension, ActionWebService::Protocol::AbstractProtocol, Object, ObjectExtension, Base64::Deprecated, Base64, Kernel]
You can see the SoapExtension in the list. I’m sure as we deal with this more we will standardize more to make it easier to figure out if something has been opened by something else.
Now on to the code.
I create a file in core_ext called soap_protocol.rb
module SoapProtocolExtension
#Required to handle CONTENT-TYPE that wraps the char set in quotes
def custom_decode_action_pack_request(action_pack_request)
return nil unless soap_action = has_valid_soap_action?(action_pack_request)
service_name = action_pack_request.parameters['action']
input_encoding = parse_charset(action_pack_request.env['HTTP_CONTENT_TYPE']).gsub(/["']/,”")
protocol_options = {
:soap_action => soap_action,
:charset => input_encoding
}
decode_request(action_pack_request.raw_post, service_name, protocol_options)
end
end
ActionWebService::Protocol::Soap::SoapProtocol.class_eval {
include SoapProtocolExtension
alias old_decode_action_pack_request decode_action_pack_request
alias decode_action_pack_request custom_decode_action_pack_request
}
The top part defines the function that actually handles the quoting.
The bottom part opens the class and includes the new method. Then it uses the alias method to move the old method out of the way. Then it moves the custom method into place. I could have probably reduced code by just modifing the HTTP_CONTENT_TYPE and passing it on to the old method. (Assuming I stick to this path that’s probably what will happen).
The second patch is in a file called mongrel.rb. I had to add a little bit more logic because Mongrel isn’t always present at startup (Hello webbrick). I also ended up putting it inside the Mongrel module to give it access to constants and other info related to Mongrel
if Module.constants.include?(”Mongrel”)
module Mongrel
module Mongrel::HttpResponseContentLengthBug
def custom_send_status(content_length=@body.length)
if not @status_sent
#@header['Content-Length'] = content_length unless @status == 304
write(Const::STATUS_FORMAT % [@status, HTTP_STATUS_CODES[@status]])
@status_sent = true
end
end
end
end
Mongrel::HttpResponse.class_eval {
include Mongrel::HttpResponseContentLengthBug
alias old_send_status send_status
alias send_status custom_send_status
}
end
Once the bug in Mongrel gets fixed I’ll remove this modification. But until then I can continue working on the system. Anybody else done anything like this? Am I going down a dark path?
October 19th, 2006 at 2:57 pm
Yes. You are going down a path that’s going to cause anger and pain in the future.
The problem is you’re basically adding software to work around defects in other software instead of just fixing the defect. Monkeypatching is ok to test out a theory and prove a concept, but after that it should be yanked and what you just learned applied to correcting the bug ASAP.
My advice is to crack open the upstream libs, make a patch, and then go submit it along with an explanation of why you had to make it.
As for the content-type thing: yes, quotes are legit in the token. The spec says you can use them to group delimited tokens into one argument… kind of like how you can use doublequotes to wrap a filename with a space in it. While content-type doesn’t really have a need for that (there’s only one thing it can have as an arg), you should still account for them.
The mongrel stuff I’m not sure about, but my guess is that even if your fix isn’t the right one, it’ll illustrate exactly what the problem is to the upstream maintainer enough so that he’ll do the needful.