Saturday, September 17, 2005

REST and SOAP and document-oriented services

There's been an interesting discussion of REST and SOAP and document-oriented web services starting with Mark Baker's post called Towards truly document oriented Web services. I think this is one of Mark's best and most succinct efforts to explain the subtle, yet important differences between REST and SOAP, which continue to befuddle so many people.

It all comes down to application semantics.

One of the primary tenets of REST is the use of a uniform interface between components. In his dissertation on REST, Roy Fielding describes the uniform interface as "the central feature that distinguishes the REST architectural style from other network-based styles". REST is closely associated with the HTTP application protocol because HTTP provides this type of uniform interface -- all components communicate using simple, generic methods: GET, POST, PUT, and DELETE.

In Mark's post he explains the different application semantics expected when using POST versus PUT to send a SOAP message. POST implies "process this message", while PUT implies "store this message".

SOAP is also an application protocol, with its own set of application protocol semantics. And these semantics are different from HTTP application protocol semantics. Even though SOAP messages are typically transferred using HTTP, for the most part SOAP hides/obscures the HTTP semantics from SOAP applications. The underlying SOAP runtime systems communicate using HTTP semantics, but SOAP applications communicate using SOAP semantics.

SOAP uses (some would say abuses) the HTTP application protocol. Even though HTTP is an application protocol, SOAP treats HTTP as a lower-level communication protocol. But SOAP is an equal-opportunity application protocol abuser. SOAP treats other application protocols, such as WebSphere MQ, SMTP, and Jabber, the same way. From the SOAP perspective, these application protocols are all low-level communication protocols.

The SOAP specifications define bindings of the SOAP application protocol semantics to other application protocol semantics, such as HTTP, SMTP, and WebSphere MQ (communication protocols). Simply put, SOAP uses HTTP and other application protocols to exchange messages between SOAP nodes. The SOAP nodes act as mediators between SOAP applications and the underlying communication protocols. Although there are some subtle differences between the various protocol bindings, SOAP does a fairly decent job of isolating the applications from the different semantics of the various underlying communication protocols. (SOAP 1.2 does a better job than SOAP 1.1 abstracting away the underlying protocol semantics.) In any case, the goal is that SOAP exposes the same application protocol semantics regardless of the underlying communication protocol.

SOAP does not fully exploit the power of the HTTP application protocol. In fact, SOAP 1.1 explicitly constrains its use of HTTP to that of the HTTP POST operation. SOAP 1.1 over HTTP requires that SOAP requests be sent using POST. The specification doesn't indicate how to process requests using PUT. Therefore use of PUT is not supported by the SOAP protocol. SOAP 1.2 adds a description of using SOAP with HTTP GET, but it still doesn't describe how to use SOAP with HTTP PUT. Mark's example of using HTTP PUT to store a SOAP message is outside the scope of the SOAP protocol. In this example, HTTP PUT semantics apply, not SOAP semantics.

Now -- getting back to the topic of REST and SOAP and document-oriented web services ...

In his post, Mark references one of my responses to an Ask the Expert question on TechTarget regarding the difference between Document and RPC style services. In my response I indicated that the difference between Document and RPC is fairly minor, but the more import consideration is the encoding style. Document style messages are encoded literally according to a schema. Back in 2002, when I wrote this response, RPC style messages were typically encoding using a data model called SOAP encoding. (Many SOAP engines now also support literal encoding of RPC style messages.)

Mark has a different perspective, though. He claims that the more important difference between Document and RPC style services is whether or not the message contains a method name. When using RPC style, the message always contains the method name. When using document style, the message may or may not contain the method name. When using the "wrapped" programming convention, the message does contain the method name. When using the "unwrapped" programming style, the message typically doesn't contain the method name.

To quote Mark's post:
While the encodings used were certainly different, each with its own not-insignificant pros and cons, what Anne failed to point out is that the RPC example included an operation name (”placeOrder”) while the document oriented example did not. This constitutes an extremely significant architectural difference, as it tells us that Anne’s document example uses a state transfer style, while the RPC example does not.
But I have to disagree with Mark's assertion. And here I have to go back to the differences between HTTP and SOAP application protocol semantics. Mark makes the assumption that if the message does not include a method name, then it must therefore be using a uniform interface -- an implicit "processMessage" operation. But that's not how SOAP works. SOAP doesn't require that you specify a method name in order to map the request to a specific method. Per the SOAP specification, the SOAP node determines what method to invoke based on the qualified name (QName) of the child element of the SOAP Body element. It really doesn't matter if the name of that element is a noun ("purchaseOrder") or a verb ("placeOrder"). The QName determines the action to be performed. These SOAP semantics are fundamentally different from HTTP/REST semantics.

In REST, the "method" that will be invoked is always determined by the URL to which the request is sent, not by the contents of the request message. REST has a uniform interface, therefore a POST to a URL will always invoke the implicit "processMessage" operation.

In SOAP, the "method" that will be processed is determined by both the URL to which the request is sent and the contents of the request -- the QName of the SOAP Body. So, for example, a single service might support multiple methods: placeOrder, cancelOrder, getOrderStatus. Each method is invoked by sending a different document:
  • purchaseOrder invokes placeOrder
  • canceledOrderID invokes cancelOrder
  • pendingOrder invokes getOrderStatus
I guess my point is that document-oriented doesn't mean that its RESTful. SOAP semantics are just different from REST/HTTP semantics.

7 comments:

Stefan Tilkov said...

Why doesn't Blogger do trackbacks? Anyway, some thoughts here.

Mark said...

Sorry for the delay in responding, but thanks so much for the kind words and reasoned response, Anne.

I was obviously succesful at communicating some points as some of your responses make evident. But I wasn't so lucky with some other points I was trying to make it seems. For example, you say;

"REST has a uniform interface, therefore a POST to a URL will always invoke the implicit "processMessage" operation."

What I'm really saying, is that POST == processMessage; that they mean the same thing. So it's just "POST to an URL", where POST is the method and the URL identifies the service/resource. Is that clearer? I brought up PUT only to say - since, of course, no existing binding uses it - that if one did create such a binding, that this is what the message would mean, and that POST != PUT and PUT != processMessage.

Regarding WS-I vs. SOAP and standardized dispatch, I reckon that the WS-I decision was made to improve interoperability by choosing a single (?) dispatch mechanism. Interestingly, a RESTful use of SOAP could also be said to restrict choice in the same manner, though instead of using the GED ala WS-I, it opts to restrict "services" (resources) to doing one thing on a POST (more or less) so that there's no ambiguity from a message-on-the-wire POV. That is, if you needed a service to do more than one thing, you'd have to mint another service, and give it its own URI.

Anyhow, some stream-of-consciousness thoughts about your interesting post. There's still a disconnect, but from my POV it's not nearly as serious as it was, say, 5 years ago.

Cheers.

Mark Hansen said...

What is the big deal about whether the method/operation being invoked is specified as a sub-string of a URL or as the QName of a single SOAP body element? Isn't that just syntax?

To place a purchase order, in REST, you POST to something like:

http:/mysite.com/purchases/newOrder

In SOAP, you POST the following message to the URL

http://mysite.com/purchases

#envelope#
#header/#
#body#
#newOrder# ...
#/newOrder#
#/body##/envelope#

(sorry about the #s - this blog posting software didn't like my XML)

In the REST case, the "new order" operation is embedded in the URL. In SOAP, it is embedded in the message syntax.

It seems to me that any SOAP system can be easily transformed to a REST system by pulling the method and schema type QNAME of the parameter (single parameter since we are talking doc/lit) out of the SOAP message and mapping them to a URL representation:

http://mysite.com/operation-qname/parameter-type-qname

I must be missing something. Is the entire SOAP vs. REST debate really about syntax?

Chui Tey said...

SOAP's RPC heritage can be seen in the version 1 spec where encoding can lead to different document schemas for the same semantics. In particular see Tim Ewalds article on MSDN - Argument against SOAP encoding.

I believe that REST and SOAP proponents have different value systems.

REST proponents value document creation and transmission over network transparency. Programmers have to be aware of the distributed nature of computing, and marshalling issues with respect to interoperating with systems running on different languages. A simpler document schema is favoured since humans have to work with it.

SOAP proponents value network transparency, transport transparency and transparent marshalling over simple schemas. Programmers can make a simpler conceptual jump from function calls to RPC. SOAP encoding even took care that if one sent getDistance(Point1 => point1, Point2 => point1), the server will get the same reference for Point1 and Point2.

Who is right?

If the customer is right, then SOAP will prevail in the long run, since marshalling is hard work, and REST intentionally makes no claims on how marshalling should be done.

However, in the short run, in the absence of well baked SOAP stacks across the board in the major open source languages, simple document schemas win, as can be seen in the so-called "web 2.0" mashups.

There is a happy medium though. Microsoft SQL Server has a feature to marshall result sets as XML. This has several advantages:

1) it is document based
2) REST-like operations
3) built-in marshalling
4) handles null types well
5) most developers already know SQL

In the end, web services are no more than one database trying to talk to another database. SOAP is this:

my database -> my ORM -> your ORM -> your database

Why not make life easier and just cut out the ORM?

abby brock said...

I am so glad this internet thing works and your article really helped me. Thanks for this.

movers in marryland 

Brave Boss said...

Great Blog!! That was amazing. Your thought processing is wonderful. The way you tell the thing is awesome. thanks for sharing....
Document Imaging Services

thesis factory said...

Hi...Nice blog. Really very interesting....!!!


Dissertation binding