« SpringBeanAdapter for Flex Flash Remoting | Main | DBUnit, HSQL and the BOOLEAN data type »

July 13, 2005

Designing Remote Service APIs

I've been meaning to bring an ongoing discussion about remote API design to this forum. A recent post to the Caucho hessian-interest mailing list prompted me to write a response that I am essentially reposting here.

The question at hand is what is the best approach for designing, publishing and versioning remote service APIs. Through my experiences designing and programming against remote service APIs I've come to develop the following opinion.

If you are developing remote APIs for a known and manageable set of clients, exposing your fine grained object model through the API over a mechanism like SOAP/Hessian makes for rapid development and easy integration. Being able to rev your object model and have the changes show up in the API is a benefit in this case since you do not have to manage a translation layer to DTOs, XML or something else.

If you are developing remote APIs for an unknown and unmanageable set of clients, you need a layer between your internal object model and the model exposed to the clients. This allows you to rev your internal model while keeping the external interface consistent. Of course, you then have to maintain the translation.

I have seen this second approach addressed well using SOAP by Salesforce.com in their remote APIs. They expose an object model in their APIs that is explicitly designed for remote use.

To manage versioning Salesforce makes different WSDL available for different versions of the API. Using different WSDL gives you access to different versions of the service. I'm not sure of the versioning mechanism but it could be either contained in the SOAP session that you establish with Salesforce or simply be a different end point for accessing the service.

Different end points for different versions seems like one good way to manage versioning your services, if you want to support multiple versions. Clients opt in to a version by choosing an end point.

I have also seen this second approach addressed well by Flickr in their RESTful XML APIs. Flickr does not seem to have addressed versioning and simply has kept their APIs 'in beta' to allow them to rev the APIs without supporting old versions.

The success of Flickr in getting the general developer community to use their APIs has led me to the conclusion that if you want people to use your APIs v. providing them because users are demanding them, then a custom, well documented XML-over-HTTP API will be most successful. Clients have to write more code but the code they have to write is obvious by looking at the APIs.

If you use a SOAP/Hessian/Burlap implementation that will convert your object model to a wire format for communication between server and client, you then need to maintain an object model for your API separately from your internal object model. I'm starting to think that if you are maintaining a translation between an internal and public model, you may as well have that public model be XML based. Remember, I'm talking about the case where you a supporting an unknown and unmanageable set of heterogeneous clients.

I'd prefer that developers of remote clients all wanted to use SOAP (or Hessian or Burlap) but it just doesn't seem to be that way. In the enterprise developer community, sure, but what about people writing Flash and Javascript clients of which there are increasing numbers every day.

I'm very interested to hear what others have to say on this topic.

Posted by Alon Salant at July 13, 2005 8:54 AM

Comments

Your discussion sounds plausable. Personally my anecdotable experience would make me shy away from approach one regardless of the project type.

I've seen approach number one - fine grained remote APIs - implode on several projects now. Whereas approach two - coarse grained remote APIs - has been a pretty safe bet up to now.

The problems with the fine grained API projects have usually been variants of one of the two:
1) poor performance under load
2) overly tight coupling between server and client

Problem variant 1 would happen because in several cases the object graph and/or the number of method calls crossing over the wire was not considered.

Problem variant 2 would happen because in several cases a facade interface would have been a natural fit anyhow for the particular problem regardless of the REMOTING aspect.

I do have a counter example of where I saw some fine grained APIs succeed. It was developing network management applications for Telecom companies. In that case the REMOTE objects exactly matched equipment in the field, the object model and the operations matched very nicely with what the System Administrator was trying to accomplish.

I am not sure what to conclude, but that's been my experience. I would tend to think the approach depends more on the problem domain than on the project type. Not sure.


Posted by: Tony at July 13, 2005 11:13 AM

On a slightly different topic... I've been wondering whether something like the Hibernate model (lazy loading and/or pre-fetching behaviour configured outside of the code) could be applied to remoting ie. when a client requests some data, the remoting layer determines (based on configuration) how much of the object graph gets shipped back to the client. The references in the object graph that don't get shipped would be replaced with some kind of proxy, which (if the client traverses that reference) would make a subsequent remote call to fetch ("page fault") more of the object graph into the client. All this would be transparent to the client itself - it would just be querying, traversing and manipulating the object graph as if it were all locally in-memory.

It seems to me that something like this could have some serious benefits over hardcoding the subsets of the object graph that get shipped with each remote request:

* you can watch how clients interact with the server over time, and tweak your object graph boundaries accordingly (and the clients wouldn't even be aware that the object graph subsets had changed)

* you could safely make remoting transparent to the client, because by being able to control how much or how little of the object graph gets sent back with each request, you can help to avoid both the "chatty client" problem and the "send the entire object graph on every request" problem

* I think the actual transport layer could be pluggable - I can't see any reason why the server infrastructure (responsible for defining and serving appropriate subsets of the object graph) or the client proxy (responsible for "page faulting" in parts of the object graph that are missing on the client) would need to be tied to a particular transport technology

Am I smoking dodgy crack? ;-)

Posted by: Peter at July 13, 2005 5:27 PM

I believe the kiretsu list discussed a couple options along these lines: SDO (http://java.sys-con.com/read/46652.htm?CFID=59730) and CarrierWave were two approaches/solutions mentioned.

We've actually looked in to doing something like this with Hibernate-managed POJOs made available through remote APIs. Before I go any further down this path, remember that this is approach #1 in my original post - exposing your fine grained model through remote APIs - with all the potential gotchas involved.

A couple fundamentals:

* you don't want anything to be 'transparent' over the network, remote entity EJBs taught us that
* you need some mechanism for describing how much of an object graph you want

In our first pass at remoted Hibernate POJOs, we decided that whatever had been initialized by Hibernate we would send, and everything else we would not. This lets you use your hibernate mapping files and explicit code in service implementations to determine how much of an object graph to send. It also let's you use your cascade semantics for Hibernate to determine how much of an object graph you persist if you sent the remoted objects back to the server to be saved.

We wrote a AOP MethodInterceptor that handles replacing Hibernate CGLib proxy objects with their initialized POJO if initialized and with a placeholder POJO (id only) if not. This allows x-1 relationships to round trip.

We replace initialized Hibernate collections with their java.util equivalent. For our initial implementation, we decided that we would replace uninitialized Hibernate collections with a 'null' instance. This at least indicates to the client that the collection was uninitialized rather than empty.

I'd like to take this one step further and replace the uninitialized collections with an object that could be initialized remotely and explicitly from the remote client - a remote equivalent of Hibernate.initialize().

I'm planning to put these thoughts together in a more organized fashion and share the code we wrote for remoting Hibernate POJOs. Coming soon...

Alon

Posted by: Alon Salant at July 20, 2005 5:24 PM

Alon,

Have you taken this work any further? I am extremely interested in it. I have done some work of my own using CGLIB to generate proxies for a POJO so that its fields can be lazy-loaded over RMI if they are suitably annotated. Next step is to try and get Hibernate's session/transaction semantics working efficiently over RMI...

Regards,

Ben Teese

Posted by: Ben Teese at November 6, 2005 1:44 AM

Alon,

Yeh i also hate DTO's etc... and want to use my pojos both on the server side and client side.

When you say you ""e replace initialized Hibernate collections with their java.util equivalent" how do you do this? In some clever way with AOP?

Posted by: william at March 14, 2006 7:52 AM

Exactly. We wrote an AOP MethodInterceptor that massages the results from a service method that returns Hibernate-managed POJOs. My comments from July 20, 2005 say a bit more.

Alon

Posted by: Alon Salant at April 3, 2006 2:52 PM

Post a comment




Remember Me?