Sponsored Links


Resources

Enterprise Java
Research Library

Get Java white papers, product information, case studies and webcasts

Using JavaSpaces

February 2007

Discussion


Understanding JavaSpaces

JavaSpaces has been a bit of an unknown technology for a long time. It's one of those technologies that programmers know is out there, but haven't actually used enough to say they understand what it's for or what it can do for them.

JavaSpaces is, in very simple terms, a kind of client/server map, a grid in which data lives. This map doesn't have a distinct key, using a sort of query-by-example to retrieve data. It also has notification capabilities.

In concept, that's really it. If you can wrap your head around the idea that it's a map in which an entry's data determines how that entry is accessed, you've mastered most of JavaSpaces already – the rest is simple implementation.

JavaSpaces has a number of uses, especially in massive parallel applications. One example is that of a job producer, e.g. “calculate this” applications where the calculations vary widely in complexity; JavaSpaces allows a high-powered CPU to grab certain computations while letting lower-powered CPUs select others. Also, if more computing power is needed, adding more power is only a matter of running more clients to select tasks from the JavaSpace. Another example is that of queueing updates to a datastore; storing data into a JavaSpace is not only very fast (subject to network throughput, of course, which would also affect direct data storage), but provides easy audit capabilities (through notification events) and also means that the persistence engine's speed can't impact the application. (This is a common requirement for financial applications, where milliseconds count.)

Using JavaSpaces

JavaSpaces requires an implementation. It's built on JINI, so the beginning of a JavaSpaces exploration should start with a download of the JINI starter kit, which includes Outrigger, Sun's implementation of JavaSpaces.

In addition to the JINI starter kit not being documented very well, Outrigger isn't especially good either, so an easy and convenient addition to the JINI download is Blitz1. Blitz replaces Outrigger as well as providing some easy startup scripts and some diagnostic tools. A commercial implementation with the ability to distribute the Space is GigaSpaces2. (These are merely some candidates, not meant to be representative of the entire JINI or JavaSpaces communities.)

Running a Blitz instance is as simple as installing Blitz, going to its directory, and typing “.\blitz.bat”. This will show a decent amount of information in the log as various services start, including registration services and an HTTP server (defaulting to port 8080). When initialization is complete, you have a working JavaSpace in action. The diagnostics tool (“dashboard.bat”) shows information about memory usage, transactions, and entry counts in the space.

Entries in a JavaSpace are basically simple Java Objects that follow a few simple rules:

  • All data persisted in the space must be exposed in public fields.
  • The Entry interface must be implemented. This is a marker interface, requiring no methods to conform to the interface contract.
  • Objects must be used for the properties (i.e., no primitive fields.) This makes sense especially in light of query-by-example, which uses nulls to indicate wildcards. <

An entry can contain data and/or functionality, and can implement any interface or class that conforms to the requirements for JavaSpaces. Thus, a fairly common pattern is:

  • Implement a Command interface in an Entry, along with any data required to perform the command
  • Store the Command into the JavaSpace
  • Have a computing resource retrieve the Command and execute it, storing any results back into the JavaSpace for consumption if necessary

Connecting to a JavaSpace is fairly simple and flexible. The Lookup.java class provided in the Blitz examples connects to every service registrar it can find, which is acceptable development behavior but not likely to be good in production; however, changing this is fairly easy.

Using the Lookup class and searching for an implementation of JavaSpace05.class (suggested, if it's available) will return a reference to the first JavaSpace it finds, provided one is available3.

There are five basic operations associated with JavaSpaces. They are:

  1. Read an entry matching a template, leaving it in the JavaSpace
  2. Take an entry matching a template from the space, removing it from the JavaSpace
  3. Write an entry into the JavaSpace
  4. Register a callback for events in the JavaSpace
  5. Issue an event in the JavaSpace

Of these, each can be associated with a Transaction, and the read/take operations can also block, wait for matching entries, or “readIfExists,” which will block if there's an entry that might become available to satisfy the operation; otherwise it returns.4 (“takeIfExists” is also in the API.)

In addition, if JavaSpace05 is used, the “take” operation can populate a list of entries matching the template, for bulk processing of entries in the JavaSpace, as can the “write” operation (for writing sets of data). JavaSpace05 also includes a way to get references to sets of information (the “contents” method.)

Templates in JavaSpaces are instances of classes implementing the Entry interface (i.e., they're entries) that use “null” as a wildcard. Thus, if an Entry implementation has string properties A, B, and C, a template might populate A with “data” while leaving B and C null; a take operation would retrieve the first entry of that class that had “data” in the “A” property. (The list form of take() would return all entries that had “data” in the “A” property.5)

Believe it or not, this summarizes JavaSpaces. JavaSpaces can legitimately be used as a datastore (i.e., sets of heterogenous or homogenous Entries), as a queue (i.e., lists of Entries containing data, consumed by external processing agents), and as a distributed processing facility (i.e., Entries containing data and processing capabilities, offloaded to external computing devices).

An Actual Application

Discussing JavaSpaces is all well and good, but that leaves it still in the realm of theory and not practice. Let's change that, by implementing a compute server. Our compute server will be functional, but not complete – it won't implement security, computational limits, or account tracking, but will allow us to signal processes and gather results.

We'll use GigaSpaces for the implementation, although any JavaSpaces implementation should work.

Our requirements are that the external clients will provide a subclass of our ComputeTask class. Our ComputeTask class will contain a status and a UUID for identification. The internal clients – which will be executing the processes – will call a method in the ComputeTask.

The key value proposition of JavaSpaces in this situation is that if the internal client process becomes overloaded for any reason, it will be very, very simple to connect another “internal client” to the JavaSpace to double computational power; each client would add nearly linear scalability to the grid. It's possible to get the same kind of linear capabilities from other technologies, but the nearly transparent nature of the JavaSpace model along with the simplicity of the clients serves as an advantage.

The first step is to download GigaSpaces' Community Edition from http://gigaspaces.com/. This requires registration; if this is a concern for you, feel free to substitute Blitz, Outrigger from the JSTK, or any other compliant JavaSpace implementation, any of which provide the JavaSpaces API features we'll be using. The compute server shown here has no dependency on any specific JavaSpaces implementation.

The next step is to consider our basic data model. The entities are fairly simple: a ComputeTask class (which does nothing in and of itself, but provides UUID and status for descendants), a SubmitTask class (and a parent) which provide some handy utility methods for adding ComputeTasks to the JavaSpace, and a multithreaded ComputeClient, which actually does the work of executing the ComputeTask.

It's worth reiterating that the practice followed in this engine is very much insecure. One of the issues not being discussed in this article is the security manager and policies; if you'd like, please read “Discovering a Java Application's Security Requirements6” for one method of determining required security policy entries.

In addition, the “multithreaded client” is very primitive, as is the ComputeTask itself. The codebase monitors no information, doesn't indicate that it's “running” (instead it removes it from the space entirely), offers no status information. Only one hundred tasks will be run (although it'd be trivial to change it to an infinite number of tasks.) That said, this task server does work, and is a workable starting point for creating a more capable task server. (Incidentally, while this code is entirely written from scratch, except for Dan Creswell's service locator classes, there are other similar implementations of the same kind of structure elsewhere.)

Here's the runtime of a given ComputeClient task. It initializes a counter (to limit the number of tasks to 20), then creates a template to use to look for available tasks (i.e., tasks whose status is STATUS_NEWTASK). Then it looks for the tasks for 5000 ms; if it finds one, it removes it from the space, executes the task (which is expected to populate a “result” field in the ComputeTask) and stores it back into the JavaSpace for retrieval by any external clients. The code is fairly straightforward, if not very robust. (Adding transaction support wouldn't involve much, actually.)

public void run() {
    int tasksRun = 0;
    ComputeTaskImpl template = new ComputeTaskImpl();
    template.setUuid(null);
    template.setStatus(ComputeTask.STATUS_NEWTASK);

    while (tasksRun < TASKS_PER_THREAD) {
        try {
            ComputeTask task = (ComputeTask) getSpace().take(template,
                                                             null,
                                                             SCAN_TIME_MILLIS);
            if (task != null) {
                task.execute();
                task.setStatus(ComputeTask.STATUS_FINISHEDTASK);
                submit(task, null, Lease.FOREVER);
                tasksRun++;
            }
        } catch (Exception e) {
            log.log(Level.SEVERE, e.getMessage(), e);
        }
    }
}

The actual task itself is very, very simple. Here's an addition task, for example. It's vast overkill to use a JavaSpace for simple addition, but it's a proof of concept – more complex tasks would follow the same pattern.

public class SimpleAdditionTask extends ComputeTaskImpl {
    private transient Logger log=Logger.getLogger(this.getClass().getName());
    public Integer firstNumber=null;
    public Integer secondNumber=null;

    // fulfills Entry requirements to have public no-arg constructor
    public SimpleAdditionTask() {
    }

    public SimpleAdditionTask(int f, int s) {
        firstNumber=f;
        secondNumber=s;
    }

    public void execute() {
        result=firstNumber+secondNumber;
        log.info("SimpleAddition has run! (result="+result+")");
    }
}

The execute() method is the one of most interest – note that it simply executes the task (the addition) and stores the result into the “result” property which is contained in ComputeTaskImpl. (Incidentally, the result here is an Integer, not an int; the JavaSpace API mandates that persisted fields in a JavaSpace are public, Serializable objects.)

Source code for the entire server is available on TheServerSide.com.

The last step is to run the actual application. Start GigaSpaces, then set your classpath to include jsk-platform.jar and jsk-lib.jar. Execute “java -Djava.security.policy=policy.all com.tss.javaspaces.compute.internal.ComputeClient”; it will block the command shell until it completes. The security policy in “policy.all” clears all security.

With the same classpath and JVM option, execute the com.tss.javaspaces.compute.util.SubmitTask class; it has a main() method that creates a single SimpleAddition task, submits it, then watches for the result for the task it submitted (which it looks for by UUID and status).

You should see the ComputeClient class show the SimpleAddition's log statement, and then the SubmitTask should get the result itself. You now have a simple, but working, Compute Server.

Hopefully this explained JavaSpaces to you clearly enough that you can see some potential applications for it and not lack any confidence that it can be executed efficiently and properly.

Footnotes

1 Blitz can be found at http://www.dancres.org/blitz . Blitz was reviewed on TheServerSide.com at http://www.theserverside.com/reviews/thread.tss?thread_id=42164

2 GigaSpaces is a complete JavaSpaces implementation, and has Enterprise, Caching, and Community Editions, with the Community being free; see http://gigaspaces.com/. GigaSpaces has a number of convenient and excellent extensions and tools for working with JavaSpaces, and their documentation should be consulted for specifics. A Tech Talk with Nati Shalom, CTO of GigaSpaces, is available at http://www.theserverside.com/tt/talks/library.tss#shalom.

3 If not, it will block indefinitely, yet another testament to its lack of suitability for production environments. A far better set of lookup methods can be found on Dan Creswell's page on “JINI Service Lookup”: http://www.dancres.org/cottage/service_lookup.html . We'll be using these in our example code.

4 An entry might exist but not be available if it's involved in a Transaction, for example. Thus, readIfExists would block while waiting for the other Transaction to complete, or return immediately otherwise.

5 This is a feature of the JavaSpace05 class, which is why its use is recommended. List management in the older JavaSpace class is far more difficult. Note that specific implementations of JavaSpaces may have their own extensions of the JavaSpace class that provide other features.


PRINTER FRIENDLY VERSION


News | Blogs | Discussions | Tech talks | Patterns | Reviews | White Papers | Downloads | Articles | Media kit | About
All Content Copyright ©2007 TheServerSide Privacy Policy