Discussion:
[jruby-dev] Serialization and Persistence
Alan McKean
2007-07-02 22:43:20 UTC
Permalink
Since the lack of Java serialization of JRuby objects stops us dead
in our tracks when trying to hook up our persistence engine, I am
interested in either getting someone on this end to work on it or
jumping in myself. In either case, I need some background on the
JRuby runtime architecture and some guidance on particular issues.
The issues are about how to detach an object from its runtime
elements and how to restore them when the object gets reloaded into
memory:

1) When we first tried saving a JRuby object to our database, we saw
it drag along a gaggle of runtime objects. Given that it might be
loaded into a different VM when it is brought in from the database,
is reconnecting the object to a particular runtime important? If so,
is there a way of determining which of the available runtimes would
be best to connect it to?

2) Detaching an object from its 'runtime' variable and making the
'metaclass' variable transient lets us store the object in our
database without dragging much else along. But we need to reconnect
things when the object is reloaded into memory. Is there a canonical
name for the metaclass that we could store in the database along with
the instance? If not, what information is available for reconnecting.
iWe persist type information in our Java product by storing the fully-
qualified name of the class with the object, then lazily loading and
initializing the connection (using the name) when we reload the
object to memory. Will this work in JRuby?

If someone has thought through a strategy for deserializing a JRuby
object and restoring its connections to its runtime, I would love to
hear about it.

Thanks,
Alan McKean


---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-03 07:33:25 UTC
Permalink
Since the lack of Java serialization of JRuby objects stops us dead in
our tracks when trying to hook up our persistence engine, I am
interested in either getting someone on this end to work on it or
jumping in myself. In either case, I need some background on the JRuby
runtime architecture and some guidance on particular issues. The issues
are about how to detach an object from its runtime elements and how to
1) When we first tried saving a JRuby object to our database, we saw it
drag along a gaggle of runtime objects. Given that it might be loaded
into a different VM when it is brought in from the database,
is reconnecting the object to a particular runtime important? If so, is
there a way of determining which of the available runtimes would be best
to connect it to?
2) Detaching an object from its 'runtime' variable and making the
'metaclass' variable transient lets us store the object in our database
without dragging much else along. But we need to reconnect things when
the object is reloaded into memory. Is there a canonical name for the
metaclass that we could store in the database along with the instance?
If not, what information is available for reconnecting. iWe persist type
information in our Java product by storing the fully-qualified name of
the class with the object, then lazily loading and initializing the
connection (using the name) when we reload the object to memory. Will
this work in JRuby?
If someone has thought through a strategy for deserializing a JRuby
object and restoring its connections to its runtime, I would love to
hear about it.
I've been looking into this a bit tonight. This email represents me
rambling.

Marking metaclass and finalizer as transient are no-brainers. I'm going
to go ahead and commit that.

I'm going to see what would be needed to remove getRuntime everywhere
it's needed. It would be a big job...be back in a few minutes...

...ok I'm back. I think it's doable. Here's more rambling thoughts.

The runtime connection is used for a few things:

1. to construct other objects

This is mostly a self-fulfilling prophecy. Objects require a runtime
when they're created, so all objects need runtime available to create
objects. If we break that chain, a number of places that depend on
runtime disappear.

2. to locate classes in order to construct objects

This is a little harder to eliminate. In order to construct a Ruby
"String" object, you need to have access to the "String" metaclass. That
means having access to the place where the "String" metaclass is stored,
currently in the runtime. Again, this is largely self-fulfilling; you
need access to a metaclass to construct an object, so you need to locate
the metaclass, and since the metaclasses are currently rooted in the
runtime, you need the runtime. But the runtime dependency is largely
peripheral to the use case.

3. to access runtime-global and thread-local data at execution time

This is probably the hardest to eliminate. Every thread Ruby code
creates or encounters is associated with a ThreadContext, which contains
extra thread-local state needed for executing Ruby code. Every external
thread that touches a given runtime is "adopted" and given a
ThreadContext and a Ruby "Thread" avatar to represent it. So a given
Java thread may have many ruby "Thread" and "ThreadContext" associated
with it, one per runtime it has touched. This allows us to share threads
across runtimes, rather than having a given thread bound to a given
runtime execlusively, as in many other JVM languages. But it also
requires that we locate the runtime, and therefore the ThreadContext, in
a different way. Therefore, we have the runtime dependency.

This essentially sums up all the major reasons why we have so many
dependencies in code on access to a runtime object. And ultimately,
requiring access to a runtime object obliterates the possibility of
third-party manipulation and transport of Ruby objects.

So to summarize, the three actual reasons we depend on runtime being
present are as follows:

1. to access and maintain types associated with a specific ruby worldspace
2. to access and maintain state associated with a specific ruby worldspace
3. to provide execution state and primitives for code running in a
specific ruby worldspace

Now let's rewrite the list by substituting in a different concept for
our top-level ruby worldspace:

1. to access and maintain types associated with a specific ClassLoader
2. to access and maintain static associated with a specific ClassLoader
3. to provide execution state and primitives for code running in a
specific ClassLoader

So let's examine how we'd solve these issues.

First off, IRubyObject.getRuntime(). Let's assume that the classloader
that loads the Ruby class is our chosen, ultimate worldspace:

public Ruby getRuntime() {
JRubyClassLoader cl = (JRubyClassLoader)Ruby.class.getClassLoader();
cl.getRuntime();
}

Everything else largely falls out of this. Starting up a new instance of
JRuby largely becomes the act of constructing the top-level classloader
in which it will live and telling it to "go".

JRubyClassLoader cl = new JRubyClassLoader(..., properties);
cl.evalScript("puts 'hello'", "(eval)");

Everything lives underneath the classloader, and since all classes have
access to that classloader, all code can retrieve the runtime associated
with it.

Would something like this work? The view from inside the classloader
seems pretty reasonable...we already have this root context and
partitioning as part of Java's classloader support, and it seems fairly
natural to use it. But I'm not well-enough versed in Java serialization
to know if this will solve our deserialization issues. It may require
you to have more control over the object stream...but of course if you
have control over the object stream, you could also just have it ask a
specific runtime to unmarshal objects, avoiding the issue completely.

Thoughts? More ideas?

- Charlie


---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-03 07:42:20 UTC
Permalink
Post by Charles Oliver Nutter
Thoughts? More ideas?
More ideas!

A different approach, less brute-force, but requires control over the
deserialization process (perhaps that's not too much to ask?)...

RubyObject.java:

public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
RubyObjectInputStream rois = (RubyObjectInputStream)in;

this.flags = in.readInt();
this.instanceVariables = (Map)in.readObject();
// FIXME: is it safe to assume this must always be a RubyClass?
this.metaClass =
(RubyClass)rois.getRuntime().getClassFromPath(in.readUTF());
}

RubyObjectInputStream.java:

public class RubyObjectInputStream extends ObjectInputStream {
private Ruby runtime;

protected RubyObjectInputStream(Ruby runtime) throws IOException,
SecurityException {
super();
this.runtime = runtime;
}

public Ruby getRuntime() {
return runtime;
}
}

You could also make runtime settable, and use a single inputstream to
read objects into multiple runtimes in sequence.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Alan McKean
2007-07-04 01:50:34 UTC
Permalink
Thanks for taking the time to think this through. Let me make sure I
understand. What you are proposing would decouple the runtime from
the objects but still make it accessible via the ClassLoader. That
seems to be a major part of the problem, but doesn't it still leave
the metaclass hierarchy coupled to the object? If we persist (or
serialize) the object, we would want to replace the metaclass
reference with enough information to be able to restore the metaclass
reference when the object is paged back in from disk (or
deserialized). I was thinking that we had to decouple the runtime so
that we wouldn't be persisting things like threads and decouple the
metaclass so that we wouldn't be storing metaclass hierarchy and all
of the method dictionaries. Of course, storing the classes and
methods would open the door to having a distributed, transactional
IDE, so maybe we would want to flush that to the persistent store,
too. But it seems like that would not be what we want for serialization.

Comments?
Post by Charles Oliver Nutter
Post by Alan McKean
Since the lack of Java serialization of JRuby objects stops us
dead in our tracks when trying to hook up our persistence engine,
I am interested in either getting someone on this end to work on
it or jumping in myself. In either case, I need some background on
the JRuby runtime architecture and some guidance on particular
issues. The issues are about how to detach an object from its
runtime elements and how to restore them when the object gets
1) When we first tried saving a JRuby object to our database, we
saw it drag along a gaggle of runtime objects. Given that it might
be loaded into a different VM when it is brought in from the
database,
is reconnecting the object to a particular runtime important? If
so, is there a way of determining which of the available runtimes
would be best to connect it to?
2) Detaching an object from its 'runtime' variable and making the
'metaclass' variable transient lets us store the object in our
database without dragging much else along. But we need to
reconnect things when the object is reloaded into memory. Is there
a canonical name for the metaclass that we could store in the
database along with the instance? If not, what information is
available for reconnecting. iWe persist type information in our
Java product by storing the fully-qualified name of the class with
the object, then lazily loading and initializing the connection
(using the name) when we reload the object to memory. Will this
work in JRuby?
If someone has thought through a strategy for deserializing a
JRuby object and restoring its connections to its runtime, I would
love to hear about it.
I've been looking into this a bit tonight. This email represents me
rambling.
Marking metaclass and finalizer as transient are no-brainers. I'm
going to go ahead and commit that.
I'm going to see what would be needed to remove getRuntime
everywhere it's needed. It would be a big job...be back in a few
minutes...
...ok I'm back. I think it's doable. Here's more rambling thoughts.
1. to construct other objects
This is mostly a self-fulfilling prophecy. Objects require a
runtime when they're created, so all objects need runtime available
to create objects. If we break that chain, a number of places that
depend on runtime disappear.
2. to locate classes in order to construct objects
This is a little harder to eliminate. In order to construct a Ruby
"String" object, you need to have access to the "String" metaclass.
That means having access to the place where the "String" metaclass
is stored, currently in the runtime. Again, this is largely self-
fulfilling; you need access to a metaclass to construct an object,
so you need to locate the metaclass, and since the metaclasses are
currently rooted in the runtime, you need the runtime. But the
runtime dependency is largely peripheral to the use case.
3. to access runtime-global and thread-local data at execution time
This is probably the hardest to eliminate. Every thread Ruby code
creates or encounters is associated with a ThreadContext, which
contains extra thread-local state needed for executing Ruby code.
Every external thread that touches a given runtime is "adopted" and
given a ThreadContext and a Ruby "Thread" avatar to represent it.
So a given Java thread may have many ruby "Thread" and
"ThreadContext" associated with it, one per runtime it has touched.
This allows us to share threads across runtimes, rather than having
a given thread bound to a given runtime execlusively, as in many
other JVM languages. But it also requires that we locate the
runtime, and therefore the ThreadContext, in a different way.
Therefore, we have the runtime dependency.
This essentially sums up all the major reasons why we have so many
dependencies in code on access to a runtime object. And ultimately,
requiring access to a runtime object obliterates the possibility of
third-party manipulation and transport of Ruby objects.
So to summarize, the three actual reasons we depend on runtime
1. to access and maintain types associated with a specific ruby worldspace
2. to access and maintain state associated with a specific ruby worldspace
3. to provide execution state and primitives for code running in a
specific ruby worldspace
Now let's rewrite the list by substituting in a different concept
1. to access and maintain types associated with a specific ClassLoader
2. to access and maintain static associated with a specific
ClassLoader
3. to provide execution state and primitives for code running in a
specific ClassLoader
So let's examine how we'd solve these issues.
First off, IRubyObject.getRuntime(). Let's assume that the
classloader that loads the Ruby class is our chosen, ultimate
public Ruby getRuntime() {
JRubyClassLoader cl = (JRubyClassLoader)Ruby.class.getClassLoader();
cl.getRuntime();
}
Everything else largely falls out of this. Starting up a new
instance of JRuby largely becomes the act of constructing the top-
level classloader in which it will live and telling it to "go".
JRubyClassLoader cl = new JRubyClassLoader(..., properties);
cl.evalScript("puts 'hello'", "(eval)");
Everything lives underneath the classloader, and since all classes
have access to that classloader, all code can retrieve the runtime
associated with it.
Would something like this work? The view from inside the
classloader seems pretty reasonable...we already have this root
context and partitioning as part of Java's classloader support, and
it seems fairly natural to use it. But I'm not well-enough versed
in Java serialization to know if this will solve our
deserialization issues. It may require you to have more control
over the object stream...but of course if you have control over the
object stream, you could also just have it ask a specific runtime
to unmarshal objects, avoiding the issue completely.
Thoughts? More ideas?
- Charlie
---------------------------------------------------------------------
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Alan McKean
2007-07-04 01:55:12 UTC
Permalink
Thanks for taking the time to think this through. Let me make sure I
understand. What you are proposing would decouple the runtime from
the objects but still make it accessible via the ClassLoader. That
seems to be solve a major part of the problem, but doesn't it still
leave the metaclass hierarchy coupled to the object? If we persist
(or serialize) the object, we would want to replace the metaclass
reference with enough information to be able to restore the metaclass
reference when the object is paged back in from disk (or
deserialized). I was thinking that we had to decouple the runtime so
that we wouldn't be persisting things like threads AND decouple the
metaclass so that we wouldn't be storing metaclass hierarchy and all
of the method dictionaries. Of course, storing the classes and
methods would open the door for a shared development environment, so
maybe we would want to flush that to the persistent store, too. But
it seems like that would not be what we want for serialization.


Comments?
Post by Charles Oliver Nutter
Post by Alan McKean
Since the lack of Java serialization of JRuby objects stops us
dead in our tracks when trying to hook up our persistence engine,
I am interested in either getting someone on this end to work on
it or jumping in myself. In either case, I need some background on
the JRuby runtime architecture and some guidance on particular
issues. The issues are about how to detach an object from its
runtime elements and how to restore them when the object gets
1) When we first tried saving a JRuby object to our database, we
saw it drag along a gaggle of runtime objects. Given that it might
be loaded into a different VM when it is brought in from the
database,
is reconnecting the object to a particular runtime important? If
so, is there a way of determining which of the available runtimes
would be best to connect it to?
2) Detaching an object from its 'runtime' variable and making the
'metaclass' variable transient lets us store the object in our
database without dragging much else along. But we need to
reconnect things when the object is reloaded into memory. Is there
a canonical name for the metaclass that we could store in the
database along with the instance? If not, what information is
available for reconnecting. iWe persist type information in our
Java product by storing the fully-qualified name of the class with
the object, then lazily loading and initializing the connection
(using the name) when we reload the object to memory. Will this
work in JRuby?
If someone has thought through a strategy for deserializing a
JRuby object and restoring its connections to its runtime, I would
love to hear about it.
I've been looking into this a bit tonight. This email represents me
rambling.
Marking metaclass and finalizer as transient are no-brainers. I'm
going to go ahead and commit that.
I'm going to see what would be needed to remove getRuntime
everywhere it's needed. It would be a big job...be back in a few
minutes...
...ok I'm back. I think it's doable. Here's more rambling thoughts.
1. to construct other objects
This is mostly a self-fulfilling prophecy. Objects require a
runtime when they're created, so all objects need runtime available
to create objects. If we break that chain, a number of places that
depend on runtime disappear.
2. to locate classes in order to construct objects
This is a little harder to eliminate. In order to construct a Ruby
"String" object, you need to have access to the "String" metaclass.
That means having access to the place where the "String" metaclass
is stored, currently in the runtime. Again, this is largely self-
fulfilling; you need access to a metaclass to construct an object,
so you need to locate the metaclass, and since the metaclasses are
currently rooted in the runtime, you need the runtime. But the
runtime dependency is largely peripheral to the use case.
3. to access runtime-global and thread-local data at execution time
This is probably the hardest to eliminate. Every thread Ruby code
creates or encounters is associated with a ThreadContext, which
contains extra thread-local state needed for executing Ruby code.
Every external thread that touches a given runtime is "adopted" and
given a ThreadContext and a Ruby "Thread" avatar to represent it.
So a given Java thread may have many ruby "Thread" and
"ThreadContext" associated with it, one per runtime it has touched.
This allows us to share threads across runtimes, rather than having
a given thread bound to a given runtime execlusively, as in many
other JVM languages. But it also requires that we locate the
runtime, and therefore the ThreadContext, in a different way.
Therefore, we have the runtime dependency.
This essentially sums up all the major reasons why we have so many
dependencies in code on access to a runtime object. And ultimately,
requiring access to a runtime object obliterates the possibility of
third-party manipulation and transport of Ruby objects.
So to summarize, the three actual reasons we depend on runtime
1. to access and maintain types associated with a specific ruby worldspace
2. to access and maintain state associated with a specific ruby worldspace
3. to provide execution state and primitives for code running in a
specific ruby worldspace
Now let's rewrite the list by substituting in a different concept
1. to access and maintain types associated with a specific ClassLoader
2. to access and maintain static associated with a specific
ClassLoader
3. to provide execution state and primitives for code running in a
specific ClassLoader
So let's examine how we'd solve these issues.
First off, IRubyObject.getRuntime(). Let's assume that the
classloader that loads the Ruby class is our chosen, ultimate
public Ruby getRuntime() {
JRubyClassLoader cl = (JRubyClassLoader)Ruby.class.getClassLoader();
cl.getRuntime();
}
Everything else largely falls out of this. Starting up a new
instance of JRuby largely becomes the act of constructing the top-
level classloader in which it will live and telling it to "go".
JRubyClassLoader cl = new JRubyClassLoader(..., properties);
cl.evalScript("puts 'hello'", "(eval)");
Everything lives underneath the classloader, and since all classes
have access to that classloader, all code can retrieve the runtime
associated with it.
Would something like this work? The view from inside the
classloader seems pretty reasonable...we already have this root
context and partitioning as part of Java's classloader support, and
it seems fairly natural to use it. But I'm not well-enough versed
in Java serialization to know if this will solve our
deserialization issues. It may require you to have more control
over the object stream...but of course if you have control over the
object stream, you could also just have it ask a specific runtime
to unmarshal objects, avoiding the issue completely.
Thoughts? More ideas?
- Charlie
---------------------------------------------------------------------
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-04 04:21:26 UTC
Permalink
Post by Alan McKean
Thanks for taking the time to think this through. Let me make sure I
understand. What you are proposing would decouple the runtime from the
objects but still make it accessible via the ClassLoader. That seems to
be solve a major part of the problem, but doesn't it still leave the
metaclass hierarchy coupled to the object? If we persist (or serialize)
the object, we would want to replace the metaclass reference with enough
information to be able to restore the metaclass reference when the
object is paged back in from disk (or deserialized). I was thinking that
we had to decouple the runtime so that we wouldn't be persisting things
like threads AND decouple the metaclass so that we wouldn't be storing
metaclass hierarchy and all of the method dictionaries. Of course,
storing the classes and methods would open the door for a shared
development environment, so maybe we would want to flush that to the
persistent store, too. But it seems like that would not be what we want
for serialization.
I'm leaning toward the second example now, where you control the
deserialization of the objects and can provide a runtime at that point.
In general I'm not sure it's going to be practically possible to
decouple the object from runtime completely, because they still need a
place to get at runtime-global state, like classes, global variables,
and thread contexts. If we forced a JVM-global or classloader-global
place to find those things, then we could eliminate the runtime
dependency, but we'd also make it far more difficult or even impossible
to have multiple separate runtime worldspaces in the same JVM...

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Jochen Theodorou
2007-07-04 08:05:34 UTC
Permalink
Post by Charles Oliver Nutter
Post by Alan McKean
Thanks for taking the time to think this through. Let me make sure I
understand. What you are proposing would decouple the runtime from the
objects but still make it accessible via the ClassLoader. That seems
to be solve a major part of the problem, but doesn't it still leave
the metaclass hierarchy coupled to the object? If we persist (or
serialize) the object, we would want to replace the metaclass
reference with enough information to be able to restore the metaclass
reference when the object is paged back in from disk (or
deserialized). I was thinking that we had to decouple the runtime so
that we wouldn't be persisting things like threads AND decouple the
metaclass so that we wouldn't be storing metaclass hierarchy and all
of the method dictionaries. Of course, storing the classes and methods
would open the door for a shared development environment, so maybe we
would want to flush that to the persistent store, too. But it seems
like that would not be what we want for serialization.
I'm leaning toward the second example now, where you control the
deserialization of the objects and can provide a runtime at that point.
In general I'm not sure it's going to be practically possible to
decouple the object from runtime completely, because they still need a
place to get at runtime-global state, like classes, global variables,
and thread contexts. If we forced a JVM-global or classloader-global
place to find those things, then we could eliminate the runtime
dependency, but we'd also make it far more difficult or even impossible
to have multiple separate runtime worldspaces in the same JVM...
sorry for commenting here, I know it is not my project and I don't know
the architecture enough, but ;)

I mean given that the object you deserialize has a transient field
giving access to the JRuby runtime, then all that has to be solved is
setting this filed correctly, right? If yes, then who is deserializing?
Isn't a runtime doing this? And wouldn't that mean I have a runtime I
can attach the object to? That would be independent of classloaders and
globals. I am sure I am missing something important here.

bye blackdrag

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-04 09:25:52 UTC
Permalink
Post by Jochen Theodorou
sorry for commenting here, I know it is not my project and I don't know
the architecture enough, but ;)
Not at all, feel free to comment at will!
Post by Jochen Theodorou
I mean given that the object you deserialize has a transient field
giving access to the JRuby runtime, then all that has to be solved is
setting this filed correctly, right? If yes, then who is deserializing?
Isn't a runtime doing this? And wouldn't that mean I have a runtime I
can attach the object to? That would be independent of classloaders and
globals. I am sure I am missing something important here.
In some cases, yes, there will be a runtime "pulling" objects off the
wire. But generally if you've got a runtime on both sides you'd just use
Ruby's marshaling, which fits a bit better into Ruby-land.

The problems come up when you're not in control of serialization. One
example is Rails apps running in JRuby but storing their session data in
a Java webapp session. In order to support clustering, sessions would be
serialized to other servers without the runtimes knowing about it...so
there would be no way on the other side to reconstitute the objects as
they're pulled off the wire.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Jochen Theodorou
2007-07-04 10:25:15 UTC
Permalink
Post by Charles Oliver Nutter
Post by Jochen Theodorou
sorry for commenting here, I know it is not my project and I don't
know the architecture enough, but ;)
Not at all, feel free to comment at will!
ok ;)
Post by Charles Oliver Nutter
Post by Jochen Theodorou
I mean given that the object you deserialize has a transient field
giving access to the JRuby runtime, then all that has to be solved is
setting this filed correctly, right? If yes, then who is
deserializing? Isn't a runtime doing this? And wouldn't that mean I
have a runtime I can attach the object to? That would be independent
of classloaders and globals. I am sure I am missing something
important here.
In some cases, yes, there will be a runtime "pulling" objects off the
wire. But generally if you've got a runtime on both sides you'd just use
Ruby's marshaling, which fits a bit better into Ruby-land.
that's no problem then, ok
Post by Charles Oliver Nutter
The problems come up when you're not in control of serialization. One
example is Rails apps running in JRuby but storing their session data in
a Java webapp session. In order to support clustering, sessions would be
serialized to other servers without the runtimes knowing about it.. so
there would be no way on the other side to reconstitute the objects as
they're pulled off the wire.
you mean the new object won't be connected to the runtime, because it
was deserialized by Java? I see.. hmm... my next thought is a bit
complicated, but well... Let us assume all Runtimes register themselfs
at a global map using a unique id. Then, when deserializing the object,
the runtime is created using this id and registered at the new computer
or attached to an already existing id. java.util.UUID might help here,
but I don't know if it will work. I never worked with UUID.

bye blackdrag


---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-05 05:19:27 UTC
Permalink
Post by Jochen Theodorou
you mean the new object won't be connected to the runtime, because it
was deserialized by Java? I see.. hmm... my next thought is a bit
complicated, but well... Let us assume all Runtimes register themselfs
at a global map using a unique id. Then, when deserializing the object,
the runtime is created using this id and registered at the new computer
or attached to an already existing id. java.util.UUID might help here,
but I don't know if it will work. I never worked with UUID.
Usually the case is that JVM 1 and JVM 2 start up independently and
create their own runtimes before any of the persistence magic begins. So
when you're interested in deserializing objects on the other side, you
need some way to reattach them to a given JVM.

I hate to steer away from this specific example, but I'd like to hear
more from the Gemstone guys about the various ideas thusfar. Since it
sounds like you control the persistence endpoints, is it such a big deal
to make a runtime available on the other side?

If we decided to continue down the path of detaching objects completely
from the runtime, do you have any proposals for resolving the
requirements I set forth earlier?

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Alan McKean
2007-07-06 20:59:54 UTC
Permalink
After talking with Charlie about an approach, we decided that this is
what we would try first. It would require an additional 4-byte field
in every object. Not the optimal approach, but good enough for a
proof-of-concept.

1) Mark 'metaclass' and 'finalizer' transient. This should allow us
to persist an object without dragging the runtime classes along.
2) Add a String 'metaClassName' field to RubyObject. Initialize it in
setMetaClass() when it is first called.
3) Modify getMetaClass() to lazily initialize the metaclass variables
when the, using 'metaClassname' when the getter is first called.


Here's an alternative approach that I would like some feedback on.
It's more complicated but would not require any additional instance
fields. It would add a field to the metaclass instead.

1) Mark every field in the RubyClass transient except for one: a new
'metaClassName' field. Initialize it when the metaclass is created.
When saving an instance, save the metaclass in the database, too.
2) When the object is reloaded from the database, fault the persisted
metaclass into memory on the first instance method invocation.
3) Use lazy initializers on the metaclass fields reconnect to the
original metaclass's field references. This would mean that there
would be two identical classes: the new one hooked up to the
persisted objects (it would be the start of method lookup for
persisted objects that had been reloaded) and the original class that
would continue to be used to create new instances. Not-yet-persisted
objects would use the original.

Ideas? Comments?
Post by Charles Oliver Nutter
Post by Jochen Theodorou
you mean the new object won't be connected to the runtime, because
it was deserialized by Java? I see.. hmm... my next thought is a
bit complicated, but well... Let us assume all Runtimes register
themselfs at a global map using a unique id. Then, when
deserializing the object, the runtime is created using this id and
registered at the new computer or attached to an already existing
id. java.util.UUID might help here, but I don't know if it will
work. I never worked with UUID.
Usually the case is that JVM 1 and JVM 2 start up independently and
create their own runtimes before any of the persistence magic
begins. So when you're interested in deserializing objects on the
other side, you need some way to reattach them to a given JVM.
I hate to steer away from this specific example, but I'd like to
hear more from the Gemstone guys about the various ideas thusfar.
Since it sounds like you control the persistence endpoints, is it
such a big deal to make a runtime available on the other side?
If we decided to continue down the path of detaching objects
completely from the runtime, do you have any proposals for
resolving the requirements I set forth earlier?
- Charlie
---------------------------------------------------------------------
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email
Charles Oliver Nutter
2007-07-06 22:13:42 UTC
Permalink
Post by Alan McKean
After talking with Charlie about an approach, we decided that this is
what we would try first. It would require an additional 4-byte field in
every object. Not the optimal approach, but good enough for a
proof-of-concept.
1) Mark 'metaclass' and 'finalizer' transient. This should allow us to
persist an object without dragging the runtime classes along.
2) Add a String 'metaClassName' field to RubyObject. Initialize it in
setMetaClass() when it is first called.
3) Modify getMetaClass() to lazily initialize the metaclass variables
when the, using 'metaClassname' when the getter is first called.
Here's an alternative approach that I would like some feedback on. It's
more complicated but would not require any additional instance fields.
It would add a field to the metaclass instead.
1) Mark every field in the RubyClass transient except for one: a new
'metaClassName' field. Initialize it when the metaclass is created. When
saving an instance, save the metaclass in the database, too.
2) When the object is reloaded from the database, fault the persisted
metaclass into memory on the first instance method invocation.
3) Use lazy initializers on the metaclass fields reconnect to the
original metaclass's field references. This would mean that there would
be two identical classes: the new one hooked up to the persisted objects
(it would be the start of method lookup for persisted objects that had
been reloaded) and the original class that would continue to be used to
create new instances. Not-yet-persisted objects would use the original.
Ideas? Comments?
Only concern about the second idea is metaclass object identity. If we
have two physical objects that represent the same class, but aren't the
same object in Java, there could be problems somewhere. I just don't
know where, but it makes my spider-sense tingle.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

http://xircles.codehaus.org/manage_email

Loading...