Merge ruby-atomic #102

jdantonio · 2014-05-30T11:27:05Z

FOR REVIEW -- DO NOT MERGE

This is an attempt to merge the ruby-atomic gem into concurrent-ruby. I used git subtree to merge the gem with complete history. I then began moving files and making updates. The C extensions work and all tests pass under MRI. The Rubinius version works and the tests pass under Rubinius . There is something wrong with the Java extensions and the native class isn't loading properly under JRuby. The tests for Concurrent::Atomic do not pass under JRuby.

This update moves the Atomic class into the Concurrent module.

Before we merge we should:

Decide if this is is the direction we want to go with this merge
Ask @headius if he is OK with us moving the class into the Concurrent module
Fix the JRuby bug

Thoughts?

….9 (Thread.exclusive is 1.9-only)

…o 0.0.2.

…rence to avoid problems with JI re-coercing the object and breaking identity.

…nalReference#compare_and_set.

…ly atomic

Native CAS for C Rubies

Just added some benchmarks

it may need a little bit of tweaking but the base is there

Parallel benchmark added

* Add compare_and_swap (plus aliases get, set, get_and_set, and compare_and_set) for an exception-free CAS * Customize update to use @ref.compare_and_set rather than try_update, to avoid exception overhead * Make try_update use a pre-allocated backtrace for its ConcurrentUpdateError to reduce exception overhead Also tweaks to benchmark: * Use a global boolean to start all threads; remaining indeterminism is up to OS thread scheduler. This is better than starting each thread in turn, since first thread may run for some time in isolation. * Fix block variable scoping bug

… build native ext it will never use.

* bench_atomic_1 is used to measure 1 combination * graph_atomic is used to run series, using bench_atomic_1 The idea is to calculate series by varying one parameter which gives much more meaninful results.

pitr-ch · 2014-06-15T17:50:24Z

Thanks for the clarification, now I see more clearly why atomic_attr is needed. But it raises questions for me: How to safely create an instance of an object which than can be safely used?

This example is wrong I believe:

class C
  def initialize
    @mutex = Mutex.new # it's not safe to store a Mutex in ivar on JRuby
    @mutex.lock
    @a = 'a'
  ensure
    @a.unlock  
  end

  def a
    @mutex.lock
    @a
  ensure
    @mutex.unlock
  end
end

c = C.new
Thread.new do
  c.a # this may read @mutex with value nil, correct?
end

This example should probably work on JRuby:

class C
  # taken from https://github.com/ruby-concurrency/thread_safe/blob/21ea387dce966f3a6a38752ab6e199bb9a2d710b/lib/thread_safe/util/cheap_lockable.rb#L65-L67
  require 'jruby'
  def synchronize
    JRuby.reference0(self).synchronized { yield }
  end

  def initialize
    synchronize do
      @a = 'a'
    end
  end

  def a
    synchronize do
      @a
    end
  end
end

c = C.new
Thread.new do
  c.a # reads correct value
end

with fallback for CRuby

  def synchronize
    # relying here for simplicity on #synchronize method is always called firstly in constructor, 
    # which means no raise condition on `||=` operator
    _@mutex ||= Mutex.new 
    _@mutex.synchronize { yield }
  end

This should be safe, right?

mighe · 2014-06-15T18:22:37Z

I agree with @chrisseaton, that's exactly why I want to see a defined memory model in Ruby 👍

@pitr-ch, your first example is right assuming ( --> the leap of faith!) Ruby follows an happens-before memory model (HBMM), like almost every other modern language.
Since c is fully built before the thread, an HBMM ensures visibility and ordering.
The following code is trickier

@x = nil

Thread.new { @x = C.new } #a
Thread.new { do_something(@x) } #b

Under Java ( @chrisseaton correct me if I'm wrong ) we have some guarantees for that case, for example that @x is either nil or assigned to an instance of C and every its final field is visible.
Under Ruby we cannot say anything without focusing on the interpreter.

I suppose that:

under the MRI @x is just nil or assigned, no visibility or ordering issues
under JRuby we have to think hard to its MM, I believe that @x is atomically assigned but we can miss the mutex inside it
under Rubinius... no idea!

With atomic_attr or constructor guarantee we would have some basis to reason about concurrency (and it would not be easy anyway!), but at the moment it is just too hard and implementation dependant (--> nearly impossible)

chrisseaton · 2014-06-15T18:44:59Z

@x = nil

Thread.new { @x = C.new } #a
Thread.new { do_something(@x) } #b

I don't think this is safe - @x is already allocated, so the write is not volatile due to expanding the instance variable table so we have no ordering guarantees at all here.

pitr-ch · 2014-06-15T19:34:56Z

What I am trying to figure out is how to do this now safely without attr_reader :x, atomic: true and without leaps of faith. That's why I've added the second example using JRuby.reference0(self).synchronized { yield } to avoid writing and reading mutex form a non-volatile @mutex variable on JRuby. For MRI the fallback should be correct provided that @thedarkone is right in this comment https://bugs.ruby-lang.org/issues/8259#note-23 and ivars in MRI are volatile/atomic already.

I think this is rather important for this gem, we are trying to be a toolset for others to build larger concurrent libraries and we should have reasonable guaranties that it'll work. I've quickly scanned the code and I think we have too many leaps of faith there 😰.

pitr-ch · 2014-06-16T13:44:27Z

Sorry it was not obvious, I would like to ask you if you could confirm or disprove the second example with #synchronize method. You guys know way more than I do right now about this stuff.

mighe · 2014-06-16T13:59:51Z

Assuming JRuby.reference0(self) returns the underlying Java Object and the synchronize method is invoked on it and that JRuby fully instantiates that Java instance before the ruby constructor... yes, I think your code is correct because it should behave like its Java counterpart where a similar code is ok.

As usual I'm assuming something I cannot prove in this moment (once I tried to look inside JRuby source code but it is really big!)

chrisseaton · 2014-06-16T14:02:33Z

It would hesitantly that one bit does look ok for CRuby - the object shouldn't be visible to anyone else until the constructor returns. But is 'looks ok' the standard we're aiming for?

If I was going to be difficult, I'd point out that you could get access to the object before synchronize is called by the constructor via ObjectSpace, and call synchronize externally and cause two Mutex objects to be allocated. But that might be a case of 'don't do that'.

pitr-ch · 2014-06-17T15:31:10Z

@mighe following links are confirming what you said:

mighe · 2014-06-18T06:11:27Z

Perfect, this could be the building block for atomic_attr

I imagine our code to become:

class XXX
  atomic_att :mutex

  def initialize
    mutex = Mutex.new
  end

  def xxx
    mutex.lock
    ...
    mutex.unlock
  end
end

pitr-ch · 2014-06-18T09:19:18Z

I've looked a little bit into JRuby's code, I did not find anything suggesting that there would be a any visibility guarantee in the constructor, see https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/RubyClass.java#L785-L818.
It allocates the object (.allocate method), it calls #initialize method then it returns the new object without any synchronization. Therefore I think we should fix in next concurrent-ruby using the Java's #synchronized method, JRuby.reference0(self).synchronized { yield }.

dre-hh · 2014-07-16T08:05:00Z

So allright, the JVM can reorder the alloc and initialize calls. But it shouldn't be a leap of faith, that it's reordering still guarantees, that it does this in a way that an instantiated object has all it's properties set. I just can't imagine any usecase, where an initialize method would be called concurrently. You first initialize the Actors, then you pass them as reference each other.

Let's look at the code from the Celluloid Dining Philosophes example

names = %w{Heraclitus Aristotle Epictetus Schopenhauer Popper}

philosophers = names.map { |name| Philosopher.new(name) }

waiter = Waiter.new # no longer needs a "capacity" argument
table = Table.new(philosophers.size)

philosophers.each_with_index do |philosopher, i| 
  # No longer manually create a thread, rely on async() to do that for us.
  philosopher.async.dine(table, i, waiter) 
end

Are you saying that if i pass a Waiter actor to the Philosopher actor, i am doing a high leap of faith in trusting that the jvm has already instantiated it's mutex? This would mean i can't even trust on this in a single threaded app.

thedarkone · 2014-07-16T11:50:48Z

So allright, the JVM can reorder the alloc and initialize calls.

@dre-hh I don't think anybody claimed that.

I am doing a high leap of faith in trusting that the jvm has already instantiated it's mutex? This would mean i can't even trust on this in a single threaded app.

Lets start with a simple class:

class Foo
  def initialize
    @mutex = Mutex.new
  end

  def synchronize(&block)
    @mutex.synchronize(&block)
  end
end

And a simple single threaded script:

foo = Foo.new
foo.synchronize {}

What actually happens under-the-covers in machine code:

foo = Foo.allocate
foo.initialize
foo.synchronize {}

Let's add another thread:

foo = nil

Thread.new do # secondary thread
  true until foo # wait for `foo` to be created by the main thread
  foo.synchronize {}
end

foo = Foo.allocate
foo.initialize

Here's a possible legal execution:

Main thread	Secondary thread
	busy loop: `true until foo`
`foo = Foo.allocate`
	`true until foo` busy loop ends, because `foo` is no longer `nil`
	`foo.synchronize {}` results in `nil.synchronize {}` (`@mutex` i-var hasn't had a value assigned to it yet, so it is `nil`), a `NoMethodError` is raised
`foo.initialize`

Note that at no point was any code reordered by JVM.

dre-hh · 2014-07-16T13:25:33Z

foo = nil

Thread.new do # secondary thread
  true until foo # wait for `foo` to be created by the main thread
  foo.synchronize {}
end

bar = Foo.new
foo = bar

Problem solved, right? :)

idk, i think it is parnaoid to add this specific JVM synchronized blocks as @pitr-ch seem to be implementing in #119

Nobody would or should ever do this with an Actor framework. You do not communicate by sharing state, you share state by communicating.

actor  = Actor.new
actor.start_acting_on(foo)

Unfortionatly i had no chance to look deep into the source of concurent-ruby in detail. Guess this problem has to do with something, how the state is passed within the lib.

pitr-ch · 2014-07-16T13:37:08Z

@thedarkone: would modification of your example to:

require 'jruby'
java_import org.jruby.util.unsafe.UnsafeHolder

module EnsureNewVisibility
  def new(*)
    super
  ensure
    UnsafeHolder.store_fence # assuming JAVA 8
  end
end

class Foo
  extend EnsureNewVisibility
  def initialize
    @mutex = Mutex.new
  end

  def synchronize(&block)
    @mutex.synchronize(&block)
  end
end

fix the issue? I believe it would. I am trying to add the EnsureNewVisibility os something similar to concurrent-ruby.

dre-hh · 2014-07-16T13:47:33Z

So to get to @pitr-ch first sync example

c = C.new
Thread.new do
  c.a # this may read @mutex with value nil, correct?
end

According to this info, the answer is no. This may never read @mutex with value nil, because no reordering ever happens.

It is only a problem if one would modify some state, after having already passed this to a thread, which is a side effect and should be avoided or at least protected by the already initialized mutex

jdantonio · 2014-07-16T13:53:56Z

@dre-hh Thank you for your feedback. I really appreciate you taking the time to look at what we're working on and share your thoughts.

I'm not going to comment on the specifics of the synchronization issue as the other team members are more knowledgeable than I am. But I'd like to talk briefly about this comment:

Nobody would or should ever do this with an Actor framework.

This isn't the first time we've had a discussion regarding what a user should and should not do and it's always an interesting trade-off. One way to reframe the issue is with the question: What guarantees should we (the gem authors) provide? Since the Ruby language doesn't have a memory model there are few guarantees given to us by the language itself. We've taken the general approach with this gem that we would like to overcome this limitation in Ruby as much as reasonably possible by providing guarantees within the gem. In many cases we've had to jump through a number of metaphorical hoops to provide those guarantees. But so far we feel the gem is better off for it. In this particular case we may decide that the effort required to provide the guarantee isn't warranted, but I think it's important that we at least explore the possibility.

pitr-ch · 2014-07-16T14:03:04Z

@dre-hh bar = Foo.new, foo = bar basically equals foo = Foo.allocate; foo.initialize there is no synchronization added it suffers the same problem.

The actor framework is concurrency abstraction to allow users not to think about problems discussed here in this thread and in #119. But as such the actor framework needs to solve internally the synchronization and visibility issues correctly.

dre-hh · 2014-07-16T14:15:25Z

@pitr-ch Dammit! JVM is too smart. But your first examle should be ok

As long as you are first initializing the mutxes and then passing them all Threads should see the mutex . If you are somehow injecting the mutex into shared state, this is a problem

pitr-ch · 2014-07-16T14:36:18Z

c = C.new
Thread.new do
  c.a # this may read @mutex with value nil, correct?
end

is not the same as the @thedarkone's example, because Thread spawning adds some memory fences. So lets consider:

foo = nil

looper = Thread.new do # secondary thread
  true until foo # wait for `foo` to be created by the main thread
  foo.synchronize {}
end

foo = Foo.new
# foo = bar # omitting, has no effect

There are two things which need to be distributed through processors caches from one core to another core: the value in foo and the value in @mutex. If foo value is distributed first the looper may stop looping and read old value in @mutex before the second value also distributes.

There may be some detail in JRuby or in JavaMemoryModel which I am missing that would allow me to consider Foo.new safe so far I found nothing while digging through JRuby source code and be reading JMM. So I am inclined to use the synchronization or memory fences to get some guarantee.

pitr-ch · 2014-07-16T14:39:37Z

@dre-hh could you link the comment including the 'first example'? not sure what you are referring too.

dre-hh · 2014-07-16T15:20:37Z

Some misunderstanding here. I see how @thedarkone's example is conserning. And actually this is because it writes on a state after having shared it.

I was referring to your original Synchronisation pattern

c = C.new
Thread.new do
  c.a # this may read @mutex with value nil, correct?
end

This is different, as you exactly pointed out.

You first define a state and then share, which is totally fine.

Of course we should rely that no reordering happens between initializing C with it's dependencies and reading it within a new thread.

How otherwise could even a singlethreaded JRuby Programm could guarantee correctnes?

So,

GIVEN: no reordering between c.alloc;c.init;Thread.spawn happens

Then: spawned thread can See all initialized i-vars of c

pitr-ch · 2014-07-16T16:29:48Z

@dre-hh Agree.

Actor implementation in concurrent-ruby works differently then celluloid or some other implementations. All actors are running on a thread-pool. The actor objects can be created on any thread from the thread pool and each message can be also processed on any thread. This allows to create and discard actors fast and in thousands that would not be feasible e.g. in Celluloid.

On the other hand that creates a real example where the issue discussed may occur. What do you thing (@dre-hh, @thedarkone, @mighe) about the #102 (comment) example? That combined with two other fences: load_fence when starting to execute an actor context on a thread, and store_fence after the execution is done (for given message) should remove any race conditions, because any changes are being done only within these executions.

thedarkone · 2014-07-16T20:52:56Z

bar = Foo.new
foo = bar

Problem solved, right? :)

Right, 😜 (sarcasm detection over Internet is an unsolved problem 😁).

@pitr-ch ad EnsureNewVisibility... well, technically that is not enough, since barriers have to be paired and so there also must be a corresponding read barrier somewhere... I'm not knowledgeable enough to authoritatively a assert that your code is broken or not.

c = C.new
Thread.new do
  c.a # this may read @mutex with value nil, correct?
end

this may read @mutex with value nil, correct?

Incorrect since spawning a thread (in JVM semantics) has special guarantees (spawned thread sees all preceding memory writes, same applies for thread joining etc). In other words c.a will not see a partially initialized object.

That combined with two other fences: load_fence when starting to execute an actor context on a thread, and store_fence after the execution is done (for given message) should remove any race conditions, because any changes are being done only within these executions.

Yes, with properly paired fences we're fine.

@pitr-ch on a practical side, given current state of JRuby's JIT/other internals (ie: JRuby i-vars code is full of volatile if I recall correctly), I don't think it is possible to see un-initialized objects or out-of-order execution of Ruby-level code in general (this doubly applies when running JRuby on x86). "Hopefully" this is going to change in the future, for example once @chrisseaton's Truffle/Graal backend is complete enough, I'm writing "hopefully" because that would mean JRuby's JIT would be at completely different level to what it is today (in other words I don't think out-of-order execution effects of Ruby-level code will appear until JRuby gets at least 10-20x times faster at executing Ruby code).

pitr-ch · 2014-07-18T11:47:34Z

Ah sorry, but I know what we need! A sarcasm sign :)

AFAIK java also uses just store-fence after final field as mentioned without a pair load-fence.

Qualified final fields of a class have a store barrier inserted after their initialisation to ensure these fields are visible once the constructor completes when a reference to the object is available.
_ taken from http://java.dzone.com/articles/memory-barriersfences_

Any other thread will be reading the value for the first time so it will load the value always from memory, at least that's my understanding.

Yeah c = C.new; Thread.new { c.a } was a bad example. I've forgot about the synchronization on Thread spawning there.

I've dig through JRuby ivar implementations few days back. The ivars are stored in public field https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/RubyObjectVar0.java#L47 and I could not find synchronization anywhere in the readers or writers. Just the changes in ivar-table are synchronized or handled by CAS (using a volatile version counter). I was not able to produce any Ruby code that would blow up on this race conditions, nevertheless I would be hesitant not to synchronize the Actor implementation. I would like it to be robust and I cannot ignore this race even the chance may be slim.

I guess It would be helpful to see the JITed machine code for the example to be able to check if there are or not any fences, but I do not know how to do it (yet, trying to find out).

@thedarkone what did you mean by '(this doubly applies when running JRuby on x86)'?

pitr-ch · 2014-07-19T08:56:57Z

@thedarkone ah, I just read this article it probably answers my question about x86. x86 has stronger memory model then other architecture, did not know that.

chrisseaton · 2014-07-20T14:23:17Z

I'll restate that I think this entire conversation is unfortunately a bit of a stab in the dark. I use the Java memory model in my PhD research and I work for Oracle on VM research, and I am still not confident to say pretty much anything about how it interacts with JRuby.

I would suggest that a better approach would be to stop trying to guess the memory model, ignore fences, and give the user the rule 'do not share objects between threads without some kind of synchronisation'. Then we can be happy that as long as that is followed we're safe. Then after a 1.0 we could very slowly and very careful start to remove some of the locks we need when we find they're limiting performance.

However if we do really want fences, instead of foo.synchronize {}, why we don't we think about creating some memory fence primitives? They're available in JDK 8 for JRuby. I'm sure Rubinius has some kind of memory fence primitives (surely it needs them for internal use?). In MRI we can resort to foo.synchronize {}.

@pitr-ch Contact me offline if you want to know how to look at generated machine code and I'll point you in the right direction.

pitr-ch · 2014-07-21T06:21:32Z

@chrisseaton I've probably got carried away sometimes in this discussion exploring what is possible, sorry about that, but I had in mind that I have to be very conservative when it comest to the final solution for concurrent-ruby. In the end I've already pushed this commit b17b70a for Actors.

I've chosen the Java's synchronization mechanism JRuby.reference0(self).synchronized { yield } to synchronize whole #initialize method and each Actors execution (since it is changing threads between message processing). Main reason is that there is no guarantee that ivars are visible after #initialize is called. Based on my knowledge this should be the safest option right now.

Fences can be made accessible on JRuby quite easily:

      require 'jruby'
      java_import org.jruby.util.unsafe.UnsafeHolder
      UnsafeHolder::SUPPORTS_FENCES or raise NotImplementedError
      UnsafeHolder.store_fence
      UnsafeHolder.load_fence
      UnsafeHolder.full_fence

but then I was not sure about Rubinius and MRI, or how to get similar behavior on Java 7 (probably volatile write/read into/from some variable?). So classic synchronize was safer.

Could you elaborate on "In MRI we can resort to foo.synchronize {}."? I am not sure what you had in mind?

def store_fence
  @mutex.synchronize { } # forces both fences
end

or something else?

@chrisseaton I'll do that. Thanks for the offer.

chrisseaton · 2014-07-21T13:38:09Z

Yeah I meant like your store_fence, but now I've realised of course that's redundant as in MRI the GIL will cause a full fence anyway - see I have no idea what I'm going on about either. store_fence could in fact be a noop in MRI.

headius and others added 30 commits June 7, 2010 22:09

First rev of Atomic for MRI and JRuby.

76f8137

Yak-shaving for release, plus impl fixes to work properly on 1.8 or 1…

d94363f

….9 (Thread.exclusive is 1.9-only)

simplify (yay for late binding)

87091a4

expand tests a bit

0e439ec

move thread require to top

ac08f56

return the new value from update, rather than the old one

5de52ed

add Atomic#swap

0f1b4b6

Additional minor tweak: require 'thread' regardless of impl. Update t…

6446bf0

…o 0.0.2.

New version for JRuby that uses a real extension for its InternalRefe…

186fb3c

…rence to avoid problems with JI re-coercing the object and breaking identity.

full portability

ef59b6c

C extension for MRI/1.9

44df3ce

Add ext dir for Java stuff.

f53b928

Make try_update nonblocking using Mutex#try_lock in the default Inter…

ecb44fe

…nalReference#compare_and_set.

initialize cInternalReference

a9e1a03

nodoc InternalReference

617c378

Bump to 0.0.4 for release. Fixes #2.

93975d8

Corrections to gemspec

6cb3222

Fix rb_mark => rb_gc_mark

2198442

Experiment with system CAS operations to make ir_compare_and_set real…

606cacd

…ly atomic

Merge pull request #4 from kim/master

2c5b731

Native CAS for C Rubies

Bump version.

0c7896b

Added simple benchmarks

18bb1ba

Merge pull request #5 from zimbatm/master

bcee6ec

Just added some benchmarks

Parallel benchmark added

755065a

it may need a little bit of tweaking but the base is there

Merge pull request #6 from zimbatm/master

f35615b

Parallel benchmark added

Move benchmark to examples/

c609f35

Update to version 0.0.6.

4d76b32

Split gem into -java and normal versions, so JRuby doesn't attempt to…

8171865

… build native ext it will never use.

More benchmark tools

3477f07

* bench_atomic_1 is used to measure 1 combination * graph_atomic is used to run series, using bench_atomic_1 The idea is to calculate series by varying one parameter which gives much more meaninful results.

This was referenced Jun 25, 2014

Remove 'leaps of faith' from our constructors #119

Closed

Safe final immutable objects #120

Closed

Merge ruby-atomic #102

Merge ruby-atomic #102

Uh oh!

Conversation

jdantonio commented May 30, 2014

FOR REVIEW -- DO NOT MERGE

Uh oh!

pitr-ch commented Jun 15, 2014

Uh oh!

mighe commented Jun 15, 2014

Uh oh!

chrisseaton commented Jun 15, 2014

Uh oh!

pitr-ch commented Jun 15, 2014

Uh oh!

pitr-ch commented Jun 16, 2014

Uh oh!

mighe commented Jun 16, 2014

Uh oh!

chrisseaton commented Jun 16, 2014

Uh oh!

pitr-ch commented Jun 17, 2014

Uh oh!

mighe commented Jun 18, 2014

Uh oh!

pitr-ch commented Jun 18, 2014

Uh oh!

dre-hh commented Jul 16, 2014

Uh oh!

thedarkone commented Jul 16, 2014

Uh oh!

dre-hh commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 16, 2014

Uh oh!

dre-hh commented Jul 16, 2014

Uh oh!

jdantonio commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 16, 2014

Uh oh!

dre-hh commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 16, 2014

Uh oh!

dre-hh commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 16, 2014

Uh oh!

thedarkone commented Jul 16, 2014

Uh oh!

pitr-ch commented Jul 18, 2014

Uh oh!

pitr-ch commented Jul 19, 2014

Uh oh!

chrisseaton commented Jul 20, 2014

Uh oh!

pitr-ch commented Jul 21, 2014

Uh oh!

chrisseaton commented Jul 21, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants