Hi,
I have written a new prototype for ruby spark binding
https://github.com/chyh1990/jruby-spark
Although this implementation only works on JRuby, I think this approach is more promising:
- REAL closure/lambda serialization, with elegant syntax
https://github.com/chyh1990/jruby-spark/blob/master/examples/pagerank.rb
- use JVM infrastructure, run on YARN with the standard job submission workflow
- reuse Java/Scala API, we can get Streaming/SQL/GraphX support nearly for free
https://github.com/chyh1990/jruby-spark/blob/master/examples/sqltest.rb
- Easier to maintain even without merging into mainline spark
The prototype is preliminary, but the concept is proved. I think ruby would be a
more elegant binding language for spark than python. I'm looking forward for more
participants!
Hi,
I have written a new prototype for ruby spark binding
https://github.com/chyh1990/jruby-spark
Although this implementation only works on JRuby, I think this approach is more promising:
https://github.com/chyh1990/jruby-spark/blob/master/examples/pagerank.rb
https://github.com/chyh1990/jruby-spark/blob/master/examples/sqltest.rb
The prototype is preliminary, but the concept is proved. I think ruby would be a
more elegant binding language for spark than python. I'm looking forward for more
participants!