[jruby-dev] GSoC 2014

Discussion:

Andrea Francesco Iuorio

2014-02-12 12:56:59 UTC

Good morning, i' m a student in Computer Science and i could be interested in one of your projects for the Google Summer of Code. I have some experience with compilers and virtual machines since my thesis project involved the creation of a compiler for the JVM and i want to improve my skills partecipating with the Truffle project or the IR project. How can i contact these projects' mentors for more details ?
Andrea Francesco IuorioStudent in Computer Science, Università degli Studi di Milanoandreafrancesco.iuorio-1ViLX0X+***@public.gmane.org - GPG Key

Chris Seaton

2014-02-12 14:21:33 UTC

Permalink

Hello Andrea,

If you'd like to chat about the Truffle project, chat to me on Skype
(chrisgrahamseaton), Google Chat (chrisgseaton-***@public.gmane.org) or JRuby IRC and
I'll give you some more details.

Chris

On 12 February 2014 12:56, Andrea Francesco Iuorio <

Post by Andrea Francesco Iuorio
Good morning, i' m a student in Computer Science and i could be interested
in one of your projects for the Google Summer of Code. I have some
experience with compilers and virtual machines since my thesis project
involved the creation of a compiler for the JVM and i want to improve my
skills partecipating with the Truffle project or the IR project. How can i
contact these projects' mentors for more details ?
*Andrea Francesco Iuorio*
Student in Computer Science, UniversitÃ degli Studi di Milano

Charles Oliver Nutter

2014-02-12 20:59:37 UTC

Permalink

It's largely up to you which of the two projects sounds more
interesting, but Chris (already replied) or Subbu (probably will reply
soon) are your contacts for Truffle and IR, respectively. I would say
that from a compiler perspective, IR probably has more work and
subprojects for you to help with, but Truffle is a very different and
interesting approach to language optimization too.

I'm also happy to answer questions here, on IRC, on Twitter (headius),
or via IM (headius-***@public.gmane.org on google talk).

- Charlie

On Wed, Feb 12, 2014 at 4:56 AM, Andrea Francesco Iuorio

Post by Andrea Francesco Iuorio
Good morning, i' m a student in Computer Science and i could be interested
in one of your projects for the Google Summer of Code. I have some
experience with compilers and virtual machines since my thesis project
involved the creation of a compiler for the JVM and i want to improve my
skills partecipating with the Truffle project or the IR project. How can i
contact these projects' mentors for more details ?
Andrea Francesco Iuorio
Student in Computer Science, Università degli Studi di Milano

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Subramanya Sastry

2014-02-13 05:29:21 UTC

Permalink

Hi Andrea,

A quick overview of the Intermediate Representation for JRuby before
talking about possible projects.

This IR has been designed with the following goals in mind:

1. Capture Ruby semantics as accurately as possible without losing
information.
2. Expose primitive operations (ex: a constant lookup involves
search-of-lexical-scope + search-of-class-inheritancy-hierarchy).
3. Be suitable for interpretation and replace the current AST-based
interpreter.
4. Perform optimizations that the JVM itself will not be able to do
directly (ex: lowering Ruby Floats to Java primitive floats, inline blocks
alongwith caller)
5. Generate readable serialized output (kind of like Ruby assembly) that
could be useful outside JRuby itself (something that we've been talking
more recently).
6. Ability to do safe offline optimizations and persist IR that can be
directly interpreted or JIT-ted without going through ruby source.
7. Be JITtable to other targets besides JVM bytecode: Dalvik for Rubuto and
more recently, Chris brought up the idea of possibly targeting Graal
directly without going through Truffle.

In this IR-based approach, all analyses and optimizations are done at the
level of individual scopes (mostly methods and blocks), and the goal is not
to do all the standard compiler optimizations but only those that will
reduce the semantic gap between Ruby and Java and make the generated code
look as much Java-like as possible so that the JVM (or other targets) can
then take it the rest of the way.

We are doing fairly well with goals 1. and 2. and are still continuing to
tweak our IR. We are trying to capture more of the JRuby runtime work into
IR primitives which can then be exposed for additional analysis and
optimizations either as part of 4. or something that the JVM itself can do.

Given that background, here are some possible specific project ideas
depending on what area you want to focus on. Some of these are more
experimental / open-ended and others are more concrete without any
surprises. Ideas 2, 4, 7 below are fairly well-defined. Ideas 3, 5 are
somewhat well-defined, but have some open-ended unresolved bits. We have
talked about idea 6 in various forms over time but never sat down to work
through details, but might not be too hard. Idea 1 may not fit in well with
the timeline, but Tom may have some sub-projects here. These are just some
initial project ideas as I tried to collate some of the many things we've
talked about over the last couple years.

1. Interpreter: Over the last 6 months, we've improved the performance of
the IR-based interpreter quite a bit, but it still lags the performance of
the AST-based interpreter (because there is a lot more state twiddling
happening with temporary variables and the like). Understanding this better
and plugging the holes would be one project. But, this is fairly open-ended
and this may not necessarily fit in with the GSoC timeline since we want to
get most of the gap narrowed in the next 3-4 months.

2. Compile IR to Dalvik: Right now, a JIT is in progress to compile the IR
to JDK 7. Testing this on Dalvik and compiling to it is an obvious
self-contained project. Lower priority (compared to Dalvik) is to compile
to other targets like Graal IR.

3. Profiling: JIT-ting and optimizations only make sense on hot code, and
some require additional information to be gathered (types). Designing
profiles and collecting them with low-overhead is the goal of this project.
In addition, some applications mutate code heavily (especially Rails). If
aggressive optimizations are done too early, they can be wasteful as
classes mutate. Profiling can also help with this by monitoring code
mutations, rate of change of code mutations, and use some metrics to figure
out when it is safe to do additional optimizations.

4. Method and closure inlining: Some basic code for inlining methods and
closures already exists in JRuby. But, this is just the inlining
transformation. There is no strategy yet as to when to inline, what to
inline, how much to inline, etc. This is somewhat tied to profiling (4.
above).

5. Exposing JRuby-native implementations of core classes for optimization:
For example, attr_reader, attr_writer, attr_accessor methods are
implemented as native Java classes. By exposing them as Ruby or IR methods,
JRuby can then potentially them inline them (and expose them as native java
object field load/stores). Similarly, with looping, iterator, enumeration
methods implemented as Java code.

6. Optimizing placement of guards. JRuby opts that reduce the semantic gap
between Ruby and Java and make Ruby look Java-like (which the JVM can opt
fairly well) will involve speculative optimizations (unboxing Ruby objects
to Java primitives, inlining of closures) based on assumptions about types
and unmutability of classes. JRuby will have to insert guards in the code
to protect against violations. Inserting these guards willy-nilly
everywhere is not the best way to handle this. Coarsening guards (on
method-entry) and combining guards (2 methods from same class get inlined)
and exploring other techniques would be the goal of this project.

7. SSA: So far, I have not implemented SSA since I figured it was not
important to do all the standard compiler opts. on the IR since the JVM (or
whatever target) will do a fairly good job of it as long as there isn't
anything that gets in the way (ex: objects instead of floats or fixnums,
calls to closures instead of method calls). But a SSA form could
potentially simplify some analyses currently implemented or might be
implemented later. So, in this project, you will build an SSA form and and
port some of our analyses to work on that.

Subbu.

On Wed, Feb 12, 2014 at 2:59 PM, Charles Oliver Nutter

Post by Charles Oliver Nutter
It's largely up to you which of the two projects sounds more
interesting, but Chris (already replied) or Subbu (probably will reply
soon) are your contacts for Truffle and IR, respectively. I would say
that from a compiler perspective, IR probably has more work and
subprojects for you to help with, but Truffle is a very different and
interesting approach to language optimization too.
I'm also happy to answer questions here, on IRC, on Twitter (headius),
- Charlie
On Wed, Feb 12, 2014 at 4:56 AM, Andrea Francesco Iuorio

Post by Andrea Francesco Iuorio
Good morning, i' m a student in Computer Science and i could be

interested

Post by Andrea Francesco Iuorio
in one of your projects for the Google Summer of Code. I have some
experience with compilers and virtual machines since my thesis project
involved the creation of a compiler for the JVM and i want to improve my
skills partecipating with the Truffle project or the IR project. How can

Post by Andrea Francesco Iuorio
contact these projects' mentors for more details ?
Andrea Francesco Iuorio
Student in Computer Science, Università degli Studi di Milano

---------------------------------------------------------------------
http://xircles.codehaus.org/manage_email

Andrea Francesco Iuorio

2014-02-14 18:10:08 UTC

Permalink

I took a quick look on your repository for understaing how IR generate bytecode. In practice, you created an object that rappresent the JVM and, using ASM, you implemented every single IR instruction with some other implementation details: the compiler, calling the function rappresenting that instruction, create the real bytecode. So, for a dalvik implementation one should create this object and implements every IR instruction using dex bytecode.

I think i' ll try to apply for the ir-dalvik project. I worked on a compiler for the jvm so i' m not completely new to this kind of project. Just one last question: do you have some template for the proposal or i can write it as i see fit ?
Andrea Francesco IuorioStudent in Computer Science, Università degli Studi di Milanoandreafrancesco.iuorio-1ViLX0X+***@public.gmane.org - GPG Key

Date: Wed, 12 Feb 2014 23:29:21 -0600
From: sss.lists-***@public.gmane.org
To: dev-***@public.gmane.org
Subject: Re: [jruby-dev] GSoC 2014

Hi Andrea,

A quick overview of the Intermediate Representation for JRuby before talking about possible projects.

This IR has been designed with the following goals in mind:

1. Capture Ruby semantics as accurately as possible without losing information.

2. Expose primitive operations (ex: a constant lookup involves search-of-lexical-scope + search-of-class-inheritancy-hierarchy).
3. Be suitable for interpretation and replace the current AST-based interpreter.
4. Perform optimizations that the JVM itself will not be able to do directly (ex: lowering Ruby Floats to Java primitive floats, inline blocks alongwith caller)

5. Generate readable serialized output (kind of like Ruby assembly) that could be useful outside JRuby itself (something that we've been talking more recently).
6. Ability to do safe offline optimizations and persist IR that can be directly interpreted or JIT-ted without going through ruby source.

7. Be JITtable to other targets besides JVM bytecode: Dalvik for Rubuto and more recently, Chris brought up the idea of possibly targeting Graal directly without going through Truffle.

In this IR-based approach, all analyses and optimizations are done at the level of individual scopes (mostly methods and blocks), and the goal is not to do all the standard compiler optimizations but only those that will reduce the semantic gap between Ruby and Java and make the generated code look as much Java-like as possible so that the JVM (or other targets) can then take it the rest of the way.

We are doing fairly well with goals 1. and 2. and are still continuing to tweak our IR. We are trying to capture more of the JRuby runtime work into IR primitives which can then be exposed for additional analysis and optimizations either as part of 4. or something that the JVM itself can do.

Given that background, here are some possible specific project ideas depending on what area you want to focus on. Some of these are more experimental / open-ended and others are more concrete without any surprises. Ideas 2, 4, 7 below are fairly well-defined. Ideas 3, 5 are somewhat well-defined, but have some open-ended unresolved bits. We have talked about idea 6 in various forms over time but never sat down to work through details, but might not be too hard. Idea 1 may not fit in well with the timeline, but Tom may have some sub-projects here. These are just some initial project ideas as I tried to collate some of the many things we've talked about over the last couple years.

1. Interpreter: Over the last 6 months, we've improved the performance of the IR-based interpreter quite a bit, but it still lags the performance of the AST-based interpreter (because there is a lot more state twiddling happening with temporary variables and the like). Understanding this better and plugging the holes would be one project. But, this is fairly open-ended and this may not necessarily fit in with the GSoC timeline since we want to get most of the gap narrowed in the next 3-4 months.

2. Compile IR to Dalvik: Right now, a JIT is in progress to compile the IR to JDK 7. Testing this on Dalvik and compiling to it is an obvious self-contained project. Lower priority (compared to Dalvik) is to compile to other targets like Graal IR.

3. Profiling: JIT-ting and optimizations only make sense on hot code, and some require additional information to be gathered (types). Designing profiles and collecting them with low-overhead is the goal of this project. In addition, some applications mutate code heavily (especially Rails). If aggressive optimizations are done too early, they can be wasteful as classes mutate. Profiling can also help with this by monitoring code mutations, rate of change of code mutations, and use some metrics to figure out when it is safe to do additional optimizations.

4. Method and closure inlining: Some basic code for inlining methods and closures already exists in JRuby. But, this is just the inlining transformation. There is no strategy yet as to when to inline, what to inline, how much to inline, etc. This is somewhat tied to profiling (4. above).

5. Exposing JRuby-native implementations of core classes for optimization: For example, attr_reader, attr_writer, attr_accessor methods are implemented as native Java classes. By exposing them as Ruby or IR methods, JRuby can then potentially them inline them (and expose them as native java object field load/stores). Similarly, with looping, iterator, enumeration methods implemented as Java code.

6. Optimizing placement of guards. JRuby opts that reduce the semantic gap between Ruby and Java and make Ruby look Java-like (which the JVM can opt fairly well) will involve speculative optimizations (unboxing Ruby objects to Java primitives, inlining of closures) based on assumptions about types and unmutability of classes. JRuby will have to insert guards in the code to protect against violations. Inserting these guards willy-nilly everywhere is not the best way to handle this. Coarsening guards (on method-entry) and combining guards (2 methods from same class get inlined) and exploring other techniques would be the goal of this project.

7. SSA: So far, I have not implemented SSA since I figured it was not important to do all the standard compiler opts. on the IR since the JVM (or whatever target) will do a fairly good job of it as long as there isn't anything that gets in the way (ex: objects instead of floats or fixnums, calls to closures instead of method calls). But a SSA form could potentially simplify some analyses currently implemented or might be implemented later. So, in this project, you will build an SSA form and and port some of our analyses to work on that.

Subbu.

Charles Oliver Nutter

2014-02-14 19:08:07 UTC

Permalink

When JRuby is accepted as a GSoC organization, you will be able to see us
there with a template. In general, we just want to see a clear path through
the summer, with periodic milestones, schedule, midterm and end goals, and
overall implementation plan.

IR for Dalvik would be a great project. We had a student attempt it some
years ago, but that work never completed and never functioned. It focused
on emitting JVM bytecode which then translated to Dalvik IR on
device...which ended up not being very effective (slow, required large
additional libs). The IR also changed a lot since then. I would recommend
emitting Dalvik IR directly.

Looking forward to having you on the team!

- Charlie (mobile)
On Feb 14, 2014 12:10 PM, "Andrea Francesco Iuorio" <

Post by Andrea Francesco Iuorio
I took a quick look on your repository for understaing how IR generate
bytecode. In practice, you created an object that rappresent the JVM and,
using ASM, you implemented every single IR instruction with some other
implementation details: the compiler, calling the function rappresenting
that instruction, create the real bytecode. So, for a dalvik implementation
one should create this object and implements every IR instruction using dex
bytecode.
I think i' ll try to apply for the ir-dalvik project. I worked on a
compiler for the jvm so i' m not completely new to this kind of project.
Just one last question: do you have some template for the proposal or i can
write it as i see fit ?
*Andrea Francesco Iuorio*
Student in Computer Science, UniversitÃ degli Studi di Milano
------------------------------
Date: Wed, 12 Feb 2014 23:29:21 -0600
Subject: Re: [jruby-dev] GSoC 2014
Hi Andrea,
A quick overview of the Intermediate Representation for JRuby before
talking about possible projects.
1. Capture Ruby semantics as accurately as possible without losing information.
2. Expose primitive operations (ex: a constant lookup involves
search-of-lexical-scope + search-of-class-inheritancy-hierarchy).
3. Be suitable for interpretation and replace the current AST-based interpreter.
4. Perform optimizations that the JVM itself will not be able to do
directly (ex: lowering Ruby Floats to Java primitive floats, inline blocks
alongwith caller)
5. Generate readable serialized output (kind of like Ruby assembly) that
could be useful outside JRuby itself (something that we've been talking
more recently).
6. Ability to do safe offline optimizations and persist IR that can be
directly interpreted or JIT-ted without going through ruby source.
7. Be JITtable to other targets besides JVM bytecode: Dalvik for Rubuto
and more recently, Chris brought up the idea of possibly targeting Graal
directly without going through Truffle.
In this IR-based approach, all analyses and optimizations are done at the
level of individual scopes (mostly methods and blocks), and the goal is not
to do all the standard compiler optimizations but only those that will
reduce the semantic gap between Ruby and Java and make the generated code
look as much Java-like as possible so that the JVM (or other targets) can
then take it the rest of the way.
We are doing fairly well with goals 1. and 2. and are still continuing to
tweak our IR. We are trying to capture more of the JRuby runtime work into
IR primitives which can then be exposed for additional analysis and
optimizations either as part of 4. or something that the JVM itself can do.
Given that background, here are some possible specific project ideas
depending on what area you want to focus on. Some of these are more
experimental / open-ended and others are more concrete without any
surprises. Ideas 2, 4, 7 below are fairly well-defined. Ideas 3, 5 are
somewhat well-defined, but have some open-ended unresolved bits. We have
talked about idea 6 in various forms over time but never sat down to work
through details, but might not be too hard. Idea 1 may not fit in well with
the timeline, but Tom may have some sub-projects here. These are just some
initial project ideas as I tried to collate some of the many things we've
talked about over the last couple years.
1. Interpreter: Over the last 6 months, we've improved the performance of
the IR-based interpreter quite a bit, but it still lags the performance of
the AST-based interpreter (because there is a lot more state twiddling
happening with temporary variables and the like). Understanding this better
and plugging the holes would be one project. But, this is fairly open-ended
and this may not necessarily fit in with the GSoC timeline since we want to
get most of the gap narrowed in the next 3-4 months.
2. Compile IR to Dalvik: Right now, a JIT is in progress to compile the IR
to JDK 7. Testing this on Dalvik and compiling to it is an obvious
self-contained project. Lower priority (compared to Dalvik) is to compile
to other targets like Graal IR.
3. Profiling: JIT-ting and optimizations only make sense on hot code, and
some require additional information to be gathered (types). Designing
profiles and collecting them with low-overhead is the goal of this project.
In addition, some applications mutate code heavily (especially Rails). If
aggressive optimizations are done too early, they can be wasteful as
classes mutate. Profiling can also help with this by monitoring code
mutations, rate of change of code mutations, and use some metrics to figure
out when it is safe to do additional optimizations.
4. Method and closure inlining: Some basic code for inlining methods and
closures already exists in JRuby. But, this is just the inlining
transformation. There is no strategy yet as to when to inline, what to
inline, how much to inline, etc. This is somewhat tied to profiling (4.
above).
For example, attr_reader, attr_writer, attr_accessor methods are
implemented as native Java classes. By exposing them as Ruby or IR methods,
JRuby can then potentially them inline them (and expose them as native java
object field load/stores). Similarly, with looping, iterator, enumeration
methods implemented as Java code.
6. Optimizing placement of guards. JRuby opts that reduce the semantic gap
between Ruby and Java and make Ruby look Java-like (which the JVM can opt
fairly well) will involve speculative optimizations (unboxing Ruby objects
to Java primitives, inlining of closures) based on assumptions about types
and unmutability of classes. JRuby will have to insert guards in the code
to protect against violations. Inserting these guards willy-nilly
everywhere is not the best way to handle this. Coarsening guards (on
method-entry) and combining guards (2 methods from same class get inlined)
and exploring other techniques would be the goal of this project.
7. SSA: So far, I have not implemented SSA since I figured it was not
important to do all the standard compiler opts. on the IR since the JVM (or
whatever target) will do a fairly good job of it as long as there isn't
anything that gets in the way (ex: objects instead of floats or fixnums,
calls to closures instead of method calls). But a SSA form could
potentially simplify some analyses currently implemented or might be
implemented later. So, in this project, you will build an SSA form and and
port some of our analyses to work on that.
Subbu.

Uwe Kubosch

2014-02-14 21:56:00 UTC

Permalink

Post by Andrea Francesco Iuorio
I think i' ll try to apply for the ir-dalvik project. I worked on a compiler for the jvm so i' m not completely new to this kind of project. Just one last question: do you have some template for the proposal or i can write it as i see fit ?

I’d be happy to mentor such an application, although I think my help would mainly be in packaging and testing. Proper help with the IR and compiler would have to come from another mentor.

--
Uwe Kubosch
http://ruboto.org/

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Andrea Francesco Iuorio

2014-02-19 19:28:14 UTC

Permalink

I write this proposal, how can i improve it in your opinion ? What should i change or add to it ?
Name: Andrea Francesco Iuorio
Location: Milan, Italy ( UTC+1 ) Contacts: ***@outlook.com, panzone(IRC), panzone91(Skype, Twitter).
Background: I' m student in Computer Science at the University of Milan. For my thesis project, i worked on a runtime layer for making portable and modular exception handling mechanisms and i developed a compiler for an object-oriented language targetizing the Java VIrtual Machine. For an assignement i also developed an AST-based compiler for a simple c-like language. My principal interest is low-level programming and i don' t be scared by some assembly or bytecode.
Summer plans: I' ll have my last exam on the last week of June. I shouldn' t take too much time to prepare it, but i probably take the exam day free. I also should take my degree on the first/second week of July, so i' ll take one ( probabily two, depending how the project is going ) day free in this period . Besides that, i don' t have any other plans and i could garantee at least 40 hours weekly, probably more.
Project name: A Dalvik backend for the JRuby Intermediate Rappresentation
Project description: JRuby developers are working on an Intermediate Rappresentation for the Ruby programming language. This approach have several advantages, like the possibility to capture precicely the Ruby semantic or the possibility to optimize code in some ways that a bytecode rappresentation could make difficult or even impossibile. Another big advantage is the possibility to utilize different backends, permitting different execution targets without change all the JRuby compiler. My idea is to develop a backend for the IR compiler that could generate bytecode for the Dalvik virtual machine.
Deliverables: The possibility to compile and execute JRuby code where the Dalvik virtual machine can run, like the Android OS.
Implementation Plan:I think i can divide this project in 3 subprojects:1) A compiler that could at least compile the basic operations in a single method, like types arithmetic, loops, conditional branches and so on.2) Extend the subproject 1 with method and object support3) Extend the subproject 2 with "special" features like multithreading and exception handlingAs one can see, i propose to try a bottom-up approach. This way we can have a sufficient subset of JRuby working quickly.
Project schedule:Because i have 3 subprojects and 3 months of work, this could be a good general schedule. I' ll mantain a direct contact with my mentor for deciding weekly goals, depending on the difficulties i could find during my work.
March -- 05-18I' ll use this period to learn the Dalvik bytecode and structure, JRuby IR and how it is implemented for learning where i must work and code guidelines. During this phase i' ll also discuss with my mentor the best way to generare Dalvik bytecode ( Should i generate a binary rappresentation using some external library like ASMDEX ? Should i generate directly raw binaries ? Should i generate Dalvik assembly and later using an assembler like smali ? There are some possibilities, with advantages and disadvantages that should be clear before start the entire project )
05-19 -- 06-15This period will be used to generate the base for the backend and the subproject 1: a compiler that could compile some single-method program. This means that at least i should implement the basic operations ( types arithmetics, loop, conditional branches ), collections and types.N.B. 06-23: Midterm evaluation. With this schedule, i' ll have an extra week if there are any problems during the developement and my mentor have some time to evaluate my work. My principal goal is to have a working subset of JRuby before the first evaluation.
06-16 -- 07-06This period wii be used for subproject 2 ( adding methods and objects support ).
07-07 -- 07-27This period will be used for subproject 3 ( generate some "special" operations like multithreading support and exception handling ).
I have one extra week that could flow into subproject 2 or 3 ( probably 3 since implementing multithreading could give some serious headaches and i don' t have much experience in this field )
I also take a two-week buffer for any unexpected problems. If i' ll not need it, i can use this time to optimize my code, write more tests or documentation and so on.

Date: Fri, 14 Feb 2014 22:56:00 +0100
Subject: Re: [jruby-dev] GSoC 2014

Id be happy to mentor such an application, although I think my help would mainly be in packaging and testing. Proper help with the IR and compiler would have to come from another mentor.
--
Uwe Kubosch
http://ruboto.org/
---------------------------------------------------------------------
http://xircles.codehaus.org/manage_email