• Type: Improvement
    • Status: Closed
    • Priority: Medium
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 1000
    • Component/s: Compiler: Baseline
    • Labels:


      We have two different ways of performing baseline compilation currently, both have in common using memory for the operand stack. The PowerPC baseline compiler tracks values on the stack so that it can determine when a slot is an object reference (and handle 64bit object references appropriately). The IA32 baseline compiler is written around using the IA32 push and pop instructions, which are very compact and have some special decoding properties. The IA32 baseline compiler doesn't track the size of object references, which is a problem with the migration to x86_64. A problem for both compilers is the memory operand stack, which IA32's clever forwarding can handle reasonably well but for the most recent Power processor is a disaster (or so I hear).

      We should really be doing expression folding in the baseline compiler. We already have the overhead of maintaining a stack to track where object references are, it's not a great leap to extend this to hold information on constant values or registers containing values. The basic idea with expression folding is that it should only emit code when it has to. For the following code:

      0: iload 0
      1: iconst 10
      2: iadd
      3: istore 0

      on Intel we would generate something like:

      0: push [SP+offset]
      1: push 10
      2: pop eax; pop edx; add eax, edx; push eax
      3: pop [SP+offset]

      and on PowerPC something like:

      0: store l0, SP+offset
      1: store 10, SP+offset
      2: load r1, SP+offset; load r2, SP+offset; add r1, r1, r2; store r1, SP+offset
      3: load r1, SP+offset; move r1, l0

      if we tracked the values on the stack then we should do something like (on both PPC and IA32):

      0: // place l0 in expression stack
      1: // place 10 in expression stack
      2: add t0, l0, 10 // acquire temporary add l0 to 10 (assume l0 is in a register) and push t0 onto expression stack
      3: move l0, t0

      The number of memory accesses in the folded version is 0 and the number of instructions very much less (and on Intel the size of the instructions is smaller). In order for this change to work we need to know basic block (BB) boundaries, which we do anyway for GC map creation. So to support this change we need to reorganize the baseline compiler. We should have methods of emitADD_reg_local_local, emitADD_reg_local_imm, etc. rather than just emitADD. As the expression stack tracking would be identical on PowerPC and IA32 we should reuse it. We should also think about x86 register usage and x86_64 support when implementing the x86_64 version.


          Issue links



              • Assignee:
                ianrogers Ian Rogers
                ianrogers Ian Rogers
              • Votes:
                0 Vote for this issue
                0 Start watching this issue


                • Created: