Uploaded image for project: 'JikesRVM'
  1. JikesRVM
  2. RVM-953

Immix fails with unclear error message in builds without assertions when hardcoded thread limit is exceeded

    Details

    • Type: Bug
    • Status: Closed
    • Priority: High
    • Resolution: Fixed
    • Affects Version/s: hg tip, 3.1.2
    • Fix Version/s: 3.1.4
    • Component/s: MMTk
    • Labels:
      None
    • Environment:

      x86_64;production configuration;ubuntu 3.0.0-15-server-offcore; Intel(R) Xeon(R) CPU E7- 4830 @ 2.13GHz

      Description

      The attached test case is executed with the following parameters:

      -Xms1000m -Xmx2000m -X:gc:verbose=2

      The same test works fine with Jikes 32-bit

      [Full heap][GC 1 Start 582.88 ms 774056KB
      Fatal error: ArrayIndexOutOfBoundsException within uninterruptible region (index was 27).

      Fatal error: ArrayIndexOutOfBoundsException within uninterruptible region (index was 26).
      Exception in GC thread

      Fatal error: ArrayIndexOutOfBoundsException within uninterruptible region (index was 21).
      Thread 30: VM.sysFail(): We're in a (likely) recursive call to VM.sysFail(), 2 deep
      sysFail was called with the message: Exiting virtual machine due to uninterruptibility violation.
      Exception in GC thread

      Fatal error: ArrayIndexOutOfBoundsException within uninterruptible region (index was 18).
      Exception in GC thread
      Died in GC:
      Exception in GC thread
      Thread 25: VM.sysFail(): We're in a (likely) recursive call to VM.sysFail(), 3 deep
      sysFail was called with the message: Exiting virtual machine due to uninterruptibility violation.
      Exception in GC thread
      Died in GC:
      Thread 22: VM.sysFail(): We're in a (likely) recursive call to VM.sysFail(), 4 deep
      sysFail was called with the message: Exiting virtual machine due to uninterruptibility violation.
      Exception in GC thread

      Fatal error: ArrayIndexOutOfBoundsException within uninterruptible region (index was 28).
      Exception in GC thread
      Exception in GC thread
      Thread 32: VM.sysFail(): We're in a (likely) recursive call to VM.sysFail(), 5 deep
      sysFail was called with the message: Exiting virtual machine due to uninterruptibility violation.
      Exiting virtual machine due to uninterruptibility violattion.

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            dgrove David Grove added a comment -

            Discussion on researchers list (https://sourceforge.net/mailarchive/forum.php?thread_name=CAAqwin7ohB_oZ1yGw%3Dfj8ER1i0Zhg1to%2BqqpwjcBtJD%2BtW02rA%40mail.gmail.com&forum_name=jikesrvm-researchers) suggests that this may be due to hard-coded limit in Immix on # of GC threads being exceeded (and not caught because production disables assertions).

            Show
            dgrove David Grove added a comment - Discussion on researchers list ( https://sourceforge.net/mailarchive/forum.php?thread_name=CAAqwin7ohB_oZ1yGw%3Dfj8ER1i0Zhg1to%2BqqpwjcBtJD%2BtW02rA%40mail.gmail.com&forum_name=jikesrvm-researchers ) suggests that this may be due to hard-coded limit in Immix on # of GC threads being exceeded (and not caught because production disables assertions).
            Hide
            jsinger Jeremy Singer added a comment -

            Attached patch rvm953.diff
            checks for a hard-coded limit
            on num GC threads, and reports a
            warning to user if limit is
            exceeded.

            Not sure how elegant this solution
            is? Thoughts?

            Show
            jsinger Jeremy Singer added a comment - Attached patch rvm953.diff checks for a hard-coded limit on num GC threads, and reports a warning to user if limit is exceeded. Not sure how elegant this solution is? Thoughts?
            Hide
            dgrove David Grove added a comment -

            bulk defer to 3.1.4

            Show
            dgrove David Grove added a comment - bulk defer to 3.1.4
            Hide
            ebrangs Erik Brangs added a comment -

            I ran into this problem on a POWER7 machine (which appears to the OS as having 64 cores).

            As somebody not familiar with garbage collection, I'd expect the garbage collector to scale gracefully to a higher number of cores (or at least not fail if more cores are available than expected). If that's not possible, I'd suggest that we do something similar to Jeremy Singer's patch. However, I'd prefer it if the code failed fast and hard (e.g. using Assert.fail(..)).

            Show
            ebrangs Erik Brangs added a comment - I ran into this problem on a POWER7 machine (which appears to the OS as having 64 cores). As somebody not familiar with garbage collection, I'd expect the garbage collector to scale gracefully to a higher number of cores (or at least not fail if more cores are available than expected). If that's not possible, I'd suggest that we do something similar to Jeremy Singer's patch. However, I'd prefer it if the code failed fast and hard (e.g. using Assert.fail(..) ).
            Hide
            ebrangs Erik Brangs added a comment -

            In light of the renewed discussion of this issue on the mailing lists, I'm attaching a new patch (based on Jeremy Singer's patch) that sketches a fix for the thread limit in Immix collectors. I'm not sure if it's the right approach but it seemed to work during my (very limited) testing.

            Show
            ebrangs Erik Brangs added a comment - In light of the renewed discussion of this issue on the mailing lists , I'm attaching a new patch (based on Jeremy Singer's patch) that sketches a fix for the thread limit in Immix collectors. I'm not sure if it's the right approach but it seemed to work during my (very limited) testing.
            Hide
            ebrangs Erik Brangs added a comment -

            I want to have this issue fixed for 3.1.4. If nobody else wants to work on it, I'll commit a fix (probably next weekend).

            Show
            ebrangs Erik Brangs added a comment - I want to have this issue fixed for 3.1.4. If nobody else wants to work on it, I'll commit a fix (probably next weekend).
            Hide
            ebrangs Erik Brangs added a comment -

            Fixes in 98a92919ec5ae39c8b4299a9fa017ab2a7fa123e (10758) and 3bcaa000db6c0f279f7d69c9d53376d67f923cfa (10759).

            Show
            ebrangs Erik Brangs added a comment - Fixes in 98a92919ec5ae39c8b4299a9fa017ab2a7fa123e (10758) and 3bcaa000db6c0f279f7d69c9d53376d67f923cfa (10759).

              People

              • Assignee:
                ebrangs Erik Brangs
                Reporter:
                albert.noll@inf.ethz.ch Albert Noll
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: