Uploaded image for project: 'JikesRVM'
  1. JikesRVM
  2. RVM-884

MMTk stats can be hard for humans to read

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Low
    • Resolution: Unresolved
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.4
    • Component/s: MMTk
    • Labels:
      None

      Description

      MMTk attempts to print its statistics in a single row like so:

      {{
      ============================ MMTk Statistics Totals ============================
      GC time.mu time.gc perf.mu perf.gc refType scan finalize prepare precopy stacks root forward release init finish L1I_MISSES.mu L1I_MISSES.gc
      7 6751.76 5881.21 0 0 99.35 5038.20 22.58 0.28 0.92 4.99 713.75 0.00 0.65 0.12 0.14 2330019756995 1443109010784
      Total time: 12632.98 ms
      ------------------------------ End MMTk Statistics -----------------------------
      }}

      Hopefully the problems with this approach are clear to see from a human readers perspective: i) Headers do not always line up with values, ii) It's even harder to read with many counters as the output becomes wider than your terminal

      Attached is a trivial patch that instead prints one statistic per line, like so:

      {{
      ============================ MMTk Statistics Totals ============================
      GC: 7
      time.mu: 6872.53
      time.gc: 5854.95
      perf.mu: 0
      perf.gc: 0
      refType: 101.65
      scan: 5008.27
      finalize: 22.64
      prepare: 0.30
      precopy: 0.89
      stacks: 4.79
      root: 715.20
      forward: 0.00
      release: 0.66
      init: 0.16
      finish: 0.18
      L1I_MISSES.mu: 1569810545957(SCALED)
      L1I_MISSES.gc: 951335255621(SCALED)
      Total time: 12727.48 ms
      ------------------------------ End MMTk Statistics -----------------------------
      }}

      As a human I certainly prefer the second output, however I have no idea how many scripts this change would break. If a consensus can be reached and more human readability is desired then perhaps this patch can be applied.

      This work is motivated by another patch that I am about to submit that increases the number of statistics MMTk reports (thus increasing problems of wide output)

      Kind regards
      Laurence

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            steveblackburn Steve Blackburn added a comment -

            Hi Laurence,

            Yeah, I understand how the perf counters work

            I'm just talking about the formatting. We currently have a descriptive field. I was suggesting that you use that descriptive field to convey this information rather than introduce a new field to the format.

            --Steve

            Show
            steveblackburn Steve Blackburn added a comment - Hi Laurence, Yeah, I understand how the perf counters work I'm just talking about the formatting. We currently have a descriptive field. I was suggesting that you use that descriptive field to convey this information rather than introduce a new field to the format. --Steve
            Hide
            rejones Richard Jones added a comment -

            I agree that machine-readability is the most important factor here. But I'd hope that we can come up with some format that is also easy for people to read. I think Laurence has a good point about the current stats being hard to read sometimes. Maybe they are also slightly tricky for a program to read (e.g. it must read the heading row and then use that to interpret subsequent rows). But as Steve says, cut works as well.

            The problem with Andreas's solution is that it overloads tabs, using them to separate records and fields. I'd argue for a format that distinguished these, e.g. by using =s, commas and tabs.

            Steve asked, why introduce a new field, e.g. ("SCALED"), rather than place it in the name of the value. I'd argue against that as SCALED is just another attribute like the value.

            For example (putting spaces around \t here just to make it a little easier to read),

            GC=7 \t time.mu=6872.53 \t time.gc=5854.95 \t L1I_MISSES.gc=951335255621,SCALED \t Total time=12727.48

            This format is easily parsable, e.g. with 3 lines of perl. It's also amenable to cut (assuming that the order is fixed). It's pretty human readable. although output may flow over several lines on the screen), and names and other attributes are tightly tied together.

            Richard

            Show
            rejones Richard Jones added a comment - I agree that machine-readability is the most important factor here. But I'd hope that we can come up with some format that is also easy for people to read. I think Laurence has a good point about the current stats being hard to read sometimes. Maybe they are also slightly tricky for a program to read (e.g. it must read the heading row and then use that to interpret subsequent rows). But as Steve says, cut works as well. The problem with Andreas's solution is that it overloads tabs, using them to separate records and fields. I'd argue for a format that distinguished these, e.g. by using =s, commas and tabs. Steve asked, why introduce a new field, e.g. ("SCALED"), rather than place it in the name of the value. I'd argue against that as SCALED is just another attribute like the value. For example (putting spaces around \t here just to make it a little easier to read), GC=7 \t time.mu=6872.53 \t time.gc=5854.95 \t L1I_MISSES.gc=951335255621,SCALED \t Total time=12727.48 This format is easily parsable, e.g. with 3 lines of perl. It's also amenable to cut (assuming that the order is fixed). It's pretty human readable. although output may flow over several lines on the screen), and names and other attributes are tightly tied together. Richard
            Hide
            dgrove David Grove added a comment -

            bulk defer open issues to 3.1.2

            Show
            dgrove David Grove added a comment - bulk defer open issues to 3.1.2
            Hide
            dgrove David Grove added a comment -

            Bulk defer to 3.1.3; not essential to address for 3.1.2.

            Show
            dgrove David Grove added a comment - Bulk defer to 3.1.3; not essential to address for 3.1.2.
            Hide
            dgrove David Grove added a comment -

            bulk defer issues to 3.1.4

            Show
            dgrove David Grove added a comment - bulk defer issues to 3.1.4

              People

              • Assignee:
                Unassigned
                Reporter:
                l.hellyer@kent.ac.uk Laurence Hellyer
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated: