<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="gmail_extra">
      <div class="gmail_quote">
        <blockquote class="quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div bgcolor="#FFFFFF" text="#000000">
            <p>In the case of vector registers, if it doesn't satisfy
              the tall-register assumption, I feel that you can possibly
              leave the input vector registers partially empty to solve
              the problem?</p>
          </div>
        </blockquote>
      </div>
    </div>
    <blockquote
cite="mid:CAGAvDYiVF0_=0V9sr_yzV=d87BSj_954i3ff9aWhhSqhVciqhQ@mail.gmail.com"
      type="cite">
      <div dir="auto">
        <div dir="auto">You can, though I'm not sure of the costs.  If
          you use just half of the input vector registers (which might
          mean you're just using the 128-bit SSE registers) you're
          increasing the total vector reads you need to do, and you're
          quadrupling the number of base case instances you have to
          solve.  Furthermore, I think you might need more vector
          operations in each base case to manage the layout of the C
          submatrix, making the base case even less efficient.</div>
      </div>
    </blockquote>
    May not be that inefficient! Basically, you are doing redundant
    computation. Say, 1st iteration, 1-4 elements in one AVX register,
    2nd iteration, 3-7 elements in AVX, then 6-9, and so on.<br>
    <blockquote
cite="mid:CAGAvDYiVF0_=0V9sr_yzV=d87BSj_954i3ff9aWhhSqhVciqhQ@mail.gmail.com"
      type="cite">
      <div dir="auto">
        <div dir="auto">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <blockquote class="quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                <div bgcolor="#FFFFFF" text="#000000">
                  <p>Best!</p>
                  <font color="#888888">
                    <p>-Yuan<br>
                    </p>
                  </font>
                  <div class="elided-text"> <br>
                    <div class="m_-248574004931951689moz-cite-prefix">On
                      2017年04月11日 07:56, TB Schardl wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div dir="auto">
                          <div dir="ltr">Hey Charles,
                            <div><br>
                            </div>
                            <div>Still working though the paper, but it
                              got me thinking, and I realized that the
                              outer product base case we use for the
                              matrix-multiplication case study
                              essentially doesn't scale for their
                              problem.</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">Our code multiplies
                              subcolumns of matrix A by subrows of
                              matrix B to compute a submatrix of the
                              product matrix C.  Basically, each
                              subcolumn or subrow is a full AVX
                              register, which is 256 bits.  Because
                              we're multiplying matrices of doubles,
                              each of which are 64 bits, each subcolumn
                              or subrow stores 4 elements.  The
                              resulting submatrix of C contains 16
                              elements, which can be (and needs to be)
                              completely stored in AVX registers.</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">Their matrix multiplication
                              problem, however, deals with 8-bit
                              integers.  That means one AVX register
                              load stores 32 elements.  So if we keep
                              using full AVX registers to store
                              subcolumns and subrow, one product between
                              a subcolumn of A and a subrow of B
                              produces elements in a 32x32 submatrix of
                              C.  Storing that submatrix completely in
                              AVX registers, however, requires 32
                              registers, which is more than the number
                              of named AVX registers on the chip.</div>
                            <div dir="auto"><br>
                            </div>
                            <div dir="auto">Of course there are other
                              base cases one could use, but the
                              outer-product base case and some
                              generalizations of it have problems when
                              the elements are so small.  Cutting one of
                              the subrows or subcolumns in half (using
                              just half of the vector register) cuts the
                              size of the submatrix in half, but the
                              resulting submatrix still requires too
                              many AVX registers to store, particularly
                              when one accounts for the registers needed
                              to store the subcolumn, the subrow, and
                              their permutations.  Cutting both the
                              subrow and subcolumn in half reduces the
                              submatrix size to a quarter its original
                              size, which can be fit in the available
                              registers.  I'm not sure the resulting
                              code is particularly efficient, however,
                              given the bookkeeping it seems one needs
                              on the vector registers.</div>
                            <div dir="auto"><br>
                            </div>
                            <div>This issue reminds me of the reasoning
                              behind the tall-cache assumption.  In
                              other words, it seems like we need the
                              number of vector registers to grow
                              quadratically with the number of elements
                              in each register in order to handle
                              problems like matrix multiplication.</div>
                            <div><br>
                            </div>
                            <div>Does all that make sense?  Thoughts on
                              these thoughts?</div>
                            <div><br>
                            </div>
                            <div>Best,</div>
                            <div>TB</div>
                          </div>
                        </div>
                        <div class="gmail_extra"><br>
                          <div class="gmail_quote">On Apr 10, 2017 2:24
                            AM, "Charles E. Leiserson" <<a
                              moz-do-not-send="true"
                              href="mailto:cel@mit.edu" target="_blank">cel@mit.edu</a>>
                            wrote:<br type="attribution">
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">
                              <div dir="ltr">
                                <div class="gmail_default"
                                  style="font-family:trebuchet
                                  ms,sans-serif">Supertechies,<br>
                                  <br>
                                </div>
                                <div class="gmail_default"
                                  style="font-family:trebuchet
                                  ms,sans-serif">Please see the attached
                                  paper.  Thoughts?<br>
                                  <br>
                                </div>
                                <div class="gmail_default"
                                  style="font-family:trebuchet
                                  ms,sans-serif">Cheers,<br>
                                </div>
                                <div class="gmail_default"
                                  style="font-family:trebuchet
                                  ms,sans-serif">Charles<br clear="all">
                                </div>
                                <div>
                                  <div
class="m_-248574004931951689m_2554045588306076500m_-6031687898788923739gmail_signature"
                                    data-smartmail="gmail_signature">
                                    <div dir="ltr">
                                      <div>
                                        <div dir="ltr">
                                          <div>
                                            <div dir="ltr">
                                              <div>
                                                <div dir="ltr">
                                                  <div>
                                                    <div dir="ltr">
                                                      <div>
                                                        <div dir="ltr">
                                                          <div>
                                                          <div dir="ltr"><span
style="font-family:trebuchet ms,sans-serif"><span
                                                          style="font-family:arial,helvetica,sans-serif"><br>
                                                          —</span><br>
                                                          Charles E.
                                                          Leiserson<br>
                                                          Edwin Sibley
                                                          Webster
                                                          Professor <br>
                                                          Margaret
                                                          MacVicar
                                                          Faculty Fellow<br>
                                                          </span>Professor
                                                          of <span
                                                          style="font-family:trebuchet
                                                          ms,sans-serif"><span
style="font-family:trebuchet ms,sans-serif">Computer Science and
                                                          Engineering<br>
                                                          <br>
                                                          Associate
                                                          Director and
                                                          Chief
                                                          Operating
                                                          Officer<br>
                                                          Computer
                                                          Science and
                                                          Artificial
                                                          Intelligence
                                                          Laboratory<br>
                                                          </span></span><span
style="font-family:trebuchet ms,sans-serif"><span
                                                          style="font-family:trebuchet
                                                          ms,sans-serif"><span
style="font-family:trebuchet ms,sans-serif">Massachusetts Institute of
                                                          Technology<br>
                                                          <br>
                                                          </span></span></span></div>
                                                          <div><span
                                                          style="font-family:trebuchet
                                                          ms,sans-serif"><span
style="font-family:trebuchet ms,sans-serif"><span
                                                          style="font-family:trebuchet
                                                          ms,sans-serif">Member
                                                          of the
                                                          National
                                                          Academy of
                                                          Engineering<br>
                                                          Fellow of
                                                          AAAS, ACM,
                                                          IEEE, and
                                                          SIAM.<br>
                                                          </span></span></span></div>
                                                          <div dir="ltr"><span
style="font-family:trebuchet ms,sans-serif"><br>
                                                          MIT CSAIL<br>
                                                          32 Vassar
                                                          Street,
                                                          #32-G768<br>
                                                          Cambridge, MA
                                                          02139<br>
                                                          USA<br>
                                                          <br>
                                                          MIT Phone: <a
moz-do-not-send="true" href="tel:%28617%29%20253-5833"
                                                          value="+16172535833"
target="_blank">+1-617-253-5833</a><br>
                                                          Web: <span
                                                          style="font-family:courier
                                                          new,monospace"><a
moz-do-not-send="true" href="http://people.csail.mit.edu/cel/"
                                                          target="_blank">http://people.csail.mit.edu/ce<wbr>l/</a></span><br>
                                                          Email: <span
style="font-family:courier new,monospace"><a moz-do-not-send="true"
                                                          href="mailto:cel@mit.edu"
target="_blank">cel@mit.edu</a></span></span><br>
                                                          —<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                              <br>
                              ______________________________<wbr>_________________<br>
                              Supertech mailing list<br>
                              <a moz-do-not-send="true"
                                href="mailto:Supertech@lists.csail.mit.edu"
                                target="_blank">Supertech@lists.csail.mit.edu</a><br>
                              <a moz-do-not-send="true"
                                href="https://lists.csail.mit.edu/mailman/listinfo/supertech"
                                rel="noreferrer" target="_blank">https://lists.csail.mit.edu/ma<wbr>ilman/listinfo/supertech</a><br>
                              <br>
                            </blockquote>
                          </div>
                        </div>
                      </div>
                      <br>
                      <fieldset
                        class="m_-248574004931951689mimeAttachmentHeader"></fieldset>
                      <br>
                      <pre>______________________________<wbr>_________________
Supertech mailing list
<a moz-do-not-send="true" class="m_-248574004931951689moz-txt-link-abbreviated" href="mailto:Supertech@lists.csail.mit.edu" target="_blank">Supertech@lists.csail.mit.edu</a>
<a moz-do-not-send="true" class="m_-248574004931951689moz-txt-link-freetext" href="https://lists.csail.mit.edu/mailman/listinfo/supertech" target="_blank">https://lists.csail.mit.edu/<wbr>mailman/listinfo/supertech</a>
</pre>
    </blockquote>
    

  </div></div>


______________________________<wbr>_________________

Supertech mailing list

<a moz-do-not-send="true" href="mailto:Supertech@lists.csail.mit.edu">Supertech@lists.csail.mit.edu</a>

<a moz-do-not-send="true" href="https://lists.csail.mit.edu/mailman/listinfo/supertech" rel="noreferrer" target="_blank">https://lists.csail.mit.edu/<wbr>mailman/listinfo/supertech</a>


</blockquote></div>
</div></div></div>



</blockquote>
</body></html>