
Sat Sep 12 12:13:43 EDT 2015
numactl --interleave=all ../testing/testing_cheevd -JN -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:10000:1000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 12:13:49 2015
% Usage: ../testing/testing_cheevd [options] [-h|--help]

% jobz = No vectors, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

 1234      0.17             0.17
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   10      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   20      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   30      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   40      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   50      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   60      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   70      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   80      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   90      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  100      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  200      0.00             0.01
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  300      0.01             0.01
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  400      0.02             0.02
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  500      0.03             0.03
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  600      0.04             0.04
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  700      0.05             0.05
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  800      0.06             0.07
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  900      0.07             0.07
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

 1000      0.09             0.09
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

 2000      0.42             0.43
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

 3000      1.16             1.27
    | S_magma - S_lapack | / |S| = 1.48e-10   ok

 4000      2.75             2.30
    | S_magma - S_lapack | / |S| = 8.15e-10   ok

 5000      5.49             3.76
    | S_magma - S_lapack | / |S| = 4.60e-10   ok

 6000      9.62             5.75
    | S_magma - S_lapack | / |S| = 5.03e-11   ok

 7000     12.33             8.10
    | S_magma - S_lapack | / |S| = 6.26e-11   ok

 8000     17.10            11.05
    | S_magma - S_lapack | / |S| = 4.79e-11   ok

 9000     24.08            14.83
    | S_magma - S_lapack | / |S| = 7.57e-11   ok

10000     32.77            19.19
    | S_magma - S_lapack | / |S| = 3.48e-11   ok

Sat Sep 12 12:16:59 EDT 2015

Sat Sep 12 12:16:59 EDT 2015
numactl --interleave=all ../testing/testing_cheevd -JV -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:10000:1000 --lapack
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 12:17:06 2015
% Usage: ../testing/testing_cheevd [options] [-h|--help]

% jobz = Vectors needed, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123      0.00             0.01
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

 1234      0.27             0.21
    | S_magma - S_lapack | / |S| = 8.18e-10   ok

   10      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   20      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   30      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   40      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   50      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   60      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   70      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   80      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

   90      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  100      0.00             0.00
    | S_magma - S_lapack | / |S| = 0.00e+00   ok

  200      0.01             0.01
    | S_magma - S_lapack | / |S| = 2.38e-09   ok

  300      0.01             0.01
    | S_magma - S_lapack | / |S| = 5.31e-10   ok

  400      0.02             0.02
    | S_magma - S_lapack | / |S| = 7.50e-10   ok

  500      0.03             0.03
    | S_magma - S_lapack | / |S| = 3.84e-10   ok

  600      0.05             0.04
    | S_magma - S_lapack | / |S| = 1.33e-09   ok

  700      0.07             0.06
    | S_magma - S_lapack | / |S| = 8.80e-10   ok

  800      0.09             0.07
    | S_magma - S_lapack | / |S| = 8.98e-10   ok

  900      0.11             0.09
    | S_magma - S_lapack | / |S| = 2.96e-10   ok

 1000      0.15             0.12
    | S_magma - S_lapack | / |S| = 1.05e-09   ok

 2000      0.77             0.52
    | S_magma - S_lapack | / |S| = 1.44e-10   ok

 3000      2.25             1.43
    | S_magma - S_lapack | / |S| = 1.92e-10   ok

 4000      5.12             2.72
    | S_magma - S_lapack | / |S| = 4.79e-11   ok

 5000      8.45             4.22
    | S_magma - S_lapack | / |S| = 3.07e-11   ok

 6000     12.67             6.40
    | S_magma - S_lapack | / |S| = 1.07e-10   ok

 7000     20.12             9.39
    | S_magma - S_lapack | / |S| = 7.83e-11   ok

 8000     28.20            12.76
    | S_magma - S_lapack | / |S| = 4.79e-11   ok

 9000     41.37            17.21
    | S_magma - S_lapack | / |S| = 9.47e-11   ok

10000     52.65            22.46
    | S_magma - S_lapack | / |S| = 3.83e-11   ok

Sat Sep 12 12:21:35 EDT 2015

Sat Sep 12 12:21:35 EDT 2015
numactl --interleave=all ../testing/testing_cheevd_gpu -JN -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:10000:1000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 12:21:41 2015
% Usage: ../testing/testing_cheevd_gpu [options] [-h|--help]

% jobz = No vectors, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123       ---              0.00
 1234       ---              0.15
   10       ---              0.00
   20       ---              0.00
   30       ---              0.00
   40       ---              0.00
   50       ---              0.00
   60       ---              0.00
   70       ---              0.00
   80       ---              0.00
   90       ---              0.00
  100       ---              0.00
  200       ---              0.00
  300       ---              0.01
  400       ---              0.02
  500       ---              0.02
  600       ---              0.03
  700       ---              0.05
  800       ---              0.06
  900       ---              0.08
 1000       ---              0.09
 2000       ---              0.44
 3000       ---              1.29
 4000       ---              2.32
 5000       ---              3.78
 6000       ---              5.64
 7000       ---              8.10
 8000       ---             11.09
 9000       ---             14.81
10000       ---             19.18
Sat Sep 12 12:23:06 EDT 2015

Sat Sep 12 12:23:06 EDT 2015
numactl --interleave=all ../testing/testing_cheevd_gpu -JV -N 123 -N 1234 --range 10:90:10 --range 100:900:100 --range 1000:10000:1000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 12:23:12 2015
% Usage: ../testing/testing_cheevd_gpu [options] [-h|--help]

% jobz = Vectors needed, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123       ---              0.01
 1234       ---              0.18
   10       ---              0.00
   20       ---              0.00
   30       ---              0.00
   40       ---              0.00
   50       ---              0.00
   60       ---              0.00
   70       ---              0.00
   80       ---              0.00
   90       ---              0.00
  100       ---              0.00
  200       ---              0.01
  300       ---              0.01
  400       ---              0.02
  500       ---              0.04
  600       ---              0.04
  700       ---              0.06
  800       ---              0.07
  900       ---              0.09
 1000       ---              0.12
 2000       ---              0.50
 3000       ---              1.42
 4000       ---              2.58
 5000       ---              4.23
 6000       ---              6.49
 7000       ---              9.44
 8000       ---             13.01
 9000       ---             17.35
10000       ---             23.11
Sat Sep 12 12:24:51 EDT 2015

Sat Sep 12 13:31:27 EDT 2015
numactl --interleave=all ../testing/testing_cheevd -JN -N 123 -N 1234 --range 12000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 13:31:33 2015
% Usage: ../testing/testing_cheevd [options] [-h|--help]

% jobz = No vectors, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123     ---               0.00
 1234     ---               0.17
12000     ---              31.05
14000     ---              46.13
16000     ---              65.71
18000     ---              90.81
20000     ---             119.46
Sat Sep 12 13:38:20 EDT 2015

Sat Sep 12 13:38:20 EDT 2015
numactl --interleave=all ../testing/testing_cheevd -JV -N 123 -N 1234 --range 12000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 13:38:26 2015
% Usage: ../testing/testing_cheevd [options] [-h|--help]

% jobz = Vectors needed, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123     ---               0.00
 1234     ---               0.18
12000     ---              36.62
14000     ---              54.42
16000     ---              78.03
18000     ---             108.75
20000     ---             143.87
Sat Sep 12 13:46:29 EDT 2015

Sat Sep 12 13:46:29 EDT 2015
numactl --interleave=all ../testing/testing_cheevd_gpu -JN -N 123 -N 1234 --range 12000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 13:46:36 2015
% Usage: ../testing/testing_cheevd_gpu [options] [-h|--help]

% jobz = No vectors, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123       ---              0.00
 1234       ---              0.15
12000       ---             31.06
14000       ---             45.65
16000       ---             65.37
18000       ---             90.12
20000       ---            119.44
Sat Sep 12 13:53:26 EDT 2015

Sat Sep 12 13:53:26 EDT 2015
numactl --interleave=all ../testing/testing_cheevd_gpu -JV -N 123 -N 1234 --range 12000:20000:2000
% MAGMA 1.7.0  compiled for CUDA capability >= 3.5, 32-bit magma_int_t, 64-bit pointer.
% CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.2, MKL threads 16. 
% device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
% Sat Sep 12 13:53:32 2015
% Usage: ../testing/testing_cheevd_gpu [options] [-h|--help]

% jobz = Vectors needed, uplo = Lower
%   N   CPU Time (sec)   GPU Time (sec)
%======================================
  123       ---              0.01
 1234       ---              0.21
12000       ---             37.28
14000       ---             55.41
16000       ---             79.81
18000       ---            110.56
20000       ---            148.26
Sat Sep 12 14:01:49 EDT 2015
