Platform: Portable Computing Language Device: Xavier Driver version : 3.0-rc2 (Linux ARM64) Compute units : 8 Clock frequency : 1377 MHz Global memory bandwidth (GBPS) float : 109.54 float2 : 108.59 float4 : 108.73 float8 : 91.07 float16 : 105.74 Single-precision compute (GFLOPS) float : 1400.08 float2 : 1403.62 float4 : 1400.64 float8 : 1395.61 float16 : 1385.01 No half precision support! Skipped Double-precision compute (GFLOPS) double : 44.01 double2 : 43.95 double4 : 43.84 double8 : 43.53 double16 : 43.06 Integer compute (GIOPS) int : 1397.38 int2 : 1401.00 int4 : 1394.18 int8 : 1400.12 int16 : 1400.80 Integer compute Fast 24bit (GIOPS) int : 1397.19 int2 : 1400.98 int4 : 1393.72 int8 : 1400.03 int16 : 1398.92 Transfer bandwidth (GBPS) enqueueWriteBuffer : 10.50 enqueueReadBuffer : 10.77 enqueueWriteBuffer non-blocking : 10.53 enqueueReadBuffer non-blocking : 10.75 enqueueMapBuffer(for read) : 72523.52 memcpy from mapped ptr : 10.72 enqueueUnmap(after write) : 14.82 memcpy to mapped ptr : 10.55 Kernel launch latency : -44.15 us