Platform: Portable Computing Language Device: Orin Driver version : 5.0 (Linux ARM64) Compute units : 16 Clock frequency : 1300 MHz Global memory bandwidth (GBPS) float : 174.18 float2 : 178.27 float4 : 178.72 float8 : 179.86 float16 : 114.04 Single-precision compute (GFLOPS) float : 5173.41 float2 : 5141.03 float4 : 5052.36 float8 : 5210.57 float16 : 5170.51 Half-precision compute (GFLOPS) half : 2638.87 half2 : 9957.23 half4 : 7488.28 half8 : 8151.71 half16 : 7179.72 Double-precision compute (GFLOPS) double : 83.12 double2 : 83.04 double4 : 82.84 double8 : 82.44 double16 : 81.60 Integer compute (GIOPS) int : 1767.72 int2 : 1777.34 int4 : 1816.11 int8 : 1772.77 int16 : 1781.42 Integer compute Fast 24bit (GIOPS) int : 1767.65 int2 : 1777.16 int4 : 1816.12 int8 : 1772.98 int16 : 1781.72 Transfer bandwidth (GBPS) enqueueWriteBuffer : 8.52 enqueueReadBuffer : 7.30 enqueueWriteBuffer non-blocking : 7.30 enqueueReadBuffer non-blocking : 7.29 enqueueMapBuffer(for read) : 129757.33 memcpy from mapped ptr : 7.32 enqueueUnmap(after write) : 8.98 memcpy to mapped ptr : 7.32 Kernel launch latency : -410.81 us