Initializing... GPU Device 0: "Hopper" with compute capability 9.0 M: 8192 (8 x 1024) N: 8192 (8 x 1024) K: 4096 (4 x 1024) Preparing data for GPU... Required shared memory size: 68 Kb Computing using high performance kernel = 0 - compute_dgemm_async_copy Time: 30.856800 ms FP64 TFLOPS: 17.82