Intel VTune Profiler

Intel VTune Profiler optimises system performance, application performance, and system configuration

To improve Python performance while using Intel systems, install and utilise the Intel Distribution for Python and Data Parallel Extensions for Python

Optimise the performance of the entire application not just the accelerated part using the CPU, GPU, and FPGA

Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go, Java,.NET, Assembly, or any combination of languages can be multilingual

Improve data transfers and GPU offload schema for SYCL, OpenCL, Microsoft DirectX, or OpenMP offload code

Examine compute-intensive or throughput HPC programs to determine how well they utilise memory, vectorisation, and the CPU

Intel has shown how to quickly discover compute and memory bottlenecks in a Python application using Intel VTune Profiler

Intel VTune Profiler aids in identifying bottlenecks’ root causes and strategies for enhancing application performance

It can assist in mapping the main bottleneck jobs to the source code/assembly level and displaying the related CPU/GPU time

Even more comprehensive, developer-friendly profiling results can be obtained by using the Instrumentation and Tracing API (ITT APIs)