Graphics processing unit (GPU)
In recent years, performance of the GPU (Graphics Processing Unit) has greatly improved. The GPU overwhelmingly outnumbers the CPU in terms of cores and effectiveness on parallel processing.
These days, the GPU has been used as an arithmetic device for super computers as well as being used for conventional image processing because of its strength in parallel processing capability. The GPU has attracted a lot of attention from the CAE field and GPGPU (Generalpurpose computing on graphics processing units) using GPU for general purposes, including math calculation, has been gaining popularity. We were early to spot GPGPU and have continued development since we first provided a GPU solver in 2012.
Calculation speed evaluation
This section describes case studies evaluating JMAG GPU solver using NVIDIA's Tesla K40, the latest GPU for math calculation.
In numerical calculations, most of the calculation time is for processing iterative solutions of linear equations obtained in the finite element method; in other words, it is spent solutionfinding. Especially when using a largescale mesh model with millions of elements, a large proportion of the processing time is required for solutionfinding. JMAG GPU solver employs a technology to accelerate such processing times using GPUs. This section shows the effectiveness of analysis time reduction when a using a GPU in comparison with using a shared memory CPU parallel solver. Hardware specifications of the GPU and CPU used are shown below.
Hardware 
CPU Intel® Xeon® X5670 
GPU NVIDIA® Tesla® K40 
Clock frequency(GHz) 
2.93 
0.745 
Number of cores 
12 (2CPU) 
2880 (1GPU) 
Memory (GB) 
24 
12 
Memory bandwidth(GB/s) 
32 
288 
Hardware Specifications
Case Studies:
Transient Response Magnetic Field Analysis of Embedded Type PM Motors
The following figure shows analysis times when conducting two steps of a transient response magnetic field analysis on a 4pole, 24slot embedded type PM motor model. This model has approx. two million elements. Compared with the calculation time of a singlecore CPU, the anticipated calculation speed increase is approx. 10x when using only one GPU, and approx. 14x when using two GPUs.
Analysis Time (Embedded Type PM Motors)
Transient Response Magnetic Field Analysis of Linear Motors
The following shows analysis times when running two steps of a transient response magnetic field analysis on a linear motor model. This model has approx 7.5 million elements. Compared with the calculation time of a singlecore CPU, the anticipated calculation speed increase is approx. 4.2x when using only one GPU. When using two GPUs, it is approx. 4.6x
Analysis Time (Linear Motors)
Induction Motor Transient Response Magnetic Field Analysis
Finally, this section shows analysis times when running two steps of a transient response magnetic field analysis on an induction motor model having rotor skew. This model has approx. 9 million elements. The GPU memory for the Tesla K40 has been increased to 12 GB, which enables such a largescale computing with a single GPU. Compared with the calculation time of a singlecore CPU, the anticipated calculation speed increase is approx. 6.8x when using only one GPU. When using two GPUs, it is approx. 7.5x.
Analysis Time (Induction Motors)
[ System requirements ]
