Debian/UBUNTU™ 14.04 GNU/Linux now becoming GPU accelerated!
Behind the scenes, there are a bunch of highly sophisticated programmers
working on GPU acceleration for Linux kernel
. Originally intended for OpenGL graphics acceleration (even DirectX is based on OpenGL hardware acceleration!), the subprocessors now have become general purpose
subprocessors, called GPGPU
s. Being 'turing complete'
, you can run a subset of GNU ANSI/ISO C99 standard on them. Subset means: No recursion, no cast operator, fixed array size, no pointer to function
This has to do with GPU design. Part of the Linux kernel itself are now 'GPU subkernels, CU (compute units)'. These are split: One part of the GPU subkernel, loaded as Linux GPU module running x86 or x64 machine code on the CPU, is able to exchange data with GPU over a 'defined window', a reserved memory address segment. Another memory address segment is for shuffling program data for execution into the GPU RAM.
The second part of the GPU subkernel runs on GPU itself. GPUs typically are implemented in ASIC hardware (free programmable CPUs). The GPU itself then also is able to shuffle calculated data back into CPU memory, without help of CPU.
The subprocessors in the GPU are divided up into groups with each 256KByte RAM. Within a group, the subprocessors (CU compute units) may exchange data. Their number varies. Between groups, by hardware design, exchange of data is not possible. The only way is to shuffle data back into CPU memory and back again into another group on GPU.
So, since shuffling data around between memory is expensive, you have to plan carefully, if you want the GPU support the CPU in the most efficient way.
The extremely cheap AMD Kaveri
processor has 512 subprocessors (CU compute units) onchip, which Linux kernel itself now may directly use to accelerate certain operations. These may not be for graphical purposes only, though graphic transformations (blur, despecle,...) are well suited tasks for GPU acceleration.
The very old and well known ImageMagick
command line tool now is the first one, which directly profits from GPU subprocessors. Accelerated now run following functions, which formerly were part of the standard GNU libraries, which came unaccelerated with every Linux distribution. Now these libs, once recompiled, run with GPU acceleration. Following functions now are being ported:blur charcoal contrast constrast-stretch convolve despeckleedge equalize emboss function gaussian-blur grayscalemodulate motion-blur negate noise radial-blur resize sketch unsharp
These operations are now executed on GPU subkernels within the Linux kernel itself, no longer on CPU!
Since most graphic toolkits (GIMP, Blender 3D, PITIVI video cutting) use ImageMagick libraries, they now are indirectly using GPU. All you will notice, is just a dramatic increase of speed
! As Debian/UBUNTU™ GNU/Linux user, you don't have to care about implementation.AMD Kaveri
processors, on contrary to INTEL core i3/5/7 XEON processors, till now - are not well known for speed, though they stick in both - XBOX ONE and PS 4. For a good reason: Only being programmed correctly, the AMD APU (AMD calls their onchip GPU 'APU') delivers 800-900 GIGAFLOPS, means almost a trillion operations per second, leaving everything, INTEL has to offer, in the dust. And this at prices below $100! (Note: Graphics 3D "card" being onchip included!)
Now to the toolkits: The ANSI/ISO C99 subset compiler suite follows IEEE 754 floating point standard. LibreOffice calc, now not only is able to run extremly large EXCEL sheets
, no, it has become 7-8 times faster, than on quad core CPU alone. PostgreSQL 'native' bindings included, just recompile LibreOffice 4.2. Performance of graphics/VIDEO acceleration varies, expect your machine becoming factor 20-30 faster, than with quad core CPU only.
NVIDIA ANSI/ISO C99 GPU compiler suite is named CUDA. Both architectures, AMD/ATI APU and NVIDIA™ now have become very similar. Beginning with new TEGRA™ 4/ARM processor with just 72 GPGPU (General Purpose GPU) subprocessors, you now can use the same GNU C99 compiler suite for NVIDIA FERMI/TESLA/MAXWELL
high end processors. The core differences between these two architectures are: First - both have different machine code instruction sets and second - AMD GPUs / APUs can do double precision floating point directly, NVIDIA single precision only, double has to be emulated.
Since a new, revolutionary technology must have a name, the archItecture is simply called "OpenCL". For more about ImageMagick OpenCL, read: http://www.imagemagick.org/script/opencl.php
or watch these videos:
OpenCL optimizations on ImageMagick Convert, Edit…: http://youtu.be/0IVB3KZWfFY
Review | Techdemos | HEVC, Dirt Showdo…: http://youtu.be/bZzcPWH4C8A
(sorry german only, just watch! B.t.w. AMD Kaveri is made in germany...) or a series of ATI (now AMD) videos:
ATI Stream OpenCL™ Technical Overview [Part 1] - …: http://youtu.be/ecYIsu83c0I