Getting python to use a GPU

So apparently you need Cuda and Anaconda to use the GPU with python. As I explore more I'll add notes here. (For those reading, the GPU is just faster with this stuff). Unfortunately you have to have a specific type of Nvidia. 

0. Background steps:

a. Become Root:

sudo su

b. Install wget: 

apt-get install wget


1. Install Cudahttps://developer.nvidia.com/cuda-downloads . 

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin

sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.deb

sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
sudo apt install nvidia-cuda-toolkit


If that doesn't work try:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda

2. Install Anaconda:


Warning, the above is one gig and it downloads more stuff when it runs.
sh Anaconda3-2023.07-2-Linux-x86_64.sh  -u
conda install numba ; conda install cudatoolkit
mkdir /usr/local/anaconda3/
export PATH=$PATH":/usr/local/anaconda3/bin/:"
echo $PATH
echo 'export PATH=$PATH":/usr/local/anaconda3/bin/:" ' >> ~/.bashrc

3. Run Hitesh Kumar's cool speedtest script:

vi speedtest.py

[i]

[ctrl V]

from numba import jit, cuda
import numpy as np
# to measure exec time
from timeit import default_timer as timer
# normal function to run on cpu
def func(a):
for i in range(10000000):
a[i]+= 1
# function optimized to run on gpu
@jit(target_backend='cuda')
def func2(a):
for i in range(10000000):
a[i]+= 1
if __name__=="__main__":
n = 10000000
a = np.ones(n, dtype = np.float64)
start = timer()
func(a)
print("without GPU:", timer()-start)
start = timer()
func2(a)
print("with GPU:", timer()-start)


[esc][Z][Z]

python3 speedtest.py

You'll notice the GPU result is about 1/10th of the non-GPU. However, you'll see that the CPU version doesn't scale. As you make the number larger, the CPU version goes up proportionally whereas the GPU version goes up negligibly (like 1% instead of 10%). Sample output, with 40 000 000 iterations :

without GPU: 4.595112634000543
with GPU: 0.2144785749997027

Credit to this guy for getting me going nicely:

https://www.geeksforgeeks.org/running-python-script-on-gpu/

Explanation of the code:

np.ones: This is a function from the NumPy library. NumPy is a powerful library in Python for numerical and array operations. np.ones is used to create an array filled with a specified value, in this case, ones. n: n is a variable or constant that represents the size or shape of the array you want to create. You should define n somewhere in your code before using this line. dtype=np.float64: This part specifies the data type of the elements in the array. In this case, np.float64 indicates that each element in the array will be a 64-bit floating-point number. Floating-point numbers are used to represent real numbers with decimal points. So, when you execute this line of code, you will create a NumPy array a with n elements, and all of these elements will be set to the value 1.0, and they will be of the data type float64. You can then manipulate and perform various numerical operations on this array using NumPy functions and features. if __name__=="__main__":

__name__: __name__ is a built-in Python variable. When a Python script is executed, Python sets the __name__ variable to "__main__" if the script is the main program that is being run. If the script is being imported as a module into another script, then __name__ is set to the name of the module.

"__main__": This is a string literal that represents the name of the main module or script.
So, the line if __name__ == "__main__": is essentially checking whether the current script is the main program being executed or if it's being imported as a module into another script. If it's the main program, the code block indented under this condition will be executed. This is often used to include code that should only run when the script is run directly and not when it's imported as a module. For example, you might have some initialization code or functions that you only want to run when the script is the main program but not when it's used as a module in other scripts. By using if __name__ == "__main__":, you can achieve this separation of code behavior.
What is this JIT thing?
In a Python AI script, @jit is often used as a decorator that stands for "Just-In-Time" compilation, and it's typically associated with a library called Numba. Numba is a Just-In-Time (JIT) compiler for Python that allows you to speed up Python code, particularly numerical and scientific code, by compiling it into machine code at runtime. This can significantly improve the performance of certain Python functions and loops. Here's how @jit is used:
from numba import jit @jit def my_function(arg1, arg2):    Python code here
In this example: from numba import jit: This line imports the jit decorator from the Numba library. @jit: This decorator is placed before a Python function definition, such as my_function. When you decorate a function with @jit, Numba will attempt to compile that function to machine code when it is first called. Subsequent calls to the function will execute the compiled machine code, resulting in potentially significant speed improvements for certain types of computations. Keep in mind that not all Python code can be easily compiled with Numba, and the effectiveness of JIT compilation depends on the specific code and the types of operations being performed. It's particularly useful for numerical and array computations, making it a common choice in AI and scientific computing where performance is crucial. To use Numba and the @jit decorator, you'll need to install the Numba library, which can be done using pip: pip install numba
After installing Numba, you can use @jit to optimize specific functions within your Python AI script, potentially speeding up their execution significantly.
Note. The @jit(target_backend=cuda) part is if you are using cuda. If you want to just use compiled code without specifically using the gpu, you use
@jit(nopython=True)

Disclaimer
In many cases, it's possible for the first script, which does not use JIT (Just-In-Time) compilation, to run faster than the second script that uses JIT. This might seem counterintuitive, but there are several reasons why this can happen: Overhead of JIT Compilation: When you use JIT compilation with Numba or similar tools, there's an initial overhead associated with compiling the code into machine code. For simple operations like counting from 1 to 1 million, this compilation overhead can outweigh any potential performance benefits. Type Inference: JIT compilation works best when it can infer the types of variables and optimize the code accordingly. In your example, the loop is simple and doesn't involve complex computations or data types. As a result, the overhead of type inference and compilation might not provide significant performance gains.
Python Global Interpreter Lock (GIL): In the CPython interpreter (the default Python interpreter), there's a Global Interpreter Lock (GIL) that prevents multiple native threads from executing Python code in parallel. This means that even with JIT compilation, the Python code inside the loop can't take full advantage of multi-core processors for parallel execution.
Code Complexity: JIT compilation tends to shine when you have computationally intensive operations or complex mathematical calculations. For such tasks, the performance gain from compiling the code to machine code can be substantial. Counting from 1 to 1 million is a very simple task with minimal computational load, so the gains from JIT may not be significant.
Warm-Up Time: JIT compilers like Numba often have a warm-up time during which they analyze and compile the code. For very short-running scripts, this warm-up time can dominate the execution time.
In summary, JIT compilation is a powerful tool for optimizing certain types of Python code, particularly code with complex computations and data manipulations. However, for extremely simple and short tasks like counting from 1 to 1 million, the overhead introduced by JIT compilation can sometimes make the plain Python version run faster.

Popular posts from this blog

Testing ChatGPT on coding - a script to make CSV to ICS

Some rough notes on FreedomGPT and PrivateGPT

What is a "token" , what is "temperature", and what is "n-shot"?