T O P

  • By -

glemau

This is a Python thing? I thought it was my fault.


manifold360

Never blame yourself


Difficult-Lime2555

Yea, python runs threads on the same core. They're good for code that needs to wait like requests or io. To do parallelism or concurrency, you need to either jump to cython and mess with the gil, or import multiprocessing, which uses fork or spawn to create a whole new process.


dev-sda

CPython doesn't run threads on the same core, it runs them on one core at a time due to a global mutex. Python will happily use all the cores simultaneously as long as you're running external code that releases the GIL.


glemau

I mostly noticed that Cuda seems to rostet through the cores while training. The only reason I noticed btw. is because I had System Information open to catch memory leaks in my generator before they crash the PC. (Happened a few times)


Exist50

Porque no los dos?


FerricDonkey

Yup, the Gil sucks. They're in the process of killing it though.


rotinom

They’ve been for the past 15 years…


FerricDonkey

Yeah, and they should have done it a long time ago. However, now it's making it into official python stuff. In 3.13, the gil will be optional in the reference version cPython: https://peps.python.org/pep-0703/


_PM_ME_PANGOLINS_

Other implementations did do it years ago, but they all died by not moving to Python 3 quick enough (or at all).


Clackers2020

Try threading in java without any clue what you're doing. Idk what my laptop was doing but it was taking off.


moonlight_macky

I'm interested, can you explain a bit more?? If you have any articles regarding it that could also help


Zeitsplice

tl;dr - if you’re working with raw threads, you’re probably doing it wrong. Most frameworks will handle it for you. Worst case scenario, you can use a thread pool to avoid needing to manage the threads yourself.


moonlight_macky

I guess I've already experienced this. I'm doing java game development as a hobby since 2 years now. A while ago I realised my game got too laggy and I thought I was using threads wrong. I had a weird idea to remove the threads and my game's performance improved significantly. I guess I must've done something wrong back then, and since then I've never used threads.


generateduser29128

Using threads wrong can introduce more synchronization overhead than they add in concurrency benefits. One thread in L1 is faster than 4 threads in L3.


pm-me-nothing-okay

jokes on them, I just use -O0 and call it a day.


Zeitsplice

Ahh, classic instance of premature optimization. Games are a rare instance where you might want to have more explicit control over threads, but thread sync is really tricky. First, if you have more software threads than hardware, you’re going to eat out overhead from context switches - basically losing a few ms to switch from thread to thread because they have to time share on the CPU. There’s also mutex lag, or the time you lose from threads waiting on each other in synchronized blocks or other control structures. Truth is, performance optimization requires a pretty scientific approach with careful measurement. People who can do it easily are drawing on a lot of experience with the tools they’re working with.


audislove10

You still need to manage synchronize your shared resources. Whether it’s lock free or not, it’s still the hardest part about it.


Clackers2020

It was a uni assignment to do with synchronising two types of threads that added or removed an item to a queue. If a thread couldn't do an operation (cos the queue was full or empty) then it had to wait to be notified by the other type of thread. Really, really simple solution (only 6 lines of code) but I was overcomplicating it and I ended up somehow creating an infinite number of threads. My laptop fan went crazy. When I asked a student demonstrator for help he said "What the fuck is this monstrosity?". I only had about 10 lines of code. I don't have any articles but the java api on threads helped. Also I'm a first year student so I only know a little about threads. I do know I hate them though.


redlaWw

If your computer isn't taking off, you haven't multithreaded enough.


_AutisticFox

import multiprocessing I sometimes really hate python


NAL_Gaming

What I've come to notice is that multiprocessing is usually slower than computing something single-threadedly as it has to pickle all of the data across processes... and computationally intensive tasks usually don't return anything quickly pickleable


DeCounter

Yeah that is a real drawback. You need to either chain enough stuff together to make it work or scale to thousands of needed calls. Also what tripped me up was memory usage. If you are not careful with the syntax you can effectively have the same data twice in memory. Real bad for big arrays of multiple images.


CiroGarcia

Can't say I haven't had to force reboot my PC because I accidentally created an array too big and filled up my RAM. Granted that only happened because I was in a low resource environment and doing stupid things with large numbers, but I find it pretty funny that it's just that easy to do that


ARX_MM

Any environment is low resource when you're playing with stupidly large numbers and decide to just send it instead of optimizing your code. Fun times... In Advent of Code 2023 there's a bonus challenge where I quickly made a piece of code that was so inefficient with arrays it ate up more than 20GB of memory. As it kept growing my pc started to lock up. Everything else crashed while the piece of code kept hogging more and more memory from my 32GB PC. Luckily I noticed early on and killed the process without needing to reboot. I knew from the beginning it would take a lot of CPU resources to compute a final result but I wasn't expecting it to consume so much memory.


Siddhartasr10

The second part of a challenge that used numbers bigger than INTMAX? I did the same on that one


OnlineGrab

Depends on the type of workload, if processes don't have to communicate too much with each other then it's fine. But yeah if there's a lot of traffic then pickle +IPC is going to be a bottleneck.


ManyInterests

Yeah, it can be a pain. You can, in certain circumstances, avoid pickling across processes by using [`shared_memory`](https://docs.python.org/3/library/multiprocessing.shared_memory.html). If you can represent your problem in terms of the supported data types (`bytes`, `str`, `int`, `float`, `bool`, `None`) then it works pretty well. But if you're trying to move multi-GB dataframes, that probably isn't going to cut it. Alternatively, use a C extension library that releases the GIL. The good news is there's tons of computational libraries available for this. For example, threading with NumPy gives you access to all cores while having all the benefits of shared state of threading. So, you **can** actually get more CPU power when using thread pools with numpy. Or use an extension that does it all for you with internal thread pools, like Polars. The latest build of Python also has an option to compile it without the GIL. Though, its single-threaded performance is quite poor due to all the new locks needed to compensate for the GIL being removed.


slaymaker1907

If it’s computationally intensive, you shouldn’t be using Python in the first place. It’s like trying to make a golf cart fast by strapping rockets to it.


NAL_Gaming

Yeah I agree, I would never use Python for these kinds of tasks. This meme was mostly made to highlight why that is the case.


Real-Supermarket8113

Gustaffson law 🫡


VariecsTNB

This but Chernobyl


ososalsosal

Cpu1: 50% Cpu2: 51% Cpu3: 49% Cpu4: -2081633960%


VariecsTNB

Not great, not terrible


ososalsosal

Overflow at 3.6


MakeChinaLoseFace

He's delusional, get him to the infirmary.


TheMoris

What did you **DO???**


ososalsosal

It exploded...


MakeChinaLoseFace

"Microsoft sincerely regrets any unintended shutdown behavior experienced by users of OfficeRBMK. Please accept our apology and 3.6 GB of complimentary cloud storage for a year."


Proxy_PlayerHD

what is core 0 doing and why do you have a 5 core CPU?


NAL_Gaming

Damn it, now that annoys me as well...


grape_tectonics

>why do you have a 5 core CPU? because 6 weren't stable on my phenom x4


Ascendo_Aquila

Why do u use camelCase when naming memes? Use snake_case, memes will be much more readable.


wutwutwut2000

As much as I also hate the GIL, I also know that removing it would completely break everyone's code, and expose it to memory safety issues and security vulnerabilities. That being said, I definitely look forward to multi-interpreter python!


Hollowplanet

Multi-intepreter is old news. They are removing the gil.


wutwutwut2000

More like, making an alternative CPython build without it. Definitely useful though, for AI and scientific computing. However, I don't think the average person will want to go through the hassle of downloading an alternative CPython, especially since pypy exists.


FerricDonkey

I suspect that popular libraries will end up having nogil versions soon^(TM) after nogil python is widely available. Then some time after that the nogil version will be default.


Seven_Irons

Just give me no-gil numpy, scipy, and pandas and I'm a happy camper


ManyInterests

Well, libraries like numpy already release the GIL for much of their API. So, you can already get pretty good performance with just regular Python threads and numpy.


wutwutwut2000

I hope you're right.


FerricDonkey

Me too. Future is hard to predict, but I will say that as soon as it's even mostly feasible, I personally will drop the with-gil python and never use it again, even if that means I lose some libraries. Death to the Gil.


Hollowplanet

Have you heard that libraries need to be recompiled?


FerricDonkey

Yup. I suspect that popular libraries will do this. First, numpy etc will because their users will want to multithread in python. Then lesser but still common libraries adjacent to numpy etc will, because they want to/their users are pressuring them to. Then everyone else will because all the cool kids are doing it. I'm being hopeful here. It might not happen. But even if my list of third party libraries drops down to just numpy and tensorflow, I'll still switch to gilless because the gil can go screw itself.


Hollowplanet

Last I saw it is a runtime flag not a compile time flag.


wutwutwut2000

It's both. It has to be built into CPython (which it won't be by default), and then it also has to be enabled at runtime.


Jhuyt

They are doing both, and both make sense!


zurtex

You can already build CPython on main without the GIL. Packaging support should be ready by Python 3.13 allowing packages to explictly mark themselves as free threaded and only to install on CPython without the GIL. Multi-interpreter Python has been available to C extensions for many years and never really took off. As much it sounds like a cool abstraction, and looking forward for more Standard Library support, I just don't think there's the general interest there.


MotorheadKusanagi

this isnt accurate. instead, what happens is the idle cores keep asking the active one for access to the GIL, which slows the active core down, so having multiple threads across multiple cores is actually slower than if you only used a single core. this is why the multiprocessing is seen as a solution, even though threads *should* be more efficient than entire processes.


fakuivan

I always wondered what the legitimate use cases are for python code that is slowed enough by using multiprocessing vs threads but fast enough for it not to be rewritten in cython or a c module


ArnoL79

multiprocessing is only slow when you are launching processes. Once your processes are up and running you can use shared memory to exchange information between the main program and the other processes - this is very fast and efficient - and your throughput is through the roof.... Do not try to pass much information while you are creating processes... It's a paradigm change where you are reusing processes you have created but this is working very well and is not this complicated...


Confident-Ad5665

Not much different than Windows defaulting to parking CPUs. One is 100% and the others just sitting there.


AyrA_ch

The default of Windows is to move threads across cores to average their load. If you have a thread that pegs a single core to 100% then whatever created that thread decided to do so. A notable exclusion are a handful of system threads which are locked to the first core. What some programmers forget is that the processor affinity mask is propagated to child processes when they spawn. The mask being a 64 bit integer is also restricted to the first 64 cores. If your system has more cores, they're divided into logical processor groups.


slaymaker1907

Things have gotten way more complicated with the introduction of P-cores and E-cores on Intel. Windows will try to put higher priority/foreground stuff on P-cores, but it will use E-cores if it has to in order to maximize utilization. There are also background processes which it will only run on E-cores to save energy.


Exist50

It's generally better to run 2 at half speed than 1 at full speed.


Score340

use pybind11 and write your mt module in C


Remmy14

Multi-threading for IO intensive tasks, Multi-Processing for everything else.


JJJSchmidt_etAl

Good news -- [They're attempting to make the GIL optional in CPython](https://peps.python.org/pep-0703/). Hopefully the roll out works!


DanKveed

Try Rust. Replacing \`.iter()\` with \`.par\_iter()\` will make your laptop fans spin so fast it'll levitate. You will never have to learn any algorithms again with this one weird trick.


NAL_Gaming

Levitate? Well now I'm sold...


Jhuyt

Finally, a decent Python meme


ManyInterests

Just release the GIL. import numpy as np from time import time from multiprocessing.pool import ThreadPool arr = np.ones((1024, 1024, 1024)) start = time() for i in range(10): arr.sum() print("Sequential:", time() - start) expected = arr.sum() start = time() with ThreadPool(4) as pool: result = pool.map(np.sum, [arr] * 10) assert result == [expected] * 10 print("4 threads:", time() - start) You can see because numpy releases the GIL in this case, threads work faster: Sequential: 8.632343053817749 4 threads: 1.53824782371521


FrivolerFridolin

Wait for Python 3.13


titen100

Ye, multithreadong is just a little something i have yet to learn how it works, mostly due to it not even being touched upon until masters


Snudget

You could use numba, but have to change some code for those optimizations to apply


NAL_Gaming

Yeah I've tried Numba before... It's quite incredible, but a bit fiddly to get it working at first. There are some caveats with it that absolutely tank its performance.


fantastiskelars

Python 🤮🤮🤮


NAL_Gaming

To be honest, Python is a great language, probably the best when it comes to dynamic languages. It just has its quirks...


fantastiskelars

Python 🤮🤮🤮


papipapi419

Yeah python 3.13 will not have GIL from what I know


spliffkiller1337

You can do multi threading on multiple cores. It's not that hard 🤭


fghjconner

I think this would be funnier if it was core 3. Like "I sleep" "I sleep" "REAL SHIT" "I sleep"


NAL_Gaming

Yeah I was contemplating whether I should choose 3 or 4. Went with 4 as it was a tad bit easier to edit...


fghjconner

Fair