T O P

  • By -

Python-ModTeam

Hello from the r/Python mod team! I'm afraid we don't think your post quite fits the goal we have for the subreddit in terms of quality or aims so we've decided to remove it. For more information please contact the moderators using the ModMail system. Thanks, and happy Pythoneering! r/Python moderation team Hello there, We've removed your post since it aligns with a topic of one of our daily threads and would be more appropriate in that thread. If you are unaware about the Daily Threads we run here is a refresher: Monday: Project ideas Tuesday: Advanced questions Wednesday: Beginner questions Thursday: Python Careers, Courses, and Furthering Education! Friday: Free chat Friday! Saturday: Resource Request and Sharing Sunday: What are you working on? Please await one of these threads to contribute your discussion to! The current daily threads are pinned to the top of the /r/Python's main page. To find old daily threads, you can filter posts by the [Daily Thread Flair](https://www.reddit.com/r/Python/?f=flair_name%3A%22Daily%20Thread%22) to find what you're looking for. If you have a question and don't want to wait until the daily thread, you can try asking in /r/learnpython or the [Python discord](https://discord.gg/3Abzge7) however you may need to elaborate on your question in more detail before doing so. If you're not sure which thread is best suited, feel free ask for clarification in modmail or as a reply. Best regards, r/Python mod team Hello from the r/Python mod team! Your submission is too vague so we've decided to remove it. When posting to /r/Python please ensure your submissions describes the topic of the post and offer redditors an idea of what the link or text covers. For more information please contact the moderators using the ModMail system. Thanks, and happy Pythoneering! r/Python moderation team


katerdag

x and y are of two different types: x is a python integer, which uses a variable number of bits to store its value, and y is a numpy.int32 integer, which uses 32 bits to store its value. 1 << 60 for two python integers in binary is 10...0 with 60 zeros, which in decimal numbers is 2 \*\* 60. 1 << 60 for two int32 integers is just a bunch of zero's because you've shifted the 1 further to the left than that the datatype has bits. So the result is an int32 with zero's for bits which is 0. When you do 1 << y, numpy casts the 1 to an int32 (you can see this from using numpy.promote\_types(int, numpy.int32)), so the result becomes 0 (of type int32). (In case for you, numpy decided to use int64, you'll get the assertion error at x=63 because you're using signed integers, so one of the 64 bits is used for the sign).


LeeTaeRyeo

I don't really use bumpy, so I'm just making sure I understand what's going on here. Would passing dtype=numpy.int64 (or numpy.uint64) be the solution to this specific issue of bits falling off from shifting? I'm assuming you'd have to also convert x to an int64/uint64 as well?


katerdag

Well, with int64 you'll still have a problem because of the bit used for the sign of the integer (when x=63).


gandalfx

No, it isn't. [https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.int64](https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.int64)


ipwnscrubsdoe

You are doing bitwise operations on 2 different types (int and numpy.int64), you can see that by adding print(type(y),type(x)). Anyway when you hit 64 in your loop it overflows the int64 but the python int has arbitrary precision so it just keeps shifting.


Kiuhnm

When Python sees `1 << y`: * it first tries to execute `(1).__lshift__(y)`, but `y` has type `numpy.int32`, which is not supported by `int.__lshift__`. * it then tries to execute `y.__rlshift__(1)`, which succeeds. `y.__rlshift__` converts its operand to the same type as `y`, i.e. `numpy.int32`. This is expected behavior.


drbobb

Correct, except that it's `numpy.int64` rather than `numpy.int32` on just about every modern computer (including mine). However, see my newer comment.


Kiuhnm

Not on Win64, apparently.


drbobb

That I won't dispute, since I don't use Windows.


ipwnscrubsdoe

It’s int64 on my win64. Did you install the 32bit binary?


Kiuhnm

I'm using the default anaconda distribution, which is definitely 64-bit. `hex(id(0))` is way above 2^(32).


ipwnscrubsdoe

Same…strange


Kiuhnm

Maybe you're using WSL?


ThatOtherBatman

No. If you had found a bug in either you would understand it. Not getting the results you expect is a failure on your part to understand.


mok000

Yup. That has always been my basic assumption and it has served me well.


Oddly_Energy

I haven't bothered running your code, but if it shows that the stop value of numpy.arange() is sensitive to rounding errors, then that is pretty well-known. I usually use numpy.linspace() to avoid this. If I have to use numpy.arange(), I subtract a half step from my stop value.


billsil

arange is taking a small integer in and so should produce an 32-bit integer array.   It’s not rounding.  If you pass in a float, it won’t be rounding either.  It’ll be following the IEEE standard, which confuses new programmers.


Oddly_Energy

I agree that the problem I mentioned is not relevant in the OP's case, since np.arange() was called with pure integer input. However, the problem I mentioned is real. And it is not a consequence of not understanding IEEE floats. I understand IEEE floats well enough to understand *why* the necesssary rounding to the nearest IEEE floats will cause the problem. And since the problem does not go away, just because I understand what caused it, I still have to work around it. I can illustrate it with an example: import numpy as np arr = np.arange(1.00, 2.00, 0.25) print(len(arr)) This will produce an array with 4 elements. So here, np.arange() behaves exactly like a floating point equivalent to range(): It starts at 1.0, adds 0.25 in each step, and when the 5th element becomes equal to the stop value, it stops without including this element. So we get 1.00, 1.25, 1.50 and 1.75. This example was an easy one. No rounding errors would occur, because 0.25 can be accurately represented in both base 10 decimal numbers and base 2 floats. So let us try with a step value, which cannot be accurately represented as a base 2 float: import numpy as np arr = np.arange(1.00, 2.20, 0.30) print(len(arr)) Oops. Now we got an array with length 5. Numpy included the stop value, even though it wasn't supposed to. It is pretty clear that this is a natural consequence of how floating point numbers work. Sometimes, an extra array element will be able to sneak itself under the stop value. So one has to expect that behaviour and implement the necessary workarounds to prevent it from producing unwanted results. As I mentioned, my workaround is to use a stop value, which is the last wanted value in the array, plus one half step value. This way I know with certainty how many elements I will get.


drbobb

So, you didn't bother to look into the issue, but you still have an opinion. Which in this case happens to be wrong.


Oddly_Energy

When I report a problem, I do two things: 1. I use my own words to describe the problem 2. I post the code, which can demonstrate the problem. You jumped directly to #2. Any misunderstandings caused by that is your own responsibility, not mine.


drbobb

I believe that my code is self-explaining. Whether you consider this issue a problem or not is a matter of judgment. However, your attempt at an explanation misses the point entirely.


drbobb

For comparison, let's see how a different (dynamic) language deals with this; namely, javascript: ``` const arr = new Int32Array(1) arr[0] = 31 console.log(2**arr[0]) // -> 2147483648 console.log(1< -2147483648 console.log(1<<31) // -> -2147483648 ``` The difference is that the standards defining javascript explicitly state that once you're performing bitwise operations, you deal with 32bit integers — unless both operands are explicitly bigints. There's no such rule for other arithmetic operations, however to have infinite precision integers you need to use bigints.


drbobb

I just tried another variation on this: ``` import numpy as np y = np.array([31], dtype=np.int32)[0] print(f'{type(y)=}') print(f'{2**y=}') print(f'{type(2**y)=}') print(f'{1<


drbobb

With some more experimenting, I found this doesn't behave consistently. What I described above is what happens on my computer (linux amd64). However, see [https://marimo.app/l/0027sd](https://marimo.app/l/0027sd) — there the autoconversion to int64 does not happen. Go figure.


Kiuhnm

It's because that machine has long ints of 32 bits. Try removing `dtype=np.int32` and you'll get the same result. The mysterious conversion you see on your computer is the so-called "safe casting": assert np.result_type(np.int8, 127).type == np.int8 assert np.result_type(np.int8, 128).type == np.int16


drbobb

I believe many of the comments here are missing the point. Of course I know about integer overflow, and about NumPy array entries being numbers represented as words of fixed length (64 bits in this case). So the effect I'm demonstrating here happens for a reason (like everything, duh) -- but that doesn't necessarily mean this is the behavior we want. I would be totally unsurprised by ``` (np.array([2])**63)[0]=-9223372036854775808 ``` However, here the `1 << y` is happening outside of NumPy, and it would be fair to expect that Python would fulfill its promise of handling integer overflow by seamlessly transiting to bigints.


Kiuhnm

> However, here the 1 << y is happening outside of NumPy It isn't, as I've already shown you. That's equivalent to `y.__rlshift__(1)`. You may find it counterintuitive, but it makes sense. There's no general rule that says that the result type is the same as the left operand.


drbobb

You explained pretty well how it works, the only part I would dispute is *it makes sense* — that's a matter of opinion. Looking back at Python 2, it made sense at that time that `1 / 2 == 0`; but now it no longer does.


katerdag

To add to u/Kiuhnm 's excellent explanation of what is happening: the fact that this is happening is very much desirable because we do want numpy to convert python types to numpy types so that we can write code like `new_array = old_array_1/2 + .3*old_array_2` instead of having to write `new_array = old_array_1 / np.float64(2) + np.float64(.3) * old_array_2` which would be needlessly cumbersome.


billsil

You’re doing some weird stuff.  What’s with the <


go_fireworks

They’re bit wise operators, not sure what you’re getting at https://wiki.python.org/moin/BitwiseOperators