15,000 lines of verified cryptography now in Python

(jonathan.protzenko.fr)

483 points | by todsacerdoti 193 days ago

16 comments

chrisrodrigue 192 days ago
There's no mention of what Python version this is actually in.
After some digging, it looks like the answer is 3.14 [0], so we won't be seeing this until October [1].
One could argue that this is a security fix (just read the first sentence of the blog post) and should be included in all the currently supported versions (>=3.9) of Python [2].
[0] https://github.com/python/cpython/blob/main/Doc/whatsnew/3.1...
[1] https://peps.python.org/pep-0745/
[2] https://devguide.python.org/versions/
[-]
- ashishb 192 days ago
  Many famous libraries like Spacy do not support Python 3.13 and are stuck on Python 3.12 (Oct 2023).
  So, even if this comes out in Python 3.14, any non-trivial project will have to wait till Oct 2026 (or Oct 2027) to be able to use it.
  1 - https://github.com/explosion/spaCy/issues/13658
  [-]
  - arp242 192 days ago
    Good grief that issue is a clusterfuck of bozos.
    Sometimes I wish there was a GitHub with entry exam. "A library you use has a bug, you find a find a 3 month old bug report for your exact issue. Do you 1) add a comment with "me too", 2) express your frustration this issue hasn't been fixed yet, 3) demand the maintainers fix it as soon as possible as it's causing issues for you, or 4) do nothing".
    Only half joking.
    [-]
    - Hasnep 192 days ago
      The Refined GitHub extension [1] automatically hides comments that add nothing to the discussion. [2]
      [1] https://github.com/refined-github/refined-github [2] https://github-production-user-asset-6210df.s3.amazonaws.com...
      [-]
      - bityard 192 days ago
        Wow, that extension is a gold mine of much needed GitHub functionality
    - rowanG077 192 days ago
      I actually find that hugely helpful that so many people are actually actively expressing they are hitting this. It's not easy to be able to get an idea what issues people are actually hitting with anything you have made. An issue being bumped with essentially "me too" is a highly valuable signal.
      [-]
      - barotalomey 192 days ago
        That's why github added emoji reactions many years ago, so you can express "me too" without spamming the notification systems.
        https://github.blog/news-insights/product-news/add-reactions...
        [-]
        paganel 192 days ago
        [flagged]
        [-]
        padjo 192 days ago
        Emoji being associated with infancy is a thing that only exists in your brain and only because young people were early adopters of emoji about 15 years ago. Emoji are just as valid a means of communication as any other.
        [-]
        anonym29 192 days ago
        It's definitey not only in their brain, it's in my brain, too.
        On a related note: Emoticon > Emoji.
        Bring back B~), get rid of those silly newfangled unicode monstrosities.
        [-]
        genewitch 192 days ago
        Element and discord and the Samsung default keyboard try to :coloncode: any emoticons you type.
        This is on desktop and mobile, since the two apps are electron.
        \>. <
        [-]
        anonym29 192 days ago
        Good thing I don't use Element as my Matrix preferred client. Discord and all Samsung software are more or less indistinguishable from spyware the way I see it, as most closed-source proprietary software is. That's a major if not primary reason why it's closed source: it's doing naughty stuff against the interests of the user that the developers want to conceal from their victims.
        It doesn't matter how many other people prefer to throw their own privacy off a cliff, irrational and self-harming group conformity concerns don't afflict me the way they seem to afflict neurotypical people. One of the blessings of being on the spectrum.
        idiotsecant 192 days ago
        The point is that for the typical older HN denizen, emojis were legitimately a sort of counterculture element associated with trivial conversations in the counterculture of our youth.
        That's not what they are anymore. They have a rich meaning - ask a young person what :) means and you'll have your example.
        Emojis are, like it or not, part of the lexicon now.
        [-]
        dandellion 192 days ago
        You can use whatever dialect of language that you want. I'll keep using the same as the people commenting above, which doesn't include emojis.
        anonym29 192 days ago
        >typical older HN denizen, emojis were legitimately a sort of counterculture element associated with trivial conversations in the counterculture of our youth.
        Emojis didn't exist in the youth of the typical older HN denizen. They are the exclusive purview of Gen Alpha, Gen Z, and younger millennials.
        [-]
        idiotsecant 189 days ago
        Huh? Emojis have existed since BBSs
        kevincox 192 days ago
        It is a button that let's people say "me too". It really doesn't matter if it is labeled with a thumbs-up emoji, the text "me too" or Rick Astley dancing. Don't think you are better than others because you are too "proud" to click a button with a thumbs up image on it.
        [-]
        vrighter 191 days ago
        but in a way that's easily ignorable, for convenience.
        florbo 192 days ago
        I think emojis are easily overused, but I certainly don't mind when they're used to convey simple, universally understood meaning (such as reacting with a "me too" in a bug report).
        Stuff like gitmoji, though, drive me nuts. Talk about ambiguous and easily misused. Faster for everyone if you just say what you mean.
        qwertox 192 days ago
        Maybe then GitHub should add a text-based expression for people like you.
        Like Google does in Gmail, where you can turn off the icons for the "archive", "report spam" actions so that the text is shown instead.
        I'm sure this would add a lot of value to GitHub /s
        Or you ask your OS vendor (or each website operator) to also ship an optional set of emoticons with less childish images. Start a petition and I'll sign it.
        genewitch 192 days ago
        I very rarely use emojiis on github. If I can't add anything to the discussion / issue, I post nothing. So if there's enough people like me, an issue can languish.
        However the barrier to entry for a lot of issues reporting is pretty tall - requiring triplicate documentation of everything. It's like an overreaction to the sort of people that need to be told to unplug something for 30 seconds to continue the support call.
        And that's if they even "allow" issues.
        I was posting in issues note frequently 1-3 years ago. I'm sure I'd be sheepish about some posts.
      - arp242 192 days ago
        Do you really think the maintainers don't understand that "doesn't work with Python 3.13" isn't going to affect tons of people?
        There's some bozo asking "any news? I cant downgrade because another lib requirement" just two days after the maintainer wrote several paragraphs explaining how difficult it is to make it work with Python 3.13. This adds no value for anyone and is just noise. Anyone interested in actual useful information (workarounds, pointers on how to help) has to wade though a firehose of useless nonsense to get at anything remotely useful. Any seriously discussions of maintainers wanting to discuss things is constantly interrupted by the seagulls from Finding Nemo: "fix? fix? fix? fix? fix?""
        Never mind the demanding nature of some people in that thread.
        Just upvote. That's why this entire feature was added.
        [-]
        madars 192 days ago
        After seeing "Jigar Kumar" cognitive exploits on xz mailing list a maintainer would be excused (and I'd even say, encouraged) to just ignore pressure tactics altogether. It's an open source project - if it works great, if it breaks, you get to keep both pieces.
        throwaway519 192 days ago
        To the non-technical founder, this is doing something.
        They will not move to ~~mocking up~~ sketching a wireframe of something.
        rowanG077 191 days ago
        Considering the python ecosystem is not keen on staying updated anyway, I wouldn't be surprised a maintainer doesn't realize a new python version not working is that important.
        aftbit 192 days ago
        [flagged]
      - hkt 192 days ago
        Reactions, though.
      - stingraycharles 192 days ago
        It’s not just that, it’s people commenting “this is unacceptable” and “I hate this library” that add very little value.
        Also, you can upvote issues as well / leave reactions without leaving comments. It ensures a better signal:noise ratio in the actual discussion.
    - raverbashing 192 days ago
      Geez honestly
      This seems to be the issue https://github.com/explosion/spaCy/issues/13658#issuecomment...
      And you depend on opinionated libraries that break with newer versions. Why? Well because f you that's why! Because our library is not just a tool, it's a lifestyle
      Though it seems that Pydantic 1x does support 3.13 https://docs.pydantic.dev/1.10/changelog/#v11020-2025-01-07
    - __mharrison__ 191 days ago
      What is the passing answer?
    - bagels 192 days ago
      You're missing
      5) Do 1, 2, and 3
      6) Submit a a tested pull request with a fix
      [-]
      - efitz 192 days ago
        Except when you do 6 the maintainer rejects the request and says that it wasn’t a bug at all, and everyone is stuck with the bug.
      - matheusmoreira 192 days ago
        Pull requests should not be normalized. The guy in charge of the code base has complete knowledge of it and can always do a better job than randoms making ad hoc changes to quick fix their own problems. Complaining about bugs and missing features is much easier. It's less of a headache and requires less effort and time investment.
        Truth is developers don't want to end up maintaining other people's code anyways. It's gotten to the point I roll my eyes whenever I see people saying "patches welcome". Chances are they're not as welcome as you would think. "Show us the code", and when you do... Nothing. Submitting a pull request is no guarantee that they'll accept it. Hell it's no guarantee they'll even give you the time of day. You can put in a whole lot of effort and still end up getting refused or even straight up ignored.
        Even if it's an unambiguous improvement on the status quo. They'll commit code that literally returns "not implemented" directly to master and actually ship it out to users. Your pull request will certainly be held to much higher standards though. And that's if they engage with you in an attempt to get your patches merged. They might just decide to ignore your patch and redo all the work themselves, which is exactly what all the complainers wanted them to do in the first place! If they're really nice they'll even credit you in the commit message!
        If you're gonna put in effort into someone else's project, just fork it straight up and make it your own project. Don't waste time with upstream. Don't wait for their blessing. Just start making your own version better for you. Don't even submit it back, they can pull it themselves if they want to.
        [-]
        bagels 192 days ago
        Boo to you. I've up-streamed patches to scipy and a few other python packages. Nobody ever said no to genuine high quality bug reports that were coupled with good fixes. Not saying that will happen every time, but if you're unsure, ask how receptive they are.
        [-]
        matheusmoreira 192 days ago
        I've upstreamed patches too and I increasingly feel like it's a waste of time and that it requires far more effort than it's worth. Good maintainers who work with you are rare. I remember the ones who do.
        > Nobody ever said no to genuine high quality bug reports that were coupled with good fixes.
        Uhuh.
        > ask how receptive they are
        Doesn't really help much. It's entirely possible for people to tell you it's fine then change their minds after you actually do it and send the code in. Seriously hope you never have to experience anything of the sort.
        LtWorf 192 days ago
        We should normalise GOOD pull requests. And possibly avoid making garbage ones that fail 100% of the testsuite.
        [-]
        matheusmoreira 192 days ago
        You can put effort into all that, observe all the best practices, follow all the rules, try your very best and still end up with literally nothing to show for it simply because the other guy doesn't want to.
        Always fork the project and customize it for yourself. Don't argue with others, don't try to convince them, make your version work better for you. It doesn't matter if they want it or not. You don't even have to publish it.
        [-]
        eptcyka 192 days ago
        I wish more people attempted to actually grasp their dependencies by the scruff of their neck and got to know them deeply.
        kamray23 192 days ago
        Eh, most of the time you do have to publish it. MIT/BSD/Apache allows you not to, but most big projects are using LGPL or another share-alike license. With those, if you're making a project that you do publish (e.g. a commercial product that uses that library/language/etc.) you do kinda legally have to share that modified source code. It's of course different if you're scripting for your own sake, but the moment you publish anything using them you also have to publish your modified versions of libraries and other stuff.
        I do agree with the "fork and make it better for you", but I also think it's common courtesy to throw up issues for the bugs or missing features you encounter, as well as pull requests. If for no other reason then as "prototypes" or "quick fixes" that others can cherry-pick into their own forks. They may get rejected, they may get closed, but you still don't have to sit there arguing about it since you have your own version. From a slightly Kantian angle you have your working version and you've now fulfilled your social duty by offering solutions to problems. You've got no need to campaign for the solutions if they get rejected.
        It's virtuous and you get to ignore the github flame wars. There's really no downside beyond taking 5 minutes to be nice and at least put your solution out there. Also fulfills your reciprocal legal duty under LGPL and such.
        [-]
        MyPasswordSucks 192 days ago
        I think the best way to go (if you're the "Good Samaritan Who Wants to Do His Civic Duty but Also Doesn't Want to End Up Spending the Week in Nerd Court") is to fork it, fix it, and leave a brief comment in the issue thread that says "hey, I fixed this in my fork".
        That way, people with the same issue who come in to the issue thread through a search aren't reduced to combing through all the forks to see if one happens to fix the issue. Then instead of spamming up the thread with "me too, this is soo frustrating, fix asap pls", they can spam it up with "merge this fork into the main project asap pls" :-)
        LtWorf 192 days ago
        If you don't distribute to others you don't need to distribute your own changes even for copyleft licenses. That's why the AGPL license was written, because if you're doing something as a service you won't need to distribute any sources even with GPL license.
        matheusmoreira 192 days ago
        > you do kinda legally have to share that modified source code
        I'm using the license exactly as intended. Upstream developers literally don't matter. Free software is not about developers, it's about users.
        Free software licenses say the user gets the source code. They're about empowering the user, in this case me, with the ability use and modify the software as I see fit. If I customize something for my personal use, then I'm the only user of the fork and license terms are fulfilled. People would only get to demand source code from me if I started distributing compiled executables to other users.
        > You've got no need to campaign for the solutions if they get rejected.
        I used to feel that need. Caused me significant anxiety. I thought upstreaming patches was the Right Thing to do. Mercifully I've been cured of this misconception.
        > There's really no downside beyond taking 5 minutes to be nice and at least put your solution out there.
        I have no problem with that. My point is getting involved with upstream is often more trouble than it's worth. Let them do their thing. Just pull from them and rebase your changes. Enjoy life.
        People should think twice before trying to cram their unsolicited code down other people's throats. Even when they ask for the code, chances are deep down they don't actually want to deal with it.
        [-]
        kamray23 187 days ago
        That is exactly why I said that that only comes in once you publish something. The legal argument isn't as strong as the moral one is for me anyway, I never publish anything. I think making a PR isn't forcing code on someone, they're not obligated by anything to read and consider it, but if they do want to they can. I'll make one, and then I'll stop responding, because that's where my personal moral obligation ends. Whoever wants the code can now easily discover it, whoever doesn't can throw it away as they like. Including the upstream maintainer.
        kant2002 191 days ago
        Usually better start with small scale fixing typos and improving docs. That’s great canary for me. If it’s accepted within day or so, that’s project which I willing to learn. I would say you can select projects for your value and you would be fine contributing there. Eventually people will learn you and will trust you more
        [-]
        matheusmoreira 191 days ago
        I understand your point and it's good advice. Typos and documentation though? That's boring... I want to do interesting things. Ever wondered why some program can't do something? Surely someone much smarter would have thought of it, right? And then you code it up and it actually works? That's the kind of thing I like to do.
        I once tried to retrofit a library system into GNU bash of all things. Let's just say it didn't go well.
        [-]
        LtWorf 190 days ago
        Your patch sounds horrible, no wonder people reject them.
        [-]
        matheusmoreira 190 days ago
        Works for me. I can easily organize and reuse functions now. Made scripting much more pleasant. Even made a package out of it so I could install it alongside the upstream version.
        LtWorf 192 days ago
        Those are great guidelines for never achieving anything relevant.
        Also, the same ideas applied to dating rather than to code are what we call "incels/redpillers" and so on, so you might want to calm down a bit I think.
        [-]
        matheusmoreira 192 days ago
        > Those are great guidelines for never achieving anything relevant.
        Relevant to whom? Everything I've done is extremely relevant to me. I wouldn't have done those things if I didn't care about them.
        > Also, the same ideas applied to dating rather than to code are what we call "incels/redpillers" and so on
        I'd say it's more like MGTOW.
        [-]
        AlexeyBelov 191 days ago
        MGTOW are incels/redpillers though.
        [-]
        matheusmoreira 191 days ago
        They're related but not the same. In this context, an "incel" would be someone who's incapable of contributing to projects despite wanting to. That's not my case.
        [-]
        LtWorf 190 days ago
        If your contributions have all been rejected, it could be that they weren't as good as you think they were?
        I've certainly made people angry when I rejected their buggy "improvements".
        [-]
        matheusmoreira 190 days ago
        > If your contributions have all been rejected
        They haven't.
        I see your point though:
        > it could be that they weren't as good as you think they were
        Who knows. Maybe. Asked myself that question many times. Asked others.
        It doesn't matter anymore. Now I only wonder if they are good enough for me.
        throwaway519 192 days ago
        Truth.
  - sitkack 192 days ago
    I have been getting paid to write Python since the late 90s and it amazes me how it consistently has these needless own goals, yet still keeps on going in spite of itself. Way to go Python!
    spaCy should make Cython optional
    hard fork Cython to not used stringitized annotations
    stay on Python 3.12 forever and then skip to 3.15
    It is like have a crowd of people trying to outdo each other on how much self harm they can induce.
    [-]
    - ChrisMarshallNY 192 days ago
      > It is like have a crowd of people trying to outdo each other on how much self harm they can induce.
      See: Lemmings[0]
      [0] https://en.wikipedia.org/wiki/Lemmings_(National_Lampoon)
      [-]
      - sitkack 192 days ago
        I am picturing a whole dis track at pycon where we creatively mock everything and everyone.
    - martinky24 192 days ago
      What exactly is the own goal here…? They’re making the language better in the upcoming release. This is how normal software works.
      In no world is this an “own goal”. God forbid they take on a big task for the betterment of the future language.
      [-]
      - yjftsjthsd-h 192 days ago
        > What exactly is the own goal here…? They’re making the language better in the upcoming release. This is how normal software works.
        Backward incompatible changes are an own goal because they either (depending on your view) make the software worse, or make it better and then make those improvements unavailable to users.
      - sitkack 192 days ago
        I don’t think you understand the whole issue of why spaCy can’t move to Python 3.13
        God forbid.
  - rhdunn 192 days ago
    I hit this when upgrading to Ubuntu 25.04 as that upgraded to Python 3.13. I'm running the various python projects I want in a venv. For the projects stuck on 3.12 I ended up building and installing it from source to my local directory (removing the unprefixed links for compatibility) as the ppa for Python ports doesn't (didn't?) support the latest Ubuntu.
    I dislike using something like docker or conda as I don't want to install/use a separate distro just to be able to use a different version of Python. My setup is working well so far.
    [-]
    - turbocon 192 days ago
      Everyone has there own preferences, but I'd look into uv if I were you. It allows you to specify the python version, and for scripts you can even specify the python version as part of the shebang
      [-]
      - Imustaskforhelp 191 days ago
        uv is literally the goat except I haven't able to make vllm work in uv for some reason. Though aside from that, I think I need to use shebang more because I don't use it as often right now.
  - rtpg 192 days ago
    I was going to write something glib about getting things fixed but that thread looks gnarly!
    To be honest I know so many people who use Pydantic and so many people who seem to get stuck because of Pydantic 2. I’m glad I have minimal exposure to that lib, personally.
    I suppose the biggest issue really is type annotation usage by libs being intractable
  - bsoles 192 days ago
    Does that mean that a point release of Python has breaking changes? If true, that sounds crazy.
    [-]
    - aftbit 192 days ago
      Yes, every Python 3 release for me (at least since 3.6) has had a breaking change or two that affects a library I use. Most of the time, the fix is pretty trivial, but not always.
    - dymk 192 days ago
      Yes, things like format string syntax change between 3.10 and 3.11 and it’s incredibly frustrating
  - pbronez 192 days ago
    Just ran into something similar with Great Expectations. Python 3.12 is the newest I can run.
    [-]
    - 0cf8612b2e1e 192 days ago
      uv seems to have reverted to defaulting to 3.12 instead of 3.13. Which I fully endorse owing to how many packages are not yet compatible.
    - ashishb 192 days ago
      Exactly, many compiled languages like Java and Go do not suffer from this issue.
      [-]
      - LtWorf 192 days ago
        It's not about being compiled or not compiled. Python is now making breaking changes on every release instead of piling up a bunch of them and making python 4.
        So what we get is a a mini python2/3 situation on every single release instead.
        [-]
        ashishb 192 days ago
        Yeah.
        Even patch version upgrade from 3.12.3 to 3.12.4 broke a lot of packages.
        https://github.com/langchain-ai/langchain/issues/22692
      - fiddlerwoaroof 192 days ago
        Tell that to all the companies stuck on Java 8 with old versions of Spring and Hibernate. This is the cost of an ecosystem where major libraries make breaking changes.
        [-]
        lmm 192 days ago
        Spring and Hibernate broke the rules of the language and paid the price, and nevertheless all the companies I'm aware of managed to migrate the best part of a decade ago.
- devrandoom 192 days ago
  > One could argue
  How?
  [-]
  - chrisrodrigue 192 days ago
    From https://github.com/python/cpython/issues/99108#issue-1436673...:
    > As evidenced by the recent SHA3 buffer overflow, cryptographic primitives are tricky to implement correctly. There might be issues with memory management, exceeding lengths, incorrect buffer management, or worse, incorrect implementations in corner cases.
    This is a proactive fix for zero days that may be lurking in the wild.
bgwalter 192 days ago
Reading the article, they took a verified C library generated from F* from Microsoft, vendored the code in CPython and wrote a C extension.
And during the process they discovered that the original library did not handle allocation failures?
What is the big deal about Python here? It's just another wrapped C library.
[-]
- quantumgarbage 192 days ago
  A seamless, drop-in replacement, for all python users is actually a pretty good feat tbh
- anon6362 192 days ago
  In general, it's good practice to use best components available, regardless of source. Cryptographic components especially shouldn't be DIY.
  It would be cool if Ruby did something similar for stdlib/openssl, but it could be also be done in a gem too.
- nine_k 192 days ago
  > just another wrapped C library.
  I suppose you mean "just another top-notch library available in the Python ecosystem"? :)
- bolognafairy 192 days ago
  What part of how this is being conveyed do you disagree with? Specifically.
  This isn’t a language war.
  [-]
  - neuroelectron 192 days ago
    The headline makes it sound like it was written in Python which is a lot more readable than 15,000 lines of C
    [-]
    - frumplestlatz 192 days ago
      The 15k lines of C are generated. The implementations are written in (and formally verified using) F*, and then the C is generated from the F* code using KaRaMeL.
      One should be able to trust the proofs of correctness and never look at (or maintain) the generated C directly.
      - https://fstar-lang.org
      - https://github.com/FStarLang/karamel
      - https://github.com/hacl-star/hacl-star
      [-]
      - wepple 192 days ago
        I’m not a verification expert - do we know or assume that the F* to C compiler preserves the verified code completely? And doesn’t introduce issues?
        [-]
        aw1621107 192 days ago
        This paper [0] linked in the KaRaMeL repo claims to provide the proof you're looking for:
        > Verified Low-Level Programming Embedded in F∗
        > We present Low∗, a language for low-level programming and verification, and its application to high-assurance optimized cryptographic libraries. Low∗ is a shallow embedding of a small, sequential, well-behaved subset of C in F∗, a dependently-typed variant of ML aimed at program verification. Departing from ML, Low∗ does not involve any garbage collection or implicit heap allocation; instead, it has a structured memory model à la CompCert, and it provides the control required for writing efficient low-level security-critical code.
        > By virtue of typing, any Low∗ program is memory safe. In addition, the programmer can make full use of the verification power of F∗ to write high-level specifications and verify the functional correctness of Low∗ code using a combination of SMT automation and sophisticated manual proofs. At extraction time, specifications and proofs are erased, and the remaining code enjoys a predictable translation to C. We prove that this translation preserves semantics and side-channel resistance.
        [0]: https://arxiv.org/pdf/1703.00053
        [-]
        wepple 191 days ago
        Thanks! Looks hefty but I’ll try give it a crack
      - myko 192 days ago
        This is fantastic. Somewhere my CS prof is smiling
- nxpnsv 192 days ago
  How is that bad in any way?
Retr0id 192 days ago
Will this bring support for "streaming" output from SHAKE?
Here's the (recently closed!) issue for pyca/cryptography regarding this feature, although I can't find an equivalent issue for the python stdlib: https://github.com/pyca/cryptography/issues/9185
Edit: Found a relevant issue, "closed as not planned": https://github.com/python/cpython/issues/82198
IlikeKitties 192 days ago
Modern, ubiquitous cryptography is now practically unbreakable (even for the NSA) and widely used. The crypto wars of the 90s seem rather quaint. Any thoughts on the effects on society this now has?
[-]
- xyzzy123 192 days ago
  It's cool that we can largely "strike out" link level wiretapping from our threat models but it just means attackers move on to the endpoints. I have a wonderfully private link to google, my bank and my crypto exchange but all of those are coerced to report everything I do.
  [-]
  - TacticalCoder 192 days ago
    > ... but it just means attackers move on to the endpoints.
    Yup but this doesn't scale anywhere near as well for the attackers.
    [-]
    - FabHK 192 days ago
      Sorry, if there are N clients, and M servers, then there are N+M endpoints, but N*M links, which is a lot more.
      [-]
      - PhilipRoman 192 days ago
        The link is only as strong as it's weakest... link? Anyway, given the structure of internet, I wouldn't say it is O(N*M), more like O(the number of major ISPs).
      - Ey7NFZ3P0nzAe 191 days ago
        Well all endpoints might differ but the links might rely on the exact same algorithm so you need to count distinct links
    - artursapek 192 days ago
      Most internet traffic is cosolidated through a small number of providers like Cloudflare and AWS.
      [-]
      - hkt 192 days ago
        Cloudflare being, if it wants to be, an epic MITM given its control of DNS and its role as WAF. Line level surveillance barely matters now.
    - pyfon 192 days ago
      Except for countries who hist companies who hold all the endpoints.
- Analemma_ 192 days ago
  I think modern cryptography is basically unbreakable if used correctly, but there is still a lot of work to do re: developer ergonomics and avoiding footguns in implementations. This is much better than it used to be, thanks to things like libsodium, moving away from RSA, and newer protocols that de-emphasize cipher negotiation, but there is still more to do, even with “good” crypto primitives. For example, AES used perfectly is probably unbreakable but AES-GCM has ways to bite the unwary; ideally we should think about an even newer symmetric block cipher or increasing awareness of better AES modes without GCM’s drawbacks.
- jakeogh 192 days ago
  I just got bit by device attestation. Tried to install the latest ebay app version via the Aurora Store (on GrapheneOS), and instead of presenting me with the option of using my ebay login, it wanted a google account at a play store login with no way to bypass. I was able to downgrade to the previous version which does not require the Integrity API, but the walls are closing in. Only took 7 months: https://news.ycombinator.com/item?id=41517159 (I did get ebay on the phone and filed an issue, hopefully others do the same)
- matheusmoreira 192 days ago
  > Any thoughts on the effects on society this now has?
  I have observed two effects.
  They are constantly trying to make it illegal for the common man to use cryptography. Constantly. Cryptography is subversive. It has the power to defeat judges, armies, nations. Authorities don't want normal people to have access to this. It will increasingly become a tool of corporations and governments.
  Authorities are bypassing cryptography instead by attacking the end points. It's easier to crack other parts of the system and exfiltrate data after it's been decrypted than to break the cryptography. Easier to hack a phone and get the stored messages than to intercept end-to-end encrypted ciphertext and decrypt it.
- gpcz 192 days ago
  For now. If someone makes a practical quantum computer, pretty much every asymmetric primitive we use at the start of a cryptographic protocol to make a shared secret for symmetric encryption will be breakable.
  [-]
  - dist-epoch 192 days ago
    The switch to post-quantum encryption already started - Signal, Chrome, iMessage
- anon6362 192 days ago
  This is so vague as to be meaningless because body of research (attacks and hw-accelerated implementations), details, implementations, uses, and adversary/ies budget(s) matter. Furthermore, the obsession with optimizing for "fast" cryptographic algorithms/implementations undermine themselves when it comes to the cost of CPU-bound brute force attack which become cheaper and more attainable over time.
- imiric 192 days ago
  Why are you so certain of this? The NSA has a long history of attempting to insert backdoors in cryptographic systems. Most recently they bribed RSA to make their compromised PRNG the default, which shipped in software as late as 2014, possibly even later.
  And these are just the attempts we know about. What makes you think that they haven't succeeded and we just don't know about it?
  [-]
  - IlikeKitties 192 days ago
    We know from the Snowden Leaks that they couldn't crack PGP at the time. There's been some talks about cracking "export grade" cryptography and how that is done. I'm pretty confident that the newer hardened crypto libraries are pretty secure and since even the NSA recommends signal encryption now because the infrastructure in the US has so many holes the Chinese are in the entire mobile internet infrastructure, that's a pretty strong lead that it's very hard if not impossible to crack signal, at least for the Chinese.
    It's also very likely that even if they can attack crypto in ways we don't know about, they can at best reduce the required time it takes to crack a given key. Chosing extra long keys and changing them frequently should protect you from lots of attacks.
    [-]
    - mmooss 192 days ago
      > It's also very likely that even if they can attack crypto in ways we don't know about, they can at best reduce the required time it takes to crack a given key.
      Why do you say that? Very often the vulnerabilities are not in the mathematics but in the implementation.
      If Signal works, as was pointed out during the recent scandal, it only protects messages in transit. The application on your phone is not especially secure and anyone who can access your phone can access your Signal messages (and thus any of your correspondents' messages that they share with you).
      > that's a pretty strong lead that it's very hard if not impossible to crack signal, at least for the Chinese.
      The NSA does not recommend Signal for classified communication.
      The NSA thinks use of Signal is the best interest of the US government, as the NSA perceives those interests (every institution will have its own bias). It could be that Signal is the least insecure of the easily available options or that that the NSA believes that only they can crack Signal.
      [-]
      - commandersaki 192 days ago
        > If Signal works, as was pointed out during the recent scandal, it only protects messages in transit. The application on your phone is not especially secure and anyone who can access your phone can access your Signal messages (and thus any of your correspondents' messages that they share with you).
        Device compromise is outside the threat model for Signal or any software for that matter.
        > Why do you say that? Very often the vulnerabilities are not in the mathematics but in the implementation.
        This is why we use primitives that are well understood and have straightforward implementations. I'm no expert but you can look at design of say Salsa20 or Curve25519 -- it was designed for straightforward constant time implementation. Yes NIST suite has implementation subtleties such as constant time implementations, but those ward off issues that are of pretty low concern / infeasible to attack (i've yet to see a timing channel or cache side channel be exploited in the wild except for the XBox 360 MAC verification). Also CFRG has been publishing guidance so we know how to properly implement KEMs, (hybrid) encryption, PAKE, hash to curve, signatures, ECDH, ECDSA, etc. Compound this with a lot of footgun free crypto libraries such as Go crypto, libsodium, monocypher, BoringSSL, Tink, etc. and these days you'd be hard pressed to make a misstep cryptography wise.
        In my opinion, NSA advantage is not that it has a virtually unlimited budget, it's that they have better vantage point to carry out multi-user attacks. So does the likes of FAANG, Fastly, Cloudflare, etc.
        [-]
        mmooss 192 days ago
        > Device compromise is outside the threat model for Signal or any software for that matter.
        I agree about Signal - that's what they say iirc. Some software does take it into account. The point here is about security depending on much more than cryptography mathematics; Signal is just an example.
      - IlikeKitties 192 days ago
        > Why do you say that? Very often the vulnerabilities are not in the mathematics but in the implementation.
        I recommend this talk: https://www.youtube.com/watch?v=v8Pma5Bdvoo This gives you a good example how practical attacks and intentional weakening of crypto works.
        And especially for common cryptos like AES and RAS you can easily compare the outputs of different implementations. If one is different, that one is suspect. And especially for open source crypto like OP, the implementation can easily be verified.
        [-]
        mmooss 192 days ago
        You describe implementation as easy, but in practice it takes hard-to-find expertise and lots of resources. In the case of the OP, look what it took to get a secure implementation into Python, as late as the year 2025.
      - upofadown 192 days ago
        My take is that the Signalgate thing was mostly about usability[1]. Which I suppose could be considered an implementation thing but is something that should be explicitly addressed. So a possible NSA based conspiracy theory here is that NSA is working in the background impairing usability.
        Cryptographic libraries are not always documented as well as they should be. In the case of something like Python, it is not always easy to map library documentation to the resultant Python.
        [1] https://articles.59.ca/doku.php?id=em:sg
  - mmooss 192 days ago
    > What makes you think that they haven't succeeded and we just don't know about it?
    Yes, afaik they also have a history of not revealing exploits they discover. With a budget in the tens of billions and tens of thousands of employees, they have the resources to discover quite a bit.
jart 192 days ago
I think it was pretty obvious to anyone who looked at the SHA3 code Python had that it was insane and not to be trusted. Everyone smart moved on to blake2b. How they managed to frameworkify a hash function and push it through standardization, I'll never know.
andai 192 days ago
My first thought was "in the Python language", but it's a C extension. Is it even possible to write crypto code in memory managed languages? Due to the need for everything to run in constant time.
[-]
- martinky24 192 days ago
  IIRC Go has pretty decent crypto implemented in Go: https://pkg.go.dev/crypto
- wolf550e 192 days ago
  Plenty of cryptography in Java, C# and Go.
- EasyMark 192 days ago
  rust is a type of memory managed language and I don't see any issue with writing crypto there?
  [-]
  - dymk 192 days ago
    c++ is a (poorly) memory managed language by that logic (std::unique_ptr, etc)
    [-]
    - EasyMark 185 days ago
      Actually c++ is also excellent if you use modern c++20 and greater. Crypto algos tend to be fairly small compared to a lot of other software and if you can't keep your coders disciplined enough to use memory safe code there then the company or project should probably go under because of skills issues.
      [-]
      - dymk 183 days ago
        Alright, enjoy your RCEs. I’ll be over here being productive, despite my skill issue.
hall0ween 193 days ago
I'm cryptographically ignorant. What does this mean for python?
[-]
- otterley 192 days ago
  https://en.wikipedia.org/wiki/Formal_verification
  Provable correctness is important in many different application contexts, especially cryptography. The benefit for Python (and for any other user of the library in question) is fewer bugs and thus greater security assurance.
  If you look at the original bug that spawned this work (https://github.com/python/cpython/issues/99108), the poster reported:
  """As evidenced by the recent SHA3 buffer overflow, cryptographic primitives are tricky to implement correctly. There might be issues with memory management, exceeding lengths, incorrect buffer management, or worse, incorrect implementations in corner cases.
  The HACL* project https://github.com/hacl-star/hacl-star provides verified implementations of cryptographic primitives. These implementations are mathematically shown to be:
```
    memory safe (no buffer overflows, no use-after-free)
    functionally correct (they always compute the right result)
    side-channel resistant (the most egregious variants of side-channels, such as memory and timing leaks, are ruled out by construction)."""
```
- hathawsh 192 days ago
  Here is how I read it: CPython has a hashlib module that has long been a wrapper around certain functions exported from OpenSSL, but since a SHA3 buffer overflow was discovered in OpenSSL, the CPython project has now decided to wrap a library called HACL*, which uses formal verification to ensure there are no buffer overflows and other similar bugs.
- aseipp 192 days ago
  Extreme ELI5 TL;DR: Your Python programs using the cpython interpreter and its built in cryptographic modules will now be using safer and faster, with no need for you to do anything.
  [-]
  - kccqzy 192 days ago
    Who uses its built-in cryptographic modules though? I wrote a fair bit of cryptographic code in Python more than five years ago and most people recommended the cryptography package https://pypi.org/project/cryptography/ over whatever that's built-in.
    [-]
    - Retr0id 192 days ago
      I'm a big fan of pyca/cryptography and I use it for any serious project, but if I just need hashing I tend to use the standard library hashlib - it saves a somewhat heavy dependency, and the API is less verbose.
      Also, pyca/cryptography uses OpenSSL. OpenSSL is fine, but has the same "problem" as the previous python stdlib implementation. (Personally I think it's an acceptable risk. If anything, swapping in 15,000 lines of new code is the greater risk, even if it's "verified")
      [-]
      - frumplestlatz 192 days ago
        I'm curious why you put "verified" in scare-quotes, and why you think adopting formally verified code is a greater risk.
        [-]
        Retr0id 192 days ago
        I don't think formal verification is bad, I just don't think it's a silver bullet. i.e. we should not get complacent and assume that formally verified code is 100% safe.
  - odo1242 192 days ago
    Is it faster? I’m pretty sure the main goal of this effort is just the “safer” part.
    [-]
    - protz 192 days ago
      the performance was pretty much identical, however, as an added benefit, Blake2 got quite a bit faster due a combination of 1) our code being slightly more optimized, and 2) python's blake2 integration not actually doing runtime cpu detection, meaning that unless you compiled with -march=native (like, Gentoo), at the time, you were not getting the AVX2 version of Blake2 within Python -- my PR fixed that and added code for CPU autodetection
      bear in mind that performance is tricky -- see my comment about NEON https://github.com/python/cpython/pull/119316#issuecomment-2...
    - aseipp 192 days ago
      The goal is to make things safer, yes, but speed is absolutely a major priority for the project and a requirement for production deployment, because the difference in speed for optimized designs vs naive ones might be an order of magnitude or more. It's quite speedy IME. To be fair to your point, though, it's also a moving target; "which is faster" can change as improvements trickle in. Maybe "shouldn't be much slower" is a better statement, but I was speaking generally :)
      (You could also make the argument that if previous implementations that were replaced were insecure that their relative performance is ultimately irrelevant, but I'm ignoring that since it hinges on a known unknown.)
    - drewcoo 192 days ago
      And safer is often slower to avoid timing attacks.
      [-]
      - aseipp 192 days ago
        I mean, most if not all of the code they're replacing (e.g. the vendored and vectorized Blake2 code) is also going to be designed and optimized with timing attacks in mind and implemented to avoid them. CVE-2022-37454 was literally found in a component that was optimized and had timing attack resistance in mind (XKCP from the SHA-3 authors). "Code that is already known to be wrong" is not really a meaningful baseline to compare against.
- jessekv 192 days ago
  More context here.
  https://github.com/python/cpython/issues/99108#issuecomment-...
tsecurity 192 days ago
How much of the development of this was verified, and what does that consist of?
I worry a little when I read that things are verified and were hard.
[-]
- aseipp 192 days ago
  https://eprint.iacr.org/2017/536.pdf is the relevant paper that introduces the project and its broad design. Figure 1 on page 3 is a good place to look.
- rtkwe 192 days ago
  The first two shouldn't matter because the entirety of the code is verified and anyone can check the verification. The last is an issue with any cryptography but verification doesn't try to address that only that the code does precisely and only what it's supposed to; ie it should guarantee that there are no exploits possible against that library.
badmonster 192 days ago
how reusable is the generic streaming verification framework beyond cryptographic hashes?
pluto_modadic 192 days ago
does this mean anything that imports the cryptography modules needs to include G++ or something weird, or is it compiled into cpython itself?
[-]
- __s 192 days ago
  The latter. Most of CPython is C you don't need compiler to run
AndyMcConachie 192 days ago
Thank you!
thebeardisred 193 days ago
Lines of code is a pretty poor measurement. Doubly so when you're boasting a large number in the context of cryptographic code.
[-]
- bbeonx 192 days ago
  This isn't really a metric. It's a description to help the reader understand the magnitude of effort that went into this project. SLoC is a bad metric for plenty of things, but I think it's fine for quickly conveying the scope of a project to a blog reader.
  [-]
  - odyssey7 192 days ago
    Lines of code is also a poor indicator for “magnitude of effort.”
    Tangent: generally I’m more inclined to believe quality is improved when someone claims 1000s of lines reduced rather than 1000s of lines written.
    [-]
    - tgv 192 days ago
      Do you remember writing the proof for quicksort, which is say 0.1k lines? 15k lines of verified code is a pretty good indication of effort.
      But the problem may come from the headline, which is somewhat clickbaity. HN forbids changing it, and then part of the discussion focuses on the literal content of the headline, which is, as you rightly hint, not the best summary of what's interesting here.
      [-]
      - odyssey7 192 days ago
        Programs are already proofs. A 15,000-line proof is going to have a mistake somewhere.
        In mathematics, the proofs tend to be resilient to minor errors and omissions, because the important part is that the ideas are correct.
        In applied cryptography, errors and omissions are foundations for exploits.
        Verifying that those 15,000 lines do what they do doesn't give me much more confidence than thorough unit testing would.
        [-]
        the-lazy-guy 192 days ago
        > A 15,000-line proof is going to have a mistake somewhere.
        If this proof is formal, than it is not going to. That is why writing formal proofs is such a PITA, you actually have to specify every little detail, or it doesn't work at all.
        > Verifying that those 15,000 lines do what they do doesn't give me much more confidence than thorough unit testing would.
        It actually does. Programs written in statically typed languages (with reasonably strong type systems) empirically have less errors than the ones written in dynamically typed languages. Formal verification as done by F* is like static typing on (a lot of) steroids. And HACL has unit tests among other things.
    - chaitimes 192 days ago
      From my experience, pre LLMs, it was a valid proxy metric for effort
    - xeromal 192 days ago
      See: AI generating 1000s of lines
  - lionkor 192 days ago
    It's a metric of complexity
- aseipp 192 days ago
  It isn't boasting about anything, it's a straightforward description of the 2.5 years of effort they went through for this project, and some of the more annoying "production integration" bits.
- bbstats 192 days ago
  the 2nd sentence:
  "As of last week, this issue is now closed, and every single hash and HMAC algorithm exposed by default in Python is now provided by HACL*, the verified cryptographic library."
  it's describing coverage.
- titaphraz 192 days ago
  > Lines of code is a pretty poor measurement
  But let's say at least that it's abused, exploited, perverted and a molested measurement before we call it a poor one. It not all that bad when you have context.
- daquisu 192 days ago
  Which better measurement do you propose?
evertyhingisfin 192 days ago
[flagged]
Surac 192 days ago
This all reads like if any new version of python breaks old stuff? Never thought python is unstable
[-]
- procaryote 192 days ago
  Minor versions of python knowingly break things all the time, often to remove something that was earlier deprecated because a tidy namespace is more important than backwards compatibility, apparently.
  The fact that you often won't find out until runtime makes it a real pain
  It's a mess.
sylware 192 days ago
And who in going to "verify" the machine code produced by the compiler used for the python interpreter itself?
(then the hardware would have to be validated as well if complex ISA).
The point is, cryptography should be mostly if not all assembly. Look at dav1d AV1 decoder to have an idea.
Tarucho 192 days ago
Not saying it isn´t useful but with this change Python's crypto depends on Microsoft (https://fstar-lang.org/ and https://project-everest.github.io/)
[-]
- NavinF 192 days ago
  > HACL* is a formally verified library of modern cryptographic algorithms, where each primitive is verified for memory safety, functional correctness, and secret independence. HACL* provides efficient, readable, standalone C code for each algorithm that can be easily integrated into any C project.
  > All the code in this repository is released under an Apache 2.0 license. The generated C code from HACL* is also released under an MIT license.
  You have an unusual definition of "depends on Microsoft". Anyone worried about depending on Microsoft should be able to maintain 15k lines of C that are already formally verified. Python already vendored the code so who cares who wrote that code?
  [-]
  - Tarucho 192 days ago
    [flagged]