A “frozen” dictionary for Python

Posted by jwilk 1 day ago

Comments

Comment by pansa2 1 day ago

I wonder whether Raymond Hettinger has an opinion on this PEP. A long time ago, he wrote: "freezing dicts is a can of worms and not especially useful".

https://mail.python.org/pipermail/python-dev/2006-February/0...

Comment by pkulak 18 hours ago

This is why I love how Rust approached this; almost by accident to make borrow checking work. Every reference is either mutable or not, and (with safe code), you can't use an immutable reference to get a mutable reference anywhere down the chain. So you can slowly construct a map through a mutable reference, but then return it out of a function as immutable, and that's the end of it. It's no longer ever mutable, and no key or value is either. There's no need to make a whole new object called FrozenHashMap, and then FrozenList, and FrozenSet, etc. You don't need a StringBuilder because String is mutable, unless you don't want it to be. It's all just part of the language.

Kotlin _kinda_ does this as well, but if you have a reference to an immutable map in Kotlin, you are still free to mutate the values (and even keys!) as much as you like.

Comment by rcxdude 16 hours ago

Only if you're returning a reference or wrapping it in something that will only ever return a reference. If you return an object by value ('owned'), then you can do what you like with it and 'mut' is just an light guardrail on that particular name for it.

Comment by vlovich123 16 hours ago

You cannot return an immutable version. You can return it owned (in which case you can assign/reassign it to a mut variable at any point) or you can take a mut reference and return an immutable reference - but whoever is the owner can almost always access it mutably.

Comment by pkulak 16 hours ago

Arg, you’re right. Not sure what I was thinking there. I still think my point stands, because you get the benefits of immutability, but yeah, I didn’t explain it well.

Comment by tekne 15 hours ago

I mean, if you return an immutable reference, the owner in fact cannot mutate it until that reference is dropped.

If you in fact return e.g. an Rc::new(thing) or Arc::new(thing), that's forever (though of course you can unwrap the last reference!)

Comment by aw1621107 2 hours ago

> I mean, if you return an immutable reference, the owner in fact cannot mutate it until that reference is dropped.

Might be worth noting that "dropped" in this context doesn't necessarily correspond to the reference going out of scope:

    fn get_first(v: &Vec<i32>) -> &i32 {
        &v[0]
    }

    fn main() {
        let mut v = vec![0, 1, 2];
        let first = get_first(&v);
        print!("{}", first});
        v.push(3); // Works!
        // print!("{}", first); // Doesn't work
    }

Comment by the__alchemist 17 hours ago

My favorite part of rust is its explicit control over mutability, in the manner you describe.

Comment by jonathaneunice 1 day ago

That's a great link and recommended reading.

It explains a lot about the design of Python container classes, and the boundaries of polymorphism / duck typing with them, and mutation between them.

I don't always agree with the choices made in Python's container APIs...but I always want to understand them as well as possible.

Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times! Thank goodness their first wave of understanding wasn't their last. Concurrency and parallelism in Python was a TINY issue in 2006, but at the forefront of Python evolution these days. And immutability has come a long way as a design theme, even for languages that fully embrace stateful change.

Comment by zahlman 23 hours ago

> Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times!

The new implementation has saved space, but there are opportunities to save more space (specifically after deleting keys) that they've now denied themselves by offering the ordering guarantee.

Comment by jonathaneunice 23 hours ago

Ordering, like stability in sorting, is an incredibly useful property. If it costs a little, then so be it.

This is optimizing for the common case, where memory is generally plentiful and dicts grow more than they shrink. Python has so many memory inefficiencies that occasional tombstones in the dict internal structure is unlikely to be a major effect. If you're really concerned, do `d = dict(d)` after aggressive deletion.

Comment by zahlman 23 hours ago

> Ordering, like stability in sorting, is an incredibly useful property.

I can't say I've noticed any good reasons to rely on it. Didn't reach for `OrderedDict` often back in the day either. I've had more use for actual sorting than for preserving the insertion order.

Comment by mcherm 19 hours ago

Personally, I find lots of reasons to prefer an orders Dict to an unordered one. Even small effects like "the debugging output will appear in a consistent order making it easier to compare" can be motivation enough in many use cases.

Comment by xen0 18 hours ago

It's sometimes nice to be deterministic.

I don't often care about a specific order, only that I get the same order every time.

Comment by no_wizard 18 hours ago

Thinking about this upfront for me, I am actually wondering why this is useful outside of equality comparisons.

Granted, I live and work in TypeScript, where I can't `===` two objects but I could see this deterministic behavior making it easier for a language to compare two objects, especially if equality comparison is dependent on a generated hash.

The other is guaranteed iteration order, if you are reliant on the index-contents relationship of an iterable, but we're talking about Dicts which are keyed, but extending this idea to List, I see this usefulness in some scenarios.

Beyond that, I'm not sure it matters, but I also realize I could simply not have enough imagination at the moment to think of other benefits

Comment by xen0 18 hours ago

I work on a build system (Bazel), so perhaps I care more than most.

But maybe it does all just come down to equality comparisons. Just not always within your own code.

Comment by dontlaugh 16 hours ago

Being able to parse something into a dict and then serialise it back to the same thing is a bit easier. Not a huge advantage, though.

Comment by kzrdude 17 hours ago

It seems like opinions really differ on this item then. I love insertion sort ordering in mappings, and python with it was a big revelation. The main reason is that keys need some order, and insertion order -> iteration order is a lot better than pseudorandom order (hash based orders).

For me, it creates more reproducible programs and scripts, even simple ones.

Comment by vanviegen 21 hours ago

Indeed! I don't understand why it isn't more common for stdlibs to include key-ordered maps and sets. Way more useful than insertion ordering.

Comment by zahlman 21 hours ago

Presumably because it involves different performance characteristics.

Comment by morshu9001 19 hours ago

Same. Recently I saw interview feedback where someone complained that the candidate used OrderedDict instead of the built-in dict that is now ordered, but they'll let it slide... As if writing code that will silently do different things depending on minor Python version is a good idea.

Comment by tehnub 19 hours ago

Well it's been guaranteed since 3.7 which came out in 2018, and 3.6 reached end-of-life in 2021, so it's been a while. I could see the advantage if you're writing code for the public (libraries, applications), but for example I know at my job my code is never going to be run with Python 3.6 or older.

Comment by morshu9001 16 hours ago

Yeah, if you have that guarantee then I wouldn't fault anyone for using dict, but also wouldn't complain about OrderedDict.

Comment by sam_bristow 4 hours ago

Honestly, if I was writing some code that depended on dicts being ordered I think I'd still use OrderedDict in modern Python. I gives the reader more information that I'm doing something slightly unusual.

Comment by BiteCode_dev 19 hours ago

Ordering is very useful for testing.

This morning for example, I tested an object serialized through a JSON API. My test data seems to never match the next run.

After a while, I realized one of the objects was using a set of objects, which in the API was turned into a JSON array, but the order of said array would change depending of the initial Python VM state.

3 days ago, I used itertools.group by to group a bunch of things. But itertools.group by only works on iterable that are sorted by the grouping key.

Now granted, none of those recent example are related to dicts, but dict is not a special case. And it's iterated over regularly.

Comment by seanhunter 21 hours ago

Ordering is specifically a property (useful or not) that a set doesn't have. You need a poset for it to be ordered.

I would expect to use a different data structure if I needed an ordered set.

Comment by LtWorf 21 hours ago

Does your code actually rely on that? I've never once needed it.

Comment by mvanbaak 1 day ago

This was 19 (almost) 20 years ago. As stated in the lwn.net article, a lot of concurrency has been added to python, and it might now be time for something like a frozendict.

Things that were not useful in 2006 might be totally useful in 2026 ;P

Still, like you, I'm curious wether he has anything to say about it.

Comment by aewens 23 hours ago

I think Raymond Hettinger is called out specially here because he did a well known talk called [Modern Dictionaries](https://youtu.be/p33CVV29OG8) where around 32:00 to 35:00 in he makes the quip about how younger developers think they need new data structures to handle new problems, but eventually just end up recreating / rediscovering solutions from the 1960s.

“What has been is what will be, and what has been done is what will be done; there is nothing new under the sun.”

Comment by sesm 23 hours ago

Since that time HAMT was invented and successfully used in Scala and Clojure, so this talk didn't age well.

Comment by Someone 22 hours ago

Wikipedia (https://en.wikipedia.org/wiki/Hash_array_mapped_trie) links to the paper describing HAMT (https://infoscience.epfl.ch/server/api/core/bitstreams/f66a3...) and claims that is from 2000. That talk is from 2016.

Comment by zelphirkalt 21 hours ago

Do you know of any implementation, that is well annotated/commented, so that it is easy to understand?

Comment by ndr 22 hours ago

HAMT weren't immutable/persistent until Clojure though: https://en.wikipedia.org/wiki/Persistent_data_structure#Pers...

Still well before the talk.

Comment by dkarl 23 hours ago

It's interesting that he concludes that freezing dicts is "not especially useful" after addressing only a single motivation: the use of a dictionary as a key.

He doesn't address the reason that most of us in 2025 immediately think of, which is that it's easier to reason about code if you know that certain values can't change after they're created.

What a change in culture over the last 20 years!

Comment by morshu9001 19 hours ago

You can't really tell though. Maybe the dict is frozen but the values inside aren't. C++ tried to handle this with constness, but that has its own caveats that make some people argue against using it.

Comment by krick 17 hours ago

Indeed. So I don't really understand what this proposal tries to achieve. It even explicitly says that dict → frozendict will be O(n) shallow-copy, and the contention is only about O(n) part. So… yeah, I'm sure they are useful for some cases, but as Raymond has said — it doesn't seem to be especially useful, and I don't understand what people ITT are getting excited about.

Comment by 16 hours ago

Comment by morshu9001 15 hours ago

Maybe treating Python like a systems language, so applying the same reasoning for const in C++ and Rust to it

Comment by zahlman 23 hours ago

> Another PEP 351 world view is that tuples can serve as frozenlists; however, that view represents a Liskov violation (tuples don't support the same methods). This idea resurfaces and has be shot down again every few months.

... Well, yes; it doesn't support the methods for mutation. Thinking of ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.

I normally find Hettinger very insightful so this one is disappointing. But nobody's perfect, and we change over time (and so do the underlying conditions). I've felt like frozendict was missing for a long time, though. And really I think the language would have been better with a more formal concept of immutability (e.g. linking it more explicitly to hashability; having explicit recognition of "cache" attributes, ...), even if it didn't go the immutable-by-default route.

Comment by kccqzy 22 hours ago

Apple (or perhaps NeXT) has solved this problem already in Objective-C. Look at NSArray and NSMutableArray, or NSData and NSMutableData. It’s intuitive and Liskov-correct to make the mutable version a subclass of the immutable version. And it’s clearly wrong to have the subclass relationship the other way around.

Given how dynamic Python is, such a subclass relationship need not be evident at the C level. You can totally make one class whose implementation is independent of another class a subclass of the other, using PEP 3119. This gives implementations complete flexibility in how to implement the class while retaining the ontological subclass relationship.

Comment by pansa2 23 hours ago

> ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.

Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)? Do other languages take that approach?

> linking [immutability] more explicitly to hashability

AFAIK immutability and hashability are equivalent for the language's "core" types. Would it be possible to enforce that equivalence for user-defined types, given that mutability and the implementation of `__hash__` are entirely controlled by the programmer?

Comment by kccqzy 22 hours ago

Yes you could. Other languages do. See NSMutableSet and NSSet in Objective-C.

Comment by chriswarbo 23 hours ago

> Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)?

At one extreme: sure, anything can be made a subclass of anything else, if we wanted to.

At the other extreme: no, since Liskov substitution is an impossibly-high bar to reach; especially in a language that's as dynamic/loose as Python. For example, consider an expression like '"pop" in dir(mySet)'

Comment by tremon 23 hours ago

> consider an expression like '"pop" in dir(mySet)'

  class frozenset:
    pass
  
  class set(frozenset):
    def pop(self, key):
      pass

I don't see why hasattr(mySet, 'pop') should be a problem here?

Comment by chriswarbo 21 hours ago

> I don't see why hasattr(mySet, 'pop') should be a problem here?

I never said it's a problem (and I never said it's not!). I was specifically addressing two things:

- The "theoretical" nature of the question I quoted (i.e. ignoring other aspects like subjectivity, practicality, convention, etc.)

- The reasoning about "Liskov violation", which was quoted further up this thread.

For context, here's Liskov's definition of their principle (from https://en.wikipedia.org/wiki/Liskov_substitution_principle ):

> Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:[1]

> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.

My expression `"pop" in dir(mySet)` gives an explicit example of how `set` and `frozenset` are not subtypes of each other (regardless of how they're encoded in the language, with "subclasses" or whatever). In this case `ϕ(x)` would be a property like `'"pop" in dir(x)' = 'False'`, which holds for objects x of type frozenset. Yet it does not hold for objects y of type set.

Your example of `hasattr(mySet, 'pop')` gives another property that would be violated.

My point is that avoiding "Liskov violations" is ("theoretically") impossible, especially in Python (which allows programs to introspect/reflect on values, using facilities like 'dir', 'hasattr', etc.).

(FYI I became rather jaded on the Liskov substitution principle after reading https://okmij.org/ftp/Computation/Subtyping )

Comment by kccqzy 17 hours ago

> I became rather jaded on the Liskov substitution principle after reading https://okmij.org/ftp/Computation/Subtyping

The root of the issue here is that Liskov substitution principle simply references ϕ(x) to be some property satisfied by objects of a class. It does not distinguish between properties that are designed by the author of the class to be satisfied or properties that happen to be satisfied in this particular implementation. But the Hyrum’s Law also states that properties that are accidentally true can become relied upon and as time passes become an intrinsic property. This to me suggests that the crux of the problem is that people don’t communicate sufficiently about invariants and non-invariants of their code.

Comment by tremon 20 hours ago

> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.

This says "if hasattr(parent, 'pop') == True then hasattr(child, 'pop') must be True". This is not violated in this case, since hasattr(parent, 'pop') is False. If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.

Comment by minitech 16 hours ago

The property in question is `hasattr(x, "pop") is False`.

> If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.

The distinction isn’t “negative proofs”, but yes, that’s their point. In Python, you have to draw a line as to which observable properties are eligible.

Comment by FreakLegion 7 hours ago

> I've felt like frozendict was missing for a long time, though.

Type the dict as a mapping when you want immutability:

  x: Mapping[int, int] = {1: 1}

  x[1] = 2  # Unsupported target for indexed assignment ("Mapping[int, int]").

The only problem I've seen with this is:

  y = {}
  y[x] = 0  # Mypy thinks this is fine. Mapping is hashable, after all!

The issue here is less that dict isn't hashable than that Mapping is, though.

Comment by zahlman 17 minutes ago

This is because the ABC system is defined such that MutableMapping is a subtype of Mapping. Which mostly makes sense, except that if we suppose there exist Mappings that aren't MutableMappings (such that it makes sense to recognize two separate concepts in the first place), then Mapping should be hashable, because immutable things generally should be hashable. Conceptually, making something mutable adds a bunch of mutation methods, but it also ought to take away hashing. So Liskov frowns regardless.

Comment by tmp10423288442 18 hours ago

And, to the point of this proposal, `dict` and `frozendict` don't have an inheritance relationship either.

Comment by immibis 17 hours ago

ImmutableFoo can't be a subclass of Foo, since it loses the mutator methods. But nor can Foo be a subclass of ImmutableFoo, since it loses the axiom of immutability (e.g. thread-safety) that ImmutableFoo has.

When you interpret Liskov substitution properly, it's very rare that anything Liskov-substitutes anything, making the entire property meaningless. So just do things based on what works best in the real world and aim for as much Liskov-substitution as is reasonable. Python is duck-typed anyway.

It's a decent guiding principle - Set and ImmutableSet are more substitutable than Set and Map, so Set deriving from ImmutableSet makes more sense than Set deriving from Map. It's just not something you can ever actually achieve.

Comment by morshu9001 19 hours ago

I agree, same with frozenset. If you really want to use one of those as a key, convert to a tuple. There might be niche use cases for all this, but it's not something that the language or even the standard lib need to support.

Comment by boothby 18 hours ago

Problem being that sets aren't consistently ordered and conversion to a tuple can result in an exponential (specifically, factorial) explosion in the number of possible keys associated with a single set. Nor can you sort all objects. Safe conversion of sets to tuples for use as keys is possible but the only technique I know requires an auxiliary store of objects (mapping objects to the order in which they were first observed), which doesn't parallelize well.

Comment by morshu9001 18 hours ago

tuple(sorted(s)) and if you can't even sort the values, they're probably not hashable. I get that this involves a copy, but so does frozenset, and you can cross that bridge in various ways if it's ever a problem.

Comment by boothby 17 hours ago

Here are some types that support hashing:

  str
  bytes
  int, float
  complex
  tuple
  frozenset

Aside from int and float, you cannot perform comparisons between objects of different types. Moreover, you cannot sort complex numbers at all.

I have crossed that bridge, and I'm telling you (again) that a sorted tuple is not a generic solution.

Comment by morshu9001 17 hours ago

I'm not saying the problem with tuple doesn't exist, but that there doesn't need to be a built-in way to deal with it. If for some unfortunate reason you've got a mixed-type set that you also want to use as a dict key, you can write a helper.

Comment by ndr 22 hours ago

Immutability it's a joy to work with. Ask anyone who's worked with Clojure's dicts.

Comment by perrygeo 22 hours ago

Python discovers immutable data, and gets it wrong. Frozendict is a blunt instrument - instead of a mutable free-for-all, it just locks the data for the entire lifecycle. Brace for the wave of code littered with deep copies, proudly proclaiming how functional it all is.

If you want real immutable data structures, not a cheap imitation, check out pyrsistent.

Comment by cylemons 22 hours ago

How else would you "modify" immutable data other than by copying it?

Comment by ndr 21 hours ago

You don't have to copy everything.

Look up persistent data structures [0]

The main trick is to store everything in a tree with a large branching factor.

The elements are on your leaves. When you want to make an edit you need to create only the nodes from the root to the actual leaf. Then you return the new root. Everything else you can share.

This is a log operation, normally implemented with branch 32.

It works with vectors as well as dicts. Creating a copy is constant, it's immutable can be shared freely (especially on GC'd languages).

Insert/Delete are O(log32(n)) instead of O(n) they'd be with a full copy.

[0] https://en.wikipedia.org/wiki/Persistent_data_structure#Pers...

Comment by thechao 21 hours ago

And, yet, somehow, functional languages have been doing this since forever? I jest, I jest! There's an entire field of study focused on this! I'm no expert, but imagine you've got a singly-linked list, and we're going to allow only push_back() and pull_back() (no insert()). Let's go ahead and let multiple owners have at this list. If A does "pull_back()" what really happens is A has a local "tail" pointer that moves back along the end of the list and has a new belief about the end of the list. When A does "push_back()" it begins attaching links at its new tail. Since the links just "point backwards" without modifying the old links, this doesn't mutate the list "as is", it only mutates A's version of the list. So, if B is also looking at the list, it could just start doing "push_back()" and add its own "tail". The result is a data-structure that's a "bunch of shared singly linked lists" in a bush structure that's more akin to a tree than a singly linked list.

You can do the same thing with binary trees and other structures. It's more fiddly, and there's definitely places where single-ownership allows mutability "under the hood" for real performance gains, but that's the basic idea.

Comment by cylemons 17 hours ago

Interesting, I didn't think about that, so it is copying on write but on a more granular level

Comment by tylerhou 13 hours ago

Basically, a proxy. You don't need to deep copy; you just need to return a proxy object that falls back to the original dict if the key you are requesting has not been found or modified.

Functional data structures essentially create a proxy on every write. This can be inefficient if you make writes in batches, and you only need immutability between batches.

Comment by shadowgovt 22 hours ago

If your dicts are frozen, you shouldn't need to deep-copy. The point of immutability is that if you want a new frozendict based on another one, you just rebuild the indirection data structure up top and leave the values it references alone.

Comment by perrygeo 20 hours ago

You're absolutely right: an "indirection data structure" is necessary. Freezing the data is the least interesting part - it doesn't give you any of the benefits typically associated with immutable data structures in functional languages. That's my point - Python is shipping a half solution that's being mistaken for a proper one.

You think Python developers are going to roll their own HAMT on top of frozendicts? Or are they just gonna make copies? Personally, I'd just use pyrsistent which seems to get it right.

Comment by rented_mule 18 hours ago

It's a little early to say that "Python is shipping" anything related to this. The PEP is still in draft status and it may be modified (as it was three hours ago) or even rejected. The article says it is likely to be accepted, but that has not happened yet. That also means there is time to comment on the PEP in the discussion it links to if you'd like.

https://peps.python.org/pep-0814/

Comment by BiteCode_dev 19 hours ago

pyrsistent is super slow, though. Just ran a quick benchmark:

  - Creation - 8-12x slower  
  - Lookup - 22-27x slower  
  - Contains check - 30-34x slower  
  - Iteration - 5-14x slower  
  - Merge - 32-158x slower

Except at 10k+ items, batchup dates on 100K+ items or inserting 100 keys.

This is rarely the case in practice, most dictionaries and dict operations are small, if you have a huge dict, you probably should be chunking your load or delegating that to infra.

Not to mention pyrsistent's API is incompatible with dicts, so you can't pass it to external code without conversion.

You'd better have an incredible ROI to justify that.

Comment by instig007 17 hours ago

> pyrsistent is super slow, though

Since when is Python about speed?

> Just ran a quick benchmark

Where's the code? Have you observed the bottleneck call?

> Except at 10k+ items, batchup dates on 100K+ items or inserting 100 keys.

> This is rarely the case in practice

Where's the stats on the actual practice?

> You'd better have an incredible ROI to justify that.

The ROI being: fearless API design where 1) multiple instances of high level components are truly independent and could easily parallelize, 2) calling sites know that they keep the original data intact and that callees behave within the immutability constraints, 3) default func inputs and global scope objects are immutable without having to implement another PEP, 4) collections are hashable in general.

Comment by BiteCode_dev 15 hours ago

Clearly the ROI is perfect for you.

I won't waste more of your time.

Comment by instig007 13 hours ago

It's perfect for most Python developers actually, not just for myself, contrary to your "in practice" claim.

Comment by polyrand 1 day ago

A frozen dictionary would be very welcome. You can already do something similar using MappingProxyType [0]

  from types import MappingProxyType
  
  d = {}
  
  d["a"] = 1
  d["b"] = 2
  
  print(d)
  
  frozen = MappingProxyType(d)
  
  print(frozen["a"])
  
  # Error:
  frozen["b"] = "new"

[0]: https://docs.python.org/3/library/types.html#types.MappingPr...

Comment by zahlman 23 hours ago

> You can already do something similar

Only if you deny access to the underlying real dict.

Comment by ali_m 23 hours ago

Yes, this only prevents the callee from mutating it, it can't provide a strong guarantee that the underlying mapping won't be changed upstream (and hence MappingProxyType can't be washable).

Comment by code_biologist 1 day ago

Shoutout to pyrsistent, a personal favorite library for frozen/immutable/hashable data structures whenever I need to use lists, sets, or dicts as dict keys, or make sets of such items: https://github.com/tobgu/pyrsistent

Comment by sevensor 23 hours ago

Concurrency is a good motivation, but this is super useful even in straight line code. There’s a huge difference between functions that might mutate a dictionary you pass in to them and functions that definitely won’t. Using Mapping is great, but it’s a shallow guarantee because you can violate it at run time.

Comment by sundarurfriend 1 day ago

Can someone ELI5 the core difference between this and named tuples, for someone who is not deep into Python? ChatGPT's answer boiled down to: unordered (this) vs ordered (NTs), "arbitrary keys, decided at runtime" vs "fixed set of fields decided at definition time" (can't an NT's keys also be interpolated from runtime values?), and a different API (`.keys()`, `.items()`), etc (I'm just giving this as context btw, no idea if there's inaccuracies in these).

So could this also have been approached from the other side, as in making unordered NamedTuples with support for the Mapping API? The line between dictionaries and named tuples and structs (across various languages) has always seemed a bit blurry to me, so I'm trying to get a better picture of it all through this.

Comment by quietbritishjim 1 day ago

> arbitrary keys, decided at runtime" vs "fixed set of fields decided at definition time" (can't an NT's keys also be interpolated from runtime values?)

If you want to create a named tuple with arbitrary field names at runtime then you need to create a new named tuple type before you create the instance. That is possible, since Python is a very dynamic language, but it's not particularly efficient and, more importantly, just feels a bit wrong. And are you going to somehow cache the types and reuse them if the field names match? It's all a bit of a mess compared to just passing the entries to the frozendict type.

Comment by _flux 1 day ago

The key difference is what the GPT outlined: tuples have order for the fields and named tuples are not indexed by the name, but instead a field accessor is used, i.e. foo.bar vs foo["bar"]. In addition namedtuples can be indexed using that order like tuples can (foo[0]), which clearly isn't possible with dicts and would be quite confusing if dict had integer key.

So I think the differences aren't great, but they are sufficient. A frozendict is not going to be indexable by an integer. Python already has an abstract type for this, for mostly the use of type checking I imagine: https://docs.python.org/3/glossary.html#term-mapping

Documentation for namedtuple: https://docs.python.org/3/library/collections.html#collectio...

Comment by minitech 16 hours ago

On top of generating types dynamically being slow and bad as quietbritishjim said, “str values that also happen to be valid identifiers” is a very limiting dict key requirement.

Comment by quietbritishjim 3 hours ago

Great point, I can't believe I missed that.

Comment by willvarfar 1 day ago

The values in tuples cannot change. The values that keys point to in a frozen dict can?

But yeah I'd be in favour of something that looked a lot like a named tuple but with mutable values and supporting [name] access too. And of course some nice syntactic sugar rather like dicts and sets have with curly brackets today.

Comment by pansa2 1 day ago

> The values in tuples cannot change. The values that keys point to in a frozen dict can?

The entries of a tuple cannot be assigned to, but the values can be mutated. The same is true for a `frozendict` (according to the PEP they don't support `__setitem__`, but "values can be mutable").

Comment by vscode-rest 23 hours ago

Tuple entries must be hashable, which (as far as standard library is concerned) means immutable.

Comment by pansa2 23 hours ago

  >>> hash([1, 2])
  TypeError: unhashable type: 'list'

  >>> t = ([1, 2], [3, 4])
  >>> print(t)
  ([1, 2], [3, 4])

Comment by vscode-rest 23 hours ago

Ah. Of course… that’s how the workaround to use tuples as frozen dicts can work in the first place. Slow morning for me!

Comment by grimgrin 1 day ago

I think you could have asked this same comment w/o mentioning ChatGPT and you wouldn't have been downvoted to oblivion in 3 minutes

I don't see anything wrong with your asking to understand

Comment by chistev 1 day ago

This place hates ChatGPT and AI. Lol.

Edit: Of course, I get down voted as I predicted I would. Lol.

Comment by acdha 1 day ago

This place hates laziness and imprecision. Using ChatGPT for editing or inspiration is okay as long as you personally review the results for accuracy and completeness, at which point people care about it as much as you announcing that you used a spell checker.

Comment by delaminator 1 day ago

Pasting chat GPT responses is against the site rules.

always has been even before GPT

https://news.ycombinator.com/item?id=46206457

Comment by acdha 1 day ago

Fair, because they’re not your words. I’ll edit my comment for what I had in mind: that it can be helpful for that like a spell checker - for example, I know non-native English speakers who find them useful for editing but they completely understand the topic intellectually.

Comment by grimgrin 22 hours ago

you made a non-comment, that's why you're downvoted

comments don't have to be substantial, but they should be adding something that might merit more responses, or could. yours does not. what kind of reply could you even give to "this place [and so, its users] hates [thing]"

I'd ask you to qualify it, but you can't really qualify such a statement

Comment by rurban 20 hours ago

> so having a safe way to share dictionaries between threads will be a boon

Since only the keys are const, the values not, frozendict is not thread-safe per se. There needs to be a small lock around the value getter and setter.

Comment by varelaz 20 hours ago

it's thread safe on operations on the dict but not on the values. Same relates to other immutable structures like tuples. Lock will not help here cause unsafety comes from operation on value after value is obtained.

Comment by bilsbie 1 day ago

Wow weird Mandela effect for me. I really remember this being a built and actually using it.

Comment by aewens 23 hours ago

You may be thinking of the `frozenset()` built in or the third party Python module [frozendict](https://pypi.org/project/frozendict/)?

Personally, I’ve been using a wrapper around `collections.namedtuple` as an underlying data structure to create frozen dictionaries when I’ve needed something like that for a project.

Comment by guidopallemans 23 hours ago

When you are making str -> Any dictionaries it's quite likely you're better off with dataclasses or namedtuples anyway.

Comment by TheFlyingFish 22 hours ago

That works if you're dealing with a known set of keys (i.e. what most statically-typed languages would call a struct). It falls down if you need something where the keys are unknowable until runtime, like a lookup table.

I do like dataclasses, though. I find them sneaking into my code more and more as time goes on. Having a declared set of properties is really useful, and it doesn't hurt either that they're syntactically nicer to use.

Comment by pansa2 1 day ago

There was a previous PEP (in 2012) with the exact same title:

https://peps.python.org/pep-0416/

Also one in 2019 for a "frozenmap":

https://peps.python.org/pep-0603/

Comment by Qem 1 day ago

Perhaps you used the frozen dict implementation from the pip installable boltons library: https://boltons.readthedocs.io/en/latest/dictutils.html#bolt...

Comment by hiddencost 23 hours ago

Perhaps immutabledict?

https://pypi.org/project/immutabledict/

Comment by kleiba 4 hours ago

One nice side effect of a dict that's immutable is that under the hood, you could employ perfect hashing so that the your dict uses a hash function that is collision-free.

Comment by Strilanc 22 hours ago

This is a type that I would use a lot.

For example, I often write classes that do cacheable analysis that results in a dict (e.g. the class stores a list of tiles defined by points and users want a point-to-tiles mapping for convenience). It's worth caching those transformations, e.g. using @functools.cached_property, but this introduces a risk where any caller could ruin the returned cached value by editing it. Currently, I take the safety hit (cache a dict) instead of the performance hit (make a new copy for each caller). Caching a frozendict would be a better tradeoff.

Comment by clickety_clack 22 hours ago

Maybe you should take a look at pyrsistent. That allows you to make frozen “maps”. You can “evolve” them into new versions of the objects and it keeps the references to the unchanged keys and values under the hood so it’s fast and memory efficient.

Comment by codethief 1 day ago

Next step: Auto-inferring the correct (most narrow) TypedDict type from a frozendict. (Think `const foo = { … } as const` in TypeScript.) I miss this TS feature in Python on the daily.

Comment by mrweasel 1 day ago

What would that do exactly? Auto-generate a TypedDict class?

Bringing up TypedDict is sort of interesting, because it seems like a frozen dictionary could have been implemented by PEP 705, if typing.ReadOnly was enforced at runtime, and not just a hint to a static type checker.

Comment by dmurray 1 day ago

Auto-generate a TypedDict type while type checking, while doing nothing at runtime.

I expect this is not a very big ask and the various typecheckers will have versions of this soon after release.

Comment by codethief 21 hours ago

After what release? Type checkers already support the various TypedDict PEPs, even those that haven't been accepted yet. Yet, none of them infers TypedDict types.

Comment by mrweasel 1 day ago

Okay so basically just "inline":

  class MyType(TypedDict):
    a: str
    b: int

or infer the type hint in:

  my_typed_dict: dict[str, int] = {"a": 5}

It should probably be something like:

  auto_typed: dict[typing.Const, typing.Const] = {"a": 5}

where typing.Const defaults to Any for an empty dict.

Comment by codethief 21 hours ago

Not sure we're talking about the same thing. Inline TypedDicts are already in the process of being formalized, see https://peps.python.org/pep-0764/ , and have experimental support in e.g. Pyright.

What I meant was that

  foo = {"a": 5}

should be inferred as

  foo: TypedDict[{ "a": Literal[5] }] = {"a": 5}

Comment by mrweasel 16 hours ago

Ah, I didn't know about PEP 795. I don't really know if I like it though. It looks like something someone invented because they don't want to write classes or specifically want to be able to access a data structure like a dict, for whatever reason.

It basically provides data types, but also ensures that you have no easy way to reuse them, and it will miss cases where two data structures happen to have the same signature.

It's getting a little messy when we have class, namedtuple and TypedDict, which all can do much the same thing.

Comment by adzm 1 day ago

Curious what this means for typescript as well; honestly I've only reached for as const rarely.

Comment by anentropic 1 day ago

Agreed... but it shouldn't need a frozendict for this

IMHO TypedDict in Python are essentially broken/useless as is

What is needed is TS style structural matching, like a Protocol for dicts

Comment by virtue3 1 day ago

I also SUPER LOVE this feature. Especially when you make a type union of the keys for easier indexing.

Comment by tweakimp 1 day ago

Can you give a Python example of a use case for this?

Comment by mwsherman 23 hours ago

I’ve found C#’s frozen dictionary to be useful: https://learn.microsoft.com/en-us/dotnet/api/system.collecti...

It’s optimized for fast reads in exchange for expensive creation.

Comment by drhagen 1 day ago

If this gets wide enough use, they could add an optimization for code like this:

    n = 1000
    a = {}
    for i in range(n):
        a[i] = str(i)
    a = frozendict(a)  # O(n) operation can be turned to O(1)

It is relatively easy for the JIT to detect the `frozendict` constructor, the `dict` input, and the single reference immediately overwritten. Not sure if this would ever be worth it.

Comment by _flux 1 day ago

Wouldn't ref-counting CPython already know that a has a single reference, allowing this optimization without needing any particular smart JIT?

Comment by zbentley 23 hours ago

I think GP was talking about optimizing away the O(N) call on the last line. The GC will take care of removing the reference to the old (mutable) dict, but constructing a new frozendict from a mutable dict would, in the current proposal, be an O(N) shallow copy.

There are also potentially other optimizations that could be applied (not specific to dict/frozendict) to reduce the memory overhead on operations like "a = f(a)" for selected values of "f".

Comment by tracnar 17 hours ago

Indeed, the Tcl implementation does this so e.g. `set d [dict] ; dict set d key value` can modify d in place instead of creating a copy (since everything is immutable).

Comment by zahlman 23 hours ago

First thought: I would very much expect it to be able to do this optimization given the similar things it does for string concatenation.

But actually, I suspect it can't do this optimization simply because the name `frozendict` could be shadowed.

Comment by vlovich123 16 hours ago

I’m curious why the dict->frozendict operation is not an O(1) operation when there’s a refcnt of 1 on the dict - that resolves the spooky action at a distance problem raised and solves the performance for the most common usage pattern (build up the dict progressively and convert to a frozendict for concurrent reads).

Comment by emil-lp 5 hours ago

Because optimizations are not added to the PEP, but added later.

Comment by mont_tag 11 hours ago

Note that Python already offers MappingProxy that wraps a dictionary to hide mutating methods.

People hardly ever use this tool. That suggests that there isn't much of a need for a frozendict.

Comment by the__alchemist 21 hours ago

I wish Python could move away from types-as-mutability-determiners. Or paper over them; all could have mutable and non-mutable variants, with "to_mut()" and "to_immut()" methods. And in the same vein, explicit control over whether functions mutate a variable in place, or make a local copy. Python is my first lang, and I still am unable to consistently reason about these.

Comment by Maledictus 21 hours ago

I love Ruby's .freeze

Comment by steadyelk 21 hours ago

Agree it's about time to make this built in. Other functional packages like JAX [0] are already using the concept but they build it into their library from scratch.

[0] https://flax.readthedocs.io/en/latest/api_reference/flax.cor...

Comment by bvrmn 17 hours ago

It would be great to have **kwargs as frozendict by default. It could help with caching decorators to speedup key construction. For example natural key is simply `(args, kwargs)`.

Comment by ok123456 22 hours ago

Ruby has had frozen dictionaries since version 1.0, and they are generally considered a good thing to use where possible.

Comment by zzzeek 1 day ago

SQLAlchemy has its own frozendict which we've had in use for many years, we have it as a pretty well performing cython implementation these days, and I use it ubiquitously. Would be a very welcome addition to the stdlib.

This proposal is important enough that I chimed in on the thread with a detailed example of how SQLAlchemy uses this pattern:

https://discuss.python.org/t/pep-814-add-frozendict-built-in...

Comment by xamuel 23 hours ago

This subject always seems to get bogged down in discussions about ordered vs. unordered keys, which to me seems totally irrelevant. No-one seems to mention the glaring shortcoming which is that, since dictionary keys are required to be hashable, Python has the bizarre situation where dicts cannot be dict keys, as in...

{{'foo': 'bar'}: 1, {3:4, 5:6}: 7}

...and there is no reasonable builtin way to get around this!

You may ask: "Why on earth would you ever want a dictionary with dictionaries for its keys?"

More generally, sometimes you have an array, and for whatever reason, it is convenient to use its members as keys. Sometimes, the array in question happens to be an array of dicts. Bang, suddenly it's impossible to use said array's elements as keys! I'm not sure what infuriates me more: said impossibility, or the python community's collective attitude that "that never happens or is needed, therefore no frozendict for you"

Comment by kzrdude 23 hours ago

Turning a dictionary into a tuple of tuples `((k1, v1), (k2, v2), ...)`; isn't that a reasonable way?

If you want to have hash map keys, you need to think about how to hash them and how to compare for equality, it's just that. There will be complications to that such as floats, which have a tricky notion of equality, or in Python mutable collections which don't want to be hashable.

Comment by xamuel 22 hours ago

That argument would apply to sets too, and yet frozenset is builtin.

Comment by shadowgovt 22 hours ago

Regarding the spooky-action-at-a-distance concerns of a `.freeze()` method on dict:

`.freeze()` should probably just return a frozendict instead of in-place mutating the dict, and they should be separate types. Under the hood, you'll have to build the hashtable anyway to make the frozendict; as long as you're doing that work, you may as well build an object to contain the hashtable and just have that object be separate from the dict that birthed it.

The values referenced by both the original dict and the frozendict can be the same values; no need to clone them.

Comment by emil-lp 5 hours ago

The point is that freeze could work in constant time, whereas the copying takes linear time.

Another alternative mentioned was `move`, which would create a frozen version in constant time and clear the original dict.

Comment by whimsicalism 20 hours ago

yes! no more `MappingProxyType` hacks

Comment by westurner 18 hours ago

PMap and PVector, functional Python libraries,

"PEP 351 – The freeze protocol" (2005, rejected) https://peps.python.org/pep-0351/ ; IIUC the freeze protocol proposed basically:

  def freeze(obj)
    return cache.setsefault(hash(obj),
      obj.__freeze__())

/? "Existing threads re: consts and applications thereof"

I wasn't able to find a URL to this post (2021) from the python-ideas mailing list archives using a double quoted search term today; I had to use the python mailing list search engine? Did something break crawling of mailing lists? Old mailman HTML archives were very simple to crawl.. ENH: pypa: add a sitemap.xml for/of mailing list archives, forums; @pypa: ask for search engine indexing advice: "How do we make sure that the python mailing list archives will be search indexed?" (as they traditionally were)

How to find the .txt of mailing list archives posts these days?

From "[Python-ideas] Re: Introduce constant variables in Python" (2021) https://mail.python.org/archives/list/python-ideas@python.or... :

- pyrsistent: PMap, PVector

I'm out of time for this; (reformatting this for HN so that URLs will be auto-linkified but newlines won't be eliminated) here's the full email as .txt, the mailing list archive has a hyperlinkified version with newlines preserved. GH Markdown and CommonMark Markdown also preserve newlines and auto-linkify:

  From: [@westurner]
  Date: Thu, Jun 17, 2021, 10:43 AM
  Subject: Re: [Python-ideas] Re: Introduce constant variables in Python
  Cc: python-ideas <python-ideas@python.org>
  
  
  On Mon, May 24, 2021 at 5:43 PM Chris Angelico <rosuav@gmail.com> wrote:
  
  Requiring that a name not be rebound is well-defined and testable.
  Requiring that an object not change is either trivial (in the case of,
  say, an integer) or virtually impossible (in the case of most
  objects).
  
  What would be the advantage of such a declaration?
  
  ChrisA
  
  ## Existing threads re: consts and applications thereof
  
   So, `/? from:me pyrsistent` I found a few results:
  - "[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)" 2016-04
    https://mail.python.org/pipermail/python-dev/2016-April/143958.html
    - ~Sandboxing python within python is nontrivial to impossible; consts might help a bit
      - https://mail.python.org/pipermail/python-dev/2016-April/143958.html
  
  - "Proposal to add const-ness type hints to PEP-484"
    https://mail.python.org/archives/list/python-ideas@python.org/thread/OVPF5I6IOVF6GOJQRH5UGCCU3R7PQHUF/
    - https://github.com/python/typing/issues/242
      - "Final names and attributes" https://github.com/python/mypy/pull/5522
         This is where `typing.Final` comes from.
  
  - "[Python-ideas] "Immutable Builder" Pattern and Operator"
    https://mail.python.org/pipermail/python-ideas/2017-January/044374.html
    - [pyrsistent] and "fn.py [do] immutables:
      https://github.com/kachayev/fn.py/blob/master/README.rst#persistent-data-structures "
  
  - "[Python-ideas] Add recordlcass to collections module"
    https://groups.google.com/g/python-ideas/c/9crHfcCBgYs/m/6_EEaWJAAgAJ
    - ORMs (e.g. Django, SQLAlchemy) require "dirty state" checking to know which object attributes have changed and need an SQL statement to be executed to synchronize the state; this is relevant because when we're asking for mutable namedtuple we're often trying to do exactly this pattern.
  
  - "[Python-ideas] Suggestions: dict.flow_update and dict.__add__"
    https://www.google.com/search?q=%22%5BPython-ideas%5D+Suggestions%3A+dict.flow_update+and+dict.__add__%22
  
    > dicttoolz has functions for working with these objects; including dicttoolz.merge (which returns a reference to the merged dicts but does not mutate the arguments passed).
    > 
    > https://toolz.readthedocs.io/en/latest/api.html#dicttoolz
    > https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge
    > 
    > pyrsistent has a PRecord class with invariants and type checking that precedes dataclasses. pyrsistent also has 'freeze' and 'thaw' functions for immutability. PRecord extends PMap, which implements __add__ as self.update(arg) (which does not mutate self)
  https://github.com/tobgu/pyrsistent/blob/master/README.rst#precord
    > 
    > https://github.com/tobgu/pyrsistent/blob/master/pyrsistent/_pmap.py
  
  - "[Python-ideas] How to prevent shared memory from being corrupted ?"
    https://www.google.com/search?q=%22How+to+prevent+shared+memory+from+being+corrupted+%3F%22
    > PyArrow Plasma object ids, "sealing" makes an object immutable, pyristent
    > 
    > https://arrow.apache.org/docs/python/plasma.html#object-ids
    > https://arrow.apache.org/docs/python/plasma.html#creating-an-object-buffer
  
    > > Objects are created in Plasma in two stages. First, they are created, which allocates a buffer for the object. At this point, the client can write to the buffer and construct the object within the allocated buffer. [...]
  
  - [Python-ideas] Experimenting with dict performance, and an immutable dict
    https://mail.python.org/archives/list/python-ideas@python.org/message/DNBGUJHDH4UTPSETMFFWMJHNXQXIWX4I/
  
    > https://pyrsistent.readthedocs.io/en/latest/intro.html#pyrsistent :
    > 
    >> Pyrsistent is a number of persistent collections (by some referred to as functional data structures). Persistent in the sense that they are immutable.
    >>
    >> All methods on a data structure that would normally mutate it instead return a new copy of the structure containing the requested updates. The original structure is left untouched.
    >>
    >> This will simplify the reasoning about what a program does since no hidden side effects ever can take place to these data structures. You can rest assured that the object you hold a reference to will remain the same throughout its lifetime and need not worry that somewhere five stack levels below you in the darkest corner of your application someone has decided to remove that element that you expected to be there.
    >>
    >> Pyrsistent is influenced by persistent data structures such as those found in the standard library of Clojure. The data structures are designed to share common elements through path copying. It aims at taking these concepts and make them as pythonic as possible so that they can be easily integrated into any python program without hassle.
  
    
  
  > What would be the advantage of such a declaration?
  
  Constants don't need to be locked or unlocked; which is advantageous for parallelism and reasoning about program correctness.
  True consts (wherein everything referred to in that object is 'frozen' and immutable or at least only modifiable with e.g. copy-on-write)
  wouldn't require locks,
  which would be post-GIL advantageous.
  
  You could do consts by never releasing a threading.Lock (or similar):
  - https://docs.python.org/3/library/asyncio-sync.html#locks
  - https://docs.python.org/3/library/threading.html#lock-objects
  - This from 
  https://docs.python.org/2/library/sets.html?highlight=immutable#immutable-transforms re ImmutableSet/FrozenSet
  is not present in the python 3 docs: 
  https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset 
  
  Though - even if Python enforced normal consts in the language - all of the other code objects would still be mutable, so you still have the impossibility of sandboxing python.
  
  Functional and contracts coding styles rely upon invariance; 
  which can be accomplished with various third-party packages that enforce const-ness throughout what may be an object tree behind that reference that would otherwise need to be copy.deepcopy()'d.
  
  ## pyrsistent
  Src: https://github.com/tobgu/pyrsistent
  
  > - PVector, similar to a python list
  > - PMap, similar to dict
  > - PSet, similar to set
  > - PRecord, a PMap on steroids with fixed fields, optional type and invariant checking and much more
  > - PClass, a Python class fixed fields, optional type and invariant checking and much more
  > - Checked collections, PVector, PMap and PSet with optional type and invariance checks and more
  > - PBag, similar to collections.Counter
  > - PList, a classic singly linked list
  > - PDeque, similar to collections.deque
  > - Immutable object type (immutable) built on the named tuple
  > -   freeze and thaw functions to convert between python standard collections and pyrsistent collections.
  > - Flexible transformations of arbitrarily complex structures built from PMaps and PVectors.
  
  
  ## icontract
  Src: https://github.com/Parquery/icontract
  
  > icontract provides design-by-contract to Python3 with informative violation messages and inheritance.
  >
  > It also gives a base for a flourishing of a wider ecosystem:
  > 
  > - A linter pyicontract-lint,
  > - A sphinx plug-in sphinx-icontract,
  > - A tool icontract-hypothesis for automated testing and ghostwriting test files which infers Hypothesis strategies based on the contracts,
  together with IDE integrations such as icontract-hypothesis-vim, icontract-hypothesis-pycharm, and icontract-hypothesis-vscode,
  > - Directly integrated into CrossHair, a tool for automatic verification of Python programs,
  together with IDE integrations such as crosshair-pycharm and crosshair-vscode, and
  > - An integration with FastAPI through fastapi-icontract to enforce contracts on your HTTP API and display them in OpenAPI 3 schema and Swagger UI.

https://en.wikipedia.org/wiki/Design_by_contract

https://en.wikipedia.org/wiki/Invariant_(mathematics)#Invari... [ https://en.wikipedia.org/wiki/Class_invariant ]

> What is the difference between "invariant" and "constant" and "final"?

Comment by 1 day ago

Comment by DeathArrow 18 hours ago

>But dictionaries are mutable, which makes them problematic for sharing data in concurrent code.

Not really, C# has ConcurrentDictionary.

Comment by semiinfinitely 18 hours ago

its called dataclass

Comment by drhagen 1 day ago

Great! Now make `set` have a stable order and we're done here.

Comment by cr125rider 1 day ago

Aren’t sets unsorted by definition? Or do repeated accesses without modification yield different results?

Comment by ledauphin 1 day ago

this is likely in reference to the fact that dicts have maintained insertion order since Python ~3.6 as property of the language. Mathematically there's no defined order to a set, and a dict is really just a set in disguise, but it's very convenient for determinism to "add" this invariant to the language.

Comment by zahlman 23 hours ago

Sets use a different implementation intentionally (i.e. they are not "a dict without values") exactly because it's expected that they have different use cases (e.g. union/intersection operations).

Comment by jonathaneunice 1 day ago

Debugging is a completely different and better animal when collections have a predictable ordering. Else, every dict needs ordering before printing, studying, or comparing. Needlessly onerous, even if philosophically justifiable.

Comment by adrian_b 23 hours ago

Not related to Python, but one of the possible implementations of a set, i.e. of an equivalence class on sequences, is as a sorted array (with duplicates eliminated, unless it is a multiset, where non-unique elements are allowed in the sorted array), as opposed to the unsorted array that can store an arbitrary sequence.

So sets can be viewed as implicitly sorted, which is why the order of the elements cannot be used to differentiate two sets.

Being sorted internally to enforce the equivalence between sets with elements provided in different orders does not imply anything about the existence of an operation that would retrieve elements in a desired order or which would return subsets less or greater than a threshold. When such operations are desired, an order relation must be externally defined on the set.

So a possible definition of sets and multisets is as sorted arrays with or without unicity of elements, while sequences are unsorted arrays (which may also have the constraint of unique elements). However the standard set operations do not provide external access to the internal order, which is an order between arbitrary identifiers attached to the elements of the set, which have no meaning externally.

Comment by Retr0id 1 day ago

A stable order does not imply sorted order. If you convert the same set to a list twice you'll probably get the same order both times but it isn't guaranteed anywhere, and that order may change between python implementations and versions. The order may also change as the set grows or shrinks.

Comment by sltkr 1 day ago

So are dictionary keys, but Python decided to make them insertion ordered (after having them be unordered just like set elements for decades). There is no fundamental reason sets couldn't have a defined order. That's what languages like JavaScript have done too.

Comment by cpburns2009 23 hours ago

Python's decision to make dict keys ordered in the spec was a mistake. It may be the best implementation so far, but it eliminates potential improvements in the future.

Comment by mrweasel 22 hours ago

Agreed. The only reason to make them sorted is because people would wrongly assume that they where. You can argue that a programming language should not have unexpected behaviors, and apparently unsorted dictionary keys where a surprise to many, on the other hand I feel like it's a failure of education.

The problem was that assuming that keys would be sorted was frequently true, but not guaranteed. An alternative solution would have been to randomize them more, but that would probably break a lot of old code. Sorting the keys makes no difference if you don't expect them to be, but it will now be a greater surprise if you switch language.

Comment by lou1306 23 hours ago

Can someone give a strong rationale for a separate built-in class? Because "it prevents any unintended modifications" is a bit weak.

If you have fixed keys, a frozen dataclass will do. If you don't, you can always start with a normal dict d, then store tuple(sorted(d.items())) to have immutability and efficient lookups (binary search), then throw away d.

Comment by malkia 21 hours ago

Welcome to starlark :)

Comment by neonsunset 22 hours ago

[dead]