Shared publicly  - 
"""Dear BDFL,

I'm writing my talk for a local PyCon (it is on Saturday - and I'm late as ever), and one of the questions I'm trying to answer is why Python doesn't have real information hiding the in the way of the C++ (or Ruby) ideas of public, protected and private.

Everything I've found online just mentions the state of what it is now, with leading underscore being considered private, and double underscore getting mangled with the classname, but these are still public (and we are consenting adults). Was there a particular driving force behind the access methods in Python? Or was it a collection of smaller things?

--Procrastinating in the Southern Hemisphere

Dear Procrastinating,

There is actually some information hiding possible -- but only by writing C extensions. :-)

The main reason for making (nearly) everything discoverable was debugging: when debugging you often need to break through the abstractions (since bugs don't confine them to the nice abstractions you've created for your program :-) so I though it would be handy to be able to see anything from the debugger. And since the debugger is written in Python itself (for flexibility and a number of other reasons) I figured the same would apply to other forms of programming -- after all, sometimes debugging doesn't imply using a debugger, it may just imply printing a certain value. Again, too much data hiding would make things more complicated here.

The other observation was that even in C++, there are usually ways around the data hiding (e.g. questionable casts). Which made me realize that apparently other languages could live just fine with less-than-perfect hiding, and that hiding was an advisory mechanism, not an enforcement mechanism. So Python could probably be just fine with even-less-than-perfect hiding. :-)

Good luck with your talk, and enjoy the event!
Marc Herbert's profile photoMichael Foord's profile photoMarcus Aurelius Farias's profile photoOleg Korsak (kamikaze)'s profile photo
I always described this as "we're all adults here" information hiding.
Data hiding in most languages (C#, Java, C++) is an illusion anyway - there are always techniques like reflection to bypass it as there are always times when you need to.
Exactly. I've several times been slammed for saying this on stackexchange. The data hiding part of OO theory sounds good in theory, but turns out to in practice be hurtful.

Once argument did come up: By making data private, compilers can make certain types of optimizations on them. But then I think rather a keyword that explicitly allows those optimizations is better.
Information Hiding is a programming palliative.
The way I've always thought about it, even in "strict" languages where you're writing small-I interfaces for client coders, you should be documenting what the public API is for your code. Defining the contract in words or via API documentation.

Somebody interfacing with your code is going to need to know about that contract anyway, in order to know how to use your code at all. After that, having the code enforce the contract (via public, private etc) feels, to me, pointless/redundant and (as stated previously) just gets in the way during real-world debugging/hackery.
I don't feel like typing out the reasons right now, but "data hiding" can be useful in many scenarios. As for +Michael Foord 's assertion that "there are always times when you need to." this is not true. If you write your code correctly, with encapsulation in mind, you will never need to breach the encapsulation. That being said, I agree that it is very useful in debugging. One reason I love python (and use it almost exclusively right now) is because of the excellent debugging tools. Edit: Plus, you always have the practical benefits of data hiding in python, as you can just write your code in such a way that it follows your own internal idea on private/public/protected. Having the "data hiding" just explicitly documents it, which makes it easier for new programmers/new members of the team to see what they should be doing. You can get a similar effect with comments, minus the compiler enforcement.
+Michael Foord we've run into a situation with mock, in fact, where the leading-underscore is confusing -- a method that can be overridden but shouldn't be called. I think having stricter visibility rules forces everyone to play by those rules, and thereby reduces surprise.
I see information hiding as forced discipline. I prefer to be disciplined out of my own will.
Guido, thanks for this. I often make the same argument in JavaScript because Crockford and others think it's a good idea to hide all your objects in anonymous functions. Yes, the namespacing is good but hiding them is not good for debugging!
Hiding isn't good for testing either! Be proud of your state, don't keep it hidden in a closet! ;)
The real solution is to create super-private attributes prefixed with three underscores, and have __getattribute__() terminate the program and format the user's hard drive if they try to access them directly. That'll show 'em.
It's not necessarily about hiding, it can also be assignment guarding. Java, for instance, allows you to see and modify private members through reflection, but I think you first have to pass a security check.
I understand the reason for not having language-enforced data hiding, but on the other side, I think +Chris Carpenter hit on a good point about the hiding keywords serving as documentation. Yes, everyone SHOULD properly document their APIs, but I also tend to believe the code should document itself as much as possible. I don't personally care if the language enforces the hiding, and as pointed out, there's always ways around it, but I do care if the language (or a well-agreed upon style standard) allows me to express it to other programmers.
Visibility rules are very useful in one case: poor developers using well designed libraries[*]. While this is the most common case it is practically never considered: poor developers tend not to spend their spare time in forums and discussions like this one.

[*] Example: the JDK. Except for a couple of really horrible hacks I never had to circumvent any JDK visibility rules.
This is one of my favorite language features of Python. Great design decision Guido.
I'm happy that Python doesn't have private properties (for testing, debugging). But I'm also happy that Python has __properties. When refactoring code, say of a library, I can assume that __properties aren't used outside the class.
Data hiding on itself isn't that useful. But when combined with other language properties it is. It would be great to have a language that implements an ocap subset of Python.
The thing is not only test code can access this private-but-accessible data and behaviour... That is just plain wrong IMHO.

Even with convention stating underscored fields are private, nothing would prevent your API users to fiddle with them . Making their code dependent on any internal changes you might want to do and putting your object in some unhanded/incoherent state.

"We are all adults", sure, but we are also all lazy and looking for the fastest way to achieve our goal, even if that means breaking encapsulation by using this very handy __do_exactly_what_i_need method. And humans, even fully mature and grown up, are not error-prone.

Consequence: you are forced to keep this private API stable and/or your clients must update their code each refactor you do.
Hardly a comfortable situation for anyone.

The testing code argument isn't really convincing either. Testing public methods is enough to ensure the class respects its contract.
It will trigger the private methods codepath too, as if they were inlined.

The fact that private methods are also accessible in other languages is moot too. If that's the case, why reproduce such a bad behaviour?
At least, introspecting/reflecting private members in java or ruby is really involving and explicit. Not something you do by mistake.

All in all, this seems like a terrible and fragile decision, where the rest of the language is a pleasure to work with. hope this might be reconsidered :)
+Alexandre Mazari: If the people using your module depend on your private API not changing, they really deserve to have their software break when it does. It's not like they weren't warned.
+Geoffrey Spear The real world doesn't work that way. Their software breaks, and you affect completely unrelated people you don't want to (end users, intermediate distributors, etc), which is precisely the effect a well defined API intends to avoid. It's the same effect of having everything public in the first place, so the convention is moot in that regard. The intention is stability, rather than teaching people to behave.
+Geoffrey Spear on the other hand if the only way developers can get your module to "do the right thing" is by using private apis then it is your fault.

Of course if the language you use has data hiding and you don't have access to the private apis then you're just stuffed (or you use reflection).
+Michael Foord No, you're not stuffed.. you just agree with the maintainer to get it fixed, or you fork the original project in the good'ol open source style. If you are accessing private details the author specifically tagged as private, you're forking either way. At least with a real fork you get the burden of doing so, rather than screwing up the original author.
+Gustavo Niemeyer but that's just reality. In practise there's no difference at all between forking a project and making a private api public, and just using the private api.
+Michael Foord The first time you release a security fix that breaks millions of users that you didn't know were trusting on your private API we can talk again.
+Gustavo Niemeyer so you're saying the millions of users are the problem because they trusted the private api? (And not the fact that they apparently needed to use a private api.) Are you saying these issues only matter at that scale? That seems an odd position to take. You're also implying that security fixes can't and don't break public apis.
Sorry, you're making stuff up as you go, and editing the post in flight.. I'll step out of the debate.
If another developer is using your private API which changes, and their unit tests don't catch your changes breaking their application before a user ever sees the problem, they have no business developing software at all, period. But yes, +Michael Foord, I absolutely agree with your point that if someone needs to use the private API (as opposed to not needing it, but using it anyway), then the public API is probably a bad design.
Guido did the right thing here, and Python has the most practical data-hiding solution I've seen.

C++ is terrible in this respect. A few years ago the following hack worked with gcc to get around data-hiding:
#define private public
#define protected public
#include "whateverHeaderYouWantPublicMembersFrom.h"

Was always curious if this was a C++ feature or a compiler bug...
+Michael Foord I think you agree that as developers, we are obviously trying to match the requirements of the next guy in line, being another dev or a user and must also strive to make the usage of our work as easy and safe as possible. Members hiding quite enforces part of this safety, avoiding a lot of side effects on the state of our instances.
Going further, we might better follow the steps of the functional guys by making vars immutable by default.

Anyway, going back to APIs stability, you are right that private members usage is a failure of both the provider and user side. But as +Gustavo Niemeyer wrote, pointing fingers is useless, we'd better off avoiding such a situation altogether, right ?

Dunno why I had to reignite this thread, this debate is maybe 20 yo, and already won looking at the languages landscape.
You can't trust people too much, python, or you'll get abused ;)
Hmmm, +Michael Foord +Gustavo Niemeyer, this implies that tools like the Cheese Shop need better ways to express dependencies. Also, and maybe +Tarek Ziadé will weigh in, but it seems like the package descriptions would be usefully extended to represent that dependency as well.

One of the difficulties with saying "just fork it" is that as a consumer, sometimes it's hard to determine where the convoluted mess really goes.
I would add that making everything visible also makes testing much easier. Having written tests for both Java and Python, having to work around private to write unit tests is very very painful, ugly and error prone. Also, I have found that when people want to use a method or function, they use it. If it was private, they make it public. Very rarely do people look at the private method and say: "Oh well... I guess I'll just have to find another way."
+Marc Herbert well, only security fixes should break public apis. Are you saying security flaws should be left unfixed if the only solution is to break an api?

Obviously there are always concerns to balance (size of breakage, severity of issue), but security usually trumps other concerns.
+Michael Foord You do not seem to have ever performed any system administration. Security fixes are always the most conservative "upgrades" and never break any API in practice. Simply because no administrator wants to be faced with a "security versus breakage" dilemma.
+Marc Herbert no, I'm not a system administrator. I do work on apis used by large numbers of people (in terms of my day job used by millions). Sure you try never to break a public API, but a security fix is about the only reason why you ever would (outside of a normal release and deprecation cycle). Although we've done security fixes, I haven't had to break a public API.
If only I did not have to rename all occurrences of a member to add/remove underscores when I change its intended visibility...
Where can we see this talk? I'm interested in public/protected/private thing
Add a comment...