Shared publicly  - 
 
on the idiocy of OOP dot notation
xah's rumination extempore❕ episode №20131128221110

the syntax of object oriented programing language is commonly like this:

「‹object›.‹method name›(‹option›)」

where ‹object› is the meat you want to work on, and the part in round bracket ‹option› is secondary.

For example, in Python you have:

「‹string›.split(‹chars›)」

however, sometimes there's categorical pigeon hole problem, and the OOP lang designers have a dilemma: should data be the object or class the object? This particularly happens for {string, regex, math, number} things.

For example, in python regex, you have:

「re.sub(‹pattern›, ‹replacement›, ‹string›)」

where the part you really want to act on, becomes the last parameter ‹string›, and the prefex the “re”, is the regex class object. 〔Python: regex Example http://xahlee.info/perl-python/regex.html

this idiocy happens in JavaScript too. Here's some normal examples, where the meat is the first thing:

「‹string›.split(‹char›)」

「‹string›.replace(/‹pattern›/, ‹replacement›)」

and now witness this:

「‹regex pattern›.test(‹string›)」

above, the meat is the ‹string›, while the regex pattern is the object. 〔 JavaScript Regex Tutorial http://xahlee.info/js/js_regex.html

and, now, LOOK:

「Math.min.apply(‹context›, ‹list›)」

above, the meat is the ‹list›. The “Math” is a global object, and “min” is one of its method, and “apply” is a method inherited from the Global object.

the bottom line, is complexity multiplied by confoundedness.

solution? ban the ���� of it. Instead, everything should be a function:

math.f(x1, x2, x3, …)

The meat is the first thing in the function's parameter spec, while the “math.f” uniquely identify its {module, namespace, purpose}.

Complimentary essay:
What are OOP's Jargons & Complexities (OOP as Functional Programing)
http://xahlee.info/comp/oop.html

#python   #javascript
What are OOP's Jargons & Complexities (OOP as Functional Programing). Xah Lee. , 2005-01-30 , …, 2011-04-30. Tweet. This article explains the jargons & complexities of the Object Oriented Programing (OOP) paradigm, in terms of basic concept of functions, using Java as example.
6
1
Nick Alcock's profile photoFabrice Popineau's profile photoCraig “blipvert” Lennox's profile photoXah Lee's profile photo
32 comments
 
It's not idiocy; these are two very different scenarios: a function defined on an instance object is bound when the object is constructed and follows the usual inheritance chain whereas a function defined by a lib is statically bound. Or to use your terminology, in the first usage, the "meat" defines the function and may override all or parts of it (or restrict access to only itself or others of its kind), whereas in the second usage, the function definition is beyond the meat's control. Your proposal would seem to remove the ability to make this very useful distinction.
Xah Lee
+
1
2
1
 
+Craig Lennox right. But that's technical excuse.

the bottom line, with oop dot notation, is that, there's a decision to make. Should the lib be the object, or should the data be the object?

this dilemma comes in string, regex, math.
In js for example, regex is a object, string is also a object. JS decided to support both, kind of. With string as object, you have limited regex ability. With regex as object, you have fuller regex ability. ( my JavaScript Regex Tutorial http://xahlee.info/js/js_regex.html )

math is also a good example. e.g. complex numbers. Should complex number be the object, or the lib Math be the object? or should Number class be the object??

my post is a bit raving, but what i proposed, is simply functional programing and eliminate OOP. :D But for real, it does not have this issue.

but also, we should note that the dot notation isn't the only possible notation for OOP. The functional notation as in f(x,y,z) could also work... here OOP is defined as having class/objects system. It does not necessitate the usual obj.method(param) notation.

lisp CLOS system is probably a good example (though i know nothing about it)
 
The idiocy is that this notation has been a by product of single dispatch. When you write a.f(x) it is only syntactic sugar for f(a,x) (first parameter is always 'this')
This notation has caused more harm than relief.
It is not extensible to multi dispatch and it is not mandatory at all as +Craig Lennox said.

And btw, totally unrelated, but about syntax, in C you can adress the i-th elt of array a by a[i] or i[a]. IE you can write
"Hello"[3]
or
3["Hello"]
The underlying grammar (aka syntax) of the language is not always what one can think of at first sight. It is especially true of JS.
 
+Xah Lee, array notation decays to pointer addition: a[b] = *(a + b). Addition is commutative, so you can always say b[a] as well. (Nobody actually does this outside of the International Obfuscated C Code Contest, I bloody well hope. But you can.)
 
What you both are missing is that in Python and JS, functions and closures are first-class objects which are assigned to variables in the same way as integers, strings and any other object.  The dot operator is merely the general syntax by which a value defined on an object is accessed.  Any construct of the form:

     x.y.z.some_function(arg1,arg2,arg3);

can be rewritten equivalently as:

    temp_variable = x.y.z.some_function;
    temp_variable(arg1,arg2,arg3);

and it must perform the function equivalently in both cases, otherwise you have broken the syntax.

To address the assertion from +Fabrice Popineau about multiple dispatch, that is bollocks.  All that is required for multiple dispatch to define a dispatch object which supports the function call operator.
http://www.artima.com/weblogs/viewpost.jsp?thread=101605
 
+Craig Lennox, Python doesn't have closures. But yes, you can assign function objects like that. (Note another terminological difference here: in Python, functions are a sort of object. In C, they are not, so e.g. where the standard says that all pointers to objects are convertible to 'char ', this does *not mean that pointers to functions are. This is one area where POSIX requires more of the core language than the C Standard does, because of dlsym()'s nonportable use of void * to mean 'a pointer to any function'.)
 
+Craig Lennox And what is the purpose to write a.f(x, y, z) to implement multiple dispatch ?
Isn't that much simpler to write f(a, x, y, z) ? At least the way Java does it , it is just stupid.
Java has declared that multiple {inheritance, dispatch} aren't needed. If you want them, you
have to implement them. And the syntax of Java doesn't help.
Xah Lee
 
+Craig Lennox good point that i did miss showing the example you mentioned, but that isn't much relevant to the essay.

take any actual source code of python or javascript. Replace all var, function, object names by random string. you'll have this form:

x.y(z)

my claim, is that, there is no way to know which is the function/method, which is the object, which is the subject, without semantics coming in.

i'm focusing on syntax of language design.

this situation happens in practice too, especially in js. One often has to pause and think, wait, which is the object, which is the subject.

in sharp contrast, lisp syntax, do not have this problem, nor shell, perl (if it's not using some package), Mathematica, etc.

sure, one can explain it away by why it's that way, or workaround to show reversing object/subject/verb is also possible, but the point remains: the dot notation syntax do not resolve order of object/subject/verb.

One can have a syntax that allow different order, yet with a property such that the order of object/subject/verb can be distinguished by syntax alone.

All it takes is a generalized prefix and postfix notation. e.g. in Mathematica:

x[y]
is syntactically equivalent to
y // x
...
(will have to think about and take more time to write a full example to get going on analogy of object/subject/verb order thing and to OOP dot notation and Mathematica's prefix/postfix/infix/infix/match-fix ...)
 
+Xah Lee True about any operator, [] included. That's what I wanted to show
with my C example.
And yes you need semantics to understand your x.y(z) .
However, I still don't see the point in introducing this dot notation. The
3 caballeros wanted to sell their
UML/Rational Rose stuff, that's the main point.
Xah Lee
+
1
2
1
 
+Fabrice Popineau yes. I wasn't trying to defend dot notation. And i truly dislike the UML/Rational stuff. I think they are charlatans. 
 
+Xah Lee, it has halfwitted closures. You can read, but not modify, values in enclosing scopes. To me, that's not a closure. You can't use it to implement a generator, for example, which is why Python had to implement generators as a first-class language construct. (Lua never needed to, nor did JS: both could just use perfectly ordinary closures and perfectly ordinary assignment to do it.)
 
+Xah Lee, I'm not sure if they're actually charlatans: it is possible they just got carried away and thought that this thing they invented was the greatest thing since sliced bread and would solve every problem programming had ever had, instead of, as it transpired, being nearly useless. It's not like wild over-optimism isn't a trait of computing people...
Xah Lee
+
1
2
1
 
+Nick Alcock right. My use of the word charlatan is a bit colorful. Though, perhaps this quote by EWD will make me look more credible:

… what society overwhelmingly asks for is snake oil. Of course, the snake oil has the most impressive names — otherwise you would be selling nothing — like “Structured Analysis and Design”, “Software Engineering”, “Maturity Models”, “Management Information Systems”, “Integrated Project Support Environments” “Object Orientation” and “Business Process Re-engineering” (the latter three being known as IPSE, OO and BPR, respectively). — Edsger W Dijkstra (1930 〜 2002), in EWD 1175: The strengths of the academic enterprise.
 
Ah yes, Dijkstra really did have a way with words. :)
 
take any actual source code of python or javascript. Replace all var, function, object names by random string. you'll have this form:
x.y(z)
my claim, is that, there is no way to know which is the function/method, which is the object, which is the subject, without semantics coming in.

And you can never know this.  Welcome to the world of late-binding, dynamically-typed languages.  All that power comes at a price.

Now tell me this:  assuming you have the following two constructs,
  y(x,z)
  y(z,x)

without "semantics coming in", is y() one function or two?  In Python, they have to be the same function, there is no way for them to be different.  This is useful information which gets totally pissed away under the bastardized system you're proposing.  Unless you want to live in the syntax-free drug-induced psychotic haze of CLOS along with the 8½ other people in the world who use it, dot notation just makes sense.
Xah Lee
 
+Craig Lennox i don't quite see your argument. The example
y(x,z)
y(z,x)

whether the y is one or 2 functions... how can it be 2 different functions if they have the same identifier? do you mean 2 separate function definition for it? In mathematica, (as you know since you do use M) you can have more than one definition for the same function, where the def differ in number of params or types of params (because it's just using pattern matching, where one def body matches say 2 args, another matches 3 args form, another 2 args but the arg is another form, etc.). And similarly, this can be done in other langs too that distinguish args received...

but still, even suppose i'm in favor of dot notation, i'm just not seeing what the above signify?

bastardized system you're proposing
lol. normal syntax f(x) is bastardized? i didn't realize you are in favor of dot notation, but nor i see how you find f(x,y,z,...) to be bad.
 
Well, as long as you have these things called objects, and those objects can have members which can have values, dot notation just seems to me to be a very efficient and compact way of accessing those properties.

So if you have an object x, which defines a property y, then x.y accesses that property.  If y happens to be a integer, then you might have something like 

    x.y = 42

That seems like a very natural syntax which has been around in some form since the days of Algol fucking W.

So if x.y can be a simple value like an integer, why not a closure?  If functions are first class objects, then it must be possible.  So, if the value of x.y is a function or a closure, then how do we invoke that function?  What syntax makes most sense for that?  

If your answer is that x.y cannot be a function, then functions are not first-class and your language sucks.  If your answer is that you have to say y(x), then yes, that is a bastardized syntax, end of story.

The question is not whether a thing should be done, but rather, can it be done.  Computer languages are for defining what can be done, as soon as you go to trying to tell people what should be done, you have left computer science and entered the religion business.
Xah Lee
 
+Craig Lennox yes, i do understand that. But, that wasn't what I was exploring anyway. I wasn't exploring what are possible syntax for OOP languages. I was pointing out, a issue of OOP dot notation, namely the ambiguity of x.y(z), AND and was pointing it being a practical problem that confuses people. In your first message, you mentioned distinction of dynamic bound vs static bound. But look at javascript, where regex can be a object with pattern as method, as well as method of string. So, both are dynamic bound, yet you have reversal of arg vs object. You have to admit this one point, then we can move on to about what would be a better alternative, given OOP.

i haven't thought about what are possible notation for OOP object/method/args... ( +Fabrice Popineau mentioned lisp CLOS but i am not familiar with it at all, so Fabrice's point about multi-dispatch is just something i have yet to look at)
Xah Lee
 
+Craig Lennox but, back to your point about possible syntax of OOP... off hand i don't think i agree. I think f(...) still works fine.

i have this article
What are OOP's Jargons & Complexities (OOP as Functional Programing)
http://xahlee.info/comp/oop.html

which showed these FP syntax for OOP:

mySurface = a_surface;
mySurface(rotate(angle));

corresponding to dot notation:

mySurface = new a_surface();
mySurface.rotate(angle);

I think the normal functional notation works just fine.
 
+Nick Alcock Would you say Haskell doesn't have closures either? Because Haskell doesn't allow you to modify the variables in the enclosing scope either.

In any event, Python allows you to do this with the "nonlocal" keyword, which has been in there since Python 3.0.

Unrelated: That Java example is a ridiculous strawman -- deliberately made more verbose than necessary. You can do it with:

    String a = "a string";
    String b = "another one";
    String c = a + b
    System.out.println(c);

I think the main point still stands (at least with respect to Java), but this is just a bad example, since Java provides string manipulation syntax that hides the OOP stuff, making the code almost identical to the Lisp version.
 
Also, some of the statements about Java are plainly false -- Java is not nearly as purely OOP as you claim.

"Standard functions like ... “for” loop constructs, “if … else” branching statements, or simple arithmetic operations… must now somehow become a method of some class."
For, if, while, and other control structures are just simple syntax in Java, nothing to do with OO. This is more a product of C than anything else.

Similarly, arithmetic is incompatible with OO in Java. "Now suppose the plus operation +, where does it go? Should it become a method of the various classes under the Number heading, or should it be a method of the Math class set?"
In Java, it isn't a method, it's just built-in syntax that only works on primitive types (which are NOT objects and don't have methods) -- with a special exception for Strings.

It's kind of ironic that you picked these examples, because I mostly agree with your core message (at least w.r.t. Java) in that it takes OOP way too far (as shown in your FileReader/FileWriter example), but control structures and arithmetic are not good examples of this. If anything, they are a violation of Java's own OOP purity because they are completely incompatible with the object system.
Xah Lee
+
1
2
1
 
+Matt Giuca you are right. The essay is written long ago as rant, could use some polishing.
 
I haven't really coded java for real, but Java sports iterator objects in for...loop. I learned about that concept first there. Find it really strange. Also, the details i don't recall, but Number is also a object, a wrapper object over the primitive ones.
 
+Xah Lee Right, so the fact that for loops work on iterators is one part where they interact directly with the object system. (I find this extremely useful, though; it's one of the key benefits of OOP that you can iterate over anything that supports that interface; it could just as easily be implemented with type classes in say Haskell.)

The number thing in Java is a mess. You've got primitives like int (not a class) and boxed wrapper classes like Integer, which provide methods for primitives, as well as placing them in the class hierarchy under Number, and also allowing them to be used in polymorphic data structures like Vector. You actually can do arithmetic on the boxed classes, because the compiler automatically unboxes them. It's certainly a hack, and not good design.

If I could say one thing, it would be don't let Java cloud your opinion of OOP programming. OOP programming (like all tools) is useful in some situations -- I would argue a great deal -- and inappropriate in others. Java often tries way too hard to use OOP when it isn't warranted. I think that C++ code is often a good deal more practical. Unlike Java, C++ doesn't force everything to be in a class, so if you just need a function or a struct, you just use one. There isn't the Java-like mentality of "everything should be encapsulated with getters and setters (even if it's a simple struct)" or "everything should be a method (even if it's a simple function)". But on the other hand, you can get good mileage out of classes, and because of operator overloading, you can make your classes feel like a much more natural part of the language.
 
btw, +Xah Lee, re your example

y(x,z)
y(z,x)

in C++ they could trivially be different functions, because functions can be overloaded based on the type of their arguments in C++ (and a number of other languages): a function's identifier is not its name but its name and the types of all its arguments (its type signature). This has advantages, but like so many features in C++ can also be used to make code epically unclear.

Oh, btw, if you want real fun with language grammars, is

A b(c);

a function b returning a variable of type A and taking a single parameter of type c, or a declaration of a variable named b of type A and initialized to c? In C++, you can't tell unless you know whether c is the name of a type or not. Welcome to context-sensitive languages! :/
 
The infamous "most vexing parse" problem. CUUURSE YOU!!!!
 
+Matt Giuca, I don't need to curse you. If you need to deal with that problem, C++ has already done that! :)
 
+Craig Lennox "To address the assertion from +Fabrice Popineau about multiple dispatch, that is bollocks.  All that is required for multiple dispatch to define a dispatch object which supports the function call operator."

I certainly don't want to define a dispatch object. I want the syntax of the language handle it for me. Else sooner or later you end up with designe patterns., which are usually a clue that your language sucks (the human compiler at work and so on).

+Craig Lennox "If your answer is that x.y cannot be a function, then functions are not first-class and your language sucks.  If your answer is that you have to say y(x), then yes, that is a bastardized syntax, end of story. "

The problem of dispatch and generic functions is not solved by your notation. Ok, x.y is a function and then what ? How do you dispatch on both the first and sencond arguments ? And I mean how do you write it ?

+Craig Lennox "So if x.y can be a simple value like an integer, why not a closure?  If functions are first class objects, then it should be possible.
<snap snap>
The question is not whether a thing should be done, but rather, can it be done.  Computer languages are for defining what can be done, as soon as you go to trying to tell people what should be done, you have left computer science and entered the religion business."

If you read carefully your own use of the should  word, you are yourself entering the religion business by defending this notation on the faith of it.
Add a comment...