Accessing a database - ownCloud continued

( EDIT: continued from https://plus.google.com/+KristianK%C3%B6hntopp/posts/GwTLk1kk5CG )

private static function processQuery( $query ) {
...
}

No need to look at the code. Really, don't.

Here is the docblock.

@brief does minor changes to query
@param string $query Query string
@return string corrected query string
This function replaces PREFIX with the value of $CONFIG_DBTABLEPREFIX and replaces the ` with ' or " according to the database driver.

Yes. Fucking with quoting, using str_replace against SQL command strings. What could possibly go wrong?

Hm, maybe I should be looking at the code, too.

Wait.

                }elseif( $type == 'mssql' ) {
                        $query = preg_replace( "/\`(.*?)`/", "[$1]", $query );
                        $query = str_ireplace( 'NOW()', 'CURRENT_TIMESTAMP', $query );
                        $query = str_replace( 'LENGTH(', 'LEN(', $query );
                        $query = str_replace( 'SUBSTR(', 'SUBSTRING(', $query );
                        $query = str_ireplace( 'UNIX_TIMESTAMP()', 'DATEDIFF(second,{d \'1970-01-01\'},GETDATE())', $query );

                        $query = self::fixLimitClauseForMSSQL($query);
                }

That's not exactly what the docblock says.

We are actually looking at a much more fundamental problem here.

So ownCloud probably used MDB2 as a database access layer and a database modeling layer at first, and later went PDO, because MDB sucks a lot, conceptually, functionally and performance wise.

What MDB2 doesn't do, and PDO doesn't, either, is query building. There are very few frameworks that can pull this off and don't suck, hard.

Nextstep had DBKit as a full scale ORM, and that worked to a good degree for static queries, at least when I tested it in the 90ies against Oracle and Sybase. Ruby has RoR and ActiveRecord, and that does work to some degree, except that Ruby does not really scale. It has at least a solid exit strategy, and that's worth a lot in database abstractions.

MDB did one thing (data modeling), and not very well, and didn't do the other properly (query building). I know because at the time MDB was being discussed and written, I was still active in the PHP community and did try to do the DBKit ORM thing in PHP (Version 3 at that time) and it simply did not pay off in a request-based framework.

Basically, you have to parse a lot of crap, instantiate a lot of objects and shovel a lot of strings in order to build a string, which you send to the database for parsing. At the end of the request you demolish everything and at the next request you start from scratch. The PHP execution model and this kind of abstraction did not go very well together, at all.

In an application server that would change, you'd instantiate all the modelling and query building crap and then just ask for the query (probably cached and prebuilt from a previous request). But app servers scale differently than per-request models (they cost you per session, not per concurrent request, basically their n is factor 100 or more higher for the same number of active users, and their memory footprint probably is larger as well).

In a query compiler you'd model all the crap and generate an SQL dictionary for static queries for the target database as part of the deploy and schema generation.

Anyway, what ownCloud does here:

a. They are using MDB2 for database access: to build a connection to a database, abstract away the actual PHP commands to do that, which are database specific.

b. They are using MDB2 for schema management: To compile their supposedly database agnostic XML into SQL DDL for the target database. Only that it is not: Their str_replace() processing tries to fix stuff that MDB2 does not model, or they try to work around MDB2 not knowing intricacies of certain database server versions.

That is, MDB2 is simply not up to that task, at all.

c. They are pushing literal SQL into their application and push this out to the database, which may be any of MySQL, PostgreSQL, sqlite, Oracle or MS SQL Server.

MDB2 does not help here at all, and PDO does not, either, so the SQL has to be as portable as possible, and then needs to be massaged in a very extremely bad way in order to be able to run on anything that is not MySQL (and probably an old version of MySQL, assuming MyISAM and 4.x TIMESTAMP semantics and other crap).

That's of course doomed. Evil. And a shitload of potential security problems.

The point of using a bindParam interface is that you do not str_replace quote signs, mysql_real_escape exotic unicode characters and other stuff. Yet, ownCloud trying to honor the pwn in its name does this all over the fucking place.

The proper solution would be

a. a query building Fullsize ORM. See my remarks above. This is hard to get right, and a projects in its own right, I but several are available for PHP.

b. application specific, target database specific literal SQL, as static as possible, hidden away in manually written bookkeeper functions that persist your stuff, and a thin layer of support functions making this less painful.

Solution a is not in any of the tools used. Solution b has not been chosen.

Instead this. pwning this is harder than normal, b/c they at least use bindParam in their DML, but the approach would be to go and try to find combinations of dynamic query building that can be tricked into turning their numerous, distributed and uncontrolled str_replaces into a doomsday machine.

When the NSA shit hit the fan, a lot of people stated that they would now host their own stuff.

And installed this.

This is an improvement?  R U sure? What's next? TikiWiki advocacy?

I think I'd rather trust Google and the NSA with my data right now. Seriously.

( *EDIT: continued in https://plus.google.com/+KristianK%C3%B6hntopp/posts/CP6oxFZt5pM )
Shared publiclyView activity
Related Collections