Discussion:
DB support as contribute: is it a good idea?
(too old to reply)
Edison Wong
2008-02-13 08:52:32 UTC
Permalink
Hi folk,

Drupal 6.0 is revamped with Schema API, so what's next for Drupal 7.x?
PDO for sure! With this powerful data-access abstraction layer, workload
will much reduced for DB abstraction layer designers and developers, and
finally benefit our contribute developers and end users.

By the way, together with the decision of Drupal 7.x + PDO, there is
also some voice about moving PostgreSQL (and so other potential
databases support, e.g. Oracle, DB2, MSSQL, etc) support away from core,
but contribute; on the other hand, add official SQLite support into
Drupal core, together with MySQL.

Is this really a good idea? Or even if it is possible? As an existing
Drupal + PostgreSQL users, what will this affect your daily work? As a
potential customer of Drupal + Oracle/DB2/MSSQL/etc, is this a good new
for you, or just an evil? I would like to provide some brief idea for
you within this article.

Please refer to http://groups.drupal.org/node/8855 for more detail
information.

- --
Edison Wong
***@gmail.com
http://edin.no-ip.com/html/
Ivan Sergio Borgonovo
2008-02-13 12:52:58 UTC
Permalink
On Wed, 13 Feb 2008 16:52:32 +0800
Post by Edison Wong
By the way, together with the decision of Drupal 7.x + PDO, there
is also some voice about moving PostgreSQL (and so other potential
databases support, e.g. Oracle, DB2, MSSQL, etc) support away from
core, but contribute; on the other hand, add official SQLite
support into Drupal core, together with MySQL.
Perfect... put support for any other DB other than MySQL in
contrib *but* provide a serious abstraction layer that let people
write contrib to support *more than 2* DB ;)

That just move the problem from providing an API to providing an SPI,
making it worse.

Once you decide to go the SPI way... you've to design it in a way
that anyone can add support for his beloved DB.
You're not in charge of what DB will be plugged nor how.
Surely you won't have to develop the plug-in for each DB... but
you'll have to provide support without knowing what will get in those
plug-ins.

Everyone that is serious about DB AL should know that a) it's not
easy b) it is a matter of compromises.
If you don't compromise (performance, # of supported DB, delegation,
level of abstraction, extensibility, schedule...), you don't get the
DB AL.

Writing an SPI has its pros and cons.

Pros:
MySQL fans won't encumber core with MySQLism that sometimes make it
hard to support other DB.
PG fans then will have to write their own plug-in, and if they don't
it is up to their business, and if they do, they will do it in an
independent way from MySQL people *and* with a *serious* SPI most of
the time they will have to cut&paste code from the MySQL plug-in and
rewrite few lines.

Cons:
It is a harder work. It will have to be even more abstract and
radical.
We will have to extrude current support for the DBs and turn it into
plug-ins.
Unless we want to see a lot of code duplication we will still have to
write a basic API and the plug-ins will override just few methods.
So... anyway before we get to an SPI that is worth to have we will
have to pass through a better API.

So... let's work for putting pg support in contrib for drupal 9.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Karoly Negyesi
2008-02-13 16:54:49 UTC
Permalink
I recommend my blog post http://drupal4hu.com/node/123 and the linked sandbox http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/crell/pdo/ instead of FUD and confusion. And yes, it is a very good idea to move db drivers into contrib -- provided that core is capable of proper support them. This is what Crell and I are working on.
Edison Wong
2008-02-14 04:38:40 UTC
Permalink
It is very simple to comment others idea as FUD without responsibility,
but that is totally another story if works is supported by indeed
research progress.

I have upload my research progress to sandbox
(http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/hswong3i/siren/),
a patch for Official Drupal 6.0. By applying the patch, we are able to
support totally 9 PHP drivers (including mysql, mysqli, pgsql, oci8,
pdo_mysql, pdo_pgsql, pdo_oci, pdo_sqlite and pdo_ibm) within single
core code base. Progresses are also split into number of small patches
(http://groups.drupal.org/node/8663), and ready for review (or even
commit). I dig into this field for more than year, and I am also willing
to keep it on going, for both MySQL/PostgreSQL/Oracle/SQLite/DB2/etc.

Please provide some solid research progress (but not code snippet in
brain storming level) to support your FUD comment, or else it is totally
not fair.

Regards

Karoly Negyesi wrote:
| I recommend my blog post http://drupal4hu.com/node/123 and the linked
sandbox
http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/crell/pdo/
instead of FUD and confusion. And yes, it is a very good idea to move db
drivers into contrib -- provided that core is capable of proper support
them. This is what Crell and I are working on.

- --
Edison Wong
***@gmail.com
http://edin.no-ip.com/html/
Edison Wong
2008-02-14 07:27:25 UTC
Permalink
Moreover, I would like recommend Moodle XMLDB implementation as a solid
example (http://docs.moodle.org/en/Development:XMLDB_introduction).

The idea of my proposal is quite similar as that,
1. database.*.inc: similar as ADOdb
2. common.*.inc: similar as Moodle DML Library
(http://docs.moodle.org/en/DML_functions)
3. schema.*.inc: similar as Moodle DDL Library
(http://docs.moodle.org/en/DDL_functions)

That should be no conflict with chx and crell's work: Data API should be
something belongs to Drupal core API (similar idea as Moodle SQL neutral
statements) which is much higher level than that of my research (similar
as ADOdb). They should be split but not mix together :-)

- --
Edison Wong
***@gmail.com
http://edin.no-ip.com/html/
Ivan Sergio Borgonovo
2008-02-14 15:11:12 UTC
Permalink
On Thu, 14 Feb 2008 15:27:25 +0800
Post by Edison Wong
Moreover, I would like recommend Moodle XMLDB implementation as a
solid example
(http://docs.moodle.org/en/Development:XMLDB_introduction).
The idea of my proposal is quite similar as that,
1. database.*.inc: similar as ADOdb
2. common.*.inc: similar as Moodle DML Library
(http://docs.moodle.org/en/DML_functions)
3. schema.*.inc: similar as Moodle DDL Library
(http://docs.moodle.org/en/DDL_functions)
That should be no conflict with chx and crell's work: Data API
should be something belongs to Drupal core API (similar idea as
Moodle SQL neutral statements) which is much higher level than that
of my research (similar as ADOdb). They should be split but not mix
together :-)
I've been thinking about which level of abstraction is suited for
drupal a bit.
I came out with something like what Larry wrote (not as elegant
because queries are structured array and to "compose" you add
elements to array, you don't queue methods, but I was in a
hurry...) just for MS SQL (the scope was letting a door open to
transfer the stuff currently in MS SQL to pg that is already running
drupal).
It's something that can be carried quite far but I'm
wondering if it reach the sweet spot in abstraction/performance.

Django uses a 3rd party full ORM, but Django has never been a CMS, it
always has been a framework.
Still Drupal has been a CMS but it is going in the framework
direction.

Pushing in anything like SQLAlchemy will solve the problem of DB
abstraction at the apparent cost of bloat (apparent since it has to
be proved that adopting an ORM has such a big impact on performance
once you go the multi-db support path...).
Anyway hiding the DB too much and putting both feet in the OO realm
may have its drawback too.

Anyway... Drupal as a framework doesn't have the same wide area of
application as Django and core has very well defined objects (I think
there are discussions about redefining core objects too, cck, node,
field...). It has to be seen if pushing custom DB support into core
objects (providing some methods tailored on DB capabilities to
access/modify core drupal object) is worth.

I faced situations where doing something really exploiting the
capabilities of a DB do make the difference... but still if all the
things core have to offer to interact with core objects can be
achieved efficiently with Larry's approach, we could delay the
problem of an even more abstract DB layer some more version.

I think that at this moment having a system that let you interact
with core object efficiently and let you write DB agnostic code 90%
of the time (or avoid you to write SQL code at all) will kick out of
core and contrib *nearly* all the mysqlism that are a pain.
90% of the time people shouldn't feel the need to write SQL
directly and access functions similar to what Larry wrote or just use
core object methods.

The structure of Larry's query object should be examined carefully to
see if it makes hard to implement stuff like updates with join or
other things (subquery) that may not be so popular but still popular
enough to convince people to skip the whole DB AL and again write SQL
directly.
If a module uses 90% of supported queries and 10% of unsupported
queries... it will still be a hard module to port.
If 90% of the modules use just supported queries, it be a better
situation.
This will determine the success of a new AL.

For the remaining 10% of modules that see a real advantage
of exploiting DB specific queries core should provide a path to put
SQL specific queries in "module-contrib" sort of... so that if the
original author wrote it for a specific DB and someone is interested
in porting it to other DB it won't be a pain to do so.
If every module contrib invent its method to isolate DB specific
code, no one will be so brave to port modules to other DB.

Another thing to consider is that Drupal is a web framework... you'll
have to deal with form, menu, json, xml etc... there are a lot of
repetitive tasks that could be avoided exploiting a bit more metadata
put into the schema api (or elsewhere).
People already wrote about it... but eg. if we are going to support
newer version of DB... checks comes into play, we have constraint
etc... and even avoiding a copy of RoR, you can still automate a lot
of stuff starting from validation if you use metadata into schema.

Abstraction and delegation were separated at birth... and everyone
know that you can solve all CS problems with recursive use of
delegation (including asking Karoly to write pg drivers <g>)... but
still Drupal can't fulfil all needs of all developers in all fields
and even as a framework I'd keep the "web site" use case on top of
priorities.
That means dealing with forms and all the above... if we're not going
to have a full ORM it is worth to offer some helpers that will avoid
to write over and over the same code for common tasks taking
advantage of schema metadata or even renouncing to some higher level
abstraction to put in the schema data some info that won't be
strictly related to the DB but will make much easier to build up web
apps (auto-completion?, nested select (pk/fk) with JS/JQuery?, you
name them...).

BTW Larry, I've found that adding group by, order by info directly in
the fields "property" avoid a lot of typing and keeps info under
sight, my scenario didn't push this to much so there may be
limitations to this approach... but still... it worked for me.

At a first sight moodle functions just look inspirational but the
feeling is they give too much freedom to the coder to put non
cross-compatible SQL inside functions.
Other inspirational sources for DB AL could be:
http://propel.phpdb.org/trac/ propel
http://www.phpdoctrine.org/ doctrine
http://creole.phpdb.org/trac/ creole
ADOdb now has its own Active Record too...
I think there are others...

I'd be very interested in which problems you had to face to succeed
in porting drupal to so many DB.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Karoly Negyesi
2008-02-17 19:43:31 UTC
Permalink
Post by Ivan Sergio Borgonovo
delegation (including asking Karoly to write pg drivers <g>)... but
$db = new PDO('pgsql:dbname=drupal', 'drupal', 'drupal');
$binarydata = "abcdefg\x00a\x00\x01\x02";
$db->exec('CREATE TABLE test (data bytea, comment varchar(64), len integer)');
$db->beginTransaction();
$stmt = $db->prepare("INSERT INTO test (data, comment, len) VALUES (:data, :comment, :len)");
$stmt->bindParam(':len', $len);
$stmt->bindParam(':data', $blob, PDO::PARAM_LOB);
$blob = fopen('php://memory', 'a');
$len = fwrite($blob, $binarydata);
rewind($blob);
$comment = 'lob';
$stmt->bindParam(':comment', $comment);
$stmt->execute();
$db->commit();

Like that? I *did* write the skeleton of LOB handling for postgresql, mind ya.
Edison Wong
2008-02-18 01:54:03 UTC
Permalink
Hmm... From my point of view, it is just similar as
http://netevil.org/uuid/4365876a-cee9-3009-7726-365876a51802 (although
it is target for pdo_oci, it share the same idea as pdo_pgsql):
<?php
$db = new PDO("oci:", "scott", "tiger");
$db->beginTransaction(); // Essential!
$stmt = $db->prepare(
~ "INSERT INTO blobtest (id, contenttype, blob) ".
~ "VALUES (:id, :type, EMPTY_BLOB()) ".
~ "RETURNING blob INTO :blob");
$stmt->bindParam(':id', $id);
$stmt->bindParam(':type', $type);
$stmt->bindParam(':blob', $blob, PDO::PARAM_LOB);
$type = 'image/gif';
$id = 1; // generate your own unique id here
$blob = fopen('/path/to/a/graphic.gif', 'rb');
$stmt->execute();
$stmt->commit();
?>

http://php.net/pdo#pdo.lobs also come with some example (BTW, this level
of code snippet is TOTALLY NOT ENOUGH for a Drupal-style db_query()
implementation...):
<?php
$db = new PDO('odbc:SAMPLE', 'db2inst1', 'ibmdb2');
$stmt = $db->prepare("insert into images (id, contenttype, imagedata)
values (?, ?, ?)");
$id = get_new_id(); // some function to allocate a new ID

// assume that we are running as part of a file upload form
// You can find more information in the PHP documentation

$fp = fopen($_FILES['file']['tmp_name'], 'rb');

$stmt->bindParam(1, $id);
$stmt->bindParam(2, $_FILES['file']['type']);
$stmt->bindParam(3, $fp, PDO::PARAM_LOB);

$db->beginTransaction();
$stmt->execute();
$db->commit();
?>

On the other hand, I get a functional implementation of pdo_pgsql *done*
with correct PDO + PostgreSQL BLOB handling, based on our existing D6 DB
API with minimal amount of logic change:
http://edin.no-ip.com/viewvc/siren/includes/database.pdo_pgsql.inc?view=markup
http://edin.no-ip.com/viewvc/siren/includes/common.postgresql.inc?view=markup
http://edin.no-ip.com/viewvc/siren/includes/schema.postgresql.inc?view=markup
http://edin.no-ip.com/viewvc/siren/includes/install.postgresql.inc?view=markup

Like this one? Don't forget that they are parallel developed with
pdo_mysql, pdo_oci, pdo_sqlite, and pdo_ibm, where all of the other are
both function correctly. I am not trying to trigger a war of
implementation; BTW, a tiny amount brain-storming-level code snippet
(which similar implementation is all around the world...) is not enough
for supporting a positive technical-based discussion ;-(

Karoly Negyesi wrote:
|> delegation (including asking Karoly to write pg drivers <g>)... but
|
| $db = new PDO('pgsql:dbname=drupal', 'drupal', 'drupal');
| $binarydata = "abcdefg\x00a\x00\x01\x02";
| $db->exec('CREATE TABLE test (data bytea, comment varchar(64), len
integer)');
| $db->beginTransaction();
| $stmt = $db->prepare("INSERT INTO test (data, comment, len) VALUES
(:data, :comment, :len)");
| $stmt->bindParam(':len', $len);
| $stmt->bindParam(':data', $blob, PDO::PARAM_LOB);
| $blob = fopen('php://memory', 'a');
| $len = fwrite($blob, $binarydata);
| rewind($blob);
| $comment = 'lob';
| $stmt->bindParam(':comment', $comment);
| $stmt->execute();
| $db->commit();
|
| Like that? I *did* write the skeleton of LOB handling for postgresql,
mind ya.

- --
Edison Wong
***@gmail.com
http://edin.no-ip.com/html/

Continue reading on narkive:
Loading...