Discussion:
Do not let postgresql hold back great patches
(too old to reply)
Karoly Negyesi
2007-11-11 22:17:26 UTC
Permalink
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.

Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.

Regards,

Karoly Negyesi
Ken Rickard
2007-11-11 22:31:25 UTC
Permalink
Is there a reliable way to search for patches requiring pgSQL testing?

Perhaps these sorts of things should be committed and then separate
issues opened for pgSQL testing?

In the case of this issue, ALTER IGNORE will not work on pgSQL. [Will
follow-up there.]

It is also possible (and quick) to run EXPLAIN against pgSQL if we
have a list of queries that I could test.

- Ken Rickard
agentrickard

Note: Not a pgSQL guru, but have it installed on the Mac.
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
Regards,
Karoly Negyesi
Greg Knaddison
2007-11-12 00:07:10 UTC
Permalink
Post by Ken Rickard
Is there a reliable way to search for patches requiring pgSQL testing?
Not really. In the Postgresql group[1] I created a post[2] to track
things like this. Others have proposed a taxonomy to track stuff like
this. I think the problem with that is that project_issue doesn't
support editing taxonomies during followups (I assume there's an issue
- I didn't look).

Greg

[1] http://groups.drupal.org/postgresql
[2] http://groups.drupal.org/node/6980
--
Greg Knaddison
Denver, CO | http://knaddison.com
World Spanish Tour | http://wanderlusting.org/user/greg
Derek Wright
2007-11-12 00:17:07 UTC
Permalink
I think the problem with that is that project_issue doesn't support
editing taxonomies during followups
Right.
(I assume there's an issue - I didn't look).
There's always an issue. ;)

http://drupal.org/node/187480

-Derek (dww)
Khalid Baheyeldin
2007-11-12 00:30:41 UTC
Permalink
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
I know many will not like what I will say, but I have to say it.

A prime example of where MySQL works fine to solve an issue with a few lines
is this issue http://drupal.org/node/83738. All the huge changes and jumping
through hoops is because of accomodating PostgreSQL

We introduced schema changes just because PostgreSQL cannot do case
insensitive matching by default, while MySQL works fine. For the sake of 5%
(or 1%) of the sites, we are increasing complexity.
Post by Karoly Negyesi
Regards,
Karoly Negyesi
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Ken Rickard
2007-11-12 01:20:59 UTC
Permalink
I would agree with Khalid, actually. Having rock-solid code on a
popular platform seems ideal, at least until something like PDO is
ready to handle database abstraction for us.

I only installed pgSQL to be a "responsible" module maintainer after
my requests for pgSQL testers went unanswered.

At work, we're entirely MySQL.

- Ken Rickard
Post by Khalid Baheyeldin
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
I know many will not like what I will say, but I have to say it.
A prime example of where MySQL works fine to solve an issue with a few lines
is this issue http://drupal.org/node/83738. All the huge changes and jumping
through hoops is because of accomodating PostgreSQL
We introduced schema changes just because PostgreSQL cannot do case
insensitive matching by default, while MySQL works fine. For the sake of 5%
(or 1%) of the sites, we are increasing complexity.
Post by Karoly Negyesi
Regards,
Karoly Negyesi
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Larry Garfield
2007-11-12 01:58:28 UTC
Permalink
PDO will only abstract the API for us. It will not help us abstract query
syntax or solve the fact that there is no such thing as "SQL", just vaguely
similar languages called MySQL, PostgreSQL, SQLite, Oracle, etc.
Post by Ken Rickard
I would agree with Khalid, actually. Having rock-solid code on a
popular platform seems ideal, at least until something like PDO is
ready to handle database abstraction for us.
I only installed pgSQL to be a "responsible" module maintainer after
my requests for pgSQL testers went unanswered.
At work, we're entirely MySQL.
- Ken Rickard
Post by Khalid Baheyeldin
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
I know many will not like what I will say, but I have to say it.
A prime example of where MySQL works fine to solve an issue with a few
lines is this issue http://drupal.org/node/83738. All the huge changes
and jumping through hoops is because of accomodating PostgreSQL
We introduced schema changes just because PostgreSQL cannot do case
insensitive matching by default, while MySQL works fine. For the sake of
5% (or 1%) of the sites, we are increasing complexity.
Post by Karoly Negyesi
Regards,
Karoly Negyesi
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
--
Larry Garfield AIM: LOLG42
***@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
George Kappel
2007-11-12 02:14:30 UTC
Permalink
I would think this is a slippery slope, making sure a patch work against
at least 2 db backends in a reasonable way is an important indication of
quality
Post by Larry Garfield
PDO will only abstract the API for us. It will not help us abstract query
syntax or solve the fact that there is no such thing as "SQL", just vaguely
similar languages called MySQL, PostgreSQL, SQLite, Oracle, etc.
Post by Ken Rickard
I would agree with Khalid, actually. Having rock-solid code on a
popular platform seems ideal, at least until something like PDO is
ready to handle database abstraction for us.
I only installed pgSQL to be a "responsible" module maintainer after
my requests for pgSQL testers went unanswered.
At work, we're entirely MySQL.
- Ken Rickard
Post by Khalid Baheyeldin
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
I know many will not like what I will say, but I have to say it.
A prime example of where MySQL works fine to solve an issue with a few
lines is this issue http://drupal.org/node/83738. All the huge changes
and jumping through hoops is because of accomodating PostgreSQL
We introduced schema changes just because PostgreSQL cannot do case
insensitive matching by default, while MySQL works fine. For the sake of
5% (or 1%) of the sites, we are increasing complexity.
Post by Karoly Negyesi
Regards,
Karoly Negyesi
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Ivan Sergio Borgonovo
2007-11-12 11:34:33 UTC
Permalink
On Sun, 11 Nov 2007 20:14:30 -0600
Post by George Kappel
I would think this is a slippery slope, making sure a patch work
against at least 2 db backends in a reasonable way is an important
indication of quality
Agree.

I think supporting one DB is not just a matter of losing support but
it is a matter of loosing freedom. Once drupal lose the
infrastructure for DB independence, it will be hard to put it back.

Support for 2 DB is actually too few to keep the code sane. The
second will just be a special case of the first rather than an
abstraction.

I am *ing scared of mono-cultures.

No one said DB abstraction had to be easy, actually it isn't and
there are plenty of DB abstraction layers out there to prove it.

I was actually looking at http://www.sqlalchemy.org/ (python) to
learn something.
Any chance we could introduce, get inspired, rely on an existent DB
abstraction layer?

OK, I'll stop to pretend to be a software architect and I'm going to
install D6 on pgsql now to see if I can help in anyway. Installation
completed...

BTW I've some older drupal sites on pgsql. The reasons I chose pgsql
were historical (more mature transaction, stored procedures...).
If I'd have to chose for the future I *may* reconsider to use mysql.
All my pgsql sites are running smoothly even if I had to patch some
modules. I could live with a mysql only D6. I'd just consider it
risky for the future of drupal itself.

Wasn't Microsoft interested in supporting drupal running on IIS/MSSQL?

http://hojtsy.hu/blog/2007-nov-04/adventures-redmond-microsoft-open-source-and-drupal

http://buytaert.net/microsoft-and-drupal


BTW as far as I know there is no way to introduce a unique constraint
in pgsql without eliminating the duplicates first.

For the case exposed here http://drupal.org/node/146466

mysql
ALTER IGNORE TABLE {search_index} ADD UNIQUE KEY sid_word_type (sid,
word, type)");
ALTER IGNORE TABLE {search_dataset} ADD UNIQUE KEY sid_type (sid,
type)

would turn to be

pgsql
delete from {search_index}
where exists ( select 'x'
from {search_index} i
where
i.sid = {search_index}.sid
and i.word = {search_index}.word
and i.type = {search_index}.type
and i.oid < {search_index}.oid
);

alter table {search_index}
add constraint sid_word_type
unique (sid, word, type);


delete from {search_dataset}
where exists ( select 'x'
from {search_dataset} i
where
i.sid={search_dataset}.sid
and i.type = {search_dataset}.type
and i.oid < {search_dataset}.oid
);

alter table {search_dataset}
add constraint sid_type
unique (sid, type);

untested on drupal... tested on a *test* table, going to test further
shortly.

(oid is a bit of pgmagic, oid is the object id, so you're not risking
to kill all the row and leave at least one instance).
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Ivan Sergio Borgonovo
2007-11-12 12:15:17 UTC
Permalink
On Mon, 12 Nov 2007 12:34:33 +0100
Post by Ivan Sergio Borgonovo
(oid is a bit of pgmagic, oid is the object id, so you're not
risking to kill all the row and leave at least one instance).
It seems you can't access oid directly on newer postgresql.
It was working in 7.4, it doesn't in 8.1.

Investigating...
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Ivan Sergio Borgonovo
2007-11-12 13:07:20 UTC
Permalink
On Mon, 12 Nov 2007 13:15:17 +0100
Post by Ivan Sergio Borgonovo
On Mon, 12 Nov 2007 12:34:33 +0100
Post by Ivan Sergio Borgonovo
(oid is a bit of pgmagic, oid is the object id, so you're not
risking to kill all the row and leave at least one instance).
It seems you can't access oid directly on newer postgresql.
It was working in 7.4, it doesn't in 8.1.
Investigating...
default oid assignment has been dropped from >8.0.

I can't judge how much reliable this solution could be:

http://archives.postgresql.org/pgsql-sql/2003-01/msg00513.php
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Brenda Wallace
2007-11-16 08:50:29 UTC
Permalink
I'm here, and willing to test on postgresql.. I'm not full time but
rather a weekend hacker. Please send me stuff.
(Shiny on #drupal)
Post by Ivan Sergio Borgonovo
On Sun, 11 Nov 2007 20:14:30 -0600
Post by George Kappel
I would think this is a slippery slope, making sure a patch work
against at least 2 db backends in a reasonable way is an important
indication of quality
Agree.
I think supporting one DB is not just a matter of losing support but
it is a matter of loosing freedom. Once drupal lose the
infrastructure for DB independence, it will be hard to put it back.
Support for 2 DB is actually too few to keep the code sane. The
second will just be a special case of the first rather than an
abstraction.
I am *ing scared of mono-cultures.
No one said DB abstraction had to be easy, actually it isn't and
there are plenty of DB abstraction layers out there to prove it.
I was actually looking at http://www.sqlalchemy.org/ (python) to
learn something.
Any chance we could introduce, get inspired, rely on an existent DB
abstraction layer?
OK, I'll stop to pretend to be a software architect and I'm going to
install D6 on pgsql now to see if I can help in anyway. Installation
completed...
BTW I've some older drupal sites on pgsql. The reasons I chose pgsql
were historical (more mature transaction, stored procedures...).
If I'd have to chose for the future I *may* reconsider to use mysql.
All my pgsql sites are running smoothly even if I had to patch some
modules. I could live with a mysql only D6. I'd just consider it
risky for the future of drupal itself.
Wasn't Microsoft interested in supporting drupal running on IIS/MSSQL?
http://hojtsy.hu/blog/2007-nov-04/adventures-redmond-microsoft-open-source-and-drupal
http://buytaert.net/microsoft-and-drupal
BTW as far as I know there is no way to introduce a unique constraint
in pgsql without eliminating the duplicates first.
For the case exposed here http://drupal.org/node/146466
mysql
ALTER IGNORE TABLE {search_index} ADD UNIQUE KEY sid_word_type (sid,
word, type)");
ALTER IGNORE TABLE {search_dataset} ADD UNIQUE KEY sid_type (sid,
type)
would turn to be
pgsql
delete from {search_index}
where exists ( select 'x'
from {search_index} i
where
i.sid = {search_index}.sid
and i.word = {search_index}.word
and i.type = {search_index}.type
and i.oid < {search_index}.oid
);
alter table {search_index}
add constraint sid_word_type
unique (sid, word, type);
delete from {search_dataset}
where exists ( select 'x'
from {search_dataset} i
where
i.sid={search_dataset}.sid
and i.type = {search_dataset}.type
and i.oid < {search_dataset}.oid
);
alter table {search_dataset}
add constraint sid_type
unique (sid, type);
untested on drupal... tested on a *test* table, going to test further
shortly.
(oid is a bit of pgmagic, oid is the object id, so you're not risking
to kill all the row and leave at least one instance).
Trevor Mckeown
2007-11-16 17:28:55 UTC
Permalink
Alright let me throw my 2 cents in here .... Again many may not like
what i have to say.

I have experience with both mysql and postgresql (along with SQL
server and oracle). They are both pretty cool databases (though I
tend to lean towards postgres, its much more mature, the command line
client is a lot better, and its had a PL/SQL language for much longer
then mysql)

Everything I read in the previous messages about keeping DB
independance I think is dead on. Any application that only supports
one database shuts the door on potential users and more then likely is
using some boofed up, non-standard SQL. Neither Mysql or Postgres is
evil, they are just tools. Its all about how you use those tools.
Same as the argument, guns don't kill people, people kill people.
From my limited experience with Drupal, its bad queries that are
breaking Postgres support. ... '1' is not an int ... its a string.
Don't user '%s' for every parameter you pass to db_query ...Realize
there is a difference between storing a string, storing an int,
storing a timestamp, etc ... and please stay away from functions or
constructs that only work for one db platform.

On a side note, I've been away from the community for a little while
(close to a year), and I really would like to jump back in. My jedi
skill of programming have gotten a lot better (lots of jquery and lots
of php), and I would like to try my hand at designing architecture or
even opening up og2list to qmail, or maybe even making a second
attempt at porting mantis to drupal. Thanks for you time

-Trevor (tmck)
I'm here, and willing to test on postgresql.. I'm not full time but
rather a weekend hacker. Please send me stuff.
(Shiny on #drupal)
Post by Ivan Sergio Borgonovo
On Sun, 11 Nov 2007 20:14:30 -0600
Post by George Kappel
I would think this is a slippery slope, making sure a patch work
against at least 2 db backends in a reasonable way is an important
indication of quality
Agree.
I think supporting one DB is not just a matter of losing support but
it is a matter of loosing freedom. Once drupal lose the
infrastructure for DB independence, it will be hard to put it back.
Support for 2 DB is actually too few to keep the code sane. The
second will just be a special case of the first rather than an
abstraction.
I am *ing scared of mono-cultures.
No one said DB abstraction had to be easy, actually it isn't and
there are plenty of DB abstraction layers out there to prove it.
I was actually looking at http://www.sqlalchemy.org/ (python) to
learn something.
Any chance we could introduce, get inspired, rely on an existent DB
abstraction layer?
OK, I'll stop to pretend to be a software architect and I'm going to
install D6 on pgsql now to see if I can help in anyway. Installation
completed...
BTW I've some older drupal sites on pgsql. The reasons I chose pgsql
were historical (more mature transaction, stored procedures...).
If I'd have to chose for the future I *may* reconsider to use mysql.
All my pgsql sites are running smoothly even if I had to patch some
modules. I could live with a mysql only D6. I'd just consider it
risky for the future of drupal itself.
Wasn't Microsoft interested in supporting drupal running on IIS/MSSQL?
http://hojtsy.hu/blog/2007-nov-04/adventures-redmond-microsoft-open-source-and-drupal
http://buytaert.net/microsoft-and-drupal
BTW as far as I know there is no way to introduce a unique constraint
in pgsql without eliminating the duplicates first.
For the case exposed here http://drupal.org/node/146466
mysql
ALTER IGNORE TABLE {search_index} ADD UNIQUE KEY sid_word_type (sid,
word, type)");
ALTER IGNORE TABLE {search_dataset} ADD UNIQUE KEY sid_type (sid,
type)
would turn to be
pgsql
delete from {search_index}
where exists ( select 'x'
from {search_index} i
where
i.sid = {search_index}.sid
and i.word = {search_index}.word
and i.type = {search_index}.type
and i.oid < {search_index}.oid
);
alter table {search_index}
add constraint sid_word_type
unique (sid, word, type);
delete from {search_dataset}
where exists ( select 'x'
from {search_dataset} i
where
i.sid={search_dataset}.sid
and i.type = {search_dataset}.type
and i.oid < {search_dataset}.oid
);
alter table {search_dataset}
add constraint sid_type
unique (sid, type);
untested on drupal... tested on a *test* table, going to test further
shortly.
(oid is a bit of pgmagic, oid is the object id, so you're not risking
to kill all the row and leave at least one instance).
Karoly Negyesi
2007-11-12 11:57:38 UTC
Permalink
Post by George Kappel
I would think this is a slippery slope, making sure a patch work against
at least 2 db backends in a reasonable way is an important indication of
quality
If pigs can fly they are better animals.Maybe you want to support your statement with something.

Though I said that "has a decent hope (use common sense) to work on postgresql". Once it's committed those who care about postgresql, if
Post by George Kappel
We introduced schema changes just because PostgreSQL cannot do case insensitive
matching by default, while MySQL works fine. For the sake of 5% (or 1%) of the sites,
we are increasing complexity.
Amen brother. I am happy which simply removes LOWER from those queries and kicks the ball to the postgresql half of the playground. As I stated on my blog I have nothing against posgresql just I do not want to deal / held back with it. Any support for this solution?
Earnie Boyd
2007-11-12 13:29:34 UTC
Permalink
Post by Karoly Negyesi
Post by George Kappel
I would think this is a slippery slope, making sure a patch work against
at least 2 db backends in a reasonable way is an important indication of
quality
If pigs can fly they are better animals.Maybe you want to support
your statement with something.
Though I said that "has a decent hope (use common sense) to work on
postgresql". Once it's committed those who care about postgresql, if
Post by George Kappel
We introduced schema changes just because PostgreSQL cannot do case insensitive
matching by default, while MySQL works fine. For the sake of 5% (or 1%) of the sites,
we are increasing complexity.
Amen brother. I am happy which simply removes LOWER from those
queries and kicks the ball to the postgresql half of the playground.
As I stated on my blog I have nothing against posgresql just I do not
want to deal / held back with it. Any support for this solution?
IMO, we need to focus on Drupal SQL[1] and not MySql SQL or Postgresql
SQL or Oracle SQL for coding purposes. Another layer of abstraction
handles the nitty gritty of each different DB. AFAIR, the code prior
to 6 was both MySql and PgSql friendly. So making a change in a
different direction for Drupal 6 doesn't seem TRT to me. However if
Drupal 5 wasn't as PgSql friendly as I remember then I agree with
Karoly. For Drupal 7 I would like to see Drupal SQL handle all of the
syntax changes needed for various DB engines; that means no more
testing of $db_type outside of the Drupal SQL abstraction. Then a
company with Oracle, Sybase and DB2 could display data from all three
database types comfortably with Drupal.

[1] http://drupal.org/node/191486

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Konstantin Käfer
2007-11-12 15:04:42 UTC
Permalink
Post by Earnie Boyd
IMO, we need to focus on Drupal SQL[1] and not MySql SQL or
Postgresql SQL or Oracle SQL for coding purposes. Another layer of
abstraction handles the nitty gritty of each different DB.
http://adodb.sourceforge.net/. We don't want to go that route.

~Konstantin
Earnest Berry III
2007-11-12 23:34:47 UTC
Permalink
I'm just catching up on this thread, so this may have already been answered.
Regarding the case insensitivy, why can't one use ILIKE during the postGres
search in the SQL statement to force case-insensitive?...again, as I read
more through this thread, this may have been answered.



From: development-***@drupal.org [mailto:development-***@drupal.org]
On Behalf Of Khalid Baheyeldin
Sent: Sunday, November 11, 2007 7:31 PM
To: ***@drupal.org
Subject: Re: [development] Do not let postgresql hold back great patches



On Nov 11, 2007 5:17 PM, Karoly Negyesi <***@negyesi.net> wrote:

In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.

Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.


I know many will not like what I will say, but I have to say it.

A prime example of where MySQL works fine to solve an issue with a few lines
is this issue http://drupal.org/node/83738. All the huge changes and jumping
through hoops is because of accomodating PostgreSQL

We introduced schema changes just because PostgreSQL cannot do case
insensitive matching by default, while MySQL works fine. For the sake of 5%
(or 1%) of the sites, we are increasing complexity.


Regards,

Karoly Negyesi
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Larry Garfield
2007-11-12 15:55:55 UTC
Permalink
Post by Earnie Boyd
Post by Karoly Negyesi
Amen brother. I am happy which simply removes LOWER from those
queries and kicks the ball to the postgresql half of the playground.
As I stated on my blog I have nothing against posgresql just I do not
want to deal / held back with it. Any support for this solution?
IMO, we need to focus on Drupal SQL[1] and not MySql SQL or Postgresql
SQL or Oracle SQL for coding purposes. Another layer of abstraction
handles the nitty gritty of each different DB. AFAIR, the code prior
to 6 was both MySql and PgSql friendly. So making a change in a
different direction for Drupal 6 doesn't seem TRT to me. However if
Drupal 5 wasn't as PgSql friendly as I remember then I agree with
Karoly. For Drupal 7 I would like to see Drupal SQL handle all of the
syntax changes needed for various DB engines; that means no more
testing of $db_type outside of the Drupal SQL abstraction. Then a
company with Oracle, Sybase and DB2 could display data from all three
database types comfortably with Drupal.
[1] http://drupal.org/node/191486
You have just proposed that we create our own serialized language called DSQL that we then regex for *every single query* and then rewrite a query for each database backend according to some abstract set of rules.

No.

That is way too complex and way too slow, especially when it would involve slowing down MySQL (95%) users purely for the sake of everyone else (5% at most). It is also a sign that the architecture is bad: If you serialize a data structure and then string-parse the serialized version, you are doing something wrong. You should have parsed it first, then serialized.

The only alternative I see is to use data-structure-based query builders instead in some cases. We do that now for DDL statements (Schema API). I'm planning to do that for all insert/update/delete statements in Drupal 7, and I'm pondering how feasible it would be to require certain types of SELECT queries to use one. I'm still undecided.

That is, however, beside the point for right now. Right now, the point is that we're holding back a performance improvement under MySQL and a way to increase the number of shared hosts Drupal runs on (those that don't allow temp tables) for the sake of PostgreSQL, which few shared hosts support anyway.

While I agree that forcing us to think about multiple databases is good architecturally, we just don't have the resources to support non-MySQL as well as we do PostgreSQL. I would not object to committing the MySQL portions and fixing PostgreSQL later.

--Larry Garfield
Chris Johnson
2007-11-12 16:46:11 UTC
Permalink
<devils-advocate>
A smattering of thoughts:

1. Nice Catch-22 or Chicken-and-Egg situation we have here. It's
sort of like the push for PHP5, only that kind of effort is more the
Postgres community camp.

2. Not many Drupal sites use Postgres, so why support them? Maybe
there would be more, if we supported Postgres.

3. About the time MySQL reaches version 6, it will have roughly a
comparable feature set to Postgres 7 (more than 3 years old, now).

4. For every feature we use that is easier or better in MySQL than in
Postgres, there are items that are easier or better in Postgres. We
just haven't taken advantage of them.

5. Not wanting to abstract the differences away between databases is
partially built upon the conceit that all Drupal developers are good
SQL developers as well. In reality, the vast majority of us should
never write a line of SQL; instead we should be asking a Drupal API to
fetch/replace an object for us. In practicality, that will never
happen, as it has adverse effects outside the quality of code (e.g.
the volume of code, novel new applications, etc.).

6. Regarding previous remarks about the slippery slope and chx's
demand for some evidence: you should know better, chx. The evidence
is simple and obvious: if we code specifically for one database (one
browser, one screen size, one RSS reader, one whatever), the effort to
then open up and use others at a later date is much larger than if we
always coded with that thought in mind.

My stomach hurts; I'm feeling grumpy.
</devils-advocate>

Wasn't there once an effort to provide an sqlite version of Drupal? :-)
Richard Morse
2007-11-12 17:23:37 UTC
Permalink
Post by Chris Johnson
2. Not many Drupal sites use Postgres, so why support them? Maybe
there would be more, if we supported Postgres.
Once all of the modules I use support PostgreSQL, I intend to switch
over. It bugs me having to run both databases, and everything else we
do uses PostgreSQL (or Oracle, but that's not on my servers...).
Post by Chris Johnson
Wasn't there once an effort to provide an sqlite version of
Drupal? :-)
And could this be brought back? It would make setting up a home
development version much easier, as I don't really like having to
install a complete database server on my home machine.

Ricky


The information transmitted in this electronic communication is intended only for the person or entity to whom it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this information in error, please contact the Compliance HelpLine at 800-856-1983 and properly dispose of this information.
Ivan Sergio Borgonovo
2007-11-12 18:15:01 UTC
Permalink
On Mon, 12 Nov 2007 12:23:37 -0500
Post by Richard Morse
Post by Chris Johnson
2. Not many Drupal sites use Postgres, so why support them?
Maybe there would be more, if we supported Postgres.
Once all of the modules I use support PostgreSQL, I intend to
switch over. It bugs me having to run both databases, and
everything else we do uses PostgreSQL (or Oracle, but that's not on
my servers...).
In my experience modules needs very minor tweaks to run under pgsql.
Furthermore most modules are simpler and smaller than core... I know
that if one upgrade break up something most of the time it will
require 5 minutes to fix pgsql compatibility back.
I wouldn't dare with core.

Again a Chicken-and-Egg situation.

If it would be easier to implement modules that support both DB...
One way is actually provide an api that hides the DB... another is
suggesting a standard guideline to support different db in modules.

For the reasons expressed I'd feel as a big loss if drupal will
support one DB only... even if by a miracle it would be pgsql.

BTW as someone pointed out if drupal would be designed with pgsql in
mind first... mysql would seem dumb and missing features etc...
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Earnie Boyd
2007-11-12 20:10:33 UTC
Permalink
Post by Larry Garfield
You have just proposed that we create our own serialized language
called DSQL that we then regex for *every single query* and then
rewrite a query for each database backend according to some abstract
set of rules.
No, that isn't what I proposed exactly. It is what you read into it.
The query need not be rewritten if the structure is sane to begin with.
If precious MySQL is the favored one, we say Drupal SQL uses its
syntax structure. Everything else then is slightly slower because the
string needs parsed to syntax manipulate the string. And if done
correctly the parsed string is cached so that it doesn't have to
reparsed.

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Bèr Kessels
2007-11-14 23:04:54 UTC
Permalink
Post by Larry Garfield
The only alternative I see is to use data-structure-based query builders
instead in some cases.
Post by Larry Garfield
While I agree that forcing us to think about multiple databases is good
architecturally, we just don't have the resources to support non-MySQL as
well as we do PostgreSQL.  I would not object to committing the MySQL
portions and fixing PostgreSQL later.
This whole thread is the main reason why many projects choose to build a
really DB independent system. Like AdoDB or Active Record. Those two, IMO are
OTT for drupal, but still, its worth noting it down somewhere: "Nov 2007: big
issues, long threads, frustrated developers, delayed release: caused by our
DB-bstraction, not being really abstract enough".
BÚr
--
Drupal, Ruby on Rails and Joomla! development http://webschuur.com
Dries Buytaert
2007-11-12 18:08:42 UTC
Permalink
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql. Once it's committed those who care about postgresql, if
they so want to, can test and if there is a need, fix it.
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
No. Supporting multiple databases is a good thing and something we
want to get better at.

While MySQL is popular in our developer world, it /only/ has a 30%
market share in the real world. Making a MySQL-only CMS is like
making a Firefox-only website. It's OK for techies, but not for mam
and dad.

I'm not willing to drop PostgreSQL support, certainly not after we put
time and effort in the schema API and now we started to scratch the
surface of object relational mapping.

PostgreSQL is not the ugly sister. Admittedly though, we keep waiting
for her prince on a white horse. ;)

--
Dries Buytaert :: http://www.buytaert.net/
Earnest Berry III
2007-11-12 23:51:00 UTC
Permalink
I agree with the earlier statement. MySQL seems "smarter/easier" because
most devel in the Drupal work use it all the time. I've been on other
projects where pgSQL is the "smarter/easier" method.

Multiple-DB support is just good practice for a project that is to be widely
adopted and touts itself as a "flexible" CMS. There's no fine print that
says "Flexible...only on Apache 2.2 running CentOS 5 with kernel X; further
MySQL 5.x and PHP 5.1x and only with the following php libs installed....".
We not doing data-warehousing, or complex SQL Data cubes using
mutli-dimensional series slices...thus, in my very humble opinion, we SHOULD
be able to support multiple database with just good clean SQL code and
logic. We, as a community, should also be able to handle the "situations"
that arise b/t databases types; such as this case-insensitivity issue. We
all have friends across the pond ( and by pond I mean a linux guy has a
windows-expert friend, a MySQL guy has a PostGres friend, etc. etc.), and
could be able to cross that pond for information to help form a solution.

Anyway, that's my 2 bits...well, 1.5 bits.
Karoly Negyesi
2007-11-16 22:49:51 UTC
Permalink
Post by Dries Buytaert
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql.
No. Supporting multiple databases is a good thing and something we
want to get better at.
We agree.
Post by Dries Buytaert
I'm not willing to drop PostgreSQL support
Same as I answered to Barry. Somebody care to explain how my words can mean "drop postgresql support" How? I said " a decent hope (use common sense) to work on postgresql." by (use common sense) I meant "please do not debate on what's a decent hope". All I asked for is to commit something, move on and let the postgresql people test/fix once it's in thus not letting the very limited postgresql testing community hold back the whole of Drupal.

In the past we committed patches that were incomplete one way or another so we could build on them. That's all I am asking here.

I am working on the LOWER patch. I have carefully worded the original letter I am following up to now. I am truly baffled that despite all these efforts, people think I want to drop PostgreSQL. If you can point that out in my letters or my work, I would be grateful because surely all these great people take that motion from somewhere but where...? Maybe my blog post? But some people did not like it, so I yielded to fate and try to find solutions that are good for everyone.
Bèr Kessels
2007-11-17 08:25:33 UTC
Permalink
No. Supporting multiple databases is a good thing and something we  
want to get better at.
I would like everyone who commented in this thread to head over to
http://groups.drupal.org/node/6772 "Active Records, a possible approach for
consistent Data APIs" and to read the excellent proposal in PDF [1] there.

It is a very drupalish approach of an "industry standard"-way of doing db
abstraction. I believe that - with your input - we can make that into
something we all will love.

Bèr

[1] http://groups.drupal.org/files/drupal_active_records.pdf
--
Drupal, Ruby on Rails and Joomla! development http://webschuur.com
Barry Jaspan
2007-11-16 17:27:06 UTC
Permalink
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql.
Kind of like saying: "Software really only needs to support Windows.
The other platforms don't matter." Except in this case our "Windows"
is only 30% of the market.

You need to understand that MySQL is broken in a lot of ways. A lot
of the "missing support for pgsql" bugs are actually "we are depending
on extremely broken and non-portable mysql behavior" bugs. If we go
down the route of accepting MySQL-only code, it will be very difficult
to get out.
Post by Karoly Negyesi
Prime example: http://drupal.org/node/146466 this is the most
important patch we have currently as it makes Drupal search speedy and
nice. And it is more or less on hold just because noone is sure
whether the postgresql update works or not.
Actually, that patch was broken for MySQL too (see
http://drupal.org/node/192348) and got committed anyway because no one
tested its upgrade path. Separate rant.

Thanks,

Barry
Karoly Negyesi
2007-11-16 22:37:58 UTC
Permalink
If we go down the route of accepting MySQL-only code, it will be very difficult
to get out.
http://lists.drupal.org/pipermail/development/2007-November/027299.html

"In my opinion, any patch should be committable to core once it works on mysql and has a decent hope (use common sense) to work on postgresql."

How can people read

"Let's drop postgresql!"

into my words? I am nonplussed.
Bill Moran
2007-11-16 22:40:07 UTC
Permalink
Post by Karoly Negyesi
If we go down the route of accepting MySQL-only code, it will be very difficult
to get out.
http://lists.drupal.org/pipermail/development/2007-November/027299.html
"In my opinion, any patch should be committable to core once it works on mysql and has a decent hope (use common sense) to work on postgresql."
How can people read
"Let's drop postgresql!"
into my words? I am nonplussed.
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it. Why would
you request that patches get committed when it's "likely" that
it will work with a supported backend, when it's so easy to
go the extra few feet and verify?
--
Bill Moran
http://www.potentialtech.com
Karoly Negyesi
2007-11-16 22:55:14 UTC
Permalink
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch testers. Second, damn easy to install PostgreSQL? Really? Where is the handbook page? The video?
Bill Moran
2007-11-16 23:13:55 UTC
Permalink
Post by Karoly Negyesi
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch testers. Second, damn easy to install PostgreSQL? Really? Where is the handbook page? The video?
I'm not kidding you. Of course, if your distro sucks, it might
not be easy, but on FreeBSD, I do
make install; /usr/local/etc/rc.d/postgresql initdb; /usr/local/etc/rc.d/postgresql start

If you need a video to get you through that, you're not going to
be much help as a patch tester anyway ...

Have you even tried installing it? You're criticizing quite a
bit, but I'm not seeing any first-hand description, just a
baseless dismissal.
--
Bill Moran
http://www.potentialtech.com
Bill Moran
2007-11-16 23:20:44 UTC
Permalink
Post by Karoly Negyesi
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch testers. Second, damn easy to install PostgreSQL? Really? Where is the handbook page? The video?
Wait a minute ... allow me to retract my previous statement and
get out of the trap you tricked me into.

Why are you being so confrontational? You _asked_ _why_ and I
gave you my opinion. Now you're arguing that my opinion is no
good, but it doesn't change the fact that it's still my opinion.

Stop turning this into an argument and perhaps you'll get more
support. I have no desire to argue the relative difficulty of
installing PostgreSQL on this list. If you want to discuss
that, take it to the PostgreSQL mailing lists.
--
Bill Moran
http://www.potentialtech.com
Khalid Baheyeldin
2007-11-17 00:19:57 UTC
Permalink
Post by Karoly Negyesi
Post by Karoly Negyesi
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch
testers. Second, damn easy to install PostgreSQL? Really? Where is the
handbook page? The video?
Wait a minute ... allow me to retract my previous statement and
get out of the trap you tricked me into.
Why are you being so confrontational? You _asked_ _why_ and I
gave you my opinion. Now you're arguing that my opinion is no
good, but it doesn't change the fact that it's still my opinion.
Stop turning this into an argument and perhaps you'll get more
support. I have no desire to argue the relative difficulty of
installing PostgreSQL on this list. If you want to discuss
that, take it to the PostgreSQL mailing lists.
Bill,

Karoly is not being confrontational. Why do you say so?

His point is that we do not get enough testers IN GENERAL, let alone
PostgreSQL.

The user base of PostgreSQL vs. MySQL on Drupal is far in favor of MySQL,
and hence testing and installed base is vastly in favor of MySQL.

Yes, PostgreSQL may be a better engine, may be damn easy to install, but
that is not the point. The point is having ENOUGH USER BASE WHO care to
spend the time to test patches, compatibility, upgrade path, ....etc.

The LOWER issue that is being refered to is this
http://drupal.org/node/83738.

I have voiced my concern, together with others, on adding so many columns
just to get around the case sensitive of a database that is not much in use.

So, Karoly is proposing that we test on MySQL, then throw the issue over the
fence to the folk who use (and care about) PostgreSQL for them to handle,
because of the INSTALLED BASE issue, and the TESTING RESOURCES issue. This
is NOT a technology issue.
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Khalid Baheyeldin
2007-11-17 00:22:58 UTC
Permalink
Post by Khalid Baheyeldin
So, Karoly is proposing that we test on MySQL, then throw the issue over
the fence to the folk who use (and care about) PostgreSQL for them to
handle, because of the INSTALLED BASE issue, and the TESTING RESOURCES
issue. This is NOT a technology issue.
The previous analogy of Windows is incorrect.

A more proper analogy is : "I test on LAMP, because that is what I use. If
others want to make Drupal work on Windows and IIS, then let them test
there, and propose patches that do not put Drupal through hoops to support
that. We are not specifically excluding Windows, and in fact some people use
Drupal just like that".

Same for PostgreSQL, MS-SQL, SQLite, ...etc.
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Bill Moran
2007-11-17 02:06:59 UTC
Permalink
Post by Khalid Baheyeldin
The LOWER issue that is being refered to is this
http://drupal.org/node/83738.
I have voiced my concern, together with others, on adding so many columns
just to get around the case sensitive of a database that is not much in use.
I agree with your concern 100%.

In PostgreSQL, the correct way to solve this problem would be to
add an index:
create index lower_name on users (LOWER(name));

Said index will then be used any time a WHERE clause references
LOWER(name). This avoids the problems of adding a column.
Does MySQL not have this capability?
--
Bill Moran
http://www.potentialtech.com
Derek Wright
2007-11-17 02:16:05 UTC
Permalink
In PostgreSQL, the correct way to solve this problem would be to...
a) Please read the issue. This approach was proposed months ago and
discussed at length.

b) Please don't reply to the issue on this mailing list. It
splinters the conversation, and spams hundreds of developers who
might not care about LOWER() performance problems). Reply in the
issue so that all the replies are in 1 place and so that people who
care can find it, and people who don't won't be bothered. ;)

Thanks,
-Derek (dww)
Bill Moran
2007-11-17 02:11:15 UTC
Permalink
Post by Khalid Baheyeldin
Post by Karoly Negyesi
Post by Karoly Negyesi
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch
testers. Second, damn easy to install PostgreSQL? Really? Where is the
handbook page? The video?
Wait a minute ... allow me to retract my previous statement and
get out of the trap you tricked me into.
Why are you being so confrontational? You _asked_ _why_ and I
gave you my opinion. Now you're arguing that my opinion is no
good, but it doesn't change the fact that it's still my opinion.
Stop turning this into an argument and perhaps you'll get more
support. I have no desire to argue the relative difficulty of
installing PostgreSQL on this list. If you want to discuss
that, take it to the PostgreSQL mailing lists.
Bill,
Karoly is not being confrontational. Why do you say so?
When these kinds of things happen, I'm torn between defending
my viewpoint and letting it drop for the sake of peace and
quiet. In this case, I will state my reason then let it drop.

The conversation went as follows:
Karoly: Why is everyone against my viewpoint.
Me: I think it's because ...
Karoly: You're wrong.

If he didn't want my opinion, why did he ask? It felt
confrontational and pointlessly argumentative to me, and it also
felt like it was about to degrade into "MySQL vs. PostgreSQL".

That's how I saw it. If others saw it differently, then that's
their viewpoint.
--
Bill Moran
http://www.potentialtech.com
Khalid Baheyeldin
2007-11-17 03:36:41 UTC
Permalink
Post by Bill Moran
Post by Khalid Baheyeldin
Post by Karoly Negyesi
Post by Karoly Negyesi
Post by Bill Moran
In my opinion, the reason for that is that it's sooo damn easy
to install PostgreSQL and test a patch against it.
You are kidding me. First of all, we do not even have enough patch
testers. Second, damn easy to install PostgreSQL? Really? Where is the
handbook page? The video?
Wait a minute ... allow me to retract my previous statement and
get out of the trap you tricked me into.
Why are you being so confrontational? You _asked_ _why_ and I
gave you my opinion. Now you're arguing that my opinion is no
good, but it doesn't change the fact that it's still my opinion.
Stop turning this into an argument and perhaps you'll get more
support. I have no desire to argue the relative difficulty of
installing PostgreSQL on this list. If you want to discuss
that, take it to the PostgreSQL mailing lists.
Bill,
Karoly is not being confrontational. Why do you say so?
When these kinds of things happen, I'm torn between defending
my viewpoint and letting it drop for the sake of peace and
quiet. In this case, I will state my reason then let it drop.
Karoly: Why is everyone against my viewpoint.
Me: I think it's because ...
Karoly: You're wrong.
If he didn't want my opinion, why did he ask? It felt
confrontational and pointlessly argumentative to me, and it also
felt like it was about to degrade into "MySQL vs. PostgreSQL".
That's how I saw it. If others saw it differently, then that's
their viewpoint.
I think we are going in circles here.

His very point is that HE IS NOT ADVOCATING that Drupal become a MySQL only
monoculture. Some thought it to be so, but it is not. Nor do I believe that
anyone supports that for that matter.

His point, that the lack of resources behind PostgreSQL is delaying the
acceptance of patches that work in MySQL, thus holding Drupal back.

I believe I covered this point in my previous two emails. Let those who use
PostgreSQL put the effort of testing the patches, not hold everything back.
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Larry Garfield
2007-11-17 05:04:09 UTC
Permalink
Post by Khalid Baheyeldin
I think we are going in circles here.
His very point is that HE IS NOT ADVOCATING that Drupal become a MySQL only
monoculture. Some thought it to be so, but it is not. Nor do I believe that
anyone supports that for that matter.
His point, that the lack of resources behind PostgreSQL is delaying the
acceptance of patches that work in MySQL, thus holding Drupal back.
I believe I covered this point in my previous two emails. Let those who use
PostgreSQL put the effort of testing the patches, not hold everything back.
I expect this issue to become much bigger come Drupal 7. There's talk of
getting MS SQL support in core (courtesy Microsoft), Oracle in core, SQLite
in core... Even if we can support all of those different databases at the
same time, maintaining support for them is going to be a huge pain. 99% of
people reviewing patches are using MySQL. Will we say that no patch with a
query can be committed until one of the three people who uses Oracle with
Drupal has tested it? Will we not be able to commit a patch until Microsoft
confirms it doesn't break MS SQL?

Perhaps that's a bit hyperbolic, but since, for example, I am completely
incapable of testing against MS SQL (I don't have a Windows computer capable
of running it, even if I was willing to pay for it) and I suspect many other
people are, this issue is going to crop up more and more, and we are going to
need to figure out how we're going to address it.
--
Larry Garfield AIM: LOLG42
***@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
Ivan Sergio Borgonovo
2007-11-17 10:30:02 UTC
Permalink
On Fri, 16 Nov 2007 19:19:57 -0500
Post by Khalid Baheyeldin
So, Karoly is proposing that we test on MySQL, then throw the issue
over the fence to the folk who use (and care about) PostgreSQL for
them to handle, because of the INSTALLED BASE issue, and the
TESTING RESOURCES issue. This is NOT a technology issue.
That doesn't make it a good decision.
Again... putting aside the ignorance of someone about postgres,
forgetting completely the topic about what's better, *splitting*
support for the DB is not healthy for the project at this stage.

The path should be to make the DB abstraction layer more DB agnostic
and make developer more aware there is not just mysql.
Everything will look terrible if you design for mysql and then try to
port it to any other DB. This increase the inertia of moving to a
more agnostic DB layer and supporting anything else than mysql.

So "commit something, move on and let the postgresql people test/fix"
is not a solution. It is a dead end.
If the few pg people will stop to bother you at *every step* till we
won't have a DB abstraction layer and some way to deal with DB
sensible patches), mysql people will become even less aware there are
other DB. As a side note pg dev generally tend to know more about
mysql than mysql dev tend to know about pg.

When pg people will come out and say: "what kind of garbage did you
wrote? It can't be ported to pg unless we...." what you'll end up
with will be the mysql people saying "we can't mess with the code for
a DB that have just [put ridiculously low unsupported percentage
here] market share". And things will get worse.
It is NOT the way you build the premises for a decent SPI!

Larry Garfield made a good point too. But until you've the tools
(abstraction layer that should mitigate the problem quite
substantially and tune the patch system to make dev aware of "DB"
sensible patches) committing in the hope things will work on pg is
not going to make things better.

If you lower the support level for pgsql now you'll never have a sane
DB abstraction layer and sooner or later you'll support just one DB.

pgsql is just what we have... it could be anything else *reasonably
different* from mysql. Supporting 2 "anything" generally is not
enough to build up a sane SPI. What you end up is: case A is supported
and case B is special case of A, when you need to support case C
everything is so entangled that you realise trying to build a SPI that
way was wasted time.

Once drupal become a one db application it will be extremely hard to
go back.

This will "hold drupal back" much more than supporting pgsql upfront.

MS seemed to be interested in supporting drupal (is it?). They should
be interested to support MS SQL as well, that should help to have a
more agnostic db layer. (Larry... maybe they can give you a couple of
licenses of SQL 200X ;) but they won't provide the exorcist ).

As for me I installed D6 on pgsql. I'll try to learn how to
contribute the "drupalish" way (cvs, issues etc...).
I posted some patch for postgres mainly for modules in the past. I'll
try to help with beta and rc too.

As for the lower() problem I can't see why a "lowered" index can't
get in.

Again... it is a matter of good design and *freedom*.
Maybe mysql will be where pg is now, maybe it won't, maybe you can't
turn php into python in 1 or 2 years as you can't turn mysql in pg.
I don't care.
Fortunately we have mysql, and postgres and sqlite and firebird
and ... to chose from.
*************
It is not a matter of pg vs mysql.
*************
I don't want to bet on others projects blindly, I don't like
mono-cultures.

DB abstraction is a known problem and everybody know adding one more
layer has impact on performance. We have to live with it.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Karoly Negyesi
2007-11-17 13:38:11 UTC
Permalink
Post by Ivan Sergio Borgonovo
When pg people will come out and say: "what kind of garbage did you
wrote? It can't be ported to pg unless we...."
Let's get back to the original post:

In my opinion, any patch should be committable to core once it works on mysql and has a decent hope (use common sense) to work on postgresql.

So, we are just skipping the final fine tune / testing steps. We do write garbge nor do we commit MysQL only code. I am willing (and I always did) to peek into the postgresql manual to see whether it has a solution to any problem at hand. This is not the same as actually writing the thing and testing but it gives, you know, a decent hope, that it will work on postgresql.
Jeff Eaton
2007-11-17 15:14:58 UTC
Permalink
Question: Should this approach be taken with Javascript-related code
on any non-Firefox, non-IE browsers?

--Jeff
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql.
So, we are just skipping the final fine tune / testing steps. We do
write garbge nor do we commit MysQL only code. I am willing (and I
always did) to peek into the postgresql manual to see whether it has
a solution to any problem at hand. This is not the same as actually
writing the thing and testing but it gives, you know, a decent hope,
that it will work on postgresql.
Jakob Petsovits
2007-11-17 18:05:24 UTC
Permalink
Post by Jeff Eaton
Post by Karoly Negyesi
In my opinion, any patch should be committable to core once it works
on mysql and has a decent hope (use common sense) to work on
postgresql.
Question: Should this approach be taken with Javascript-related code
on any non-Firefox, non-IE browsers?
I think the main point to consider here is:
Will such issues hold back a release (i.e. Postgres support is still a
critical issue - "deferred, but not dropped") or will it be released with
Postgres left in an unfunctionable state (or, for that matter,
Opera/Safari/Konqueror for JavaScript stuff)?

The latter would be where "dropping support" starts.
Personally, I see no problem with the former.
Karoly Negyesi
2007-11-18 14:09:23 UTC
Permalink
Post by Jeff Eaton
Question: Should this approach be taken with Javascript-related code
on any non-Firefox, non-IE browsers?
I would exercise caution here. While I can very see how a given patch might get an untested postgresql version just based on the postgresql manual, this gets very tricky with browsers. The decent hope here is greatly deceiving...
Ivan Sergio Borgonovo
2007-11-28 12:45:01 UTC
Permalink
On Wed, 28 Nov 2007 11:48:23 +0100
It's easier to find work-arounds and translations from standard SQL
to MySQL syntax, and do the same thing to translate to another
reasonable database (Posgres, Oracle, MSSQL), than it is to reverse
translate MySQL (or other database) specific features to other
databases.
Agree.
Standard SQL is the sort of the greatest common factor (in the
mathematical sense). Even more significantly, hundreds of MySQL
people have spent significant effort in coming up with standard SQL
equivalents in MySQL, so most of that work is already done for us.
Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents.
This might only be for CRUD operations (data manipulation
statements) and not for DDL, for which we already have a reasonable
API and which are much less standardized.
On the long run this doesn't give space for optimisations.
People should write SQL with standards in mind and not MySQL in mind
but the DB abstraction layer should leave a door open for
optimisation.
There should be a default path that doesn't impact on performances
(including both mysql and pgsql code) that let module writers write
custom code for a particular DB *if* they are willing to do so.
There was a thread that share some aspect with this problem about
"numbers of loaded module and performance" and guidelines for
module developers.
I myself always use MySQL for my Drupal installs -- because it will
work. Thus, I never test them with Postgres, even though I run
Postgres on several systems, am very comfortable with using it, etc.
I just don't have the time to hassle with discovering some obscure
SQL bug resulting from database incompatabilitilies when I'm just
trying to make a site work.
I use pgsql for drupal and till now I installed stable versions of
drupal where most stuff was working out of the box in core (there was
a minor issue in length of watchdog messages where MySQL silently trim
stuff while pg complains) and fixes for the modules I use were
trivial.

I've to learn how to use the drupal "patch/issues" infrastructure to
give more chances the patch I made get into modules/core smoother.
I still have to read carefully all the coding standards.
So unless we actually push the issue of supporting multiple
databases as a community, the vast majority of developers will
never make the effort. And with each passing release spent in that
mode, the harder it will become to reverse it. We will end up
supporting only one database, MySQL, de facto.
Absolutely agree. But it is a problem of chicken and egg. dev don't
care cos there isn't enough abstraction in DB layer and the code is
too mysqlish, the code continue to be that way because most dev are
surmounted/ignore how to really deal with multiple DB.

I sympathise with Chris Johnson. I understand the frustration of core
dev when they have to deal with the latitancy/scarcity of pg dev
trying to improve/give suggestions etc... about performance on pg.
But again this is a problem of chicken and egg. Now everything that
is going to change/add work to the MySQL side is seen as an annoyance
because it is actually complicated to improve stuff for pg without
degrading performances for MySQL.

In the light of the fact that MySQL is going to have the same
features pg had for years, I would think it would be a good bet to
improve the abstraction layer for BOTH so to be able to use the most
advanced features that BOTH will have. Anyone that didn't deal with
MyISAM only know there are places where some of these features can
improve performances and code writing *a lot*.

I think most people here know there is no silver bullet when you've
to deal with flexibility and performance and DB abstraction layers
are not an exception.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Chris Johnson
2007-11-30 14:37:16 UTC
Permalink
Post by Ivan Sergio Borgonovo
On Wed, 28 Nov 2007 11:48:23 +0100
Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents.
This might only be for CRUD operations (data manipulation
statements) and not for DDL, for which we already have a reasonable
API and which are much less standardized.
On the long run this doesn't give space for optimisations.
People should write SQL with standards in mind and not MySQL in mind
but the DB abstraction layer should leave a door open for
optimisation.
There should be a default path that doesn't impact on performances
(including both mysql and pgsql code) that let module writers write
custom code for a particular DB *if* they are willing to do so.
Well, actually I'm very much in favor of optimized SQL, because I'm
very much in favor of optimized Drupal. I hate software bloat.

So what I really think -- and this is a truly unpopular position -- is
that most of the SQL should be written by "core" SQL coders, and the
rest of us should be accessing data through object abstractions, i.e.
get me a node, get me a user. Underneath is the highly optimized,
shiney, sparkly, incredibly brilliant PHP and SQL code written by the
'database access guru core team'.

Ok, I'll put the crack pipe down now.

My goal with the above suggestion was just to make it easier for
people to write the standard SQL to MySQL || PGSQL || MS SQL || Oracle
abstraction-layer translation stuff by helping authors avoid difficult
to translate non-standard-isms. If developers can avoid creating
modules which depend on unique special extension X in MySQL or special
extension Y in Postgres (unless they really meant to), then we'd be
better off in our question to support more than one database.

I suppose I have not made things much clearer. For some reason, I'm
plagued with complicated phraseology today. :-/
Bill Moran
2007-11-30 14:47:01 UTC
Permalink
Post by Chris Johnson
Post by Ivan Sergio Borgonovo
On Wed, 28 Nov 2007 11:48:23 +0100
Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents.
This might only be for CRUD operations (data manipulation
statements) and not for DDL, for which we already have a reasonable
API and which are much less standardized.
On the long run this doesn't give space for optimisations.
People should write SQL with standards in mind and not MySQL in mind
but the DB abstraction layer should leave a door open for
optimisation.
There should be a default path that doesn't impact on performances
(including both mysql and pgsql code) that let module writers write
custom code for a particular DB *if* they are willing to do so.
Well, actually I'm very much in favor of optimized SQL, because I'm
very much in favor of optimized Drupal. I hate software bloat.
So what I really think -- and this is a truly unpopular position -- is
that most of the SQL should be written by "core" SQL coders, and the
rest of us should be accessing data through object abstractions, i.e.
get me a node, get me a user. Underneath is the highly optimized,
shiney, sparkly, incredibly brilliant PHP and SQL code written by the
'database access guru core team'.
Ok, I'll put the crack pipe down now.
My goal with the above suggestion was just to make it easier for
people to write the standard SQL to MySQL || PGSQL || MS SQL || Oracle
abstraction-layer translation stuff by helping authors avoid difficult
to translate non-standard-isms. If developers can avoid creating
modules which depend on unique special extension X in MySQL or special
extension Y in Postgres (unless they really meant to), then we'd be
better off in our question to support more than one database.
I suppose I have not made things much clearer. For some reason, I'm
plagued with complicated phraseology today. :-/
I think you should smoke some more of that crack pipe.

If you look at an OS (Linux or BSD or whatever) you'll see the kernel
divided into machine dependent and machine independent parts. For
example, you can't boot an ARM processor the same way you boot an i386
CPU, but most of the other things the kernel does can come from the
same code.

I expect that the right way to make Drupal _truly_ database neutral is
to have database-dependent and database-independent code. You'd only
see this kind of thing at the lower levels, and higher logic code
shouldn't even care about it.

Take, for example, that COUNT(*) is _really_ fast on MySQL + MyISAM,
but is slow on just about every other DB. So, replace it with
$db->count(), which does an SQL COUNT(*) on MySQL + MyISAM, but uses
some other technique on other databases. The programmer no longer
cares _how_ such a thing gets done, he only cares that he wants to
know how many rows are in a particular query
--
Bill Moran
http://www.potentialtech.com
Earnie Boyd
2007-11-30 23:53:33 UTC
Permalink
Post by Bill Moran
If you look at an OS (Linux or BSD or whatever) you'll see the kernel
divided into machine dependent and machine independent parts. For
example, you can't boot an ARM processor the same way you boot an i386
CPU, but most of the other things the kernel does can come from the
same code.
I expect that the right way to make Drupal _truly_ database neutral is
to have database-dependent and database-independent code. You'd only
see this kind of thing at the lower levels, and higher logic code
shouldn't even care about it.
Drupal SQL http://drupal.org/node/191486
Post by Bill Moran
Take, for example, that COUNT(*) is _really_ fast on MySQL + MyISAM,
but is slow on just about every other DB. So, replace it with
$db->count(), which does an SQL COUNT(*) on MySQL + MyISAM, but uses
some other technique on other databases. The programmer no longer
cares _how_ such a thing gets done, he only cares that he wants to
know how many rows are in a particular query
And we can optimize in the abstraction. We need to trim the number of
places to worry with optimization.

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Khalid Baheyeldin
2007-12-03 04:17:36 UTC
Permalink
Post by Bill Moran
I expect that the right way to make Drupal _truly_ database neutral is
to have database-dependent and database-independent code. You'd only
see this kind of thing at the lower levels, and higher logic code
shouldn't even care about it.
The alternatives are:

1. Calling modules would do 'case' statements for each type of database. No
one wants that, unless it is a highly optimized site with lots of
customization ( i.e. not the normal Drupal core).

2. Use the lowest common denominator for SQL and live with performance and
scalability issues.

3. Do database specific stuff (including optimization) in the abstraction
layer (this is what Bill Moran advocates above).

I am for #3 as the way forward.
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Larry Garfield
2007-12-03 04:38:48 UTC
Permalink
Post by Khalid Baheyeldin
Post by Bill Moran
I expect that the right way to make Drupal _truly_ database neutral is
to have database-dependent and database-independent code. You'd only
see this kind of thing at the lower levels, and higher logic code
shouldn't even care about it.
1. Calling modules would do 'case' statements for each type of database. No
one wants that, unless it is a highly optimized site with lots of
customization ( i.e. not the normal Drupal core).
2. Use the lowest common denominator for SQL and live with performance and
scalability issues.
3. Do database specific stuff (including optimization) in the abstraction
layer (this is what Bill Moran advocates above).
I am for #3 as the way forward.
#3 is what I'm pushing for in the D7 database overhaul, within reason. How to
abstract everything without making the syntax insane or the performance
terrible is what I'm currently trying to wrap my head around. The biggest
problem is field type handling, which I want to keep away from module
developers but varies greatly depending on the database.
--
Larry Garfield AIM: LOLG42
***@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
adrian rossouw
2007-12-03 06:24:25 UTC
Permalink
Post by Larry Garfield
#3 is what I'm pushing for in the D7 database overhaul, within
reason. How to
abstract everything without making the syntax insane or the
performance
terrible is what I'm currently trying to wrap my head around. The biggest
problem is field type handling, which I want to keep away from module
developers but varies greatly depending on the database.
and the biggest issue here is date / time types.

it's fscking crazy how different these are between databases.
Larry Garfield
2007-12-03 06:49:01 UTC
Permalink
Post by adrian rossouw
Post by Larry Garfield
#3 is what I'm pushing for in the D7 database overhaul, within reason. How to
abstract everything without making the syntax insane or the
performance
terrible is what I'm currently trying to wrap my head around. The biggest
problem is field type handling, which I want to keep away from module
developers but varies greatly depending on the database.
and the biggest issue here is date / time types.
it's fscking crazy how different these are between databases.
Actually, that one I think I have a solution for, or the beginnings of one.
In PHP 5, there is the DateTime class to represent a date/time in some
timezone. It's great, but not a proper value object. Writing a value object
around DateTime is easy, however, and I've already done one.

So, each database driver defines internally its own format string to convert a
DateTime/DateValue class into the string format it expects for a date/time
stamp. The database driver then detects the presence of an object as an
argument to a query/prepared statement and formats it to its desired format,
which is then inserted into the query. That works for putting a date/time
value into a query (be it insert, update, delete, or select). For reading, I
believe we can use schema API to check if a field is a timestamp field and
then cast it back to a DateValue object. What the performance implication of
that is, though, I do not know. It may be something that we just have to
swallow if we want real database independence, along with always using an
object to represent a date/time (which I think we should be doing anyway).

As for in-SQL operations on the date value, like MONTH() or YEAR() in MySQL?
If someone has an idea for those that doesn't involve regexing every query I
would dearly love to hear it. :-)
--
Larry Garfield AIM: LOLG42
***@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
Ivan Sergio Borgonovo
2007-12-03 09:18:48 UTC
Permalink
On Mon, 3 Dec 2007 00:49:01 -0600
Post by Larry Garfield
As for in-SQL operations on the date value, like MONTH() or YEAR()
in MySQL? If someone has an idea for those that doesn't involve
regexing every query I would dearly love to hear it. :-)
EXTRACT both in mysql and pg manual.
There is not such a beast in MS SQL[*]. Something to get around it in
MS SQL could be DATEPART (there is a DATE_PART in mysql and pg if I
remember right).

Anyway you can't handle all the functions avoiding search&replace.
a) we can chose some popular functions to have a good optimisation
b) if queries will be built from objects they should be more regular
so search&replace could be done without regexp

Anyway I really would involve in such decisions people that already
did the MS/Oracle/DB2 ports.

If we were doing a "general" abstraction layer we could just borrow
a more mature one.

Unfortunately PHP is not compiled (sort of).

[*] as it looks clear... DB abstraction should span more than 2 DB
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Khalid Baheyeldin
2007-12-03 23:42:56 UTC
Permalink
Post by Larry Garfield
As for in-SQL operations on the date value, like MONTH() or YEAR() in MySQL?
If someone has an idea for those that doesn't involve regexing every query I
would dearly love to hear it. :-)
The current use of UNIX timestamp is a compromise, and has its issues, but
still allows in-SQL operations.

In-SQL operations is very very important. Going to ActiveRecord type of
abstraction without the ability to do aggregate operations on the data is
just not an option.

Without in-SQL, doing things like:

select n.nid, n.type, n.title, count(*) as num_votes, avg(value) as avg_vote
from votingapi_vote v inner join node n on content_id = n.nid where n.type =
'image' and n.created between unix_timestamp('2007-11-01 00:00:00') and
unix_timestamp('2007-11-30 23:59:59') group by v.content_id having num_votes
Post by Larry Garfield
1 order by avg_vote desc, num_votes desc limit 20;
Would be impossible, or would entail very expensive full table scans.
Larry Garfield
2007-12-04 00:40:12 UTC
Permalink
Post by Khalid Baheyeldin
Post by Larry Garfield
As for in-SQL operations on the date value, like MONTH() or YEAR() in MySQL?
If someone has an idea for those that doesn't involve regexing every
query
Post by Larry Garfield
I
would dearly love to hear it. :-)
The current use of UNIX timestamp is a compromise, and has its issues, but
still allows in-SQL operations.
In-SQL operations is very very important. Going to ActiveRecord type of
abstraction without the ability to do aggregate operations on the data is
just not an option.
select n.nid, n.type, n.title, count(*) as num_votes, avg(value) as avg_vote
from votingapi_vote v inner join node n on content_id = n.nid where n.type =
'image' and n.created between unix_timestamp('2007-11-01 00:00:00') and
unix_timestamp('2007-11-30 23:59:59') group by v.content_id having num_votes
Post by Larry Garfield
1 order by avg_vote desc, num_votes desc limit 20;
Would be impossible, or would entail very expensive full table scans.
Except that unix_timestamp() is MySQL-specific, isn't it?

--Larry Garfield
Khalid Baheyeldin
2007-12-04 01:19:54 UTC
Permalink
Yes, but it is one level above native date/time fields.

All databases have equivalent functions for date math and the like (EXTRACT,
INTERVAL, ...etc).

My point was the need for in-SQL operations for queries, which no matter how
you abstract the individual CRUD (we already have user_load() and
node_load() for that), you still need in-SQL for reporting and aggreggation.
Post by Larry Garfield
Post by Khalid Baheyeldin
Post by Larry Garfield
As for in-SQL operations on the date value, like MONTH() or YEAR() in MySQL?
If someone has an idea for those that doesn't involve regexing every
query
Post by Larry Garfield
I
would dearly love to hear it. :-)
The current use of UNIX timestamp is a compromise, and has its issues,
but
Post by Khalid Baheyeldin
still allows in-SQL operations.
In-SQL operations is very very important. Going to ActiveRecord type of
abstraction without the ability to do aggregate operations on the data
is
Post by Khalid Baheyeldin
just not an option.
select n.nid, n.type, n.title, count(*) as num_votes, avg(value) as avg_vote
from votingapi_vote v inner join node n on content_id = n.nid where
n.type
Post by Khalid Baheyeldin
=
'image' and n.created between unix_timestamp('2007-11-01 00:00:00') and
unix_timestamp('2007-11-30 23:59:59') group by v.content_id having num_votes
Post by Larry Garfield
1 order by avg_vote desc, num_votes desc limit 20;
Would be impossible, or would entail very expensive full table scans.
Except that unix_timestamp() is MySQL-specific, isn't it?
--Larry Garfield
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
Larry Garfield
2007-12-04 01:44:31 UTC
Permalink
Post by Khalid Baheyeldin
Yes, but it is one level above native date/time fields.
All databases have equivalent functions for date math and the like
(EXTRACT, INTERVAL, ...etc).
My point was the need for in-SQL operations for queries, which no matter
how you abstract the individual CRUD (we already have user_load() and
node_load() for that), you still need in-SQL for reporting and
aggreggation.
Agreed. Doing that in a database independent fashion is the $64,000 question.
--
Larry Garfield AIM: LOLG42
***@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson
Earnie Boyd
2007-12-04 14:52:09 UTC
Permalink
Post by Larry Garfield
Except that unix_timestamp() is MySQL-specific, isn't it?
Isn't it preferred to use the ansi-sql92 TIMESTAMP column and allow the
DB engine to do TRT with it? Then we can use PHP5 functions
variant_date_from_timestamp and variant_date_to_timestamp to convert it.

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Darrel O'Pry
2007-12-03 06:52:13 UTC
Permalink
Post by adrian rossouw
Post by Larry Garfield
#3 is what I'm pushing for in the D7 database overhaul, within reason. How to
abstract everything without making the syntax insane or the performance
terrible is what I'm currently trying to wrap my head around. The biggest
problem is field type handling, which I want to keep away from module
developers but varies greatly depending on the database.
and the biggest issue here is date / time types.
it's fscking crazy how different these are between databases.
I could have swore that some where I mentioned creating query
constructs with the syntatical properties of SQL statements and mapping
them to phrases...
Now that I have my old servers back in hand the idea might be found at
http://darrelopry.darkstyles.doesntexist.com/node/63... While that was
specifically geared toward the idea of schema... I'm sure the idea of
mapping syntactical and grammatical abstractions to phrases can be
applied to queries themselves... and with a little introspection into
the query construct individual database engines could try to output
optimized queries to their databases.

see also mentions of this concept in machine translation projects aka
food for thought:
http://www.linguistics.pitt.edu/news-events/flyers/Aranovich.pdf
http://books.google.com/books?id=SZwjb_56JzkC&pg=PA118&lpg=PA118&dq=language+syntactical+grammatical+abstraction+translation&source=web&ots=c1YffE48ZK&sig=ftDJQ7KZbz8Pr2wb3Rxp3noA-hI
http://publications.csail.mit.edu/abstracts/abstracts05/jsylee/jsylee.html

The analysis, transfer, generation approach seems to be pretty common in
machine translation projects. Which is technically what we're doing.
There are of corners that can be cut since SQL is limited in scope and
has a strongly defined structure compared to natural language. We can
skip the analysis part, since we as developers are doing the analysis to
create the queries, the query construct is the transfer/abstraction of
the syntactical structure of the query and the generation is the
responsibility of the database specific abstraction layer.

just one approach I like... it may be way overkill...
Ivan Sergio Borgonovo
2007-12-03 09:20:53 UTC
Permalink
On Mon, 03 Dec 2007 01:52:13 -0500
Post by Darrel O'Pry
Post by adrian rossouw
Post by Larry Garfield
#3 is what I'm pushing for in the D7 database overhaul, within reason. How to
abstract everything without making the syntax insane or the
performance
terrible is what I'm currently trying to wrap my head around.
The biggest
problem is field type handling, which I want to keep away from module
developers but varies greatly depending on the database.
and the biggest issue here is date / time types.
it's fscking crazy how different these are between databases.
I could have swore that some where I mentioned creating query
constructs with the syntatical properties of SQL statements and
mapping them to phrases...
If by query construct you mean a string... as Larry on his website
(somewhere... connection is not that good right now) it is not the
most efficient.
Un-serialised stuff (program logic) -> serialized stuff (sql alike
construct) -> unserialized stuff (sql parser).

What have been done in schema api is more on the spot.

You keep the different part of the query separated and let the DB
specific engine glue them. That way you skip the analysis and the
"transfer" part is a sort of compilation, but generally much easier.

You'd still have to provide a common way to pump custom code directly
in the DB depending on the db_type that's not so ugly as a switch and
that provide a fall-back.

You've some added feature too... since the "compiler" understand a
bit of SQL you could avoid some SQL injection too (even if this as I
suspect is what Larry define "The biggest problem is field type
handling".)

But... this will require a lot of rewriting of modules.
While there is no special logic in building up tables, and you
generally don't do it on the fly, passing objects in spite of strings
is a quite different thing.

As a side note objects may offer some extra caching for stuff like
record count and paging. Something that seems critical in terms of
optimisation across several DB.
Post by Darrel O'Pry
Now that I have my old servers back in hand the idea might be found
the old server looks buys ;)
Post by Darrel O'Pry
at http://darrelopry.darkstyles.doesntexist.com/node/63... While
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Earnie Boyd
2007-12-03 13:07:53 UTC
Permalink
Post by Khalid Baheyeldin
3. Do database specific stuff (including optimization) in the abstraction
layer (this is what Bill Moran advocates above).
I am for #3 as the way forward.
Yes, it is the way to move forward. Doing these things anywhere else
just causes the frustration demonstrated by this thread. We end up
with one way to code the data selections, table creations, etc within
everything but the Drupal SQL [1] abstraction engine. Most of it is
already taken care of, we just need to formalize, tweak (read patch)
and document.

[1] http://drupal.org/node/191486

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Ivan Sergio Borgonovo
2007-12-03 14:22:09 UTC
Permalink
On Mon, 03 Dec 2007 08:07:53 -0500
Post by Earnie Boyd
Post by Khalid Baheyeldin
3. Do database specific stuff (including optimization) in the
abstraction layer (this is what Bill Moran advocates above).
I am for #3 as the way forward.
Yes, it is the way to move forward. Doing these things anywhere
else just causes the frustration demonstrated by this thread. We
end up with one way to code the data selections, table creations,
etc within everything but the Drupal SQL [1] abstraction engine.
Most of it is already taken care of, we just need to formalize,
tweak (read patch) and document.
There are plenty DB abstraction layers and ORM. I'd hate to redo the
work in a "not so smart way".

At the lowest level you can find things like PDO, adodb etc... with
no surprise at the higher level you can find things like SQLAlchemy,
Geniusql, storm all written in python.
PHP just recently start to resemble an OO language.

Stuff like ORM are too general beasts, to be fun to use they need a
stronger OO language and carry an overhead most people using Drupal
won't be happy to live with.

RoR has ActiveRecord but RoR stays at a lower level in the pyramid of
framework->CMS.

To increase abstraction at the lowest level we can just pick up PDO.
At a higher abstraction layer, as usual, it is important to guess
which is the most suitable level of abstraction and generalisation to
avoid to be stuck in the future while not degrading performances.

I'd provide more than one very thin layer of abstraction.
A higher level could be an ORM specialised on the most important
objects in Drupal (nodes, users, taxonomy...).
At a lower level an SQL rewriting engine that could be extended later
that will support the most popular SQL functions and operations
across different DB.
At the lowest level a way to directly write SQL with a default not
optimised route using the second level of abstraction and a specific
DB route for each DB if the developers are willing to optimise, but
in a more elegant way than writing switch.

As a side note without serialising, un-serialising, re-serialising,
the second level of abstraction could provide more security for SQL
injection since it will actually be a sort of SQL compiler rather
than a regexp filter.

I'd rely a lot on conventions to keep the code short and less
error prone.

Eg. if you're using group by, the returned fields should be in the
select statement and in the group by statement in pg.
If you pass fields names with a certain convention, fields will be
put in the group by statement by the compiler without the need to
list them twice and so on...

Similarly for the distinct statement.

Somehow we could even pre-compile SQL statement or save some cached
info (as count, I'm testing count caching across pages to speed up
paging on MS SQL).
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Chris Johnson
2007-12-03 15:51:18 UTC
Permalink
Beware the premature optimization.

Saying that one way to avoid these problems is to use lowest common
denominator SQL sounds awful on the surface, because -- horrors! -- we
can't use all those great, unique to one database only, functions we
optimize performance.

But in reality, very few of the database-unique SQL syntax is really
highly optimized behavior (e.g. count(*) on a MySQL MyISAM table).
Most of them are just "different" ways of doing things than standard
or than other databases.

concat() is not faster than standard SQL ||, for instance. ;-)

We "waste" far more database performance by poorly designed tables,
poorly designed indexes and poorly designed algorithms than we ever
gain in MySQL unique features, I'll bet.

Let's be sure we focus our optimization effort in places it really
helps, and not let it get in the way of making the software more
portable and useable.

I now surrender the soapbox.
Larry Garfield
2007-12-03 16:15:25 UTC
Permalink
Post by Chris Johnson
Let's be sure we focus our optimization effort in places it really
helps, and not let it get in the way of making the software more
portable and useable.
I now surrender the soapbox.
One of the GHOP tasks we have active is to do a performance profile of Drupal to determine where to focus such energies in Drupal 7. :-) Let's revisit the performance issue after that's been done.

--Larry Garfield
Ivan Sergio Borgonovo
2007-12-03 16:37:54 UTC
Permalink
On Mon, 3 Dec 2007 16:51:18 +0100
Post by Chris Johnson
Beware the premature optimization.
Just let the door open to it (that is more a design choice rather
than optimisation itself)... and it may be a political move as well to
stop resistance by people that would say that "this is going to kill
the performance on my beloved DB".
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Larry Garfield
2007-12-03 16:31:22 UTC
Permalink
Post by Ivan Sergio Borgonovo
On Mon, 03 Dec 2007 08:07:53 -0500
Post by Earnie Boyd
Post by Khalid Baheyeldin
3. Do database specific stuff (including optimization) in the
abstraction layer (this is what Bill Moran advocates above).
I am for #3 as the way forward.
Yes, it is the way to move forward. Doing these things anywhere
else just causes the frustration demonstrated by this thread. We
end up with one way to code the data selections, table creations,
etc within everything but the Drupal SQL [1] abstraction engine.
Most of it is already taken care of, we just need to formalize,
tweak (read patch) and document.
There are plenty DB abstraction layers and ORM. I'd hate to redo the
work in a "not so smart way".
At the lowest level you can find things like PDO, adodb etc... with
no surprise at the higher level you can find things like SQLAlchemy,
Geniusql, storm all written in python.
PHP just recently start to resemble an OO language.
Were you at DrupalCon Barcelona? :-) I had a session on the database abstraction question, and the direction that I'm coding for in D7. Slides and video are still available online, I think:

http://drupal.org/node/180155
Post by Ivan Sergio Borgonovo
Stuff like ORM are too general beasts, to be fun to use they need a
stronger OO language and carry an overhead most people using Drupal
won't be happy to live with.
Actually, I consider node_load() an ORM. Not a particularly advanced one, but a basic ORM nonetheless. Before anyone says that it should become more full-featured, that way lies the Data API saga. :-)
Post by Ivan Sergio Borgonovo
RoR has ActiveRecord but RoR stays at a lower level in the pyramid of
framework->CMS.
To increase abstraction at the lowest level we can just pick up PDO.
At a higher abstraction layer, as usual, it is important to guess
which is the most suitable level of abstraction and generalisation to
avoid to be stuck in the future while not degrading performances.
I'd provide more than one very thin layer of abstraction.
A higher level could be an ORM specialised on the most important
objects in Drupal (nodes, users, taxonomy...).
At a lower level an SQL rewriting engine that could be extended later
that will support the most popular SQL functions and operations
across different DB.
At the lowest level a way to directly write SQL with a default not
optimised route using the second level of abstraction and a specific
DB route for each DB if the developers are willing to optimise, but
in a more elegant way than writing switch.
That's sort of what I'm looking at. Layer 1 is a revised db_query() to use PDO in a type-agnostic way. Layer 3 is node_load() and kin. Layer 2, for insert/update/delete, are new utility functions that build the query for you based on arrays. The part I'm still trying to figure out is Layer 2 for SELECT statements. So far the best I've come up with is a very-simple-api query builder (probably a fluent API) object that for complex queries (complex meaning "contains a function") you *must* use. I'm not sure if that is an acceptable solution, however.
Post by Ivan Sergio Borgonovo
As a side note without serialising, un-serialising, re-serialising,
the second level of abstraction could provide more security for SQL
injection since it will actually be a sort of SQL compiler rather
than a regexp filter.
Yep. String-parsing a serialized string is a design flaw.
Post by Ivan Sergio Borgonovo
I'd rely a lot on conventions to keep the code short and less
error prone.
Eg. if you're using group by, the returned fields should be in the
select statement and in the group by statement in pg.
If you pass fields names with a certain convention, fields will be
put in the group by statement by the compiler without the need to
list them twice and so on...
Similarly for the distinct statement.
Somehow we could even pre-compile SQL statement or save some cached
info (as count, I'm testing count caching across pages to speed up
paging on MS SQL).
Well, using formal prepared statements will cache the query itself, at least on databases that benefit from that. (MySQL doesn't all that much, but many others do.) I don't know if caching a built query string is going to buy us all that much. If the query string is that expensive to build, then we need to revise our query building mechanism.

--Larry Garfield
Ivan Sergio Borgonovo
2007-12-03 17:56:41 UTC
Permalink
On Mon, 3 Dec 2007 10:31:22 -0600
Post by Larry Garfield
Were you at DrupalCon Barcelona? :-) I had a session on the
database abstraction question, and the direction that I'm coding
http://drupal.org/node/180155
No, I wasn't in Barcelona and thx for the link. I did miss this one.
Post by Larry Garfield
Post by Ivan Sergio Borgonovo
Stuff like ORM are too general beasts, to be fun to use they need
a stronger OO language and carry an overhead most people using
Drupal won't be happy to live with.
Actually, I consider node_load() an ORM. Not a particularly
advanced one, but a basic ORM nonetheless. Before anyone says that
it should become more full-featured, that way lies the Data API
saga. :-)
If you call it OORM One Object Relational Model I may agree ;) That's
why I wrote real ORM are too general. We actually know we've to deal
with nodes, taxonomy, users... and yeah we've to leave the doors open
for other objects but as Drupal is a framework for CMS we need an
"ORM" for a CMS not for everything.
Post by Larry Garfield
Post by Ivan Sergio Borgonovo
I'd provide more than one very thin layer of abstraction.
A higher level could be an ORM specialised on the most important
objects in Drupal (nodes, users, taxonomy...).
At a lower level an SQL rewriting engine that could be extended
later that will support the most popular SQL functions and
operations across different DB.
At the lowest level a way to directly write SQL with a default not
optimised route using the second level of abstraction and a
specific DB route for each DB if the developers are willing to
optimise, but in a more elegant way than writing switch.
That's sort of what I'm looking at. Layer 1 is a revised
I'm married ;)
Post by Larry Garfield
db_query() to use PDO in a type-agnostic way. Layer 3 is
node_load() and kin. Layer 2, for insert/update/delete, are new
utility functions that build the query for you based on arrays.
I think we could a) get a broader picture looking at real ORM (as
SQLAlchemy and Co. and maybe something in RoR) then narrow down what
we generally (with a reasonable abundant generally) do and later
write a "sql cross compiler".
I did some experiments that should be similar to yours with arrays to
live with MS SQL and pgsql and recycle Drupal api but well... I'm not
that convinced. I mean... tons of people did it before me and I don't
feel as smart as getting it right without looking at real
implementations reasonably carefully.
Another important thing is the acceptance of such a method and making
it uniform to all the rest (schema api).
So... some how arrays (aka meta info) are the only way to go and it
looks to late unless we're going to change schema api as well.
ORM in python are well pythonic. A specialised-ORM in Drupal has to
be drupalish or there won't be acceptance and it will be hard to
"remember".
BTW I'm working on profile info and it is a pain how forms are so
similar to the profile array but still different. And it is not that
fun to write a form2profile function to transform an array in another
array. It would be very nice if you could juggle schema and query
meta info. eg. most of the times $header for table api could be
deduced from the field list or the opposite...
Post by Larry Garfield
The part I'm still trying to figure out is Layer 2 for SELECT
statements. So far the best I've come up with is a very-simple-api
query builder (probably a fluent API) object that for complex
queries (complex meaning "contains a function") you *must* use.
I didn't get it.
Post by Larry Garfield
I'm not sure if that is an acceptable solution, however.
Any sample?
Post by Larry Garfield
Post by Ivan Sergio Borgonovo
I'd rely a lot on conventions to keep the code short and less
error prone.
Eg. if you're using group by, the returned fields should be in the
select statement and in the group by statement in pg.
If you pass fields names with a certain convention, fields will be
put in the group by statement by the compiler without the need to
list them twice and so on...
Similarly for the distinct statement.
Somehow we could even pre-compile SQL statement or save some
cached info (as count, I'm testing count caching across pages to
speed up paging on MS SQL).
Well, using formal prepared statements will cache the query itself,
at least on databases that benefit from that. (MySQL doesn't all
that much, but many others do.) I don't know if caching a built
A concrete example: on MS SQL paging is different. To obtain
performance speed-up I used *_data_seek. I could to better using TOP
too (LIMIT in pg/mysql jargon), but then I wouldn't have the rows
number. I could cache the row number in a query_cache object and use
TOP the next page. Having an api that relate queries with some cached
data that every DB abstraction layer could use for its optimisation
may come handy. Each DB abstraction layer will do it as it needs with
different data... but with the same tool.
Post by Larry Garfield
query string is going to buy us all that much. If the query string
is that expensive to build, then we need to revise our query
building mechanism.
Reasonable.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Larry Garfield
2007-12-03 19:53:40 UTC
Permalink
Post by Ivan Sergio Borgonovo
Post by Larry Garfield
Post by Ivan Sergio Borgonovo
I'd provide more than one very thin layer of abstraction.
A higher level could be an ORM specialised on the most important
objects in Drupal (nodes, users, taxonomy...).
At a lower level an SQL rewriting engine that could be extended
later that will support the most popular SQL functions and
operations across different DB.
At the lowest level a way to directly write SQL with a default not
optimised route using the second level of abstraction and a
specific DB route for each DB if the developers are willing to
optimise, but in a more elegant way than writing switch.
That's sort of what I'm looking at. Layer 1 is a revised
I'm married ;)
ROFL!
Post by Ivan Sergio Borgonovo
Post by Larry Garfield
db_query() to use PDO in a type-agnostic way. Layer 3 is
node_load() and kin. Layer 2, for insert/update/delete, are new
utility functions that build the query for you based on arrays.
I think we could a) get a broader picture looking at real ORM (as
SQLAlchemy and Co. and maybe something in RoR) then narrow down what
we generally (with a reasonable abundant generally) do and later
write a "sql cross compiler".
I've started looking into code for ADODB, but dear god that thing is ginormous. I am a big big fan of small but powerful code. :-)
Post by Ivan Sergio Borgonovo
I did some experiments that should be similar to yours with arrays to
live with MS SQL and pgsql and recycle Drupal api but well... I'm not
that convinced. I mean... tons of people did it before me and I don't
feel as smart as getting it right without looking at real
implementations reasonably carefully.
Another important thing is the acceptance of such a method and making
it uniform to all the rest (schema api).
So... some how arrays (aka meta info) are the only way to go and it
looks to late unless we're going to change schema api as well.
ORM in python are well pythonic. A specialised-ORM in Drupal has to
be drupalish or there won't be acceptance and it will be hard to
"remember".
As powerful as arrays are, they're not the end-all and be-all. I have a feeling that the days of no-objects-in-core are numbered. We just have to be careful not to reimplement Java. :-)
Post by Ivan Sergio Borgonovo
BTW I'm working on profile info and it is a pain how forms are so
similar to the profile array but still different. And it is not that
fun to write a form2profile function to transform an array in another
array. It would be very nice if you could juggle schema and query
meta info. eg. most of the times $header for table api could be
deduced from the field list or the opposite...
Post by Larry Garfield
The part I'm still trying to figure out is Layer 2 for SELECT
statements. So far the best I've come up with is a very-simple-api
query builder (probably a fluent API) object that for complex
queries (complex meaning "contains a function") you *must* use.
I didn't get it.
Post by Larry Garfield
I'm not sure if that is an acceptable solution, however.
Any sample?
Well, I've nothing written yet. It's still in the "simmer in the back of my head for a while" stage. But off the cuff, I'm thinking something vaguely along the lines of:

$select = db_select();

$select->join('node', 'n')->join('user', 'u')->field('uid', 'n')->countField('nid', 'n', 'num_nids')
->where('type', 'n', 'page')->group('uid', 'n')->order('num_nids', 'DESC')->limit(0, 5);

$result = $select->execute();

Which in MySQL would translate to:

SELECT n.uid, count(n.nid) as num_nids
FROM node n INNER JOIN users u ON n.uid=u.uid
WHERE n.type='page'
GROUP BY n.uid
ORDER BY num_nids DESC
LIMIT 5

A Fluent API (mutator methods return $this, so you can chain calls like in jQuery) makes the syntax much easier and shorter, and lets you specify things in any order. It also then serves as a very easy arbitrary query builder, which (maybe maybe maybe) can then become the core of Views-partially-in-core. Of course, SELECT statements can get extremely complex, which is why I'm wary of this method as it means we will almost certainly end up making certain edge-case queries impossible. As I said, still percolating.

As Gabor said, though, let's focus on D6 for the time being. :-) Contact me off-list if you want to keep up this conversation.

--Larry Garfield
Karoly Negyesi
2007-11-28 12:41:49 UTC
Permalink
Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents. This
might only be for CRUD operations (data manipulation statements) and
not for DDL, for which we already have a reasonable API and which are
much less standardized.
Good. I do not know about open source but http://developer.mimer.com/validator/index.htm offers what you want.
Henrique Recidive
2007-11-28 14:54:37 UTC
Permalink
Post by Karoly Negyesi
Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents. This
might only be for CRUD operations (data manipulation statements) and
not for DDL, for which we already have a reasonable API and which are
much less standardized.
Good. I do not know about open source but http://developer.mimer.com/validator/index.htm offers what you want.
Mimer also provide a free webservice [1] for validating SQL against
ISO standards. This webservice has been implemented by two opensource
projects: phpMyAdmin [2] and SQuirreL SQL [3].

Can we implement this on devel module?

AFAIK it uses SOAP.

[1] http://sqlvalidator.mimer.com/index.html
[2] http://www.phpmyadmin.net
[3] http://squirrel-sql.sourceforge.net

Cheers,

Henrique
Chris Johnson
2007-11-28 10:48:23 UTC
Permalink
It's easier to find work-arounds and translations from standard SQL to
MySQL syntax, and do the same thing to translate to another reasonable
database (Posgres, Oracle, MSSQL), than it is to reverse translate
MySQL (or other database) specific features to other databases.

Standard SQL is the sort of the greatest common factor (in the
mathematical sense). Even more significantly, hundreds of MySQL
people have spent significant effort in coming up with standard SQL
equivalents in MySQL, so most of that work is already done for us.

Maybe there is an open-source tool which can validate standard SQL
which could be used. Then our DB abstraction could require a
verifiable standard SQL syntax which would then be translated (where
necessary) into the MySQL, Oracle, Postgres, etc. equivalents. This
might only be for CRUD operations (data manipulation statements) and
not for DDL, for which we already have a reasonable API and which are
much less standardized.

I myself always use MySQL for my Drupal installs -- because it will
work. Thus, I never test them with Postgres, even though I run
Postgres on several systems, am very comfortable with using it, etc.
I just don't have the time to hassle with discovering some obscure SQL
bug resulting from database incompatabilitilies when I'm just trying
to make a site work.

If I were able to devote more time to just testing Drupal for the sake
of testing it and with no further purpose, I could and would use
Postgres more of the time. But even then -- if my goal were to test
the new menu code, I'd want to eliminate any other possible error
causing sources, such as using Postgres.

So unless we actually push the issue of supporting multiple databases
as a community, the vast majority of developers will never make the
effort. And with each passing release spent in that mode, the harder
it will become to reverse it. We will end up supporting only one
database, MySQL, de facto.
Khalid Baheyeldin
2007-11-29 02:03:16 UTC
Permalink
Post by Chris Johnson
If I were able to devote more time to just testing Drupal for the sake
of testing it and with no further purpose, I could and would use
Postgres more of the time. But even then -- if my goal were to test
the new menu code, I'd want to eliminate any other possible error
causing sources, such as using Postgres.
So unless we actually push the issue of supporting multiple databases
as a community, the vast majority of developers will never make the
effort. And with each passing release spent in that mode, the harder
it will become to reverse it. We will end up supporting only one
database, MySQL, de facto.
Very good post Chris.

This is exactly the problem: those who use Drupal on PostgreSQL must
be the ones who would resolve the SQL differences, test and debug patches,
...etc.

They are the ones that have a vested interest in it more than the rest
of us. All of us want a cross platform cross database CMS, but the testing
work must be done by those who use PostgreSQL.

The rest of us have no time or resources to dedicate for something we
will not directly use.
--
Khalid M. Baheyeldin
2bits.com
http://2bits.com
Drupal optimization, development, customization and consulting.
catch
2007-11-29 09:48:51 UTC
Permalink
Post by Khalid Baheyeldin
This is exactly the problem: those who use Drupal on PostgreSQL must
be the ones who would resolve the SQL differences, test and debug patches,
...etc.
They are the ones that have a vested interest in it more than the rest
of us. All of us want a cross platform cross database CMS, but the testing
work must be done by those who use PostgreSQL.
Which brings us back to the original topic of this thread. And the fact
there's a release critical issue[1] which has been waiting for 4 days and 16
hours now for a quick test on postgresql. I don't think I need to point out
the irony of this situation.

1. http://drupal.org/node/164532 Improve table indices for common queries.
Karoly Negyesi
2007-11-29 14:07:15 UTC
Permalink
Post by Khalid Baheyeldin
This is exactly the problem: those who use Drupal on PostgreSQL must
be the ones who would resolve the SQL differences, test and debug
patches,...etc.
All of us want a cross platform cross database CMS,
While we MySQL developers are willing to validate our queries against an SQL validator but the problems are usually not on this level. Some examples are: text fields without defaults work on MySQL (even they can't have a default if i remember correctly) but not on pgsql so we need to supply a value. DDL have different syntax -- schema solved a lot but not all.

So let me reiterate my proposal: If a patch works on MySQL and has a good chance of working on postgresql -- because the queries validate or the specific features exist on postgresql but there can be syntax differences etc -- then please commit it. This is not, again NOT going down the "one database route" -- we will validate our queries, will still look into the postgresql manual as we did before etc to ensure things are OK -- just please let the postgresql folks test when they have time / resource without holding up the other whatever significant percent of core developers.
Bill Moran
2007-11-29 14:28:28 UTC
Permalink
Post by Karoly Negyesi
Post by Khalid Baheyeldin
This is exactly the problem: those who use Drupal on PostgreSQL must
be the ones who would resolve the SQL differences, test and debug
patches,...etc.
All of us want a cross platform cross database CMS,
While we MySQL developers are willing to validate our queries against an
SQL validator but the problems are usually not on this level. Some examples
are: text fields without defaults work on MySQL (even they can't have a
default if i remember correctly) but not on pgsql so we need to supply a
value. DDL have different syntax -- schema solved a lot but not all.
Before I start, let me reiterate for anyone who may have missed it earlier,
that I am a rabid PostgreSQL advocate.

That being said, I _agree_ with Karoly _in_theory_. PostgreSQL compat
should _not_ hold Drupal back. To let it hold Drupal back would be a
horrible idea. It would be utterly destructive to the project.

Unfortunately, I don't see any _actual_ evidence that the problem you
describe even exists, and I see a LOT of false information being
put forth in this thread. Let me enumerate a few items of concern:

*) The argument has been made that PostgreSQL compatibility is holding
Drupal back, then 1 (One!) patch is referred to again and again.
How important is this one patch? If this _1_ patch is holding
Drupal back, please commit it!
*) The argument was made that PostgreSQL is too complicated to install
and test on, yet I described the setup in a single email. If anyone
found my instructions too difficult to follow, _please_ let me know
and I will go to efforts to clarify.
*) Now you make a ridiculous claim as to the SQL syntax of PostgreSQL:
"text fields without defaults [don't work in pgsql]"

test=# create table testkaroly (testfield TEXT);
CREATE TABLE
test=# \d testkaroly
Table "public.testkaroly"
Column | Type | Modifiers
-----------+------+-----------
testfield | text |

test=# create table testkaroly2 (testfield TEXT DEFAULT 'test');
CREATE TABLE
test=# \d testkaroly2
Table "public.testkaroly2"
Column | Type | Modifiers
-----------+------+----------------------
testfield | text | default 'test'::text

As you can see clearly, TEXT fields work with and without defaults.

What completely unreliable source are you using for your information?

In any event, once I have some life issues sorted out I will be more
involved in Drupal development (hopefully come January). Until then,
I simply feel obligated to debunk such blatantly incorrect assertions
as this text/default thing.
--
Bill Moran
http://www.potentialtech.com
Chris Johnson
2007-11-30 14:40:24 UTC
Permalink
Post by Khalid Baheyeldin
Post by Chris Johnson
So unless we actually push the issue of supporting multiple databases
as a community, the vast majority of developers will never make the
effort. And with each passing release spent in that mode, the harder
it will become to reverse it. We will end up supporting only one
database, MySQL, de facto.
Very good post Chris.
This is exactly the problem: those who use Drupal on PostgreSQL must
be the ones who would resolve the SQL differences, test and debug patches,
...etc.
They are the ones that have a vested interest in it more than the rest
of us. All of us want a cross platform cross database CMS, but the testing
work must be done by those who use PostgreSQL.
This is exactly NOT what I meant. If "all of us want a cross platform
cross database CMS", then we /all/ have to bear some of the burden of
making it so -- not just those second class citizens who use the less
popular database. Otherwise, we're back to square one again.
Karoly Negyesi
2007-11-30 17:42:05 UTC
Permalink
Post by Chris Johnson
This is exactly NOT what I meant. If "all of us want a cross platform
cross database CMS", then we /all/ have to bear some of the burden of
making it so -- not just those second class citizens who use the less
popular database. Otherwise, we're back to square one again.
Yes we take some of the burden: we do not use MYSQLisms, we make sure that the queries validate --this can be part of the new policy, something positive that came out of this talk -- we check the postgresql manual if we do something funny -- there is just the actual test we are talking of. I do not think our viewpoints are too far away :)
Bill Moran
2007-11-30 18:24:31 UTC
Permalink
Post by Karoly Negyesi
Post by Chris Johnson
This is exactly NOT what I meant. If "all of us want a cross platform
cross database CMS", then we /all/ have to bear some of the burden of
making it so -- not just those second class citizens who use the less
popular database. Otherwise, we're back to square one again.
Yes we take some of the burden: we do not use MYSQLisms, we make sure that the queries validate --this can be part of the new policy, something positive that came out of this talk -- we check the postgresql manual if we do something funny -- there is just the actual test we are talking of. I do not think our viewpoints are too far away :)
I do. I think you're trying to make it seem like people agree with you
when they really don't.
--
Bill Moran
http://www.potentialtech.com
Earnie Boyd
2007-11-17 22:30:48 UTC
Permalink
Post by Ivan Sergio Borgonovo
The path should be to make the DB abstraction layer more DB agnostic
and make developer more aware there is not just mysql.
Everything will look terrible if you design for mysql and then try to
port it to any other DB. This increase the inertia of moving to a
more agnostic DB layer and supporting anything else than mysql.
Which is what I suggested about 4 days ago and created
http://drupal.org/node/191486 to take this on. Writing to Drupal SQL
within the code base and letting the further abstraction handle the
differences. Drupal SQL would have a format similar to MySql so that
the commands should be able to be passed directly to the MySql layer
when the db_type is mysql but other db_type would parse the string
further to make changes as needed. This can also fix quoting
differences so I can quote the columns in MySql style with the back
quote character "`" in my Drupal SQL code and when the db_type is pgsql
I change it to the single quote "'" character (assuming that is what
pgsql uses to quote columns).

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
adrian rossouw
2007-11-17 17:39:00 UTC
Permalink
Post by Karoly Negyesi
into my words? I am nonplussed.
good for you for using that word correctly =)

english first language speakers have issues with that.
Barry Jaspan
2007-11-18 05:12:33 UTC
Permalink
Karoly,
Post by Karoly Negyesi
"In my opinion, any patch should be committable to core once it
works on mysql and has a decent hope (use common sense) to work on
postgresql."
How can people read "Let's drop postgresql!" into my words?
Because:

1. You posted on your blog that you want to drop support for pgsql:
"I want Drupal core to work with MySQL and that's it."
(http://www.drupal4hu.com/node/64)

2. In conversations on IRC and elsewhere, you have expressed great
frustration at being bothered with pgsql.

3. You've specifically asked me why I think it is important that the
mysql and pgsql databases have consistent schemas, expressing that you
think it is not important.

So, when you say "any patch should be committable to core once it
works on mysql," it sounds an awful lot like "because I don't think
support for other databases is important at all and I'd be happy to
see it go." You are correct that the particular message I replied to
did not call for de-supporting pgsql but I think it is reasonable to
assume your previous statements apply to your current statements,
unless you disclaim them.

Honestly, I have no real opinion about pgsql support per se. I've
never used it except for D6 development and testing. The decision to
support it in Drupal was made before I arrived in the project and I
don't know why. However, I believe the discipline of supporting more
than mysql is healthy and necessary for the project.

I do think Larry has a point, though: This situation will only get
harder and we cannot hold up all patches until they work on every
conceivable platform. If we have consensus on what platforms we
support and lack of hostility to re-opening closed issues due to
portability issues that come up later, and sometimes "mysql-only
developers" even having to rewrite code that works on mysql in order
to make it portable, then a process that includes commits before
patches are tested on all supported platforms is probably sensible.

HOWEVER, I still think the mandatory works-before-commit set of
platforms should include more than just mysql; including at least two
DBMS's makes it much more likely that the code will be portable for
additional DBMS's later *without* requiring rewriting.

One more thing. I use XAMPP on Windows for development. Here is my
process for using PostgreSQL 8.3:

1. Download and run the installer.
2. Uncomment the "extension=php_pgsql.dll" line in Apache's php.ini.
3. Re-start Apache.
4. Change $db_url in settings.php.

Not very hard. If a handful of core developers did this and just used
pgsql for all core development, we'd have a better product and
process.

Barry
Karoly Negyesi
2007-11-18 14:07:04 UTC
Permalink
Post by Barry Jaspan
assume your previous statements apply to your current statements,
unless you disclaim them.
Really? So if I write a mail which makes it crystal clear I changed my view then I should add "Hey! look! I changed my opinion!". Sure.
Post by Barry Jaspan
I do think Larry has a point, though: This situation will only get
harder and we cannot hold up all patches until they work on every
conceivable platform. If we have consensus on what platforms we
support and lack of hostility to re-opening closed issues due to
portability issues that come up later, and sometimes "mysql-only
developers" even having to rewrite code that works on mysql in order
to make it portable, then a process that includes commits before
patches are tested on all supported platforms is probably sensible.
Here we agree.
Post by Barry Jaspan
HOWEVER, I still think the mandatory works-before-commit set of
platforms should include more than just mysql; including at least two
DBMS's makes it much more likely that the code will be portable for
additional DBMS's later *without* requiring rewriting.
Here we don't.
Huh, nice. OK, I shall admit that I already have installed postgresql on Tank but I am so unsure about it... anyone could help the process with a tutorial which makes sure the database can accept connections from localhost and only there? Also, add a "drupal" user with "drupal" password which can do... anything whatever it means. This process is clear on MySQL and after two years with occassionally working with pgsql, I could not figure out its grant system. I am daft, I know.
Bill Moran
2007-11-18 17:25:54 UTC
Permalink
Post by Karoly Negyesi
Huh, nice. OK, I shall admit that I already have installed
postgresql on Tank but I am so unsure about it... anyone could
help the process with a tutorial which makes sure the database
can accept connections from localhost and only there?
Once it's installed, there's another step called "initializing
the cluster" ... which is basically creating an empty database.
Some packaging systems might do this for you. If PostgreSQL
starts, then it's done (it won't implicitly create an empty
system).

PG is designed to allow multiple version on a single machine.
I don't know how many people actually do this, but it's a godsend
when you need it. The point is that each "cluster" (as it's called)
is a directory with all the configuration and database files for
that running instance -- usually /usr/local/pgsql or something like
like that. do a "find / -name postgresql.conf" and you'll find
it.

That's the first file you want to look at. There's a lot of
esoteric tuning options, but the obvious one near the beginning
is called "listen_address" which defaults to "localhost" only.
(Although different package systems may install a custom config
file, so you should check)
Post by Karoly Negyesi
Also, add
a "drupal" user with "drupal" password which can do... anything
whatever it means. This process is clear on MySQL and after two
years with occassionally working with pgsql, I could not figure
out its grant system. I am daft, I know.
I doubt you're daft, but I'm confused as to where the problem is.
PostgreSQL's grant system is simpler than MySQL's in my
experience.

The only thing that might be complex is that PG versions greater
than 8 have unified users and groups into a single concept called
a "role". If you just think of a "user" as a role with login
permissions, you'll be OK.

If you connect with the psql command, you can issue "\du" to get
a list of users. I'm guessing by default that there's only the
default superuser on your system. To create a new one:
CREATE ROLE drupal WITH LOGIN;
Or, to create it with superuser permissions:
CREATE ROLE drupal WITH LOGIN SUPERUSER;
You can also tweak roles after creation:
ALTER ROLE drupal PASSWORD = 'somepassword';
Then create a database:
CREATE DATABASE drupal;
And tweak settings as you like:
GRANT all ON DATABASE drupal TO drupal;

PG has a two-pronged approach to security. The role system which
I just described is actually the second layer. The first layer
is called "host-based authentication". These two layers combined
basically equate to MySQL's role system.

In the same directory as the postgresql.conf file, you'll find
a pg_hba.conf file. In this are lines that control what users
from what hosts can connect to what databases. For your
purposes, you probably want lines like:

host all all 127.0.0.1/32 trust
local all all trust

The first means that any user connecting via the loopback is
allowed to connect without a password. The second means that
any user can connect to any database via the local unix socket
without a password.

You can get more specific if you want:
# Allow anyone from local network to connect to database myaccount
# as user drupal, as long as they know the password
host drupal myaccount 192.168.5.0/24 password
# Allow anyone to connect to database drupal as any user from a
# single remote machine without a password
host all drupal 172.16.56.21/32 trust

psql is your friend. Some useful commands for seeing what your
database looks like:
# Show all roles (The "u" is legacy from when they were users)
\du
# Show all databases
\l
# Show the table structure of table "users"
\d users
# Show the permissions on all objects in the database
\dp
# Show the built-in help
\?

If you get stuck or have questions, ask.
--
Bill Moran
http://www.potentialtech.com
John Fiala
2007-11-18 23:14:25 UTC
Permalink
Here's a question - I'm interested in picking up a book on postrgeSQL
- something that will both teach me how to set it up properly and how
to configure it, as well as telling me how it's particular dialect of
SQL works. I thought I'd start with O'Reilly, but they don't have any
books for it listed in their latest catalog. I looked around, and
there's only a few books I can find that are recent.

So, how new of a book do I need for postgreSQL? And which one would be best?
--
John Fiala
Bill Moran
2007-11-18 23:48:23 UTC
Permalink
Post by John Fiala
Here's a question - I'm interested in picking up a book on postrgeSQL
- something that will both teach me how to set it up properly and how
to configure it, as well as telling me how it's particular dialect of
SQL works. I thought I'd start with O'Reilly, but they don't have any
books for it listed in their latest catalog. I looked around, and
there's only a few books I can find that are recent.
So, how new of a book do I need for postgreSQL? And which one would be best?
Personally, I'd make sure whatever you get is written at least
for 8.X. The differences (especially in tuning and management)
between 7.X and 8.X are pretty significant.

Beyond that, I can't offer a lot of help. Personally, I've been
able to use the online docs
(http://www.postgresql.org/docs/8.2/static/index.html) for all
my reference purposes. There is a page referencing books in
print on the web site: http://www.postgresql.org/docs/books/

Hope this helps.
--
Bill Moran
http://www.potentialtech.com
Larry Garfield
2007-11-29 18:58:52 UTC
Permalink
Post by Bill Moran
*) The argument has been made that PostgreSQL compatibility is holding
Drupal back, then 1 (One!) patch is referred to again and again.
How important is this one patch? If this _1_ patch is holding
Drupal back, please commit it!
I think these issues were also stalled on cross-database compatibility issues:

http://drupal.org/node/174025
http://drupal.org/node/83738
Post by Bill Moran
*) The argument was made that PostgreSQL is too complicated to install
and test on, yet I described the setup in a single email. If anyone
found my instructions too difficult to follow, _please_ let me know
and I will go to efforts to clarify.
Please make a handbook page out of them, and put it near (probably as a child of) the "setting up a dev environment" handbook page. We should probably have one for MySQL and one for Postgres (and if we add SQLite at a later date, one for that, too). Email is far too ephemeral to be a good reference. Thanks!

--Larry Garfield
Ivan Sergio Borgonovo
2007-11-29 20:17:32 UTC
Permalink
On Thu, 29 Nov 2007 12:58:52 -0600
Post by Larry Garfield
Please make a handbook page out of them, and put it near (probably
as a child of) the "setting up a dev environment" handbook page.
We should probably have one for MySQL and one for Postgres (and if
we add SQLite at a later date, one for that, too). Email is far
too ephemeral to be a good reference. Thanks!
Christoph Otto (Volt)" <a-***@microsoft.com> from (guess)
Microsoft is trying to port D6 to MS SQL. He posted 2 weeks ago about
some issues with serial/sequences/identity and I think he can offer
an interesting point of view about DB abstraction layers and the
issues he encountered in porting D6 to Microsoft SQL.

I recently had to write my classes to speed up paging in MS SQL
without renouncing to all the lovely stuff Drupal can offer and well
it was instructive.

I think the experience people trying to port D6 to MS SQL have
accumulated may be of interest to all.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Daniel F. Kudwien
2007-11-29 20:31:06 UTC
Permalink
Guys, this discussion lasts for days now. Could we either

- move it into d.o forums or
- move it to g.d.o DBA or whatever group or
- create a survey about whatever whoever thinks about MySQL/PgSQL or general
database performance or
- probably best work hard on those bugging patches instead of just chat?

sun
Edward Z. Yang
2007-11-30 17:58:37 UTC
Permalink
Post by Bill Moran
Take, for example, that COUNT(*) is _really_ fast on MySQL + MyISAM,
but is slow on just about every other DB. So, replace it with
$db->count(), which does an SQL COUNT(*) on MySQL + MyISAM, but uses
some other technique on other databases. The programmer no longer
cares _how_ such a thing gets done, he only cares that he wants to
know how many rows are in a particular query
The primary problem with this approach (at least in my experience) is
that the moment you need to create queries that are even somewhat
complicated the abstraction layer proves to be insufficient. Ad hoc
queries are often very useful for performance reasons, but you'll have
to get a PostgreSQL guru to write an equivalent.

Then again, for most people writing SQL performance/complex joins/etc
are not a primary concern...

- --
Edward Z. Yang Portable GnuPG: 0x995A2C84
HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
[[ C8D5 9E3C 15AD 1467 5561 2C0E 719A 2D9D 995A 2C84 ]]
This Message Courtesy of Thunderbird Portable
Bill Moran
2007-11-30 18:40:32 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Bill Moran
Take, for example, that COUNT(*) is _really_ fast on MySQL + MyISAM,
but is slow on just about every other DB. So, replace it with
$db->count(), which does an SQL COUNT(*) on MySQL + MyISAM, but uses
some other technique on other databases. The programmer no longer
cares _how_ such a thing gets done, he only cares that he wants to
know how many rows are in a particular query
The primary problem with this approach (at least in my experience) is
that the moment you need to create queries that are even somewhat
complicated the abstraction layer proves to be insufficient. Ad hoc
queries are often very useful for performance reasons, but you'll have
to get a PostgreSQL guru to write an equivalent.
Then again, for most people writing SQL performance/complex joins/etc
are not a primary concern...
Well, if such an abstraction solves 90% of the problems, the gurus will
have more time to deal with the remaining 10%.

Assuming it can be done well enough to actually solve 90%.
--
Bill Moran
http://www.potentialtech.com
Chris Johnson
2007-12-05 12:37:03 UTC
Permalink
Post by Bill Moran
Post by Edward Z. Yang
Then again, for most people writing SQL performance/complex joins/etc
are not a primary concern...
Well, if such an abstraction solves 90% of the problems, the gurus will
have more time to deal with the remaining 10%.
That was the intention of my idea -- not replace all SQL, but replace
all mundane SQL so efforts can be focused on the
hard/complex/performance-needing queries.

FGM
2007-12-04 15:32:54 UTC
Permalink
As I understand it, variant_date_from_timestamp is a Windows-only
function, so this wouldn't probably cut it, would it ?

---- Original Message ----
From: ***@users.sourceforge.net
To: ***@drupal.org
Subject: Re: [development] Do not let postgresql hold back great
patches
Date: Tue, 04 Dec 2007 09:52:09 -0500

[...]
Post by Earnie Boyd
DB engine to do TRT with it? Then we can use PHP5 functions
variant_date_from_timestamp and variant_date_to_timestamp to convert
it.
[...]
Earnie Boyd
2007-12-04 17:01:23 UTC
Permalink
Post by FGM
As I understand it, variant_date_from_timestamp is a Windows-only
function, so this wouldn't probably cut it, would it ?
Sorry, yes, I missed that when I was browsing the date functions. But
there are others that can be used that aren't OS specific.

Earnie -- http://for-my-kids.com/
-- http://give-me-an-offer.com/
Karen Stevenson
2007-12-04 16:01:03 UTC
Permalink
The question of how to handle dates isn't something we're going to 'solve' in a few
emails back and forth on this email list and I figure maybe we should finish
this thread up or move it elsewhere.


The PHP5 function variant_date_from_timestamp seems to be something designed to work on Windows, I've not seen that used in any other situations. It just converts raw dates to and from timestamps, with the timezone adjustment unknown or ambiguous, part of the problem we have when storing dates as timestamps now.

This whole issue is quite complex, even if we standardize on storing
all dates in the database using native date formats, the SQL to
manipulate the dates is still completely different from one database to
another. And we need better ways to associate timezones with the stored
dates, for dates that should be associated with specific timezones, so you know how to transform them when they're displayed.


For date SQL that works across databases, a simple regex find and
replace
won't work because the code needed to extract date values from datetime
columns can be completely different from one
database to another (like the completely different ways you extract a
week number from a datetime value in MYSQL and Postgres). My solution,
for now,
is to create a function that composes SQL date snippets from specified
criteria. That has been enough for me to be able to do
things like get Views queries correctly constructed in both MYSQL and
Postgres. It's not a perfect solution, but it's much simpler than a
complete abstraction layer.





And even if we start storing all core date fields in native datetime format, we still need to have ways to handle dates in non-native formats
if we want to be able to do things like import iCal or RSS feeds or
otherwise use date values that originate elsewhere.


Once I get the new Date API working (hopefully pretty soon), I'm it can be used as a focus for further work and discussion about how to handle dates in core going forward.

If nothing else, maybe we can set up a time at the next DrupalCon for a meeting of anyone interested in this subject to hammer out ideas.

Karen


----- Original Message ----
From: Earnie Boyd <***@users.sourceforge.net>
To: ***@drupal.org
Sent: Tuesday, December 4, 2007 8:52:09 AM
Subject: Re: [development] Do not let postgresql hold back great patches

Isn't it preferred to use the ansi-sql92 TIMESTAMP column and allow the

DB engine to do TRT with it? Then we can use PHP5 functions
variant_date_from_timestamp and variant_date_to_timestamp to convert
it.
Continue reading on narkive:
Loading...