Thursday, July 10, 2008

[GENERAL] A better error message for reject option in pg_hba.conf

(This is a very minor thing, but still...)

Suppose we have this line in pg_hba.conf:

local all all reject

Now:

$ psql template1 -U postgres

psql: FATAL: no pg_hba.conf entry for host "[local]", user "postgres",
database "template1", SSL off

Actually there *is* an entry for local, user postgres and database
template1 -- but it is a reject. Should we improve this message and
return an error actually?

...or am I overlooking something?

Regards,
--
Devrim GÜNDÜZ
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr

http://www.gunduz.org

Re: [JDBC] What does this mean?

Paul Tomblin wrote:
> >> What does it mean when your process is stuck in the following:
> >
> > It's waiting for a query result.
>
> Yeah, but this is a very small query on a very small table. The query
> in question is
>
> "SELECT " + columns +
> "FROM venue " +
> "WHERE venueid = ?");
>
> where the table "venue" has 15 rows in it, and it's indexed
> on venueid.
>
> Hmmm. Looks like something isn't getting committed before
> the nightly vacuum.

I'd trace on the server to see what's going on.

What happens if you run the query in psql?

If you thing that there is a problem with VACUUM, try to
run "VACUUM VERBOSE venue" in psql and see what it says.

Yours,
Laurenz Albe

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:
> Come to think of it, the current code is in a way a variant of Lossy
> Counting, it's just doing the pruning after each and every new element,
> isn't it?

Interesting comment. In LC's terms we have w=1 therefore e=1 therefore
the maximum error is as bad as possible?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Oleg Bartunov wrote:
> On Wed, 9 Jul 2008, Jan Urbaski wrote:
>
>> Jan Urbaski wrote:
>> Do you think it's worthwhile to implement the LC algorithm in C and
>> send it out, so others could try it out? Heck, maybe it's worthwhile
>> to replace the current compute_minimal_stats() algorithm with LC and
>> see how that compares?
>
> I and Teodor are using LC for phrase estimation in one application and
> from our understanding of the original paper this algorithm might be
> not good for sampling, since all theory behind was about streaming of
> FULL data. As for technique we use suffix tree, which should be fine for
> typical sample size.

Hm, that's a good point. I'm only reasurred by the fact, that it yields
roughly the same results as the original algorithm, which means that if
LC is bad for sampling, then the current implementation is just as bad.
Come to think of it, the current code is in a way a variant of Lossy
Counting, it's just doing the pruning after each and every new element,
isn't it?

Cheers,
Jan

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:
>> Tom Lane wrote:
> Well, (1) the normal measure would be statistics_target *tsvectors*,
> and we'd have to translate that to lexemes somehow; my proposal is just
> to use a fixed constant instead of tsvector width as in your original
> patch. And (2) storing only statistics_target lexemes would be
> uselessly small and would guarantee that people *have to* set a custom
> target on tsvector columns to get useful results. Obviously broken
> defaults are not my bag.

Fair enough, I'm fine with a multiplication factor.

>> Also, the existing code decides which elements are worth storing as most
>> common ones by discarding those that are not frequent enough (that's
>> where num_mcv can get adjusted downwards). I mimicked that for lexemes
>> but maybe it just doesn't make sense?
>
> Well, that's not unreasonable either, if you can come up with a
> reasonable definition of "not frequent enough"; but that adds another
> variable to the discussion.

The current definition was "with more occurrences than 0.001 of total
rows count, but no less than 2". Copied right off
compute_minimal_stats(), I have no problem with removing it. I think its
point is to guard you against a situation where all elements are more or
less unique, and taking the top N would just give you some random noise.
It doesn't hurt, so I'd be for keeping the mechanism, but if people feel
different, then I'll just drop it.

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[SQL] record type

Or maybe anyone knows how to work with record types ? How to insert something like ('1','2','3') into a table, or split it ? Anything ?

regards
mk

[pgadmin-support] Couldn't connect to database under Vista

Hi, dear support!
I've installed PostgreSQL 8.3.3 to my PC under OS Windows Vista. That were some mistakes, such as "Folded to connect to database", when installer "Activates Procedural Languages".
So, i can see process of PostgreSQL "postrges.exe", but i can't connect to created database by pgAdmin and other applications, which works with that database.
UAC is disabled. DEP is disabled for "postgres.exe" and "pg_ctl.exe".
Please help me whit this problem.

Nick Ilyushkin.

--
Sent via pgadmin-support mailing list (pgadmin-support@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-support

Re: [JDBC] Timestamp without Timezone and differing client / server tzs

Oliver Jowett wrote:
> Ken Johanson wrote:
>
>>
>> The servers and their schema (use-of ts w/o tz are not under my
>> control; adding with-timezone would likely break apps that transmit
>> date values with offset already normalized)
>
> This is the root of your problems, you're trying to use WITHOUT TIMEZONE
> to represent an instant in time which only makes sense when you're using
> the same timezone absolutely everywhere - and you're not. (Basically,
> you've got an implicit extra bit of data floating around - the server
> timezone - that's not correctly tied to your column data)

That's fine to me. UTC (in my case) is a well defined and agreed upon
TZ. IT works perfectly without having any conveyance of TZ since
everyone agrees to use UTC (or any other stable TZ spec).

>
> But if you can't change the schema, you can certainly work around it in
> your application:
>

I think that depends on how many if any layer of middleware are opaque
vs extensible. If I'm ostensibly only allow to pass in a number-wrapped
object java.util.Date or Timestamp then I could do as you suggest but by
offsetting the REAL / internal numeric value. Cringe. But I may have to
resort to this.

>> Just to pass-in a timestamp, and which ever database it is sent to, be
>> stored with its real atomic (integer) value (ie string conversion
>> normalized to what db uses).
>>
>> Simplified:
>>
>> Timestamp NOW = new Timestamp(System.currentTimeMillis());
>> for (int i=0; i<SERVERS.length; i++)
>> {
>> Connection con = ....//SERVERS[i] etc; each server exists in different
>> timezone, and datetime/timestamps cols do not store zone.
>> PreparedStatement ps = con.prepareStatement("INSERT INTO tbl
>> (lastModified) VALUES (?)");
>> ps.setTimestamp(1, NOW));
>> ps.executeUpdate();
>> }
>
> Ok, so essentially you want to pass a timestamp-with-timezone value to
> the server, then store it as timestamp-without-timezone, using the
> server's timezone to do the cast, right?
>
> What is biting you here is the server-side inference of the datatype of
> your parameter. To support both WITH TIMEZONE and WITHOUT TIMEZONE via
> setTimestamp(), the driver has to pass the parameter with an unknown
> type and let the server infer the actual type.
>

Well made point. But I'm fine with the driver implicitly treating ANY TS
object as convertible on a configured-as-such connection, even if its
destined for a TS w/ TZ. In this case because I know there are only
TZless storage.

> For example if a client sets a timestamp of "05:00 +1000" (either via an
> explicit Calendar, or because their default timezone is +1000) to insert
> into a WITHOUT TIMEZONE column, the only sensible result is to insert
> that as "05:00" regardless of the server's timezone setting. The client
> sees the value as 05:00, so the only sensible thing the driver can do is
> to insert it as 05:00.
>
> The driver takes advantage of the fact that literals interpreted as
> WITHOUT TIMEZONE completely ignore the timezone specification in this
> case (and when inserting into a WITHOUT TIMEZONE column, the type of an
> unknown-type parameter is inferred to be WITHOUT TIMEZONE).

Maybe that part/all the answer: if overriding TZ on connection then dont
pass redundant TZ information in protocol- just pass tz-less value
normalized/offset to the server/implicit TZ. I'm pretty sure another
driver does this too.

>
> If the driver passed that literal as a WITH TIMEZONE type, it'd first
> get converted to a seconds-since-epoch value (respecting the timezone
> specification) and then converted to WITHOUT TIMEZONE using the server's
> timezone. If the server's timezone is not +1000, you get something other
> than "05:00" inserted, which isn't what the client asked for.
>
> So the driver deliberately doesn't do that, and in fact jumps through
> some hoops to make sure it doesn't happen.
>
> However, "interpret as WITH TIMEZONE then cast" is exactly the behaviour
> you want in this case. You can get that behaviour via some explicit
> casting, something like this:
>
> CAST((CAST ? AS TIMESTAMP WITH TIMEZONE) AS TIMESTAMP WITHOUT TIMEZONE)
>
> (modulo syntax errors; not sure if the outermost CAST is needed, off the
> top of my head)
>
> -O
>

Honestly that beyond my comprehension, though part of me thinks it may
be over complicating things, if the driver instead just adds the offset
and doesn't pass zone in the string? In UTC's case anyway.

Thank you for some good ideas!

Ken


--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: [JDBC] Timestamp without Timezone and differing client / server tzs

Pushker Chaubey wrote:
>
> Ken Johanson wrote:
>> The servers and their schema (use-of ts w/o tz are not under my
>> control; adding with-timezone would likely break apps that transmit
>> date values with offset already normalized).
> Just wondering, other apps using same database must also be facing the
> same problem as you are if they operate across various timezones.
> Since all the applications are sharing the same timestamp(without TZ)
> column they all should follow the same protocol to update and read
> values for this shared timestamp(without TZ) column.
>
> We had a similar situation where the schema (having a timestamp without
> timezone column ) was not under our control and we had clients across
> various timezone who accessed the database.
> If one client with timezone TZ1 updated the value, the other client with
> timezone TZ2 did not read the correct value.
> To get around that we agreed on a reference timezone (GMT) so that
> stored timestamp value would be as per GMT timezone.

I (and the other clients in their real timezones) are using a very
similar config/protocol. The server is set to UTC and everyone agrees to
convert the textual/iso8601 representation to it UTC value going out,
and parse as UTC coming back. Just pass TZ to SimpleDateformat as one
mean of accomplishing this. It's very easy conceptually. Well, only for
query and their literal values constructed as in the StringBuffer way etc.

The kicker happens when using PreparedStatements or an overlying API
that relies on them. We have no control (as I understand so far) over
how the PG driver does conversion.

So I have to write a layer over some middleware that converts the values
before passing down to PS (may not be possible though). The inelegant
part is the server-specific config being stored not in the URL but
elsewhere... I could get creative and piggyback my own param in the URL
if I can access it.

So much to ponder. For now there's the political-correctness joy (not)
of having to inform customer that right now anyway, PG cant do what
they're doing with database and driver X.

Thanks for your thoughts Pushker,

Ken


--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: [BUGS] BUG #4294: XML support: name() xpath function not working

"Enrico Sirola" <enrico.sirola@gmail.com> writes:
> -- bad
> select xpath('name(/my:a/*[last()])', test, ARRAY[ARRAY['my',
> 'http://myns.com/ns']]) from test;

You could barely swat a fly with what I know about XPath ... but I'm
thinking that the culprit here is the crock at lines 3254-3288 of
xml.c, which explains itself thus:

/*
* To handle both documents and fragments, regardless of the fact whether
* the XML datum has a single root (XML well-formedness), we wrap the XML
* datum in a dummy element (<x>...</x>) and extend the XPath expression
* accordingly. To do it, throw away the XML prolog, if any.
*/

This code prefixes the given path expression with "/x", which of course
is going to break any expression that starts with a function name. It
would have to stick the /x inside the function argument to have any
chance of working. In general, I think this has zero chance of working
without implementing a pretty complete XPath parser. We need to find
another way.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[HACKERS] Postgres 8.1 doesn't like pg_standby's -l option

I was trying to set up warm standby for an 8.1.11 instance, and was using pg_standby's -l option so that it creates links and does not actually copies files. After struggling for a few hours, I found two problems; one big, one small.

The smaller issue is that even if we do not end the restore_command's value with a ' (quote), the restore process still runs just fine!

The bigger issue, Postgres seems to not recognize links as valid files, even though pg_standby was successfully able to restore them! I did not dig the issue any deeper, so cannot say if it was a misbehaving stat() or Postgres in version 8.1 did not know how to handle symlinks!

Just and FYI for all, and for the archives.

Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB      http://www.enterprisedb.com

Mail sent from my BlackLaptop device

Re: [ADMIN] Importing data - possible UTF8 import bug?

"Mikel Lindsaar" <raasdnil@gmail.com> writes:
> But I am hitting a problem. When I try to import the result with COPY
> I am getting:
> ERROR: invalid byte sequence for encoding "UTF8": 0xa2
> And gives me a line reference.

> I then grabbed that line out of the file, unpacked each byte into a
> decimal representation and 162 does not appear anywhere.

Sure you're counting the lines right? I believe the number COPY
gives you is the N'th line of COPY data (ie, N lines after the
COPY command) --- not the N'th line of the whole file. Another
thing to check is whether embedded newlines &such are throwing
off your count.

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[GENERAL] Top N within groups?

[was {SOLVED?] Re: [GENERAL] functional index not used, looping simpler
query just faster]
Ivan Sergio Borgonovo wrote:
> I'm still curious to know if this could be done
> efficiently with just one query.
>
[thinking out loud]
Can someone familiar with the source for DISTINCT ON comment on how hard
it would be to add another parameter to return more than one row?

e.g.
To do TOP 1 within an unknown number of groups
select distinct on (groupid) groupid, identifier, count(*)
from somequery
group by groupid, identifier
order by 1,3 desc,2;

I'm thinking that, for the top 3 within each group, something like
select distinct on (groupid) FOR 3 groupid, identifier, count(*)
from somequery
group by groupid, identifier
order by 1,3 desc,2;

For Ivan's case, groupid = brand, identifer = item. The where clause
applies the date limits.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [ADMIN] Importing data - possible UTF8 import bug?

> Anyone got any ideas how I can hunt down what is in this row, visually
> looking at it hasn't helped?

As a followup, I am getting this same sort of error at various points
on the file.

So I tried taking the line, and cleaning it with Iconv and turning it
into an ASCII sequence and inserting it back into the file.

I still get the same error with the same invalid character.... too weird.

I have a few lines that fail, all with different invalid character encodings:

0xe92042
0xe42050
0xa2
0xc220
0xdb20
0xe42050
0xe92042

It's at the rate of about one every 50,000 rows... so no real big
deal... but still...

--

http://lindsaar.net/
Rails, RSpec and Life blog....

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [PERFORM] Altering a column type - Most efficient way

On Thu, 2008-07-10 at 09:57 -0400, Alvaro Herrera wrote:
> Ow Mun Heng wrote:
>
> > This is what I see on the table
> >
> > NEW attypmod = -1
> > OLD attypmod = 8
>
> 8 means varchar(4) which is what you said you had (4+4)
> -1 means unlimited size.
>

This is cool.

If it were this simple a change, I'm not certain why (I believe) PG is
checking each and every row to see if it will fit into the new column
definition/type.

Thus, I'm still a bit hesitant to do the change, although it is
definitely a very enticing thing to do. ( I presume also that this
change will be instantaneous and does not need to check on each and
every row of the table?)

Thanks./

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[pgadmin-support] Types not exported/restored

I am trying to do a backup/transfer of database, and possibly the whole
server from one computer to another in a network using Pgadmin III included
in PostgreSQL 8.2. Tables were exported but nothing was exported in types.
Hence, some columns of the tables are missing, I've noticed that these are
tables having dependents in Types(which is missing) . Added to that, only
the public schema was exported and restored but not the other schemas. How
could I totally backup the whole database from one computer to another
completely?

I am new to using PostgreSQL and database as a whole and just starting to
learn.
I am dying for almost a week looking for the answer
Help please... Thanks :)
-these are the screenshots to better explain a novice like me :)
fist one is the screenshot of the db to backup, the 2nd one is incomplete
the restored db
http://www.nabble.com/file/p18395521/db_source.jpg

http://www.nabble.com/file/p18395521/db_restored.jpg

--
View this message in context: http://www.nabble.com/Types-not-exported-restored-tp18395521p18395521.html
Sent from the PostgreSQL - pgadmin support mailing list archive at Nabble.com.


--
Sent via pgadmin-support mailing list (pgadmin-support@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgadmin-support

[COMMITTERS] pgsql: Make sure we only try to free snapshots that have been passed

Log Message:
-----------
Make sure we only try to free snapshots that have been passed through
CopySnapshot, per Neil Conway. Also add a comment about the assumption in
GetSnapshotData that the argument is statically allocated.

Also, fix some more typos in comments in snapmgr.c.

Modified Files:
--------------
pgsql/src/backend/storage/ipc:
procarray.c (r1.44 -> r1.45)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/ipc/procarray.c?r1=1.44&r2=1.45)
pgsql/src/backend/utils/time:
snapmgr.c (r1.3 -> r1.4)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/time/snapmgr.c?r1=1.3&r2=1.4)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[ADMIN] Importing data - possible UTF8 import bug?

Hi all,

I am importing a bunch of data exported from an Oracle database.

This came out as a flat file, tab delimited.

I have parsed this through a ruby script that does the following:

For each line
split it at the tab
for each column
parse it through Iconv and convert from UTF8 to UTF8
on failure dump the line into a log for inspection and continue
check to see if the column value ends in a single \
if so, add another \ to escape it and not the tab that will follow
check to see if the column value is blank
if so, replace with \N
join the array back together with tabs
write it out to the import file.


This works quite well. The ruby script handles about 10,000 lines
every 15 seconds or so. Theoretically no invalid utf-8 sequences
could end up in the output file as everything is passed through Iconv
and if it raises an except it dumps the offending line into a separate
file for manual handling.

But I am hitting a problem. When I try to import the result with COPY
I am getting:

ERROR: invalid byte sequence for encoding "UTF8": 0xa2

And gives me a line reference.

I then grabbed that line out of the file, unpacked each byte into a
decimal representation and 162 does not appear anywhere.

Additionally I did:

Iconv.iconv('UTF-8', 'UTF-8', line).join == line

Which takes the line and converts it from UTF8 to UTF8 using the Iconv
library (--f UTF-8 -t UTF-8) and this returns true.

Interestingly, the following also returns true:

Iconv.iconv('ASCII', 'UTF-8', line).join == line #=> true

So there can't be any non ASCII characters in there.

Unpacking the line and uniq'ing it and sorting it gives the following:

@result.unpack('U'*@result.length).uniq.sort
=> [9, 10, 32, 45, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 65, 66,
67, 68, 69, 70, 72, 73, 74, 76, 77, 78, 79, 80, 82, 83, 84, 85, 87,
91, 92, 93, 97, 98, 99, 101, 109, 110, 111, 114, 116, 117, 121]

Anyone got any ideas how I can hunt down what is in this row, visually
looking at it hasn't helped?

Mikel


--

http://lindsaar.net/
Rails, RSpec and Life blog....

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[COMMITTERS] pgsql: Fix a few typos in comments in snapmgr.c, and sort header

Log Message:
-----------
Fix a few typos in comments in snapmgr.c, and sort header inclusions
alphabetically.

Modified Files:
--------------
pgsql/src/backend/utils/time:
snapmgr.c (r1.2 -> r1.3)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/time/snapmgr.c?r1=1.2&r2=1.3)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

[COMMITTERS] veil - veil: Updates for the serialisation interfaces for veil

Log Message:
-----------
Updates for the serialisation interfaces for veil

Modified Files:
--------------
veil/src:
veil_bitmap.c (r1.5 -> r1.6)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/veil/veil/src/veil_bitmap.c.diff?r1=1.5&r2=1.6)
veil_interface.c (r1.8 -> r1.9)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/veil/veil/src/veil_interface.c.diff?r1=1.8&r2=1.9)
veil_interface.sqs (r1.4 -> r1.5)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/veil/veil/src/veil_interface.sqs.diff?r1=1.4&r2=1.5)

Added Files:
-----------
veil/src:
veil_serialise.c (r1.1)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/veil/veil/src/veil_serialise.c?rev=1.1&content-type=text/x-cvsweb-markup)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

On Wed, 9 Jul 2008, Jan Urbaski wrote:

> Jan Urbaski wrote:


> Do you think it's worthwhile to implement the LC algorithm in C and send it
> out, so others could try it out? Heck, maybe it's worthwhile to replace the
> current compute_minimal_stats() algorithm with LC and see how that compares?

I and Teodor are using LC for phrase estimation in one application and
from our understanding of the original paper this algorithm might be
not good for sampling, since all theory behind was about streaming of
FULL data. As for technique we use suffix tree, which should be fine for
typical sample size.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [INTERFACES] libPQ -Canceling Blocking Calls

7/10/08 3:05 PM, also sprach tgl@sss.pgh.pa.us:

> Sure. Look at psql for an example. You probably need to call PQcancel
> from a signal handler, but there's a signal-safe variant for that.

Great. Although I was planning on using a threaded approach, but since libpq
was compiled thread-safe, I assume this would be fine. I was hoping I could
cancel a synchronous call without having to collapse the thread. This is
good news.

Thanks,

Keary Suska
Esoteritech, Inc.
"Demystifying technology for your home or business"

--
Sent via pgsql-interfaces mailing list (pgsql-interfaces@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-interfaces

[pgsql-fr-generale] Probleme Traffic réseau PostgreSQL

Bonjour,

Je fais suivre un message d'un souscripteur dont les messages n'arrivent
pas jusqu'à la liste... En attendant de comprendre pourquoi ils
n'arrivent pas sur la liste...

Stéphane Schildknecht

###############

Bonjour,

Je fais des tests réseau de comparaison entre Oracle et PostgreSQL et
j'ai des gros écarts...

Lorsque je fais un "select toto from table where titi='x'" alors
le nombre d'octet de la réponse est identique entre Oracle et PostgreSQL
(parfois meilleur sous PostgreSQL).

Si j'enlève la condition "where" alors PostgreSQL est beaucoup plus
bavard qu'Oracle aussi bien en nombre de packet que sur la taille des
packets. Et c'est pire si l'on fais un vidage "select * from table" brutal.
Est-ce normal ?

Pour exemple :
SELECT id_commande FROM COMMANDE (VARCHAR(28))
SGBD |Temps |Octets |Packets size |Avg Mbit/sec |Packets
PostgreSQL|0'00" |1 896 702 |1105 bytes |67.286 |1716
Oracle |0'00" |1 496 151 |846 bytes |26.168 |1768
Informix |0'00" |1 858 862 |999 bytes |22.724 |1859

SELECT date_sign FROM COMMANDE (DATE)
SGBD |Temps |Octets |Packets size |Avg Mbit/sec |Packets
PostgreSQL|0'00" |868 573 |1117 bytes |18.999 |777
Oracle |0'00" |453 126 |876 bytes |8.245 |517
Informix |0'00" |587 446 |1018 bytes |4.882 |577

SELECT id_commande FROM COMMANDE where ID_COMMANDE =
'200706202054510076503-276501' (VARCHAR(28))
SGBD |Temps |Octets |Packets size |Avg Mbit/sec |Packets
PostgreSQL|0'00" |567 |81 bytes |0.024 |7
Oracle |0'00" |975 |139 bytes |0.043 |7
Informix |0'00" |904 |82 bytes |0.046 |11

SELECT * FROM COMMANDE (46255 lignes)

SGBD |Temps |Octets |Packets size |Avg Mbit/sec |Packets
PostgreSQL|0'13" |16 801 515 |1 008 bytes |9.900 |16 653
Oracle |0'08" |06 889 074 |694 bytes |6.752 |09 920
Informix |0'13" |11 481 870 |967 bytes |7.054 |11 867

Merci de votre réponse ou avis.

Antony Resbeut


--
Sent via pgsql-fr-generale mailing list (pgsql-fr-generale@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-fr-generale

Re: [GENERAL] storing latitude and longitude

Top posting because of a challenged reader ... try <http://postgis.refractions.net/> for packages and documentation.

Greg Williamson
Senior DBA
DigitalGlobe Inc


-----Original Message-----
From: pgsql-general-owner@postgresql.org on behalf of Brent Wood
Sent: Thu 7/10/2008 3:16 PM
To: markkicks@gmail.com; pgsql-general@postgresql.org
Subject: Re: [GENERAL] storing latitude and longitude

Hi Mark,

Look at Postgis, to do this properly. It adds full OGC spec support for managing spatial/querying spatial data within a Postgres database.

It is an option included with the Windows Postgres installer, but is generally extra packages under Linux.


Cheers,

   Brent Wood


>>> mark <markkicks@gmail.com> 07/11/08 5:34 AM >>>
hi..
i want to store latitude and longitude in a users table.. what is the
best data type to use for this? i want to be able to find use this
info to find users within a distance..
how do i go about doing this?
thanks

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:
> Tom Lane wrote:
>> The way I think it ought to work is that the number of lexemes stored in
>> the final pg_statistic entry is statistics_target times a constant
>> (perhaps 100). I don't like having it vary depending on tsvector width

> I think the existing code puts at most statistics_target elements in a
> pg_statistic tuple. In compute_minimal_stats() num_mcv starts with
> stats->attr->attstattarget and is adjusted only downwards.
> My original thought was to keep that property for tsvectors (i.e. store
> at most statistics_target lexemes) and advise people to set it high for
> their tsvector columns (e.g. 100x their default).

Well, (1) the normal measure would be statistics_target *tsvectors*,
and we'd have to translate that to lexemes somehow; my proposal is just
to use a fixed constant instead of tsvector width as in your original
patch. And (2) storing only statistics_target lexemes would be
uselessly small and would guarantee that people *have to* set a custom
target on tsvector columns to get useful results. Obviously broken
defaults are not my bag.

> Also, the existing code decides which elements are worth storing as most
> common ones by discarding those that are not frequent enough (that's
> where num_mcv can get adjusted downwards). I mimicked that for lexemes
> but maybe it just doesn't make sense?

Well, that's not unreasonable either, if you can come up with a
reasonable definition of "not frequent enough"; but that adds another
variable to the discussion.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [GENERAL] storing latitude and longitude

Hi Mark,

Look at Postgis, to do this properly. It adds full OGC spec support for managing spatial/querying spatial data within a Postgres database.

It is an option included with the Windows Postgres installer, but is generally extra packages under Linux.


Cheers,

Brent Wood


>>> mark <markkicks@gmail.com> 07/11/08 5:34 AM >>>
hi..
i want to store latitude and longitude in a users table.. what is the
best data type to use for this? i want to be able to find use this
info to find users within a distance..
how do i go about doing this?
thanks

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Adding variables for segment_size, wal_segment_size and block sizes

Abhijit Menon-Sen <ams@oryx.com> writes:
> At 2008-07-03 16:36:02 +0200, mailings@oopsware.de wrote:
>> Here's a patch for this.

> I reviewed the patch, it basically looks fine. A few quibbles with the
> provided documentation:

Applied, with ams' doc changes and some further wordsmithing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[COMMITTERS] pgsql: Add unchangeable GUC "variables" segment_size, wal_block_size,

Log Message:
-----------
Add unchangeable GUC "variables" segment_size, wal_block_size, and
wal_segment_size to make those configuration parameters available to clients,
in the same way that block_size was previously exposed. Bernd Helmle, with
comments from Abhijit Menon-Sen and some further tweaking by me.

Modified Files:
--------------
pgsql/doc/src/sgml:
config.sgml (r1.182 -> r1.183)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/config.sgml?r1=1.182&r2=1.183)
pgsql/src/backend/utils/misc:
guc.c (r1.463 -> r1.464)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/misc/guc.c?r1=1.463&r2=1.464)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Tom Lane wrote:
> The way I think it ought to work is that the number of lexemes stored in
> the final pg_statistic entry is statistics_target times a constant
> (perhaps 100). I don't like having it vary depending on tsvector width

I think the existing code puts at most statistics_target elements in a
pg_statistic tuple. In compute_minimal_stats() num_mcv starts with
stats->attr->attstattarget and is adjusted only downwards.
My original thought was to keep that property for tsvectors (i.e. store
at most statistics_target lexemes) and advise people to set it high for
their tsvector columns (e.g. 100x their default).
Also, the existing code decides which elements are worth storing as most
common ones by discarding those that are not frequent enough (that's
where num_mcv can get adjusted downwards). I mimicked that for lexemes
but maybe it just doesn't make sense?

> But in any case, given a target number of lexemes to accumulate,
> I'd suggest pruning with that number as the bucket width (pruning
> distance). Or perhaps use some multiple of the target number, but
> the number itself seems about right.

Fine with me, I'm too tired to do the math now, so I'll take your word
for it :)

Cheers,
Jan

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating code coverage reports

It should be possible to make it work  for a VPATH build with appropriate arguments to gcov and lcov, but currently it expects the object files and generated data files to be in the build directory.

You need access to the build tree to generate coverage statistics and to generate the report with "make coverage" after running the tests or application.

-- Michelle

Tom Lane wrote:
Michelle Caisse <Michelle.Caisse@Sun.COM> writes:   
I have a patch that I will be submitting to add to the build system the  capability of reporting on test code coverage metrics for the test  suite.     
 Cool.    
To generate coverage statistics, you run configure with  --enable-coverage and after building and running tests, you do make  coverage. The process generates data files in the same directories as  source & object files and produces a coverage directory at the top level  with the html files.     
 How does that work with a VPATH build?  Are you trying to say that you still have to have the original build tree around in order to collect coverage data?  			regards, tom lane    

--  Michelle Caisse               Sun Microsystems California, U.S.     http://sun.com/postgresql  

[HACKERS] [WIP] collation support revisited (phase 1)

Hi,

after long discussion with Mr. Kotala, we've decided to redesign our collation support proposal.
For those of you who aren't familiar with my WIP patch and comments from other hackers here's the original mail: http://archives.postgresql.org/pgsql-hackers/2008-07/msg00019.php

In a few sentences - I'm writing collation support for PostgreSQL that is almost independent on used collating function. I will implement POSIX locales but switch to ICU will be quite easy. Collations and character sets defined by SQL standard will be hard coded so we avoid non-existence in some functions.

The whole project will be divided into two phases:

phase 1
Implement "sort of framework" so the PostgreSQL will have basic guts (pg_collation & pg_charset catalogs, CREATE COLLATION, add collation support for each type needed) and will support collation at database level. This phase has been accepted as a Google Summer of Code project.

phase 2
Implement the rest - full collation at column level. I will continue working on this after finishing phase one and it will be my master degree thesis.

How will the first part work?

Catalogs
- new catalogs pg_collation and pg_charset will be defined
- pg_collation and pg_charset will contain SQL standard collations + optional default collation (when set other than SQL standard one)
- pg_type, pg_attribute, pg_namespace will be extended with references to default records in pg_collation and pg_charset

initdb
- pg_collation & pg_charset will contain each pre-defined records regarding SQL standard and optionally one record that will be non-standard set when creating initdb (the one using system locales)
- these two records will be referenced by pg_type, pg_attribute, pg_namespace in concerned columns and will be concidered as default collation that will be inherited

CREATE DATABASE ... COLLATE ...
- after copying the new database the collation will be default (same as cluster collation) or changed by COLLATE statement. Then we update pg_type, pg_attribute and pg_namespace catalogs
- reindex database

When changing databases the database collation will be retrieved from type text from pg_type. This part should be the only one that will be deleted when proceeding with phase 2. But that will take a while :-)

Thanks for all your comments

    Regards

        Radek Strnad

Re: [INTERFACES] libPQ -Canceling Blocking Calls

Keary Suska <hierophant@pcisys.net> writes:
> Is it possible to use PQcancel() to cancel a blocking call, such as a call
> to PQexec?

Sure. Look at psql for an example. You probably need to call PQcancel
from a signal handler, but there's a signal-safe variant for that.

regards, tom lane

--
Sent via pgsql-interfaces mailing list (pgsql-interfaces@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-interfaces

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:
> Still, there's a decision to be made: after how many lexemes should the
> pruning occur?

The way I think it ought to work is that the number of lexemes stored in
the final pg_statistic entry is statistics_target times a constant
(perhaps 100). I don't like having it vary depending on tsvector width
--- why for example should a column having a few wide tsvectors get a
bigger stats entry than one with many narrow ones? (Not to mention the
issue of having to estimate the average or max width before you can
start the counting run.)

But in any case, given a target number of lexemes to accumulate,
I'd suggest pruning with that number as the bucket width (pruning
distance). Or perhaps use some multiple of the target number, but
the number itself seems about right. The LC paper says that the
bucket width w is equal to ceil(1/e) where e is the maximum frequency
estimation error, and that the maximum number of table entries needed
is log(eN)/e after N lexemes have been scanned. For the values of e
and N we are going to be dealing with, this is likely to work out to
a few times 1/e, in other words the table size is a few times w.
(They prove it's at most 7w given reasonable assumptions about data
distribution, regardless of how big N gets; though I think our values
for N aren't large enough for that to matter.)

The existing compute_minimal_stats code uses a table size of twice the
target number of values, so setting w to maybe a half or a third of the
target number would reproduce the current space usage. I don't see a
problem with letting it get a little bigger though, especially since we
can expect that the lexemes aren't very long. (compute_minimal_stats
can't assume that for arbitrary data types...)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [GENERAL] apache permission denied

ALTER ROLE apache LOGIN

It now works! Thank you Devrim and Peter for your help.


Devrim GÜNDÜZ wrote:
> On Thu, 2008-07-10 at 11:49 -0700, Chris Cosner wrote:
>> DBI connect('dbname=db','',...) failed: FATAL: role "apache" is not
>> permitted to log in at /home/www/cgi-bin/db.lib line 1635
>
> What about:
>
> ALTER ROLE apache LOGIN;
>
> -HTH.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Jan Urbański wrote:
>> Oh, one important thing. You need to choose a bucket width for the LC
>> algorithm, that is decide after how many elements will you prune your
>> data structure. I chose to prune after every twenty tsvectors.

> Do you prune after X tsvectors regardless of the numbers of lexemes in
> them? I don't think that preserves the algorithm properties; if there's
> a bunch of very short tsvectors and then long tsvectors, the pruning
> would take place too early for the initial lexemes. I think you should
> count lexemes, not tsvectors.

Yeah. I haven't read the Lossy Counting paper in detail yet, but I
suspect that the mathematical proof of limited error doesn't work if the
pruning is done on a variable spacing. I don't see anything very wrong
with pruning intra-tsvector; the effects ought to average out, since the
point where you prune is going to move around with respect to the
tsvector boundaries.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Alvaro Herrera wrote:
> Jan Urbański wrote:
>
>> Oh, one important thing. You need to choose a bucket width for the LC
>> algorithm, that is decide after how many elements will you prune your
>> data structure. I chose to prune after every twenty tsvectors.
>
> Do you prune after X tsvectors regardless of the numbers of lexemes in
> them? I don't think that preserves the algorithm properties; if there's
> a bunch of very short tsvectors and then long tsvectors, the pruning
> would take place too early for the initial lexemes. I think you should
> count lexemes, not tsvectors.

Yes, that's what I was afraid of. I'm not sure why I was reluctant to
prune in the middle of a tsvector, maybe it's just in my head.
Still, there's a decision to be made: after how many lexemes should the
pruning occur?

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Jan Urbański wrote:

> Oh, one important thing. You need to choose a bucket width for the LC
> algorithm, that is decide after how many elements will you prune your
> data structure. I chose to prune after every twenty tsvectors.

Do you prune after X tsvectors regardless of the numbers of lexemes in
them? I don't think that preserves the algorithm properties; if there's
a bunch of very short tsvectors and then long tsvectors, the pruning
would take place too early for the initial lexemes. I think you should
count lexemes, not tsvectors.


--
Alvaro Herrera

http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating code coverage reports

Michelle Caisse <Michelle.Caisse@Sun.COM> writes:
> I have a patch that I will be submitting to add to the build system the
> capability of reporting on test code coverage metrics for the test
> suite.

Cool.

> To generate coverage statistics, you run configure with
> --enable-coverage and after building and running tests, you do make
> coverage. The process generates data files in the same directories as
> source & object files and produces a coverage directory at the top level
> with the html files.

How does that work with a VPATH build? Are you trying to say that you
still have to have the original build tree around in order to collect
coverage data?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gsoc, text search selectivity and dllist enhancments

Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urbanski@students.mimuw.edu.pl> writes:
>> Do you think it's worthwhile to implement the LC algorithm in C and send
>> it out, so others could try it out? Heck, maybe it's worthwhile to
>> replace the current compute_minimal_stats() algorithm with LC and see
>> how that compares?
>
> Very possibly. I repeat that the current implementation of
> compute_minimal_stats is very ad-hoc code and wasn't written with an eye
> to high performance. Replacing it with an algorithm that someone
> actually thought about might well be worth doing.

Here's a patch that combines both patches included here:
http://archives.postgresql.org/message-id/48649261.5040703@students.mimuw.edu.pl
and adds a C implementation of the Lossy Counting algorithm.

It defines two typanalyze functions: ts_typanalyze_std and
ts_typanalyze_lc, so you can see what statistics are gathered by each of
them. It's meant for easy applying to HEAD, updating pg_type and running
ANALYZE on a few tables with tsvectors (i.e. testing, not commiting).

My observations are: the LC algorithm beats the previous one by a fairly
large margin (20-30%) timewise. The results are almost identical, I got
discrepancies of about 0.05 for some lexemes' frequencies. I intend to
stick with LC for tsvectors and that'll allow to throw away the Dllist
changes.

If I want to keep my GSoC schedule I won't be able to implement LC for
general statistics gathering, but it's trivial. If no one gets about it
I can do it after the Summer of Code (if only to see how it'll work).

Oh, one important thing. You need to choose a bucket width for the LC
algorithm, that is decide after how many elements will you prune your
data structure. I chose to prune after every twenty tsvectors. You might
consider:
- picking some other arbitrary value
- making it depend on the largest tsvector size
- making it depend on the statistics_target
- pruning after each X lexemes instead of after each Y tsvectors,
because now the buckets will vary in width and you can argue that the
order of input makes a difference again. OTOH the situation here is a
bit different: you get streams of mutually different elements (lexemes
inside a tsvector are all different) and pruning in the middle of such
stream might be unfair for lexemes that are still to be processed. Hmm,
dunno.

Cheers,
Jan

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

Re: [HACKERS] [PATCHES] WIP: executor_hook for pg_stat_statements

ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't want the tag there at all, much less converted to a pointer.
>> What would the semantics be of copying the node, and why?
>>
>> Please justify why you must have this and can't do what you want some
>> other way.

> In my pg_stat_statements plugin, the tag is used to cache hash values of
> SQL strings in PlannedStmt. It is not necessarily needed because the hash
> value is re-computable from debug_query_string. It is just for avoiding
> the work. In addition, we see different SQLs in debug_query_string in
> PREPARE/EXECUTE and DECLARE/FETCH. Hashed SQL cache can work on those
> commands.

Actually, that aspect of the plugin is 100% broken anyway, because it
assumes that debug_query_string has got something to do with the query
being executed. There are any number of scenarios where this is a bad
assumption.

I wonder whether we ought to change things so that the real query
source text is available at the executor level. Since we are (at least
usually) storing the query text in cached plans, I think this might just
require some API refactoring, not extra space and copying. It would
amount to a permanent decision that we're willing to pay the overhead
of keeping the source text around, though.

Also, after looking at the patch more closely, was there a good reason
for making the hook intercept ExecutePlan rather than ExecutorRun?
ExecutePlan was never intended to have a stable public API --- its
argument list is just a happenstance of what ExecutorRun needs to
fetch for its own purposes. I think we should keep it private and
have ExecutorRun do

if (hook)
hook(...);
else
standard_ExecutorRun(...);

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[INTERFACES] libPQ -Canceling Blocking Calls

Is it possible to use PQcancel() to cancel a blocking call, such as a call
to PQexec?

TIA,

Keary Suska
Esoteritech, Inc.
"Demystifying technology for your home or business"

--
Sent via pgsql-interfaces mailing list (pgsql-interfaces@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-interfaces

Re: [GENERAL] Regex problem

On Thu, Jul 10, 2008 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Scott Marlowe" <scott.marlowe@gmail.com> writes:
>> ...Which is not surprising. It's greedy. So, I turn off the greediness
>> of the first + with a ? and then I get this
>
>> select substring (notes from E'LONG DB QUERY.+?time: [0-9]+.[0-9]+')
>> from table where id=1;
>
>> LONG DB QUERY (db1, 4.9376289844513): UPDATE force_session SET
>> last_used_timestamp = 'now'::timestamp WHERE orgid = 15723 AND
>> session_id = 'f5ca5ec95965e8ac99ec9bc31eca84c6New session created
>> time: 5.0
>
>> Now, I'm pretty sure that with the [0-9]+.[0-9]+ I should be getting
>> 5.03999090194 at the end.
>
> You're getting bit by the fact that the initial non-greedy quantifier
> makes the entire regex non-greedy --- see rules in section 9.7.3.5:
> http://developer.postgresql.org/pgdocs/postgres/functions-matching.html#POSIX-MATCHING-RULES
>
> If you know that there will always be something after the first time
> value, you could do something like
>
> E'(LONG DB QUERY.+?time: [0-9]+\\.[0-9]+)[^0-9]'
>
> to force the issue about how much the second and third quantifiers
> match.

Thanks Tom, that's the exact answer I needed. Now, it's back to the
bit mines...

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general