m***@heroku.com
2013-02-15 01:49:48 UTC
The following bug has been logged on the website:
Bug reference: 7883
Logged by: Maciek Sakrejda
Email address: ***@heroku.com
PostgreSQL version: 9.1.8
Operating system: Ubuntu 12.04 64-bit
Description:
We ran into a customer database giving us the error above when replicating
from 9.1.7 to 9.1.8 and attempting to fail over to the 9.1.8. I noticed
several fixes to WAL replay in 9.1.8--could this be a factor in this case?
We're trying again with a fresh replica; hopefully that will just work. Logs
from the incident are below.
Thanks,
Maciek
Feb 15 00:49:12 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000094.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:12 [1300-1] [COPPER] LOG: invalid magic number 0000 in log
file 60, segment 148, offset 0
Feb 15 00:49:13 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000094.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:13 [1301-1] [COPPER] LOG: redo done at 3C/930000B0
Feb 15 00:49:13 [1302-1] [COPPER] LOG: last completed transaction was at
log time 2013-02-14 22:35:05.338681+00
Feb 15 00:49:15 wal_e.worker.s3_worker INFO MSG: completed download and
decompression#012 DETAIL: Downloaded and decompressed
"s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000093.lzo"
to "pg_xlog/RECOVERYXLOG"
Feb 15 00:49:15 [1303-1] [COPPER] LOG: restored log file
"000000010000003C00000093" from archive
Feb 15 00:49:15 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/00000002.history.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:16 [1304-1] [COPPER] LOG: selected new timeline ID: 2
Feb 15 00:49:16 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/00000001.history.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:16 [1305-1] [COPPER] LOG: archive recovery complete
Feb 15 00:49:16 [1306-1] [COPPER] WARNING: page 37956 of relation
base/16385/16430 was uninitialized
Feb 15 00:49:16 [1307-1] [COPPER] PANIC: WAL contains references to
invalid pages
Feb 15 00:49:17 [3-1] [COPPER] LOG: startup process (PID 7) was terminated
by signal 6: Aborted
Feb 15 00:49:17 [4-1] [COPPER] LOG: terminating any other active server
processes
Feb 15 00:49:17 [37-1] collectd [COPPER] WARNING: terminating connection
because of crash of another server process
Feb 15 00:49:17 [37-2] collectd [COPPER] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
Feb 15 00:49:17 [37-3] collectd [COPPER] HINT: In a moment you should be
able to reconnect to the database and repeat your command.
Bug reference: 7883
Logged by: Maciek Sakrejda
Email address: ***@heroku.com
PostgreSQL version: 9.1.8
Operating system: Ubuntu 12.04 64-bit
Description:
We ran into a customer database giving us the error above when replicating
from 9.1.7 to 9.1.8 and attempting to fail over to the 9.1.8. I noticed
several fixes to WAL replay in 9.1.8--could this be a factor in this case?
We're trying again with a fresh replica; hopefully that will just work. Logs
from the incident are below.
Thanks,
Maciek
Feb 15 00:49:12 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000094.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:12 [1300-1] [COPPER] LOG: invalid magic number 0000 in log
file 60, segment 148, offset 0
Feb 15 00:49:13 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000094.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:13 [1301-1] [COPPER] LOG: redo done at 3C/930000B0
Feb 15 00:49:13 [1302-1] [COPPER] LOG: last completed transaction was at
log time 2013-02-14 22:35:05.338681+00
Feb 15 00:49:15 wal_e.worker.s3_worker INFO MSG: completed download and
decompression#012 DETAIL: Downloaded and decompressed
"s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/000000010000003C00000093.lzo"
to "pg_xlog/RECOVERYXLOG"
Feb 15 00:49:15 [1303-1] [COPPER] LOG: restored log file
"000000010000003C00000093" from archive
Feb 15 00:49:15 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/00000002.history.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:16 [1304-1] [COPPER] LOG: selected new timeline ID: 2
Feb 15 00:49:16 wal_e.worker.s3_worker INFO MSG: could not locate object
while performing wal restore#012 DETAIL: The absolute URI that could
not be located is
s3://wal-e-[redacted]/wal-e-backups/timeline-0e0b390f-cb3f-4192-8cdb-fced4d54b0a2/wal_005/00000001.history.lzo.#012
HINT: This can be normal when Postgres is trying to detect what
timelines are available during restoration.
Feb 15 00:49:16 [1305-1] [COPPER] LOG: archive recovery complete
Feb 15 00:49:16 [1306-1] [COPPER] WARNING: page 37956 of relation
base/16385/16430 was uninitialized
Feb 15 00:49:16 [1307-1] [COPPER] PANIC: WAL contains references to
invalid pages
Feb 15 00:49:17 [3-1] [COPPER] LOG: startup process (PID 7) was terminated
by signal 6: Aborted
Feb 15 00:49:17 [4-1] [COPPER] LOG: terminating any other active server
processes
Feb 15 00:49:17 [37-1] collectd [COPPER] WARNING: terminating connection
because of crash of another server process
Feb 15 00:49:17 [37-2] collectd [COPPER] DETAIL: The postmaster has
commanded this server process to roll back the current transaction and exit,
because another server process exited abnormally and possibly corrupted
shared memory.
Feb 15 00:49:17 [37-3] collectd [COPPER] HINT: In a moment you should be
able to reconnect to the database and repeat your command.
--
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs