Discussion:
Excessive space allocations in Postgresql 9.1.6 system files causing the file system to run out of space.
(too old to reply)
Kevin Grittner
2013-02-27 15:55:01 UTC
Permalink
We have a Postgres database that was recently upgraded from 8.4.3
to 9.1.6.  We have noticed unusual growth in the data files and
generated a script to perform the following actions.
1. Query pg_class for all records
2. Generate a file listing of all postgres data files
3. Compare the two lists and eliminate all files that are
    contained within pg_class
There are 17359 data files.  After running the script, there are
5802 data files remaining that are not listed in pg_class.  Due
to the size of the (5802) data files (~4TB), I am not comfortable
about deleting them from the file system.  Does postgres 9.1.6
catalog every data file in pg_class?   Or does it leave some data
files off of this table?  If so, how can I determine if I have
stale, unnecessary data files on my file system?
Yeah, it's good to be cautious -- deleting a needed file can render
your database cluster unusable.  Be sure you have a good backup you
can go back to if you delete the wrong thing.

What directories are you looking in?

For a database or tablespace directory, are you excluding all files
which start with a filename you derived from pg_class and has a dot
or underscore followed by more characters?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Kevin Grittner
2013-03-01 16:00:30 UTC
Permalink
We did use pg_upgrade with the hard link option. We are not sure
if we ran the cleanup script.
Can we run this script now, even though its month's after we did
the upgrade?
Everything in the .../19177 directories represent data files
migrated over form postgres 8.4.3.  All new files get placed into
the .../PG_9.1_201105231/16411 directories.
The vast majority of the "orphan" files are from the
/opt/PostgreSQL/9.1/data/user_data/19177  directory.
I don't have any reason to expect that you *can't* run the script
at this point; but being a cautious person, I would do this at a
point where I was confident I could recover from a backup, and I
would read through the scripts carefully before applying them.

What you want to be really careful that you *don't* do is to modify
or truncate any of the hard-linked files, as they are quite likely
to still be just another name for the same file that is in use for
production under the newer version.  You want to simply remove the
older directory entry pointing to the file.

http://www.linfo.org/hard_link.html
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Kevin Grittner
2013-03-07 15:14:40 UTC
Permalink
Does anyone know, what the names/location of the pg_upgrade
cleanup scripts? We upgraded from 8.4.3 to 9.1.6
Sorry I didn't answer this sooner, but I didn't know off-hand and
hoped that someone who did would jump in.  Since that didn't happen
...

The file is delete_old_cluster.sh on most platforms.  The suffix is
.bat on Windows.

It can be run anytime after the upgrade *as long as you have not
moved the new cluster to the old location*.

Thanks to Bruce Momjian for telling me in chat when I asked just now.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Kevin Grittner
2013-03-07 19:56:00 UTC
Permalink
It looks like this script deletes all of the data file
directories (and the data files) from the postgres 8.4.3
instance. Since we used the "-k" option, which calls for hard
links rather than copying files to the new cluster, wouldn't this
be deleting the files?  If so, the DB would be corrupted beyond
recovery if we run it.  Or am I missing something?
You need to read up on hard links a bit more.  Here's one source:

http://www.linfo.org/hard_link.html

In summary, a hard link allows multiple directory entries to point
to the same data location.  Until you delete the last directory
entry the file remains.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-bugs mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Loading...