科学网—标签 - postgresql

相关帖子	版块	作者	回复/查看	最后发表

hillpig 2016-10-11 15:23

上一篇使用PostgreSQL中的row_to_json()直接获得JSON（PG JSON系列1）主要针对现有的非json/jsonb类型如何输出json结果，这一篇主要写一下在数据库设计中直接使用json/jsonb数据类型（而不是json/jsonb处理函数）带来的问题。我们的问题是：设定一个典型的JSON/JSONB使用场景，以及演示如何做增删改查首先来看JSON/JSONB的区别，参考 https://www.postgresql.org/docs/9.6/static/datatype-json.html ： There are two JSON data types: json and jsonb. They accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; whilejsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage. Because the json type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast, jsonb does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept. In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys. 最后一句话很明确，基本上使用的时候，都用JSONB。原因可以这么理解 json 存储是纯粹字符串，jsonb的存储和访问做了解析和优化，等后面我有时间专门写一篇jsonb和json是如何在内核里实现的，以及他们到底有啥区别，此处不做纠结。首先用JSONB来存储什么呢，我感觉和多维数组相关的比较合适，用ARRAY类型吧，输出的时候还得做个json转换，麻烦。在数据库设计的时候，有时候为了一小件事情设置个外键关联表有点不值得，感觉有点大材小用，例如下面的演出活动的票价套餐信息(父母帮应该给我广告费，顺便也提一下我们的“两小童”)：那设计数据库的时候，就有两种方案，一种是常规的做一个关联表（TActivityPricePackage），如下图：另外一种就是用jsonb来表示票价套餐（pricepackage），这种方法感觉到设计又简洁，还避免获取的时候做JOIN（乖乖，还能减少JAVA程序员工作，是不是也搞好了团队协作了），如下图：为了简洁的数据库设计的理想而奋斗，下面我们来做个实验：运行环境：PG 9.6，WIN8.1 64，Tomcat 8.0，JDK 8 数据库表： -- Table: TActivity -- 演出活动表 DROP TABLE IF EXISTS TActivity CASCADE; CREATE TABLE TActivity ( id bigint DEFAULT nextval('jsontest_uuid_seq') PRIMARY KEY,-- 活动id title character varying(128) NOT NULL,-- 活动名称 pricepackage jsonb NOT NULL -- 价格套餐,格式如： )WITH ( OIDS=FALSE ); 测试数据： -- Table: TActivity -- 演出活动表 insert into TActivity values(1,'演出活动标题1',' '); 好，我分别演示如何做增删改查。 1.先说增第一次Insert的时候，除了上面测试数据里面直接字符串的方式： insert into TActivity values(1,'演出活动标题1',' '); 以外，根据（ https://www.postgresql.org/docs/9.6/static/functions-json.html ）还可以有如下方法： json_build_object json_object json_build_array json_build_object和 json_object生成一个名值对的json对象比较方便，而 json_build_array只能生成比较简单的json array，正如官方文档例子里所展示的： json_build_array(1,2,'3',4,5) 综上我们发现，在第一次insert含有数组的时候，直接用字符串吧。那如果想给jsonb数组增加一条数据怎么办？我们可以使用连接操作符||： update tactivity set pricepackage = pricepackage || '{packagename:成人票新增3,price:189,stock:100}' where id = 1 或者新值在前面： update tactivity set pricepackage = '{packagename:成人票新增3,price:189,stock:100}' || pricepackage where id = 1 也可以使用函数 jsonb_insert，如在数组的最后增加一个： update tactivity set pricepackage = jsonb_insert(pricepackage, '{-1}', '{packagename:成人票新增100,price:189,stock:100}', true) where id = 1 其中-1表示从数组后面数第一个，即倒数第一个，true表示在该元素的后面插入。或在数组的头部插入：update tactivity set pricepackage = jsonb_insert(pricepackage, '{0}', '{packagename:成人票新增00,price:189,stock:100}', false) where id = 1 综上我们发现，给jsonb数组最后面或者最前面增加一条数据时，还是使用操作符 || 使用起来更加自然，不易出错。而如果想在jsonb数组的中间增加一条记录，则使用jsonb_insert()。 2.改如果我们想修改jsonb中的一个数据怎么办？假设我们现在pricepackage中jsonb数组已有两条数据： update tactivity set pricepackage = ' ' where id = 1 我们先看第一种情况，如何修改JSONB数组中的第1条完整的记录呢？即我想把：{price:189,packagename: 成人票0 ,stock:100} 替换成：{price:189,packagename: 成人票00 ,stock:100} 如何写呢？正常写法应该是： update tactivity set pricepackage = jsonb_set(pricepackage,'{0}','{price:189,packagename:成人票00,stock:100}', false ) where id = 1 但是我发现9.6.0关于jsonb_set()修改数组顶级元素的时候有个bug，参考我的另外一篇博文： 9.6.0版本里jsonb_set()有个bug，截至到2016.10.12 这里的false得替换成true才行，即： update tactivity set pricepackage = jsonb_set(pricepackage,'{0}','{price:189,packagename:成人票00,stock:100}', true ) where id = 1 而在9.5.4里却是正常的，即直接使用 false 即可。我们再看第二种情况，如何修改JSONB数组中的第1条记录中的某个值呢？如，我想把：{price:189,packagename: 成人票0 ,stock:100} 替换成：{price:189,packagename: 成人票000 ,stock:100} 如何写呢？正常写法应该是： update tactivity set pricepackage = jsonb_set(pricepackage,'{0,packagename}',' 成人票000 ',false) where id = 1 这回PG 9.6.0跑对了。我们再看第三种情况，如果我不知道是数组中的哪一条记录呢？即我想通过一个查找定位到该记录呢，其实这个问题是json的查找定位问题，我们放在JSON的查询那一部分讲。 3.删官方文档里关于JSONB的删除只有操作符： - text Delete key/value pair or string element from left operand. Key/value pairs are matched based on their key value. '{a: b}'::jsonb - 'a' - integer Delete the array element with specified index (Negative integers count from the end). Throws an error if top level container is not an array. ' '::jsonb - 1 #- text '::jsonb #- '{1,b}' 这里针对第2，3种情况做个演示： update tactivity set pricepackage = pricepackage - 0 where id = 1 以及： update tactivity set pricepackage = pricepackage #- '{0,stock}' where id = 1 4.查 JSON查找访问方面的操作，我们从字符串匹配/数值比较/ 这两类比较典型的来演示。假设数据库表TActivity有如下记录： delete from tactivity; insert into TActivity values(1,'演出活动标题1',' '); insert into TActivity values(2,'演出活动标题2',' '); 关于字符串匹配，我们演示的问题是：找出符合条件“ packagename = 儿童票（3-5岁） ”的那些活动？那SQL语句为: select * from TActivity a where a.pricepackage @ ' '::jsonb; 这里的 @ ' '::jsonb，注意必须是' , http://francs3.blog.163.com/blog/static/4057672720157153719971/, francs,PostgreSQL9.5：JSONB 数据类型: 支持元素修改，新增，删除码字码了半天，手指头都麻了，如果觉得对您还有用的话，赞助一下辛苦费吧：

个人分类: postgresql|16562 次阅读|0 个评论

使用PostgreSQL中的row_to_json()直接获得JSON（PG JSON系列1）

hillpig 2016-10-10 17:37

本系列博文讲述使用PG JSON/JSONB类型考虑的问题，这是第一篇（我尽量写的简单，让初学者跟着做很容易也能重复实验），看看不使用JSON/JSONB，仅仅使用row_to_json()函数能帮我们做什么。 JSON数据类型在当前互联网浪潮中得到了大量的采用，见下图GOOGLE指数对比JSON和XML，从全球范围来看：蓝色表示JSON的趋势，成明显的上升趋势。从中国范围来看， JSON 搜索热度明显高于XML： PG也从9.2开始正式引入JSON类型，那自然就问：既然DB支持了JSON和JSON相关函数，是不是中间服务层（如PHP,JAVA）通过JDBC可以直接获取到JSON数据，而无需再用org.json和json-lib库把以前的行数据进行转换？我们来做个实验：运行环境：PG 9.6，WIN8.1 64，Tomcat 8.0，JDK 8 -- 创建数据库 --postgres登陆pg，使用如下sql创建数据库和用户名密码： CREATE ROLE jsontest PASSWORD 'jsontest.dba' CREATEDB NOSUPERUSER CREATEROLE LOGIN; CREATE DATABASE jsontestdb OWNER jsontest ENCODING 'UTF8'; -- 用jsontest登录，然后是DDL： -- 创建一个全局的序列 DROP SEQUENCE IF EXISTS jsontest_uuid_seq CASCADE; CREATE SEQUENCE jsontest_uuid_seq START 1; -- Table: TUsers, 用户表 DROP TABLE IF EXISTS TUsers CASCADE; CREATE TABLE TUsers ( id bigint DEFAULT nextval('jsontest_uuid_seq') PRIMARY KEY,-- 用户id realName character varying(64) -- 真实姓名 )WITH ( OIDS=FALSE ); CREATE INDEX TUsers_cellphone_idx ON TUsers (realName); -- Table: TProject,用户创建的项目表 DROP TABLE IF EXISTS TProject CASCADE; CREATE TABLE TProject ( id bigint DEFAULT nextval('jsontest_uuid_seq') PRIMARY KEY,-- 活动id title character varying(256) NOT NULL UNIQUE, -- 项目名称,设置为UNIQUE，避免混淆 creatorId integer DEFAULT NULL REFERENCES TUsers (id) match simple on delete SET NULL -- 活动创建的用户id )WITH ( OIDS=FALSE ); CREATE INDEX TProject_creator_idx ON TProject( creatorId ); -- 测试数据： insert into TUsers values(1,'test1'); insert into TUsers values(2,'test2'); insert into TUsers values(3,'test3'); insert into TProject values(1,'测试项目1',1); insert into TProject values(2,'测试项目2',1); insert into TProject values(3,'测试项目3',1); DB设计图（在Eclipse用ERMaster反向生成）如下：好，JAVA这一块，我们分多种情况来分析： 1. 数据库最多只会返回一行数据，然后转换成JSON对象：如通过用户id获得用户JSON对象 public class User { public User() { super(); } /* * 根据用户id获取用户信息 */ public String getUserInfoFromId(String userId){ String res = null ; if(null == userId || userId.isEmpty() ) return res; Connection conn; Statement stm; ResultSet rs; DBPool dbp = new DBPool(); conn = dbp.getConnection(); try { stm = conn.createStatement(); String sql = SELECT row_to_json(a.*) from TUsers a where a.id = +userId; rs = stm.executeQuery(sql); if(rs != null rs.next()){ res = rs.getString(1); } } catch (SQLException e) { e.printStackTrace(); } finally{ dbp.closeConnection(); } return res; } } 上面的核心是SQL：SELECT row_to_json(a.*) from TUsers a where a.id = 1 那如果我获得的不是a.*，即不需要一整行数据（如密码不想返回）怎么办？如下会报错： SELECT row_to_json(a.id,a.realName) from TUsers a where a.id = 1 解决办法有： SELECT row_to_json(a.*) from (select b.id,b.realName from TUsers b where b.id = 1 ) a 或者使用WITH： WITH myInfo AS (select a.id,a.realName from TUsers a where a.id = 1 ) SELECT row_to_json(b.*) from myInfo b 或者： SELECT (SELECT row_to_json(_) from (select a.id, a.realName) as _) from TUsers a where a.id = 1 关于这个问题可以参考： http://dba.stackexchange.com/questions/27732/set-names-to-attributes-when-creating-json-with-row-to-json 注意的是，不能用row()函数，因为无法产生别名，即正如https://www.postgresql.org/docs/9.6/static/functions-json.html里所演示的那样： row_to_json(row(1,'foo')) {f1:1,f2:foo} 所以综合以上，建议用WITH方法，简洁明了。 public class UserWS extends HttpServlet ： protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String rtn = ; request.setCharacterEncoding(UTF-8); response.setContentType(text/json; charset=UTF-8); response.setCharacterEncoding(UTF-8); String action = request.getParameter(action); if(action.equalsIgnoreCase(getuserinfo)) {//其他人通过用户ID获取用户信息 String userid = (String) request.getParameter(userid); User u = new User(); String userJson = u.getUserInfoFromId(userid); if(null != userJson){//存在该用户信息 rtn = new StringBuilder() .append( {issuccessful:true,data:) .append((null==userJson)?:userJson) .append(}) .toString(); }else{ rtn = {issuccessful:false,errorcode:NOEXIST_USER}; } } String callbackFunName = request.getParameter(callbackparam); if(null != callbackFunName ){ response.getOutputStream().write(StringUtil.wrapCrossDomain(rtn,callbackFunName).getBytes(UTF-8)); }else{ response.getOutputStream().write(rtn.getBytes(UTF-8)); } } 测试一下，当我们发送： http://localhost:8080/jsonTest/user?action=getuserinfouserid=1 返回： { issuccessful : true , data : { id : 1 , realname : test1 } } 2. 数据库返回多行数据：如获取用户参与的项目 WITH myProjects AS (select a.id,a.title from TProject a where a.creatorId = 1) SELECT row_to_json(b.*) from (SELECT array_to_json(array(select row_to_json(myProjects.*) from myProjects),false) as myProjects) b 测试一下： http://localhost:8080/jsonTest/project?action=getmyprojectsuserid=1 返回： { issuccessful:true, data:{ myprojects: } } 3. 返回某些表的一行数据，加上某些表的多行数据：如获取用户基本信息以及用户参与的项目 WITH myInfo AS (select id,realName from TUsers where id = 1 ), -- 一行数据 myProjects AS (select a.id,a.title from TProject a,myInfo b where a.creatorId = b.id) -- 多行数据 SELECT row_to_json(x.*) from ( select c.*,d.* from myInfo c, (SELECT array_to_json(array(select row_to_json(myProjects.*) from myProjects),false) as myProjects) d -- 把多行数据生成一行一列json数组 ) x 该SQL返回： { myprojects : , id : 1 , realname : test1 } 总结一下：仅仅使用PG的row_to_json()，即可轻松返回需要的各种SQL数据。这样子使得中间层（JAVA/PHP）的代码变得简化，无需org.json和json-lib库再做一次封装，这是架构师要注意的一个重要趋势。码字码了半天，手指头都麻了，如果觉得对您还有用的话，赞助一下辛苦费吧：

个人分类: postgresql|17699 次阅读|1 个评论

openbsd下安装postgresql

outcrop 2013-10-17 15:40

文档有点问题，只好自己多摸索了下。 1、安装服务器及管理端 pkg_add -vv postgresql-server pkg_add -vv pgadmin3 2、设置数据库用户及数据存放位置。openbsd中，默认/var的空间很小 useradd -m postgres passwd postgres mkdir /home/pgsql /home/pgsql/var chown -R postgres:users /home/pgsql su postgres initdb -D /home/pgsql/var exit 3、设置rc.local自启动pg，pg手册关于openbsd启动设置的文档错了 if ; then su postgres -c '/usr/local/bin/pg_ctl -D /home/pgsql/var start -l /home/pgsql/logfile -s' echo -n 'postgresql' fi 默认的pgamin3支持到pg 9.1，但ports里的是9.2.3；会报错，但能用，毕竟不是常用pgadmin.

个人分类: 计算机应用技术|4061 次阅读|0 个评论

[转载]MySQL和PostgreSQL比较

bioseq 2012-9-13 17:06

特性 MySQL PostgreSQL 实例通过执行 MySQL 命令（mysqld）启动实例。一个实例可以管理一个或多个数据库。一台服务器可以运行多个 mysqld 实例。一个实例管理器可以监视 mysqld 的各个实例。通过执行 Postmaster 进程（pg_ctl）启动实例。一个实例可以管理一个或多个数据库，这些数据库组成一个集群。集群是磁盘上的一个区域，这个区域在安装时初始化并由一个目录组成，所有数据都存储在这个目录中。使用 initdb 创建第一个数据库。一台机器上可以启动多个实例。数据库数据库是命名的对象集合，是与实例中的其他数据库分离的实体。一个 MySQL 实例中的所有数据库共享同一个系统编目。数据库是命名的对象集合，每个数据库是与其他数据库分离的实体。每个数据库有自己的系统编目，但是所有数据库共享 pg_databases。数据缓冲区通过 innodb_buffer_pool_size 配置参数设置数据缓冲区。这个参数是内存缓冲区的字节数，InnoDB 使用这个缓冲区来缓存表的数据和索引。在专用的数据库服务器上，这个参数最高可以设置为机器物理内存量的 80%。 Shared_buffers 缓存。在默认情况下分配 64 个缓冲区。默认的块大小是 8K。可以通过设置 postgresql.conf 文件中的 shared_buffers 参数来更新缓冲区缓存。数据库连接客户机使用 CONNECT 或 USE 语句连接数据库，这时要指定数据库名，还可以指定用户 id 和密码。使用角色管理数据库中的用户和用户组。客户机使用 connect 语句连接数据库，这时要指定数据库名，还可以指定用户 id 和密码。使用角色管理数据库中的用户和用户组。身份验证 MySQL 在数据库级管理身份验证。基本只支持密码认证。 PostgreSQL 支持丰富的认证方法：信任认证、口令认证、Kerberos 认证、基于 Ident 的认证、LDAP 认证、PAM 认证加密可以在表级指定密码来对数据进行加密。还可以使用 AES_ENCRYPT 和 AES_DECRYPT 函数对列数据进行加密和解密。可以通过 SSL 连接实现网络加密。可以使用 pgcrypto 库中的函数对列进行加密/解密。可以通过 SSL 连接实现网络加密。审计可以对 querylog 执行 grep。可以在表上使用 PL/pgSQL 触发器来进行审计。查询解释使用 EXPLAIN 命令查看查询的解释计划。使用 EXPLAIN 命令查看查询的解释计划。备份、恢复和日志 InnoDB 使用写前（write-ahead）日志记录。支持在线和离线完全备份以及崩溃和事务恢复。需要第三方软件才能支持热备份。在数据目录的一个子目录中维护写前日志。支持在线和离线完全备份以及崩溃、时间点和事务恢复。可以支持热备份。 JDBC 驱动程序可以从参考资料下载 JDBC 驱动程序。可以从参考资料下载 JDBC 驱动程序。表类型取决于存储引擎。例如，NDB 存储引擎支持分区表，内存引擎支持内存表。支持临时表、常规表以及范围和列表类型的分区表。不支持哈希分区表。由于PostgreSQL的表分区是通过表继承和规则系统完成了，所以可以实现更复杂的分区方式。索引类型取决于存储引擎。MyISAM：BTREE，InnoDB：BTREE。支持 B-树、哈希、R-树和 Gist 索引。约束支持主键、外键、惟一和非空约束。对检查约束进行解析，但是不强制实施。支持主键、外键、惟一、非空和检查约束。存储过程和用户定义函数支持 CREATE PROCEDURE 和 CREATE FUNCTION 语句。存储过程可以用 SQL 和 C++ 编写。用户定义函数可以用 SQL、C 和 C++ 编写。没有单独的存储过程，都是通过函数实现的。用户定义函数可以用 PL/pgSQL（专用的过程语言）、PL/Tcl、PL/Perl、PL/Python 、SQL 和 C 编写。触发器支持行前触发器、行后触发器和语句触发器，触发器语句用过程语言复合语句编写。支持行前触发器、行后触发器和语句触发器，触发器过程用 C 编写。系统配置文件 my.conf Postgresql.conf 数据库配置 my.conf Postgresql.conf 客户机连接文件 my.conf pg_hba.conf XML 支持有限的 XML 支持。有限的 XML 支持。数据访问和管理服务器 OPTIMIZE TABLE —— 回收未使用的空间并消除数据文件的碎片 myisamchk -analyze —— 更新查询优化器所使用的统计数据（MyISAM 存储引擎） mysql —— 命令行工具 MySQL Administrator —— 客户机 GUI 工具 Vacuum —— 回收未使用的空间 Analyze —— 更新查询优化器所使用的统计数据 psql —— 命令行工具 pgAdmin —— 客户机 GUI 工具并发控制支持表级和行级锁。InnoDB 存储引擎支持 READ_COMMITTED、READ_UNCOMMITTED、REPEATABLE_READ 和 SERIALIZABLE。使用 SET TRANSACTION ISOLATION LEVEL 语句在事务级设置隔离级别。支持表级和行级锁。支持的 ANSI 隔离级别是 Read Committed（默认 —— 能看到查询启动时数据库的快照）和 Serialization（与 Repeatable Read 相似 —— 只能看到在事务启动之前提交的结果）。使用 SET TRANSACTION 语句在事务级设置隔离级别。使用 SET SESSION 在会话级进行设置。 MySQL相对于PostgreSQL的劣势： MySQL PostgreSQL 最重要的引擎InnoDB很早就由Oracle公司控制。目前整个MySQL数据库都由Oracle控制。 BSD协议，没有被大公司垄断。对复杂查询的处理较弱，查询优化器不够成熟很强大的查询优化器，支持很复杂的查询处理。只有一种表连接类型:嵌套循环连接(nested-loop),不支持排序-合并连接(sort-merge join)与散列连接(hash join)。都支持性能优化工具与度量信息不足提供了一些性能视图，可以方便的看到发生在一个表和索引上的select、delete、update、insert统计信息，也可以看到cache命中率。网上有一个开源的pgstatspack工具。 InnoDB的表和索引都是按相同的方式存储。也就是说表都是索引组织表。这一般要求主键不能太长而且插入时的主键最好是按顺序递增，否则对性能有很大影响。不存在这个问题。大部分查询只能使用表上的单一索引;在某些情况下，会存在使用多个索引的查询,但是查询优化器通常会低估其成本,它们常常比表扫描还要慢。不存在这个问题表增加列，基本上是重建表和索引，会花很长时间。表增加列，只是在数据字典中增加表定义，不会重建表存储过程与触发器的功能有限。可用来编写存储过程、触发器、计划事件以及存储函数的语言功能较弱除支持pl/pgsql写存储过程，还支持perl、python、Tcl类型的存储过程：pl/perl，pl/python，pl/tcl。也支持用C语言写存储过程。不支持Sequence。支持不支持函数索引，只能在创建基于具体列的索引。不支持物化视图。支持函数索引，同时还支持部分数据索引，通过规则系统可以实现物化视图的功能。执行计划并不是全局共享的, 仅仅在连接内部是共享的。执行计划共享 MySQL支持的SQL语法(ANSI SQL标准)的很小一部分。不支持递归查询、通用表表达式（Oracle的with 语句）或者窗口函数（分析函数）。都支持不支持用户自定义类型或域(domain) 支持。对于时间、日期、间隔等时间类型没有秒以下级别的存储类型可以精确到秒以下。身份验证功能是完全内置的，不支持操作系统认证、PAM认证，不支持LDAP以及其它类似的外部身份验证功能。支持OS认证、Kerberos 认证、Ident 的认证、LDAP 认证、PAM 认证不支持database link。有一种叫做Federated的存储引擎可以作为一个中转将查询语句传递到远程服务器的一个表上,不过,它功能很粗糙并且漏洞很多有dblink，同时还有一个dbi-link的东西，可以连接到oracle和mysql上。 Mysql Cluster可能与你的想象有较大差异。开源的cluster软件较少。复制(Replication)功能是异步的,并且有很大的局限性.例如,它是单线程的(single-threaded),因此一个处理能力更强的Slave的恢复速度也很难跟上处理能力相对较慢的Master. 有丰富的开源cluster软件支持。 explain看执行计划的结果简单。 explain返回丰富的信息。类似于ALTER TABLE或CREATE TABLE一类的操作都是非事务性的.它们会提交未提交的事务，并且不能回滚也不能做灾难恢复 DDL也是有事务的。 PostgreSQL主要优势：　　1. PostgreSQL完全免费，而且是BSD协议，如果你把PostgreSQL改一改，然后再拿去卖钱，也没有人管你，这一点很重要，这表明了PostgreSQL数据库不会被其它公司控制。oracle数据库不用说了，是商业数据库，不开放。而MySQL数据库虽然是开源的，但现在随着SUN被oracle公司收购，现在基本上被oracle公司控制，其实在SUN被收购之前，MySQL中最重要的InnoDB引擎也是被oracle公司控制的，而在MySQL中很多重要的数据都是放在InnoDB引擎中的，反正我们公司都是这样的。所以如果MySQL的市场范围与oracle数据库的市场范围冲突时，oracle公司必定会牺牲MySQL，这是毫无疑问的。　　2. 与PostgreSQl配合的开源软件很多，有很多分布式集群软件，如pgpool、pgcluster、slony、plploxy等等，很容易做读写分离、负载均衡、数据水平拆分等方案，而这在MySQL下则比较困难。 3. PostgreSQL源代码写的很清晰，易读性比MySQL强太多了，怀疑MySQL的源代码被混淆过。所以很多公司都是基本PostgreSQL做二次开发的。 4. PostgreSQL在很多方面都比MySQL强，如复杂SQL的执行、存储过程、触发器、索引。同时PostgreSQL是多进程的，而MySQL是线程的，虽然并发不高时，MySQL处理速度快，但当并发高的时候，对于现在多核的单台机器上，MySQL的总体处理性能不如PostgreSQL，原因是MySQL的线程无法充分利用CPU的能力。目前只想到这些，以后想到再添加，欢迎大家拍砖。 PostgreSQL与oracle或InnoDB的多版本实现的差别 PostgreSQL与oracle或InnoDB的多版本实现最大的区别在于最新版本和历史版本是否分离存储，PostgreSQL不分，而oracle和InnoDB分，而innodb也只是分离了数据,索引本身没有分开。 PostgreSQL的主要优势在于： 1. PostgreSQL没有回滚段，而oracle与innodb有回滚段，oracle与Innodb都有回滚段。对于oracle与Innodb来说，回滚段是非常重要的，回滚段损坏，会导致数据丢失，甚至数据库无法启动的严重问题。另由于PostgreSQL没有回滚段，旧数据都是记录在原先的文件中，所以当数据库异常crash后，恢复时，不会象oracle与Innodb数据库那样进行那么复杂的恢复，因为oracle与Innodb恢复时同步需要redo和undo。所以PostgreSQL数据库在出现异常crash后，数据库起不来的几率要比oracle和mysql小一些。 2. 由于旧的数据是直接记录在数据文件中，而不是回滚段中，所以不会象oracle那样经常报ora-01555错误。 3. 回滚可以很快完成，因为回滚并不删除数据，而oracle与Innodb，回滚时很复杂，在事务回滚时必须清理该事务所进行的修改，插入的记录要删除，更新的记录要更新回来(见row_undo函数)，同时回滚的过程也会再次产生大量的redo日志。 4. WAL日志要比oracle和Innodb简单，对于oracle不仅需要记录数据文件的变化，还要记录回滚段的变化。 PostgreSQL的多版本的主要劣势在于： 1、最新版本和历史版本不分离存储，导致清理老旧版本需要作更多的扫描，代价比较大，但一般的数据库都有高峰期，如果我们合理安排VACUUM，这也不是很大的问题，而且在PostgreSQL9.0中VACUUM进一步被加强了。　　2、由于索引中完全没有版本信息，不能实现Coverage index scan，即查询只扫描索引，直接从索引中返回所需的属性，还需要访问表。而oracle与Innodb则可以; 进程模式与线程模式的对比 PostgreSQL和oracle是进程模式，MySQL是线程模式。进程模式对多CPU利用率比较高。进程模式共享数据需要用到共享内存，而线程模式数据本身就是在进程空间内都是共享的，不同线程访问只需要控制好线程之间的同步。线程模式对资源消耗比较少。所以MySQL能支持远比oracle多的更多的连接。对于PostgreSQL的来说，如果不使用连接池软件，也存在这个问题，但PostgreSQL中有优秀的连接池软件软件，如pgbouncer和pgpool，所以通过连接池也可以支持很多的连接。堆表与索引组织表的的对比 Oracle支持堆表，也支持索引组织表 PostgreSQL只支持堆表，不支持索引组织表 Innodb只支持索引组织表索引组织表的优势：表内的数据就是按索引的方式组织，数据是有序的，如果数据都是按主键来访问，那么访问数据比较快。而堆表，按主键访问数据时，是需要先按主键索引找到数据的物理位置。索引组织表的劣势：索引组织表中上再加其它的索引时，其它的索引记录的数据位置不再是物理位置，而是主键值，所以对于索引组织表来说，主键的值不能太大，否则占用的空间比较大。对于索引组织表来说，如果每次在中间插入数据，可能会导致索引分裂，索引分裂会大大降低插入的性能。所以对于使用innodb来说，我们一般最好让主键是一个无意义的序列，这样插入每次都发生在最后，以避免这个问题。由于索引组织表是按一个索引树，一般它访问数据块必须按数据块之间的关系进行访问，而不是按物理块的访问数据的，所以当做全表扫描时要比堆表慢很多，这可能在OLTP中不明显，但在数据仓库的应用中可能是一个问题。　 PostgreSQL9.0中的特色功能： PostgreSQL中的Hot Standby功能也就是standby在应用日志同步时，还可以提供只读服务，这对做读写分离很有用。这个功能是oracle11g才有的功能。 PostgreSQL异步提交（Asynchronous Commit）的功能：　这个功能oracle中也是到oracle11g R2才有的功能。因为在很多应用场景中，当宕机时是允许丢失少量数据的，这个功能在这样的场景中就特别合适。在PostgreSQL9.0中把synchronous_commit设置为false就打开了这个功能。需要注意的是，虽然设置为了异步提交，当主机宕机时，PostgreSQL只会丢失少量数据，异步提交并不会导致数据损坏而数据库起不来的情况。MySQL中没有听说过有这个功能。 PostgreSQL中索引的特色功能： PostgreSQL中可以有部分索引，也就是只能表中的部分数据做索引，create index 可以带where 条件。同时PostgreSQL中的索引可以反向扫描，所以在PostgreSQL中可以不必建专门的降序索引了。

3938 次阅读|0 个评论

[转载]转发：Postgresql 9.0 Replication - Step by Step

hillpig 2012-2-21 09:38

本文转发自：http://linuxhow-tos.blogspot.com/2010/10/postgresql-90-replication-step-by-step_14.html由于国内用户无法访问blogspot，由于这篇技术博文还不错，故转发。 Postgresql 9.0 Replication - Step by Step Postgresql 9.0 Streaming Replication - Big Competition Now Postgresql 9.0 has come with in-built replication features. NOW A BIG COMPETITOR FOR MYSQL This method for those who wants to satisfy them by implementing Asynchronous mode of replication Who wants Synchronous method, and then have to wait for 9.1 to come in existence. I tried to build a Postgresql Master-Slave replication system with new feature “Streaming Replication (SR)”, and this I have to say, this is very easy and stable method for asynchronous method for replication While building the SR, I felt, this type of SR is like Mysql Master-slave replication, Slaves are getting updates from master from log files (binlog in cas e of mysql and xlog in postgres) The above statement can be taken as rough idea when comparing Mysql and Postgresql replication method Steps Included: 1. Download and build, install Postgresql from source. 2. Initialize database on master 3. Some tweaks in postgresql.conf (On Master) 4. Start backup mode on master 5. Take snapshot of Data (/usr/local/postgres/data) directory on master 6. Sync snapshot to slave and remove postmaster.pid on slave 7. Some tweaks in postgresql.conf and recovery.conf (On Slaves) 8. At end start the Database on Master and slave So here are the steps: Download the Database: (On Master) # wget http://wwwmaster.postgresql.org/redir/198/h/source/v9.0.0/postgresql-9.0.0.tar.bz2 # tar xzf postgresql-9.0.0.tar.bz2 If you have fresh CentOS/other linux version, then maybe you have to download and install required packages… In my case I am Using CentOS 5.3 64 Bit Run the below command as “ROOT” # yum install -y glibc gcc libtermcap-devel readline-devel zlib-devel # useradd postgres # chown –R postgres:postgres postgresql-9.0.0 # su – postgres # cd postgresql-9.0.0 (extracted directory) Run the following command to compile and install Postgresql # ./configure –prefix=/usr/local/postgres –with-perl # gmake # su – # gmake install # chown -R postgres:postgres /usr/local/postgres # mkdir /usr/local/postgres/data # chown postgres /usr/local/postgres/data # su - postgres # /usr/local/postgres/bin/initdb -D /usr/local/postgres/data After the initializing the data directory, we have to tweak--- “/usr/local/postgres/data/postgresql.conf“ file I am showing you required things only listen_addresses = '*' max_connections = 500 shared_buffers = 32MB wal_level = hot_standby archive_mode = on archive_command = 'cp %p /usr/local/postgres/data/pg_xlogarch/%f' max_wal_senders = 1 ####( this setting for 1 slave can connect with Master) wal_keep_segments = 32 datestyle = 'iso, mdy' lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' default_text_search_config = 'pg_catalog.english' # mkdir /usr/local/postgres/data/pg_xlogarch/ # su – postgres # cd /usr/local/postgres/ # vi data/pg_hba.conf (add the below line) host replication all 172.32.1.2/32 trust Now start the Database Instance on Master: As user postgres # /usr/local/postgres/bin/pg_ctl -D /usr/local/postgres/data -l logfile start After starting the database Instance, start the backup mode on DB Master # cd/usr/local/postgres/bin/ # ./psql –c SELECT pg_start_backup('initial backup for SR') database-name database-name = database name for which you want replication # cd /usr/local/postgres/ # tar -czf data.tar.gz data/ Exit from Backup mode # /usr/local/postgres/bin/psql –c SELECT pg_stop_backup() database-name # rsync –avz data.tar.gz postgres@172.32.1.2:/usr/local/postgrsql/ (run this command after the slave’s Postgresql installation) On SLAVE: Same as master download and install Postgresql on slave Just do not initialize the database: As user Postgres # cd /usr/local/postgrsql # tar xzf data.tar.gz # chown –R postgres:postgres /usr/local/postgres/data/pg_xlogarch # cd data # rm –rf postmaster.pid # vi /usr/local/postgrsql/data/Postgresql.conf In short: listen_addresses = '*' # what IP address(es) to listen on; max_connections = 500 # (change requires restart) shared_buffers = 32MB # min 128kB hot_standby = on datestyle = 'iso, mdy' lc_messages = 'en_US.UTF-8' # locale for system error message lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting default_text_search_config = 'pg_catalog.english' Now create a “ recovery.conf ” under /usr/local/postgres/data/ and put the below lines: (In Slave) standby_mode = 'on' primary_conninfo = 'host=172.32.1.1 port=5432 user=readonly' ### restore_command = 'cp /usr/local/postgres/data/ pg_xlogarch /%f %p' ---- This is wrong entry Use this one: restore_command = 'cp /usr/local/postgres/data/ pg_xlog /%f %p' That's it you are done with configurations: Now Start the database instance on slave machine # /usr/local/postgres/bin/pg_ctl -D /usr/local/postgres/data -l logfile start If all goes well, the on master, you will see like this on Master: LOG: replication connection authorized: user=readonly host=172.32.1.2 port=55811 And on slave’s log file (logfile) LOG: streaming replication successfully connected to primary If fails, check logfile and send error, if you don't able to resolve. New Addition: If you want to check, which pgsql server is Master or Slave, then run this command: # /usr/local/postgres/bin/pg_controldata /usr/local/postgres/data/ (running on Master) pg_control version number: 903 Catalog version number: 201008051 Database system identifier: 5572911747953590263 Database cluster state: in production (This shows, it is master server) pg_control last modified: Fri 06 May 2011 05:09:20 PM IST Latest checkpoint location: 2/97482E60 ............ On Slave: Same Line will show as: Database cluster state: in archive recovery 加我私人微信，交流技术。

个人分类: postgresql|4170 次阅读|0 个评论

PostgreSQL9.1 Warm-Standby ---之基于拷贝WAL文件的方法

hillpig 2012-2-19 18:19

本教程是PostgreSQL Cluster系列教程的一部分，该系列包括： PostgreSQL9.1 PITR示例（该教程主要阐述DBA如何基于WAL日志做备份恢复） PostgreSQL9.1 Warm-Standby ---之基于拷贝WAL文件的方法 (file-based log shipping) P ostgreSQL9.1 Warm-Standby ---之基于流复制的方法 (streaming replication) PostgreSQL9.1 Warm-Standby ---之基于同步复制的方法 (Synchronous Replication) PostgreSQL9.1 Hot-Standby ---之基于拷贝WAL文件的方法 PostgreSQL9.1 Hot-Standby ---之基于流复制的方法 PostgreSQL9.1 Hot-Standby ---之基于同步复制的方法 PG9.1+pgpool-II3.1--之HA (Hot-Standby+Streaming Replication) PG9.1+pgpool-II3.1--之Load Balancing (when meeting large amounts of requests) PG9.1+pgpool-II3.1--之Parallel Query (when meeting large amounts of data) PostgreSQL9.1 HA --- 之Slony 好了，有了第一个教程，我们终于等来了高可用性（High Availability，HA）的介绍。说到HA，我们不得不提提Fault Tolerance，为什么这么说呢，这是因为Availablity肯定是和Unavailability对应，那为什么会有Unavailability呢，那是因为有许多Fault，包括硬件的和软件的，所以我们得先从Fault Tolerance说起。先来看看Fault Tolerance概念，按照传统，看看WIKI上的定义： Fault-tolerance or graceful degradation is the property that enables a system (often computer -based) to continue operating properly in the event of the failure of (or one or more faults within) some of its components. 哦，简单来说，就是系统有了failure了，还能继续运行。搞学术的人总想从历史的角度来寻找Fault Tolerance的出处，我们再看看1978年BRIAN RANDELL在其论文“ System Structure for Software Fault Tolerance ”中的解释： The concept of “fault-tolerant computing” has existed for a long time. The first book on the subject （ W. H. Pierce, Failure-Tolerant Computer Design. New York: Academic, 1965，此处我给补充）was published no less than ten years ago, but the notion of fault tolerance has remained almost exclusively the preserve of the hardware designer. Hardware structures have been developed which can “tolerate” faults, i.e., continue to provide the required facilities despite occasional failures, either transient or permanent, of internal components and modules. 所以我们从上面来看，Fault Tolerance的理念至少在计算机出现的早期（1965年之前）就有了，在硬件设计时是必须考虑的。咱们不扯多了（有兴趣的朋友，可以多写写），接下来的问题是HA和Fault Tolerance又是什么关系？我们再来看看YENNUN HUANG， CHANDRA KINTALA在1995年的论文 Software Fault Tolerance in the Application Layer 中所写的一段描述： From a user’s point of view, fault tolerance has two dimensions: availability and data consistency of the application. For example, users of telephone switching systems demand continuous availability whereas bank teller machine customers demand the highest degree of data consistency. Safety critical real-time systems such as nuclear power reactors and flight control systems need highest levels in both availability and data consistency. Most other applications have lower degrees of requirements for fault-tolerance in both dimensions; see Figure 10.1（这里我把这个图截下来，供大家看看）。上面的话的意思是从用户的角度来看， fault tolerance有两个方面，一个是可用性（availability），一个是数据一致性（data consistency）。有些应用可能注重高的数据一致性，例如银行柜台机（ bank teller machine），有些应用可能注重高的可用性，如电话交换系统（telephone switching system），有的二者都要求高，如核电厂和飞行控制系统（nuclear power reactors ， flight control system）。怎么样，HA和 fault tolerance的关系明白了吧。下面列出一些参考资料，有兴趣的朋友可以继续深入下去。 Tandem Computers wiki: High Availability Oracle Database High Availability James A. Katzman， Tandem 16: A Fault-Tolerant Computing System BRIAN RANDELL， System Structure for Software Fault Tolerance Jim Grey, High-Availability Computer Systems Flavin Cristian, Understanding fault-tolerant distributed systems Mary Baker, Mark Sullivan, The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment FAULT-TOLERANT COMPUTING Hui-I Hsiao，A Performance Study of Three High Availability Data Replication Strategies http://h20223.www2.hp.com/nonstopcomputing/cache/76385-0-0-0-121.html 好了，我们大体明白了什么是HA之后，那如果我们做HA的话需要做什么呢？根据Jim Grey的 High-Availability Computer Systems 中描述的： High availability requires systems designed to tolerate faults to detect a fault, report it, mask it, and then continue service while the faulty component is repaired off line. 这就要求我们需要探测一个故障（fault），报告，屏蔽并在离线修复该故障时继续提供有效的服务。关于fault，我们多说两句，那到底有什么故障呢，哪些故障最容易出现，根据 Mary Baker, Mark Sullivan, The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment ，提到的一张表，如下：其中谈到In 1990, software errors accounted for 62% of Tandem system failures, while only 7% were caused by hardware. 即软件错误在1990年的Tandem系统中占了62%（有谁能给我一个现在fault发生率的报告？多谢）。虽然我们这里不去过多的探讨硬件具有Fault Tolerance的计算机，但是还是列出一些，如1950–1954 年建的SAPO恐怕是第一台具有Fault Tolerance的计算机，以及让Fault Tolerance深入人心的Tandem Computer（参阅 FAULT-TOLERANT COMPUTING ，第3页）。为了简化本文，关于HA的基础就到此为止，下面看看如何使用warm-standby（基于拷贝WAL文件的方法）来实施HA。首先我们来看看warm-standby的含义，根据http://www.postgresql.org/docs/9.1/interactive/high-availability.html，其中的一段话： Servers that can modify data are called read/write, master or primary servers. Servers that track changes in the master are called standby or slave servers. A standby server that cannot be connected to until it is promoted to a master server is called a warm standby server, and one that can accept connections and serves read-only queries is called a hot standby server. 从中我们知道，warm-standby在系统出现fault的时候，可以提升为master，即可以接受客户端的connect连接，并提供数据库的读写，角色如同一个master一样。而根据 25.5 的说明（如下），hot-standby则在为archive recovery 或 standby mode时，只能接受可读的query。 Hot Standby is the term used to describe the ability to connect to the server and run read-only queries while the server is in archive recovery or standby mode. This is useful both for replication purposes and for restoring a backup to a desired state with great precision （这一句什么意思？）. 尽管，网上还有一些关于cold/warm/hot server的定义，如这和这儿，为保持本文的简易性，此处不再详细分析这些概念上的区别（感兴趣的朋友可以继续深入研究）。好，那接下来我们看看warm-standby的原理，下面几张图截自 Getting ready for PostgreSQL 9.1 ，第28，29和30页：正常运行时： Failover时： Switchover时: 之所以列出这三张图，就是想分析分析，我们需要PostgreSQL配置哪些内容，以支持这三项（正常运行/Failover/Switchover ）。首先我们对环境说明一下： /home/postgres/db/master/pgsql 目录是master数据库的目录，端口为5432 /home/postgres/db/standby/pgsql 目录是一台standy数据库的目录，端口为6432 /home/postgres/archive 是master-standby的WAL日志中转地点 /home/postgres/ archive_failover / 是Switchover阶段standby-master的WAL日志中转地点 /home/postgres/base 是master-standby的基础备份库的目录 /home/postgres/trigger 使用pg_standby的trigger文件存放处关于上面的第1点，我们在教程一（ PostgreSQL9.1 PITR示例）里已经说过如何安装了（但是这里有一点不同的是我们将使用PostgreSQL的扩展模块 pg_standby，所以编译和安装时稍微不同，请参考:15.4. Installation Procedure ），并生成了100万条数据，即执行了： $ cd /home/postgres/develop/ $ rm -fr postgresql-9.1.2 $ tar zxf postgresql-9.1.2.tar.gz $ cd postgresql-9.1.2 $ ./configure --prefix=/home/postgres/db/master/pgsql --with-includes=/usr/local/readline/include --with-libraries=/usr/local/readline/lib $ make world $ make install-world $ /home/postgres/db/master/pgsql/bin/initdb -D /home/postgres/db/master/pgsql/data $ /home/postgres/db/master/pgsql/bin/postmaster -D /home/postgres/db/master/pgsql/data $ /home/postgres/db/master/pgsql/bin/createdb mydb $ /home/postgres/db/master/pgsql/bin/psql mydb mydb=# create table foo(id bigint); mydb=# insert into foo select * from generate_series(1,1000000); 关于第2点，类似于第1点，只是端口（我们把默认的端口5432改为6432，以和master区别开来）不同，这里简要列一个安装步骤： $ cd /home/postgres/develop/ $ rm -fr postgresql-9.1.2 $ tar zxf postgresql-9.1.2.tar.gz $ cd postgresql-9.1.2 $ ./configure --prefix=/home/postgres/db/standby/pgsql --with-includes=/usr/local/readline/include --with-libraries=/usr/local/readline/lib --with-pgport=6432 $ make world $ make install-world $ /home/postgres/db/standby/pgsql/bin/initdb -D /home/postgres/db/standby/pgsql/data 然后修改postgresql.conf：port = 6432，保存。 $ /home/postgres/db/standby/pgsql/bin/postmaster -D /home/postgres/db/standby/pgsql/data $ /home/postgres/db/standby/pgsql/bin/createdb mydb --port=6432 好，最后查看一下是否安装成功： $ /home/postgres/db/standby/pgsql/bin/psql mydb --port=6432 （A）接下来我们就看看正常运行下，如何配置master和standby数据库（注意：写到这里你可以动手操作了）： master的postgresql.conf中三个参数： wal_level = archive archive_mode = on archive_command = 'cp %p /home/postgres/archive/%f' 并参考第一篇 PostgreSQL9.1 PITR示例对master做一次基础备份（ base backup，基础备份data目录）： mydb=# SELECT pg_start_backup('bak20120218'); $ cd /home/postgres/db/master/pgsql/ $ tar czvf /home/postgres/base/base_data.tar.gz data/ mydb=# SELECT pg_stop_backup(); mydb=# select pg_switch_xlog(); 由于刚安装的standy数据库此时什么都没有，而master可能已经运行了一段时间，这就需要对standy数据库的data目录使用上面基础备份的data目录覆盖： $ cd /home/postgres/db/standby/pgsql/ $ mv data data_bk $ tar -xzvf /home/postgres/base/base_data.tar.gz $ rm -r data/pg_xlog/ $ mkdir -p data/pg_xlog/archive_status $ rm data/postmaster.pid $ cp /home/postgres/db/standby/pgsql/share/recovery.conf.sample /home/postgres/db/standby/pgsql/data/recovery.conf $ cp /home/postgres/db/standby/pgsql/data_bk/postgresql.conf /home/postgres/db/standby/pgsql/data/postgresql.conf 然后配置standy让它处于一直接收WAL日志的standby_mode状态，即修改recovery.conf文件（参考： 25.2.4 Setting Up a Standby Server ）：（注意没有使用：restore_command = 'cp /home/postgres/archive/%f %p'，当然使用这个命令我测试过也可以，但此处使用了更为强大的postgresql的扩展pg_standby,关于如何使用pg_standby，请参阅F.31. pg_standby ） standby_mode = on restore_command = '/home/postgres/db/standby/pgsql/bin/pg_standby -d -s 2 -t /home/postgres/trigger/pgsql.trigger.6432 /home/postgres/archive %f %p %r' recovery_end_command = 'rm -f /home/postgres/trigger/pgsql.trigger.6432' 好，重启standby数据库（之前请先启动master库），看看是否成功： $ /home/postgres/db/standby/pgsql/bin/postmaster -D /home/postgres/db/standby/pgsql/data --port=6432 LOG: database system was interrupted; last known up at 2012-02-22 17:35:21 CST LOG: entering standby mode Trigger file: /home/postgres/trigger/pgsql.trigger.6432 Waiting for WAL file: 000000010000000000000006 WAL file path: /home/postgres/archive/000000010000000000000006 Restoring to: pg_xlog/RECOVERYXLOG Sleep interval: 2 seconds Max wait interval: 0 forever Command for restore: cp /home/postgres/archive/000000010000000000000006 pg_xlog/RECOVERYXLOG Keep archive history: 000000000000000000000000 and later running restore : OK LOG: restored log file 000000010000000000000006 from archive LOG: redo starts at 0/6000070 LOG: consistent recovery state reached at 0/7000000 Trigger file: /home/postgres/trigger/pgsql.trigger.6432 Waiting for WAL file: 000000010000000000000007 WAL file path: /home/postgres/archive/000000010000000000000007 Restoring to: pg_xlog/RECOVERYXLOG Sleep interval: 2 seconds Max wait interval: 0 forever Command for restore: cp /home/postgres/archive/000000010000000000000007 pg_xlog/RECOVERYXLOG Keep archive history: 000000010000000000000006 and later WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... 若是，则成功。当然您可以分别通过如下命令来看master和standby的状态：看master： /home/postgres/db/master/pgsql/bin/pg_controldata /home/postgres/db/master/pgsql/data 是否Database cluster state: in production ？看standby： /home/postgres/db/standby/pgsql/bin/pg_controldata /home/postgres/db/standby/pgsql/data 是否Database cluster state: in archive recovery？当然您可以继续在master里插入一些新的数据，以检验看看standby的服务端是不是几乎同时在用WAL日志恢复。（B）接下来进入 Failover阶段情况下如何配置，在Failover时，master由于某种原因down掉，您可以通过发出下述指令来关闭（根据德哥的经验，见这里，千万不要用-m immediate来数据库）： $ /home/postgres/db/ master /pgsql/bin/pg_ctl stop -m fast -D /home/postgres/db/master/pgsql/data 确定master关机后，然后我们需要首先在提升standby为master之前做一些准备，即修改standby的postgresql.conf中三个参数(一定要注意我们新增了 archive_failover目录 )，此时配置的目的是让后面Switchover阶段时，原来的master可以使用新master的日志： wal_level = archive archive_mode = on archive_command = 'cp %p /home/postgres/ archive_failover /%f' 然后准备提升standby为master，即：在trigger目录下创建pgsql.trigger.6432文件，文件内容为空（当然也可以写smart）， $ cat pgsql.trigger.6432 $ /home/postgres/db/ standby /pgsql/bin/pg_ctl promote -D /home/postgres/db/ standby /pgsql/data （注意： promote命令发出后，可能要等一会（十几秒钟）才能让standby切换成master）然后重新启动新的master库（即原来的promote之后的standby库），以让postgresql.conf生效。然后维护master库，把fault修正后，进入下一阶段。（C）Switchover阶段，这个时候原来的master需要重新启动为standby模式，此时配置原来的master和原来的standby一样。步骤为：把新master做一次基础备份（基础备份data目录）： $ /home/postgres/db/standby/pgsql/bin/createdb mydb --port=6432 mydb=# SELECT pg_start_backup('bak20120220'); $ cd /home/postgres/db/ standby /pgsql/ $ tar czvf /home/postgres/base/ base_data_switchover.tar.gz data/ mydb=# SELECT pg_stop_backup(); mydb=# select pg_switch_xlog(); 然后拷贝到老master那里： $ cd /home/postgres/db/master/pgsql/ $ mv data data_bk $ tar -xzvf /home/postgres/base/ base_data_switchover.tar.gz $ rm -r data/pg_xlog/ $ mkdir -p data/pg_xlog/archive_status $ rm data/postmaster.pid $ cp /home/postgres/db/master/pgsql/data_bk/postgresql.conf /home/postgres/db/master/pgsql/data/postgresql.conf 然后把postgresql.conf下面的几行都注释掉： #wal_level = archive #archive_mode = on #archive_command = 'cp %p /home/postgres/archive_failover/%f' 并为原有的master新建recovery.conf文件，内容和原来的standby的recovery.conf文件类似（只是使用不同的/home/postgres/ archive_failover /目录,trigger也变为 /home/postgres/trigger/ pgsql.trigger.5432 ）： standby_mode = on restore_command = '/home/postgres/db/ master /pgsql/bin/pg_standby -d -s 2 -t /home/postgres/trigger/ pgsql.trigger.5432 /home/postgres/ archive_failover %f %p %r' recovery_end_command = 'rm -f /home/postgres/trigger/ pgsql.trigger.5432 ' 然后启动原master，此时以standby模式运行： $ /home/postgres/db/master/pgsql/bin/postmaster -D /home/postgres/db/master/pgsql/data 查看是否一切正常。即你在新的master里做DML如插入新数据操作时，在新的standby里是否看到在用日志恢复。（D）如果有必要，我们再来一次主库和备库角色切换的操作（推荐，但此处不再演示）。即关闭新主库，激活老主库。写到这里你可能还有一些疑问，应用服务器有什么机制可以探测master down掉？有没自动化工具来对Failover和Switchover阶段自动执行？PostgreSQL9.1的25.3. Failover 中只是简单的说明了一下，本身不是很详细，本文不再讨论，有兴趣的朋友可以专门写写。至此完毕。参考： PostgreSQL 9.1 Allow standby recovery to switch to a new timeline automatically Postgresql 9.0 Replication - Step by Step ，看不到该文的，请看转载 http://blog.chinaunix.net/link.php?url=http://www.pgcon.org%2F2008%2Fschedule%2Fevents%2F76.en.html Instagram 架构分析笔记加我私人微信，交流技术。

个人分类: postgresql|11445 次阅读|0 个评论

[转载]PostgreSQL可以弥补MySQL数据库的哪些缺点

pgwater 2011-12-12 17:20

PostgreSQL可以弥补MySQL数据库的哪些缺点 (2010-04-05 22:27) 分类： 01.入门 PostgreSQL由于是类似 Oracle 的多进程框架，所以能支持高并发的应用场景，这点与Oracle数据库很像，所以把Oracle DBA 转到PostgreSQL数据库上是比较容易的，毕竟PostgreSQL数据库与Oracle数据库很相似。同时，PostgreSQL数据库的源代码要比MySQL数据库的源代码更容易读懂，如果团队的C语言能力比较强的知，就能在PostgreSQL数据库上做开发，比方说实现类似greenplum的系统，这样也能与现在的分布式趋势接轨。为了说明PostgreSQL的功能，我下面简要对比一下PostgreSQL数据库与MySQL数据库之间的差异：我们先借助Jametong翻译的"从Oracle迁移到Mysql之前必须知道的50件事"，看一看如何把Oracle转到MySQL中的困难： 50 things to know before migrating Oracle to MySQL by Baron Schwartz,Translated by Jametong 1. 对子查询的优化表现不佳. 2. 对复杂查询的处理较弱 3. 查询优化器不够成熟 4. 性能优化工具与度量信息不足 5. 审计功能相对较弱 6. 安全功能不成熟,甚至可以说很粗糙.没有用户组与角色的概念,没有回收权限的功能(仅仅可以授予权限).当一个用户从不同的主机/网络以同样地用户名/密码登录之后,可能被当作完全不同的用户来处理.没有类似于Oracle的内置的加密功能. 7. 身份验证功能是完全内置的.不支持LDAP,Active Directory以及其它类似的外部身份验证功能. 8. Mysql Cluster可能与你的想象有较大差异. 9. 存储过程与触发器的功能有限. 10. 垂直扩展性较弱. 11. 不支持MPP(大规模并行处理). 12. 支持SMP(对称多处理器),但是如果每个处理器超过4或8个核(core)时,Mysql的扩展性表现较差. 13. 对于时间、日期、间隔等时间类型没有秒以下级别的存储类型. 14. 可用来编写存储过程、触发器、计划事件以及存储函数的语言功能较弱. 15. 没有基于回滚(roll-back)的恢复功能,只有前滚(roll-forward)的恢复功能. 16. 不支持快照功能. 17. 不支持数据库链(database link).有一种叫做Federated的存储引擎可以作为一个中转将查询语句传递到远程服务器的一个表上,不过,它功能很粗糙并且漏洞很多. 18. 数据完整性检查非常薄弱,即使是基本的完整性约束，也往往不能执行。 19. 优化查询语句执行计划的优化器提示非常少. 20. 只有一种表连接类型:嵌套循环连接(nested-loop),不支持排序-合并连接(sort-merge join)与散列连接(hash join). 21. 大部分查询只能使用表上的单一索引;在某些情况下,会存在使用多个索引的查询,但是查询优化器通常会低估其成本,它们常常比表扫描还要慢. 22. 不支持位图索引(bitmap index).每种存储引擎都支持不同类型的索引.大部分存储引擎都支持B-Tree索引. 23. 管理工具较少,功能也不够成熟. 24. 没有成熟能够令人满意的IDE工具与调试程序.可能不得不在文本编辑器中编写存储过程,并且通过往表(调试日志表)中插入记录的方式来做调试. 25. 每个表都可以使用一种不同的存储引擎. 26. 每个存储引擎在行为表现、特性以及功能上都可能有很大差异. 27. 大部分存储引擎都不支持外键. 28. 默认的存储引擎(MyISAM)不支持事务,并且很容易损坏. 29. 最先进最流行的存储引擎InnoDB由Oracle拥有. 30. 有些执行计划只支持特定的存储引擎.特定类型的Count查询,在这种存储引擎中执行很快,在另外一种存储引擎中可能会很慢. 31. 执行计划并不是全局共享的,,仅仅在连接内部是共享的. 32. 全文搜索功能有限, 只适用于非事务性存储引擎. Ditto用于地理信息系统/空间类型和查询. 33. 没有资源控制.一个完全未经授权的用户可以毫不费力地耗尽服务器的所有内存并使其崩溃,或者可以耗尽所有 CPU 资源. 34. 没有集成商业智能(business intelligence), OLAP **数据集等软件包. 35. 没有与Grid Control类似的工具( http://solutions. mysql .com/go. php ?id=1296t=s ) 36. 没有类似于RAC的功能.如果你问”如何使用Mysql来构造RAC”,只能说你问错了问题. 37. 不支持用户自定义类型或域(domain). 38. 每个查询支持的连接的数量最大为61. 39. MySQL支持的SQL语法(ANSI SQL标准)的很小一部分.不支持递归查询、通用表表达式（Oracle的with 语句）或者窗口函数（分析函数）.支持部分类似于Merge或者类似特性的SQL语法扩展,不过相对于Oracle来讲功能非常简单. 40. 不支持功能列(基于计算或者表达式的列,Oracle11g 开始支持计算列,以及早期版本就支持虚列(rownum,rowid)). 41. 不支持函数索引,只能在创建基于具体列的索引. 42. 不支持物化视图. 43. 不同的存储引擎之间,统计信息差别很大,并且所有的存储引擎支持的统计信息都只支持简单的基数(cardinality)与一定范围内的记录数(rows-in-a-range). 换句话说，数据分布统计信息是有限的.更新统计信息的机制也不多. 44. 没有内置的负载均衡与故障切换机制. 45. 复制(Replication)功能是异步的,并且有很大的局限性.例如,它是单线程的(single-threaded),因此一个处理能力更强的Slave的恢复速度也很难跟上处理能力相对较慢的Master. 46. Cluster并不如想象的那么完美.或许我已经提过这一点,但是这一点值得再说一遍. 47. 数据字典(INFORMATION_SCHEMA)功能很有限,并且访问速度很慢(在繁忙的系统上还很容易发生崩溃). 48. 不支持在线的Alter Table操作. 49. 不支持Sequence. 50. 类似于ALTER TABLE或CREATE TABLE一类的操作都是非事务性的.它们会提交未提交的事务,并且不能回滚也不能做灾难恢复.Schame被保存在文件系统上,这一点与它使用的存储引擎无关. PostgreSQL数据库可以解决以上问题中的： 1. 对子查询的优化表现不佳 2. 对复杂查询的处理较弱 3. 查询优化器不够成熟 PostgreSQL完全支持SQL-92标准，对SQL的支持也很全面，可以支持复杂的SQL查询。 4. 性能优化工具与度量信息不足 PostgreSQL提供了执行计划和详细的cost值，可以方便看到SQL的执行效率。 9. 存储过程与触发器的功能有限. PostgreSQL提供了完善的存储过程和触发器支持。 11. 不支持MPP(大规模并行处理) 而PostgreSQL是类似Oracle数据库的架构，是多进程的架构，而不像MySQL是多线程的架构，所以能支持MPP。 18. 数据完整性检查非常薄弱,即使是基本的完整性约束，也往往不能执行。 PostgreSQL提供完善的数据完整性检查机制，支持外键。 20. 只有一种表连接类型:嵌套循环连接(nested-loop),不支持排序-合并连接(sort-merge join)与散列连接(hash join). 而PostgreSQL则支持这些表连接类型 21. 大部分查询只能使用表上的单一索引;在某些情况下,会存在使用多个索引的查询,但是查询优化器通常会低估其成本,它们常常比表扫描还要慢. PostgreSQL数据不存在这个问题，假设表T的两个字段col1的col2上有两个索引,idx_1和idx_2，那么select * from t where col1=:a and col2=:b;查询时，PostgreSQL数据库有可能把这个查询转化为select * from t where col1=:a intersect select * from t where col2=:b，这样两个索引都可以使用上。 25. 每个表都可以使用一种不同的存储引擎. 26. 每个存储引擎在行为表现、特性以及功能上都可能有很大差异. 27. 大部分存储引擎都不支持外键. 28. 默认的存储引擎(MyISAM)不支持事务,并且很容易损坏. 29. 最先进最流行的存储引擎InnoDB由Oracle拥有. 30. 有些执行计划只支持特定的存储引擎.特定类型的Count查询,在这种存储引擎中执行很快,在另外一种存储引擎中可能会很慢. PostgreSQL只有一种存储引擎，所以不存在上面的情况。而PostgreSQL支持完善的事务。 32. 全文搜索功能有限, 只适用于非事务性存储引擎. Ditto用于地理信息系统/空间类型和查询. PostgreSQL数据库支持全文搜索，支持更多类型的索引，如B-tree,R-tree, Hash, GiST, GIN，R-tree,GIST,GIN索引可用于空间类型和查询。 37. 不支持用户自定义类型或域(domain). PostgreSQL支持丰富的类型，同时也支持自定义类型。 39. MySQL支持的SQL语法(ANSI SQL标准)的很小一部分.不支持递归查询、通用表表达式（Oracle的with 语句）或者窗口函数（分析函数）.支持部分类似于Merge或者类似特性的SQL语法扩展,不过相对于Oracle来讲功能非常简单. 这些PostgreSQL数据库都支持，如窗口函数。 41. 不支持函数索引,只能在创建基于具体列的索引. PostgreSQL支持函数索引 49. 不支持Sequence. PostgreSQL支持sequence 50. 类似于ALTER TABLE或CREATE TABLE一类的操作都是非事务性的.它们会提交未提交的事务,并且不能回滚也不能做灾难恢复.Schame被保存在文件系统上,这一点与它使用的存储引擎无关. PostgreSQL不存在这个问题。

个人分类: postgres|3945 次阅读|0 个评论

PostgreSQL：最高级的开源数据库

热度 2 outcrop 2011-10-25 18:17

PostgreSQL官方网站宣称： PostgreSQL: The world's most advanced open source database ——这句话不是吹牛。 PostgreSQL很多优秀的特性与表现，超过了Mysql甚至诸多商业数据库。PostgreSQL能运行于Windows、Linux以及BSD等平台，详细特性请参考其官方网站： http://www.postgresql.org/ 但遗憾的是，这款来自加州大学伯克利分校计算机系的数据库，在国内得到的关注很少，在中国的学术界关注可能更少——我曾经搜索过期刊网。用着盗版的MSSQL、Oracle等产品，而放弃了如此优秀的一款开源数据库，实在是让人心痛。值得一提的是，PostgreSQL是以BSD授权发布，特别适合科研。作为一名微软认证数据库管理员（MCDBA），我向您郑重推荐 PostgreSQL。嗯，还有Mysql也不错。

个人分类: 开放源代码工具|6105 次阅读|8 个评论

rhel6+postgresql8.4+postgis1.4+eclipse CDT3.6 调试环境搭建

hillpig 2011-4-26 16:11

上次写过在Ubuntu9上面的搭建环境（ http://blogold.chinaunix.net/u2/81513/showart_2168880.html ），这次写写在Red Hat Enterprise Linux 6（RHEL 6）上面的调试搭建环境，如无特殊说明，下面均在32位机器上进行。首先说明一下，我是放弃了在CentOS 5.6上搭建环境之后才选择RHEL 6的，整整折腾了2天的bug终于放弃在CentOS 5.6上继续鼓捣，RHEL 6的可靠性、易用性让我最终选择了它，尽管搭建环境解决其中出现的bug我做了两遍实验，折腾了足足3天。但是经过下面的步骤，您可以避免大量的环境设置问题，约在6个小时左右的时间全部设置完毕。注意：本文如无特殊说明，下载的安装程序均为源代码，之所以没有使用rpm包，这是血的教训，最主要的原因是库依赖问题。 1.安装RHEL 6 下载DVD光盘： http://www.ha97.com/3710.html 2.7G，2M的带宽，用迅雷5下大概3.5小时，下载完毕后一定要使用md5工具（dos下的命令行工具md5sum ）看看是否文件完整，我就曾经用firefox的下载的不完整过一次，害得我白刻录了一张DVD。然后用光盘刻录工具（windows下的如nero，8.0从这里下载），刻录成光盘以便安装。安装过程具体参考（焦振清. RHEL6安装配置图片详解）,我安装的英文版（如果用到中文的话，只需要设置输入法足够），这里主要说明的有：创建postgres账户，以后用该账户作为调试使用在“自定义安装的服务器类型”中做如下选择（图片不清楚，可以拷贝图片地址，在新页面中查看）：上图中，一定要选择“Desktop”和“Customize now”，接下来, 语言增加中文：在Development里选择：Additional Development, Development Tools, Server Platform Development三项，如下图：还要注意，如果您今后还想装VirtualBox，一定要把Additional Development中的Optional pcakages中的SDL-devel-1.2.14-2.el6.i686包选中，默认的是不选的。说明：如果您在rhel6安装时没有选择该SDL-devel-1.2.14-2.el6.i686包，在以后需要装VirtualBox需要依赖该包时，按照如下步骤安装：准备好rhel6的ISO安装光盘，例如在如下目录： /home/yourname/Downloads/rhel-server-6.0-i386-dvd.iso 挂载 mount -t iso9660 -o loop /home/yourname/Downloads/rhel-server-6.0-i386-dvd.iso /media/cdrom/ 在System-Administration-Add/Remove Software里，搜索SDL，勾选SDL-devel-1.2.14-2.el6.i686安装即可。好，安装完成后重启，使用postgres账户登录。 2.安装Postgreql 8.4 说明：之所以先把Postgresql安装起来，是因为先让它跑起来，后面再用Eclipse CDT设置调试环境。下载源代码： http://wwwmaster.postgresql.org/download/mirrors-ftp/source/v8.4.8/postgresql-8.4.8.tar.gz 到自己的目录里，如/home/postgres/development/ 。先看安装说明：在文档 http://www.postgresql.org/docs/8.4/interactive/install-requirements.html ，中说明需要的安装前置条件如下： You need an ISO/ANSI C compiler (at least C89-compliant). Recent versions of GCC ... tar is required to unpack the source distribution, in addition to either gzip or bzip2. In addition, gzip is required to install the documentation. The GNU Readline library is used by default. It allows psql (the PostgreSQL command line SQL interpreter) to remember each command you type,... The zlib compression library will be used by default.... 我们通过如下命令来看看我们安装的RHEL6是否已经满足这些条件，下面的命令分别对应上面的要求： gcc --version 我们发现：gcc (GCC) 4.4.4 20100726 (Red Hat 4.4.4-13) tar --version 我们发现：tar (GNU tar) 1.23 通过系统菜单System-Administration-Add/Remove Software 搜索Readline，我们发现：readline-6.0-3.el6 通过系统菜单System-Administration-Add/Remove Software 搜索zlib，我们发现：zlib-1.2.3-25.el6等几个文件好，条件全部满足，我们开始准备安装PostgreSQL： #2.1切换到root用户,创建postgresql默认的安装目录（root用户）： su #此处需要输入root的密码 mkdir /usr/local/pgsql chown postgres /usr/local/pgsql #2.2编译安装（注意：postgres 用户）。说明：尽管Postgresql的文档中使用gmake来编译，此处我们仍然用make方法：在上面的命令行窗口中，退出root： exit 进入postgres环境： cd /home/postgres/development/ tar zxf postgresql-8.4.8.tar.gz cd postgresql-8.4.8 ./configure --enable-depend --enable-cassert --enable-debug make make install #2.3启动Postgresql服务器，看是否成功（postgres用户）：命令行执行： /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data logfile 21 另外启动一个命令窗口,创建一个测试数据库实例mydb： /usr/local/pgsql/bin/createdb mydb 然后： /usr/local/pgsql/bin/psql mydb mydb=#select version(); 看看是否成功？若成功，别先高兴，革命成功了十分之一。不成功，您给我留言。 #2.4 设置PATH路径，重启机器，再次进入postgres用户（postgres用户）：修改PATH的目的是以后不需要每次执行命令都输入全路径，如/usr/local/pgsql/bin/postmaster 命令设置完之后，只需要输入postermaster就可以了。 gedit /home/postgres/.bash_profile 原有的PATH=....,在最后增加路径/usr/local/pgsql/bin，修改后的为： PATH=$PATH:$HOME/bin:/usr/local/pgsql/bin 保存后重启，再次用postgres账户登录。 3.安装并设置Eclipse CDT 3.6 若仁兄您不喜欢用IDE开发环境，就喜欢用命令行调试，那您可以参考我另一篇文章：使用GDB和DDD调试POSTGRESQL 。下载Eclipse CDT 3.6（我用的Galileo）： http://www.eclipse.org/downloads/packages/eclipse-ide-cc-developers/galileosr2 ，假定保存在/home/postgres/development/，并解压缩安装前提：jdk1.6，由于RHEL 6已经安装了java-1.6.0-openjdk-1....,满足该条件。然后，我们新建一个Eclipse的工作区目录：/home/postgres/workspace 为方便使用，建议在桌面建立启动快捷方式，注意创建的时候在Command选项填入： env GDK_NATIVE_WINDOWS=1 /home/postgres/develop/eclipse/eclipse 加入env GDK_NATIVE_WINDOWS=1的原因是为了避免启动不了图形界面（此问题在Ubuntu9里面存在，我没试过在RHEL 6里是否仍然有问题，故为保险起见，加入该参数）最后，根据 http://wiki.postgresql.org/wiki/Working_with_Eclipse 设置Eclipse的调试环境。该文档非常细致，建议自己仔细看看（主要是要注意设置路径，见下面Arguemnts设置），这里再简单说明一下重点：一般会设定两个调试配置，一个是为Postmaster进程调试使用的，即数据库从启动到接收sql客户端请求的调试，另外一个调试配置是为postgres进程使用的，我们都知道，当客户端sql请求到达服务端的时候，postmaster会fork一个postgres后台进程，这就是为什么调试postgres的原因。 3.1首先我们来看第一个为Postmaster进程调试使用的配置，参考下图，在C/C++ Application选项下建立调试实例：另外注意的一点是由于我们的Postgresql 默认的数据库运行目录为/usr/local/pgsql/data：注意设定Arguemnts选项为： -D /usr/local/pgsql/data 如下图： 3.2 第二个调试使用的配置是为调试Postmaster进程使用的，如下图，在C/C++ Attach to Application选项下建立调试实例：那具体怎么调试呢，首先您需要开启两个命令窗口：一个是Postmaster数据库服务器： /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data 另外一个： /usr/local/pgsql/bin/psql mydb 然后转到Eclipse里运行为postgres设置的这个调试设置，过一小会后，是不是关联上了，这时您在从psql里执行一条sql语句，是不是可以在你设置的断点处调试了。 4.安装并设置PostGIS 1.4 如果您的工作和PostGIS没有关系，请略过此节。下载PostGIS 1.4.2： http://postgis.refractions.net/download/postgis-1.4.2.tar.gz ，假定保存在/home/postgres/development/，并解压缩在该目录下。另外PostGIS 1.5我也试过，也可以调试。安装前置条件：我们根据文档： http://postgis.refractions.net/documentation/manual-1.4/ch02.html#id2701180 ： Required： PostgreSQL 8.2 or higher... GNU C compiler (gcc). ... GNU Make (gmake or make).... Proj4 reprojection library, version 4.5.0 or greater. The Proj4 library is used to provide coordinate reprojection support within PostGIS. Proj4 is available for download from http://trac.osgeo.org/proj/ . GEOS geometry library, version 3.0.0 or greater. The GEOS library is used to provide geometry tests (ST_Touches(), ST_Contains(), ST_Intersects()) and operations (ST_Buffer(), ST_Union(), ST_Difference()) within PostGIS. GEOS is available for download from http://trac.osgeo.org/geos/ . 知，前三项我们都已经安装，故还需要安装 Libgeos, Proj，如上面所述，这两个软件下载地址如下： http://download.osgeo.org/proj/proj-4.7.0.tar.gz http://download.osgeo.org/geos/geos-3.2.2.tar.bz2 下载下来之后假定保存在/home/postgres/development/，并解压缩在该目录下。 4.1 安装proj4 （postres账户） su mkdir /usr/local/proj4 chown postgres /usr/local/proj4 exit cd /home/postgres/development tar zxf proj-4.7.0.tar.gz cd proj-4.7.0 ./configure --prefix=/usr/local/proj4 make make install 4.2 安装geos （postres账户） su mkdir /usr/local/geos chown postgres /usr/local/geos exit cd /home/postgres/development tar jxf geos-3.1.1.tar.bz2 cd geos-3.1.1 ./configure --prefix=/usr/local/geos make make install 4.3 安装PostGIS （postres账户） su mkdir /usr/local/postgis chown postgres /usr/local/postgis exit cd /home/postgres/development tar zxf postgis-1.4.2.tar.gz cd postgis-1.4.2 #下面的 --prefix=/usr/local/postgis 似乎没起作用（尽管如此，还是加入该参数），结果是需要使用root权限进行安装 ./configure --prefix=/usr/local/postgis --with-pgconfig=/usr/local/pgsql/bin/pg_config --with-projdir=/usr/local/proj4 --with-geosconfig=/usr/local/geos/bin/geos-config --enable-debug make make install 4.4 设置库搜索路径（切换到root账户，设置完后需重启机器）首先切换到root账户，注意这里不是用su切换，而是使用左面右上角的postgres下面的Switch user按钮 gedit /etc/ld.so.conf 在最后添加两行（注意：只需要两行）： /usr/local/geos/lib /usr/local/proj4/lib 保存后，运行ldconfig命令，使以上修改生效: /sbin/ldconfig 重启机器，再次使用postgres账户登入。 4.5 测试一下在mydb数据库实例中导入对PostGIS的支持（postres账户）启动一个命令窗口： postmaster -D /usr/local/pgsql/data 再启动一个命令窗口 createlang plpgsql mydb psql -d mydb -f /usr/local/pgsql/share/contrib/postgis.sql psql -d mydb -f /usr/local/pgsql/share/contrib/spatial_ref_sys.sql 如果没有问题，则表示PostGIS成功（要知道我历经千辛万苦才没有错误的到这一步的，都是geos，proj4库路径找不到惹得祸）。关闭数据库服务： /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data 以后统一使用如下命令来启动数据库服务器： postmaster -D /usr/local/pgsql/data 4.6 按照和PostGIS方式，在Eclipse CDT里同样建立一个调试项目可以类似根据 http://wiki.postgresql.org/wiki/Working_with_Eclipse 设置Eclipse对PostGIS的调试环境，此处略。 5. 安装pgAdmin 3 这里安装pgAdmin 3，这是为了方便客户端使用。下载pgAdmin 3 v1.12.2 ： http://wwwmaster.postgresql.org/download/mirrors-ftp/pgadmin3/release/v1.12.2/src/pgadmin3-1.12.2.tar.gz ，假定保存在/home/postgres/development/，并解压缩在该目录下。安装前置条件：我们根据文档解压后的源代码目录中的安装说明文档（/home/postgres/development/pgadmin3-1.12.2/install） : You will need: - A modern development environment: - GTK 2.2 or above. - GNU automake 1.9 or above. - GNU autoconf 2.59 or above. - GNU gcc 3.4 or above. - wxGTK 2.8.x from http://www.wxwidgets.org/ - libxml2 2.6.18 or above from http://www.xmlsoft.org/ - libxslt 1.1.x or above from http://www.xmlsoft.org/ - PostgreSQL 8.4 or above from http://www.postgresql.org/ 我们一样一样看，GTK2.2，GNU那几项RHEL6都已经默认安装，wxGTK 没有，libxml2有，libxslt已经安装，故我们还需要安装wxGTK 2.8.12，下载地址： http://sourceforge.net/projects/wxwindows/files/2.8.12/wxGTK-2.8.12.tar.gz/download ，假定保存在/home/postgres/development/，并解压缩在该目录下。 5.1 安装wxGTK 2.8.12 cd /home/postgres/development tar zxf wxGTK-2.8.12.tar.gz cd wxGTK-2.8.12 ./configure --with-gtk --enable-gtk2 --enable-unicode make su -c make install # Install wxWidgets contrib modules. cd contrib/ make su -c make install 首先切换到root账户，注意这里不是用su切换，而是使用左面右上角的postgres下面的Switch user按钮 gedit /etc/ld.so.conf 在最后添加一行： /usr/local/lib 保存后，运行ldconfig命令，使以上修改生效: /sbin/ldconfig 退出root账户，回到postgres账户中。 5.2 安装pgAdmin3 cd /home/postgres/development tar zxf pgadmin3-1.12.2.tar.gz cd pgadmin3-1.12.2 ./configure make su -c make install 5.3 桌面建立一个launcher 设置命令路径为：/usr/local/pgadmin3/bin/pgadmin3 6. 安装QGIS 1.6.0 (未完待续，暂时出错，请不要使用下面代码) QGIS 1.6.0 主要用于查看空间数据库，下载地址： http://qgis.org/downloads/qgis-1.6.0.tar.bz2 ，假定保存在/home/postgres/development/，并解压缩在该目录下。根据解压缩后的install文件说明，QGIS所依赖的库为： Required build deps: - Qt = 4.4.0 - Proj = 4.4.x - GEOS = 3.0 - Sqlite3 = 3.0.0 - GDAL/OGR = 1.4.x - Qwt = 5.0 其中Proj, GEOS, Sqlite3,都已经安装，而Qt 默认安装版本是3，GDAL和Qwt没有安装。故先下载： GDAL 1.8.0 http://download.osgeo.org/gdal/gdal180.zip qwt 5.2.1 http://sourceforge.net/projects/qwt/files/qwt/5.2.1/qwt-5.2.1.tar.bz2/download 6.1 安装GDAL 1.8 cd /home/postgres/development tar zxf gdal180.zip cd gdal-1.8.0 ./configure make su -c make install 6.2 安装qwt 5.2.1 cd /home/postgres/development tar zxf qwt-5.2.1.tar.bz2 cd qwt-5.2.1 qmake make su -c make install 6.3 安装qgis 1.6.0 参考：在Linux环境下编译安装配置PostGIS/PostgreSQL全过程加我私人微信，交流技术。

个人分类: postgresql|10369 次阅读|0 个评论

灌实验数据

hillpig 2010-9-9 00:15

1.初始化 POSTGRESQL 数据目录和原始数据库：打开cmd，执行： initdb -D /usr/local/pgsql/data postgres -D /usr/local/pgsql/data logfile 21 createdb mydb createlang plpgsql mydb 2.安装PostGIS, 打开cmd，执行： cd ~/develop/postgis-1.4.1 ./configure --enable-depend --enable-cassert --enable-debug make make install psql -d mydb -f /usr/local/pgsql/share/contrib/postgis.sql psql -d mydb -f /usr/local/pgsql/share/contrib/spatial_ref_sys.sql 3.创建pois表启动Postmaster cp /home/postgres/develop/sql/pg_hba.conf /usr/local/pgsql/mydb/pg_hba.conf cp /home/postgres/develop/sql/postgresql.conf /usr/local/pgsql/data/postgresql.conf postmaster -D /usr/local/pgsql/data 再开一个cmd，执行： CREATE TABLE pois ( uid integer not null, name VARCHAR(128), catcode VARCHAR(32) not null, catname VARCHAR(32), others VARCHAR(32) ) WITH ( OIDS = FALSE ) ; SELECT AddGeometryColumn('pois', 'location', 4214, 'POINT', 2); 然后开我的工具，灌数据。完毕。加我私人微信，交流技术。

个人分类: postgresql|3305 次阅读|0 个评论

group by 语句从parsetree到portal全过程内存结构

hillpig 2010-5-5 11:09

上次写完了最简单的： select catcode from pois; Postgresql Executor 阶段内存结构，这回我们重新开始看看group by语句的内存结构，从parsetree，到plantree到portal执行中的所有相关的内存结构。先定义我们的表，这回数据表稍微比原先的复杂些： CREATE TABLE pois ( uid integer not null, name VARCHAR(128), catcode VARCHAR(32) not null, catname VARCHAR(32), others VARCHAR(32) ); 执行 select count(*) from pois group by catcode; 1.parsetree debug得到的parsetree list如下： parsetree_list 0x08e39a5c type T_List length 1 head 0x08e39a48 data {...} ptr_value 0x08e399bc type T_SelectStmt distinctClause 0x00000000 intoClause 0x00000000 targetList 0x08e3988c type T_List length 1 head 0x08e39878 data {...} ptr_value 0x08e3984c type T_ResTarget name 0x00000000 indirection 0x00000000 val 0x08e39820 type T_FuncCall funcname 0x08e39804 type T_List length 1 head 0x08e397f0 data {...} ptr_value 0x08e397dc type T_String val {...} ival 149133256 str 0x08e397c8 int_value 149133276 oid_value 149133276 next 0x00000000 tail 0x08e397f0 args 0x00000000 agg_star true agg_distinct false func_variadic false over 0x00000000 location 7 location 7 int_value 149133388 oid_value 149133388 next 0x00000000 tail 0x08e39878 fromClause 0x08e398fc type T_List length 1 head 0x08e398e8 data {...} ptr_value 0x08e398bc type T_RangeVar catalogname 0x00000000 schemaname 0x00000000 relname 0x08e398a8 *relname 'p' inhOpt INH_DEFAULT istemp false alias 0x00000000 location 21 int_value 149133500 oid_value 149133500 next 0x00000000 tail 0x08e398e8 whereClause 0x00000000 groupClause 0x08e399a0 type T_List length 1 head 0x08e3998c data {...} ptr_value 0x08e3992c type T_ColumnRef fields 0x08e39970 type T_List length 1 head 0x08e3995c data {...} ptr_value 0x08e39948 type T_String val {...} ival 149133592 str 0x08e39918 *str 'c' int_value 149133640 oid_value 149133640 next 0x00000000 tail 0x08e3995c location 35 int_value 149133612 oid_value 149133612 next 0x00000000 tail 0x08e3998c havingClause 0x00000000 windowClause 0x00000000 withClause 0x00000000 valuesLists 0x00000000 sortClause 0x00000000 limitOffset 0x00000000 limitCount 0x00000000 lockingClause 0x00000000 op SETOP_NONE all false larg 0x00000000 rarg 0x00000000 int_value 149133756 oid_value 149133756 next 0x00000000 tail 0x08e39a48 parstree如下图：其中：主要是多了group by 子句，及targetlist中的funccall数据结构，注意到funccall指向的Value中count函数并没有包含参数catcode。 2.Querytree 这回我们先通过日志（关于如何查看postgresql的日志，请参考配置postgresql服务器以查看日志）看看query是什么样子： DETAIL: {QUERY :commandType 1 :querySource 0 :canSetTag true :utilityStmt :resultRelation 0 :intoClause :hasAggs true :hasWindowFuncs false :hasSubLinks false :hasDistinctOn false :hasRecursive false :cteList :rtable ( {RTE :alias :eref {ALIAS :aliasname pois :colnames (uid name catcode catname others location) } :rtekind 0 :relid 17229 :inh true :inFromCl true :requiredPerms 2 :checkAsUser 0 :selectedCols (b 11) :modifiedCols (b) } ) :jointree {FROMEXPR :fromlist ( {RANGETBLREF :rtindex 1 } ) :quals } :targetList ( {TARGETENTRY :expr {AGGREF :aggfnoid 2803 :aggtype 20 :args :agglevelsup 0 :aggstar true :aggdistinct false :location 7 } :resno 1 :resname count :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 3 :vartype 1043 :vartypmod 36 :varlevelsup 0 :varnoold 1 :varoattno 3 :location 35 } :resno 2 :resname :ressortgroupref 1 :resorigtbl 0 :resorigcol 0 :resjunk true } ) :returningList :groupClause ( {SORTGROUPCLAUSE :tleSortGroupRef 1 :eqop 98 :sortop 664 :nulls_first false } ) :havingQual :windowClause :distinctClause :sortClause :limitOffset :limitCount :rowMarks :setOperations } 值得注意的是groupclause中的eqop 98 和sortop 664 项。从select * from pg_operator where oid = 98; 可以查到： 98;=;11;10;b;TRUE;TRUE;25;25;16;98;531;texteq;eqsel;eqjoinsel 可见该操作符对字符串（我们的catecode就属于该类）运算等于操作从select * from pg_operator where oid = 664; 可以查到： 664;;11;10;b;FALSE;FALSE;25;25;16;666;667;text_lt;scalarltsel;scalarltjoinsel 可见，用于排序的操作符为，且对text类型的数据进行排序。另外一个参数是targetlist中aggfnoid 2803 mydb=# select oid,proname from pg_proc where proname = 'count'; oid | proname ------+--------- 2803 | count 2147 | count (2 rows) 发现2803为count函数。另外一个问题是在parsetree中我们只有count(*)一个target，为什么到了querytree里变成了两个了,实际上单单解析parsetree的targetlist只得到第一个，接着解析group by的时候才增加的第2个。可以通过如下调用明白其中的道理： postgresql Thread gdb Debugger (5/4/10 3:42 PM) (Suspended) Thread (Suspended) 13 findTargetlistEntrySQL99() /home/postgres/develop/.../src/backend/parser/parse_clause.c:1423 12 transformGroupClause() /home/postgres/develop/.../src/backend/parser/parse_clause.c:1457 11 transformSelectStmt() /home/postgres/develop/.../src/backend/parser/analyze.c:838 10 transformStmt() /home/postgres/develop/.../src/backend/parser/analyze.c:190 0x08127e3f 9 parse_analyze() /home/postgres/develop/.../src/backend/parser/analyze.c:93 0x0812a32f 8 pg_analyze_and_rewrite() /home/postgres/develop/.../src/backend/tcop/postgres.c:606 7 exec_simple_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:918 6 PostgresMain() /home/postgres/develop/.../src/backend/tcop/postgres.c:3617 0x0827a75e 5 BackendRun() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3449 4 BackendStartup() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3063 3 ServerLoop() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1387 2 PostmasterMain() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1040 1 main() /home/postgres/develop/.../src/backend/main/main.c:188 在findTargetlistEntrySQL99（）中： static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node, List **tlist){ TargetEntry *target_result; ListCell *tl; Node *expr; /* * Convert the untransformed node to a transformed expression, and search * for a match in the tlist. NOTE: it doesn't really matter whether there * is more than one match. Also, we are willing to match an existing * resjunk target here, though the SQL92 cases above must ignore resjunk * targets. */ expr = transformExpr(pstate, node); foreach(tl, *tlist) { TargetEntry *tle = (TargetEntry *) lfirst(tl); if (equal(expr, tle-expr)) return tle; } /* * If no matches, construct a new target entry which is appended to the * end of the target list. This target is given resjunk = TRUE so that it * will not be projected into the final tuple. */ target_result = transformTargetEntry(pstate, node, expr, NULL, true); //就是这里实现的 *tlist = lappend(*tlist, target_result); return target_result; } 本段小结，我们画出我们的query tree如下：此图中，我们注意到第2项targetentry的ressortgroupref =1和resjunk = true。关于ressortgroupref ，注释做了很好的说明： Index ressortgroupref;/* nonzero if referenced by a sort/group clause */ 注意Index ressortgroupref=1 最终赋值为1（初始化为0）是在如下assignSortGroupRef() 函数中，对应的堆栈为： postgresql Thread gdb Debugger (5/4/10 4:46 PM) (Suspended) Thread (Suspended) 14 assignSortGroupRef() /home/postgres/develop/.../src/backend/parser/parse_clause.c:2091 13 addTargetToGroupList() /home/postgres/develop/.../src/backend/parser/parse_clause.c:2062 12 transformGroupClause() /home/postgres/develop/.../src/backend/parser/parse_clause.c:1497 11 transformSelectStmt() /home/postgres/develop/.../src/backend/parser/analyze.c:838 10 transformStmt() /home/postgres/develop/.../src/backend/parser/analyze.c:190 0x08127e3f 9 parse_analyze() /home/postgres/develop/.../src/backend/parser/analyze.c:93 0x0812a32f 8 pg_analyze_and_rewrite() /home/postgres/develop/.../src/backend/tcop/postgres.c:606 7 exec_simple_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:918 6 PostgresMain() /home/postgres/develop/.../src/backend/tcop/postgres.c:3617 0x0827a75e 5 BackendRun() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3449 4 BackendStartup() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3063 3 ServerLoop() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1387 2 PostmasterMain() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1040 1 main() /home/postgres/develop/.../src/backend/main/main.c:188 0x081e7ee7 关于resjunk = true，即最终不发送到客户端。另外，rewritten parse tree:阶段没有对我们的parse tree做任何修改，因为我们没有针对view的数据表，所以没有重写规则。 3.Plantree 先看日志： STATEMENT: select count(*) from pois group by catcode; LOG: plan: DETAIL: {PLANNEDSTMT :commandType 1 :canSetTag true :planTree {AGG :startup_cost 19.07 :total_cost 19.11 :plan_rows 3 :plan_width 2 :targetlist ( {TARGETENTRY :expr {AGGREF :aggfnoid 2803 :aggtype 20 :args :agglevelsup 0 :aggstar true :aggdistinct false :location 7 } :resno 1 :resname count :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 65001 :varattno 3 :vartype 1043 :vartypmod 36 :varlevelsup 0 :varnoold 1 :varoattno 3 :location 35 } :resno 2 :resname :ressortgroupref 1 :resorigtbl 0 :resorigcol 0 :resjunk true } ) :qual :lefttree {SEQSCAN :startup_cost 0.00 :total_cost 16.05 :plan_rows 605 :plan_width 2 :targetlist ( {TARGETENTRY :expr {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1 :location -1 } :resno 1 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 2 :vartype 1043 :vartypmod 132 :varlevelsup 0 :varnoold 1 :varoattno 2 :location -1 } :resno 2 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 3 :vartype 1043 :vartypmod 36 :varlevelsup 0 :varnoold 1 :varoattno 3 :location -1 } :resno 3 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 4 :vartype 1043 :vartypmod 36 :varlevelsup 0 :varnoold 1 :varoattno 4 :location -1 } :resno 4 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 5 :vartype 1043 :vartypmod 36 :varlevelsup 0 :varnoold 1 :varoattno 5 :location -1 } :resno 5 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } {TARGETENTRY :expr {VAR :varno 1 :varattno 6 :vartype 16394 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 6 :location -1 } :resno 6 :resname :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } ) :qual :lefttree :righttree :initPlan :extParam (b) :allParam (b) :scanrelid 1 } :righttree :initPlan :extParam (b) :allParam (b) :aggstrategy 2 :numCols 1 :grpColIdx 3 :grpOperators 98 :numGroups 3 } :rtable ( {RTE :alias :eref {ALIAS :aliasname pois :colnames (uid name catcode catname others location) } :rtekind 0 :relid 17229 :inh false :inFromCl true :requiredPerms 2 :checkAsUser 0 :selectedCols (b 11) :modifiedCols (b) } ) :resultRelations :utilityStmt :intoClause :subplans :rewindPlanIDs (b) :returningLists :rowMarks :relationOids (o 17229) :invalItems :nParamExec 0 } 先根据这些信息生成plantree图，如下：对于图中我们发现了几个有意思的事情，一个是Agg中已经知道了最终结果有3个group，即Long numGroups=3这是怎知道的呢？另外一个有意思的事情是Agg中的AggStrategy Aggstrategy=AGG_HASHED，这个hash策略是什么意思呢？我们先来看第一个。numGroups是在get_variable_numdistinct() 中计算的。调用堆栈为： Thread (Suspended) 16 get_variable_numdistinct() /home/postgres/develop/.../src/backend/utils/adt/selfuncs.c:4227 0x082e2036 15 add_unique_group_var() /home/postgres/develop/.../src/backend/utils/adt/selfuncs.c:2831 0x082e2c2c 14 estimate_num_groups() /home/postgres/develop/.../src/backend/utils/adt/selfuncs.c:2979 0x082e2d79 13 query_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planmain.c:276 0x0821e529 12 grouping_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:1006 0x0821fc23 11 subquery_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:481 0x082219af 10 standard_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:190 0x08221f5b 9 pg_plan_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:697 0x0827885e 8 pg_plan_queries() /home/postgres/develop/.../src/backend/tcop/postgres.c:756 0x08278963 7 exec_simple_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:921 0x08278df2 6 PostgresMain() /home/postgres/develop/.../src/backend/tcop/postgres.c:3617 0x0827a75e 5 BackendRun() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3449 0x0824440d 4 BackendStartup() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3063 0x0824440d 3 ServerLoop() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1387 0x0824440d 2 PostmasterMain() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1040 0x08246f18 1 main() /home/postgres/develop/.../src/backend/main/main.c:188 0x081e7ee7 我们来看看这个函数： double get_variable_numdistinct(VariableStatData *vardata) { double stadistinct; double ntuples; /* * Determine the stadistinct value to use. There are cases where we can * get an estimate even without a pg_statistic entry, or can get a better * value than is in pg_statistic. */ if (HeapTupleIsValid(vardata-statsTuple)) { /* Use the pg_statistic entry */ Form_pg_statistic stats; stats = (Form_pg_statistic) GETSTRUCT(vardata-statsTuple); stadistinct = stats-stadistinct; } else if (vardata-vartype == BOOLOID) { /* * Special-case boolean columns: presumably, two distinct values. * * Are there any other datatypes we should wire in special estimates * for? */ stadistinct = 2.0; } else { /* * We don't keep statistics for system columns, but in some cases we * can infer distinctness anyway. */ if (vardata-var IsA(vardata-var, Var)) { switch (((Var *) vardata-var)-varattno) { case ObjectIdAttributeNumber: case SelfItemPointerAttributeNumber: stadistinct = -1.0; /* unique */ break; case TableOidAttributeNumber: stadistinct = 1.0; /* only 1 value */ break; default: stadistinct = 0.0; /* means unknown */ break; } } else stadistinct = 0.0; /* means unknown */ /* * XXX consider using estimate_num_groups on expressions? */ } /* * If there is a unique index for the variable, assume it is unique no * matter what pg_statistic says; the statistics could be out of date, or * we might have found a partial unique index that proves the var is * unique for this query. */ if (vardata-isunique) stadistinct = -1.0; /* * If we had an absolute estimate, use that. */ if (stadistinct 0.0) return stadistinct; /* * Otherwise we need to get the relation size; punt if not available. */ if (vardata-rel == NULL) return DEFAULT_NUM_DISTINCT; ntuples = vardata-rel-tuples; if (ntuples = 0.0) return DEFAULT_NUM_DISTINCT; /* * If we had a relative estimate, use that. */ if (stadistinct 0.0) return floor((-stadistinct * ntuples) + 0.5); /* * With no data, estimate ndistinct = ntuples if the table is small, else * use default. */ if (ntuples DEFAULT_NUM_DISTINCT) return ntuples; return DEFAULT_NUM_DISTINCT; } 基本思想是使用pg_statistic.stadistinct 来设置 Agg.numGroups。我们 select starelid,staattnum,stadistinct from pg_statistic where starelid = 17229;看看： starelid | staattnum | stadistinct ----------+-----------+------------- 17229 | 1 | -1 17229 | 2 | -0.945455 17229 | 3 | 3 17229 | 4 | 3 17229 | 5 | 1 17229 | 6 | -1 (6 rows) 发现catcode（对应第3项）确实只有3个distinct value。接下来我们看第二个问题：Agg中的AggStrategy Aggstrategy=AGG_HASHED？这个主要是在函数： choose_hashed_grouping(root, tuple_fraction, limit_tuples, cheapest_path, sorted_path, dNumGroups, agg_counts); 中计算的，主要是比较seq+hash 和seq+sort的代价，如果seq+hash代价小，则选择seq+hash 。该阶段中最后通过： result_plan = (Plan *) make_agg(root, tlist, (List *) parse-havingQual, AGG_HASHED, numGroupCols, groupColIdx, extract_grouping_ops(parse-groupClause), numGroups, agg_counts.numAggs, result_plan); 生成最终的agg plan，并把leftchild赋值seq对应的plan。另这里上传了一张包含plantree和解析成plantree过程中用到的数据结构图。 4.Portal中用到的数据结构走到这里了，我们才发现，原来这里的数据结构最复杂。先把我们的图摆上：更清晰的大图可以从这里下载。图中我们发现几个有趣的问题：第1个问题： AggState中的 FmgrInfo *eqfunctions=67 FmgrInfo *hashfunctions=400 这两个字段是什么意思？我们来查查: mydb=# select oid,proname from pg_proc where oid = 400 or oid= 67; oid | proname -----+---------- 67 | texteq 400 | hashtext (2 rows) 可见 eqfunction= 67主要是实现了group by中对catcode 实施“=”操作符具体采用的函数。 hashfunctions=400则表明了需要使用的hash函数。对这两项的赋值是在： void execTuplesHashPrepare(int numCols, Oid *eqOperators, FmgrInfo **eqFunctions, FmgrInfo **hashFunctions) 实现的。调用堆栈为： Thread (Suspended) 13 execTuplesHashPrepare() /home/postgres/develop/.../src/backend/executor/execGrouping.c:239 12 ExecInitAgg() /home/postgres/develop/.../src/backend/executor/nodeAgg.c:1316 0x081c94ed 11 ExecInitNode() /home/postgres/develop/.../src/backend/executor/execProcnode.c:260 0x081bbd01 10 InitPlan() /home/postgres/develop/.../src/backend/executor/execMain.c:835 0x081ba116 9 standard_ExecutorStart() /home/postgres/develop/.../src/backend/executor/execMain.c:219 0x081bb0af 8 PortalStart() /home/postgres/develop/.../src/backend/tcop/pquery.c:539 0x0827e3b3 7 exec_simple_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:953 0x08278c0e 6 PostgresMain() /home/postgres/develop/.../src/backend/tcop/postgres.c:3615 0x0827a75e 5 BackendRun() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3449 0x0824440d 4 BackendStartup() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3063 0x0824440d 3 ServerLoop() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1387 0x0824440d 2 PostmasterMain() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1040 0x08246f18 1 main() /home/postgres/develop/.../src/backend/main/main.c:188 0x081e7ee7 第2个有意思的问题是关于 agg中的 type=20 我们通过： mydb=# select typname,typlen,typinput,typoutput from pg_type where oid = 20; typname | typlen | typinput | typoutput ---------+--------+----------+----------- int8 | 8 | int8in | int8out (1 row) 来看看。第3个有意思的问题是关于AggStatePerAggData.transfn 中的 fn_oid=1219 我们通过 mydb=# select proname from pg_proc where oid= 1219; proname --------- int8inc (1 row) 知道该函数为递增函数。详细实现在/backend/utils/adt/int8.c中。加我私人微信，交流技术。

个人分类: postgresql|7290 次阅读|0 个评论

配置postgresql服务器以查看日志

hillpig 2010-5-5 01:50

参考PostgreSQL 8.4.3 Documentation ： 18.7. Error Reporting and Logging（http://www.postgresql.org/docs/8.4/interactive/runtime-config-logging.html）来进行设置，我的设置如下，大家可以做一个参考： # - Where to Log - #log_destination = 'stderr' logging_collector = on log_directory = 'pg_log' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # - When to Log - #client_min_messages = notice log_min_messages = debug5 # - What to Log - debug_print_parse = on debug_print_rewritten = on debug_print_plan = on debug_pretty_print = on #log_checkpoints = off log_connections = on log_disconnection = on 这样子当你的服务器运行的时候，你在/usr/local/pgsql/data/pg_log 目录下，就可以看到生成的日志了。加我私人微信，交流技术。

个人分类: postgresql|12093 次阅读|1 个评论

postgresql中plantree内存结构

hillpig 2010-5-2 03:01

上次写完了 postgresql中querytree内存结构，接下来就是又让你激动，又让你觉得神秘的plantree了。先看看src/backend/optimizer的readme： Optimizer ========= These directories take the Query structure returned by the parser, and generate a plan used by the executor. The /plan directory generates the actual output plan, the /path code generates all possible ways to join the tables, and /prep handles various preprocessing steps for special cases. /util is utility stuff. /geqo is the separate genetic optimization planner--- it does a semi-random search through the join tree space, rather than exhaustively considering all possible join trees. 哈，所以可以先看看相应的目录下的文件，有个大概印象。在看看readme下面的： Optimizer Data Structures ------------------------- PlannerGlobal - global information for a single planner invocation PlannerInfo - information for planning a particular Query (we make a separate PlannerInfo node for each sub-Query) RelOptInfo - a relation or joined relations RestrictInfo - WHERE clauses, like x = 3 or y = z (note the same structure is used for restriction and join clauses) Path - every way to generate a RelOptInfo(sequential,index,joins) SeqScan - a plain Path node with pathtype = T_SeqScan IndexPath - index scans BitmapHeapPath - top of a bitmapped index scan TidPath - scan by CTID AppendPath - append multiple subpaths together ResultPath - a Result plan node (used for FROM-less SELECT) MaterialPath - a Material plan node UniquePath - remove duplicate rows NestPath - nested-loop joins MergePath - merge joins HashPath - hash joins EquivalenceClass - a data structure representing a set of values known equal PathKey - a data structure representing the sort ordering of a path 上面把optimizer用到的数据结构都说明了用途。我们再看看8.4 的document：在43.5.1. Generating Possible Plans有： The planner/optimizer starts by generating plans for scanning each individual relation (table) used in the query. The possible plans are determined by the available indexes on each relation. There is always the possibility of performing a sequential scan on a relation, so a sequential scan plan is always created. 看样子 sequential scan plan是必须的了，由于我们pois表没建索引，所以我们可以想象，我们的plan是没有index相关的path的。等有时间再写一篇涉及index和整个优化器有关的文章。好，有了这些基本概念，我们开工。再次写一下我们的前提条件：假定我们数据库中已经有如下表，并填充了数据： CREATE TABLE pois ( uid integer not null, catcode VARCHAR(32) not null, ); 现在我们用psql发送请求：select catcode from pois; 程序已经执行完了querytree_list = pg_analyze_and_rewrite(parsetree, query_string, NULL, 0); 接下来就是执行我们神秘的 stmt = (Node *) pg_plan_query(query, cursorOptions, boundParams); 了。首先看调用堆栈： postgresql Thread gdb Debugger (5/1/10 10:16 AM) (Suspended) Thread (Suspended) 17 set_plain_rel_pathlist() /home/postgres/develop/.../src/backend/optimizer/path/allpaths.c:264 16 set_rel_pathlist() /home/postgres/develop/.../src/backend/optimizer/path/allpaths.c:201 15 set_base_rel_pathlists() /home/postgres/develop/.../src/backend/optimizer/path/allpaths.c:157 14 make_one_rel() /home/postgres/develop/.../src/backend/optimizer/path/allpaths.c:93 13 query_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planmain.c:252 12 grouping_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:1006 11 subquery_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:481 10 standard_planner() /home/postgres/develop/.../src/backend/optimizer/plan/planner.c:190 9 pg_plan_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:697 8 pg_plan_queries() /home/postgres/develop/.../src/backend/tcop/postgres.c:756 7 exec_simple_query() /home/postgres/develop/.../src/backend/tcop/postgres.c:921 6 PostgresMain() /home/postgres/develop/.../src/backend/tcop/postgres.c:3617 5 BackendRun() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3449 4 BackendStartup() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:3063 3 ServerLoop() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1387 2 PostmasterMain() /home/postgres/develop/.../src/backend/postmaster/postmaster.c:1040 1 main() /home/postgres/develop/.../src/backend/main/main.c:188 我之所以把堆栈写到 set_plain_rel_pathlist()来，是因为这是建立我们最终顺序扫描path的调用函数。具体调用代码为： static void set_plain_rel_pathlist (PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte){ if (rel-reloptkind == RELOPT_BASEREL relation_excluded_by_constraints(root, rel, rte)) { set_dummy_rel_pathlist(rel); return; } check_partial_indexes(root, rel); set_baserel_size_estimates(root, rel); if (create_or_index_quals(root, rel)) { check_partial_indexes(root, rel); set_baserel_size_estimates(root, rel); } add_path(rel, create_seqscan_path(root, rel)); create_index_paths(root, rel); create_tidscan_paths(root, rel); set_cheapest(rel); } add_path(rel, create_seqscan_path(root, rel)); 即为实际增加我们的path的调用，本身该调用非常简单，就是估算从硬盘读page和时间+处理所有tuple的时间，然后填充path 。我把程序执行完下面这条语句之前的数据结构画张图，如下： line1064： result_plan = optimize_minmax_aggregates(root, tlist, best_path); 下图是最终的内存结构图，为完整起见，我把PlannerInfo及相应的数据结构都保留下来：图太大，picasa不让传超过1M的图片，只好上传到filefront.com/，看大图的可以从这里下载。下图是只包含plan tree的图：至此，全部结束。加我私人微信，交流技术。

个人分类: postgresql|7487 次阅读|0 个评论

关于postgresql的page结构

hillpig 2010-4-26 11:39

参考资料： http://www.google.com/url?sa=tsource=webct=rescd=98ved=0CCoQFjAHOFourl=http%3A%2F%2Fwww.cse.unsw.edu.au%2F~cs9315%2F03s2%2Flectures%2Fweek13%2Fnotes.psrct=jq=postgresql+page+pageheader+tupleei=wQjVS-69DYeK8gS_vtmuDwusg=AFQjCNGiV8SAUdsZeE0X47LbIzegtaAE_gsig2=_J_55sFdtOotITI0jBJUhQ http://ikubo.x0.com/PostgreSQL/pdf/IK14_vacuum.pdf http://www.opensolaris.cz/system/files/czosug19_pg.pdf 加我私人微信，交流技术。

个人分类: music|6 次阅读|0 个评论

postgresql中query tree内存结构

hillpig 2010-4-7 23:52

（由于picasa web的一些更新导致某些大图看不到，现已将文中大图已重新打包，见页面底部）话说上次分析了parsetree的内存结构（参看：postgresql中parsetree内存结构 http://www.sciencenet.cn/m/user_content.aspx?id=309594 ），那么在postgresql的执行流程中接下来就是生成querytree了，那么querytree在内存中是什么样子的呢？先看调用堆栈： Thread (Suspended) 9 parse_analyze() /home/postgres/develop/postgresql-snapshot/src/backend/parser/analyze.c:82 8 pg_analyze_and_rewrite() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:606 7 exec_simple_query() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:918 6 PostgresMain() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:3616 5 BackendRun() /home/postgres/develop/postgresql-snapshot/src/backend/postmaster/postmaster.c:3449 4 BackendStartup() /home/.../src/backend/postmaster/postmaster.c:3063 3 ServerLoop() /home/postgres/develop/postgresql-snapshot/src/backend/postmaster/postmaster.c:1387 2 PostmasterMain() /home/postgres/.../postmaster/postmaster.c:1040 1 main() /home/postgres/develop/postgresql-snapshot/src/backend/main/main.c:188 0x081e7ee7 再把我们的前提条件摆出来一下：假定我们数据库中已经有如下表，并填充了数据： CREATE TABLE pois ( uid integer not null, catcode VARCHAR(32) not null, ); 现在我们用psql发送请求：select catcode from pois; 程序已经执行完了 parsetree_list = pg_parse_query(query_string); 接下来这位老先生querytree_list = pg_analyze_and_rewrite(parsetree, query_string, NULL, 0); 就是我们要拜访的主了。推荐你最好知道到这一步之前，系统在内存中是什么样子的？都初始化了哪些数据结构？具体可参照： PostgresMain()中重要的几个初始化好了，我们开工。我们先看看： Query *parse_analyze(Node *parseTree, const char *sourceText, Oid *paramTypes, int numParams){ ParseState *pstate = make_parsestate(NULL); Query *query; Assert(sourceText != NULL); /* required as of 8.4 */ ... query = transformStmt(pstate, parseTree); free_parsestate(pstate); return query; } 看样子， ParseState和 Query 我们是非要看看不可了。先看Query，如下图：看样子，Querytree是集大成者，无论你是delete，还是update，还是select，经过parsetree之后都要在他老人家这个山头路过。从图中看来，我们的parsetree看样子要和Querytree里的targetlist和rtable字段有个对应了。我们再来简单看看 ParseState ，如下图：如同postgresql中很多个其他类似的*state一样， ParseState 主要用于记录生成Query tree这个阶段的状态。其中很有意思的一个数据结构是RangeTbleEntry，那么这个RangeTbleEntry是何方神圣呢？先别急，我们根据我们的调用流程慢慢分析一下。这里先把RangeTableEntry图画出来：我们一路跟踪执行流程，看看 query = transformStmt(pstate, parseTree) -- result = transformSelectStmt(pstate, n); 在该函数中，我们比较感兴趣的是把from和target都做了哪些手脚。具体调用函数就是： /* process the FROM clause */ transformFromClause(pstate, stmt-fromClause); /* transform targetlist */ qry-targetList = transformTargetList(pstate, stmt-targetList); 1.我们先来分析 transformFromClause(pstate, stmt-fromClause); 该函数中对每一个from clause中的Rangevar：Node *n = lfirst(fl); 执行： n = transformFromClauseItem(pstate, n, rte, rtindex,relnamespace, containedRels); 并赋值： pstate-p_joinlist = lappend(pstate-p_joinlist, n); pstate-p_relnamespace = list_concat(pstate-p_relnamespace,relnamespace); pstate-p_varnamespace = lappend(pstate-p_varnamespace, rte); 所以我们来看看 transformFromClauseItem();函数。该函数中，又调用rte = transformTableEntry(pstate, rv); 函数来执行具体的转换工作。转换完之后，赋值： *top_rte = rte; *top_rti = rtindex; *relnamespace = list_make1(rte); *containedRels = bms_make_singleton(rtindex); rtr = makeNode(RangeTblRef); rtr-rtindex = rtindex; return (Node *) rtr; 返回。所以我们再来看 transformTableEntry(pstate, rv)函数。该函数中继而调用： RangeTblEntry *rte= addRangeTableEntry(pstate, r, r-alias,interpretInhOption(r-inhOpt), true); 该函数中才是主要创建RangeTblEntry的地方，主要是调用了rel = parserOpenTable(pstate, relation, lockmode); 函数。从字面上看该函数无非是在parser阶段打开一个relation，因为这里的RangeVar只有relname ='pois'，故我们猜想，只能通过relname来打开一个relation了。只能通过查询pg_class里面的该relname对应的tuple继而找到oid，继而读入该relation了。好，猜想归猜想，我们来看看 parserOpenTable(pstate, relation, lockmode);里面有哪些调用，主要是 rel = try_heap_openrv(relation, lockmode); 。好，那么我们就来看看这个函数。该函数只是对Relation r = try_relation_openrv(relation, lockmode); 的封装。让我们来看看该函数： relOid = RangeVarGetRelid(relation, true); return relation_open(relOid, lockmode); 里面继而调用r = RelationIdGetRelation(relationId);这个是不是很熟悉，我们在load_critical_index(ClassOidIndexId, RelationRelationId);早就打过交道了。该函数主要是通过查询pg_class表和pg_attribute系统表生成relation-rd_rel 和 relation-rd_att属性。好，我们主要就来看看 RangeVarGetRelid(relation, true);了。该函数主要调用relId = RelnameGetRelid(relation-relname); 来获取该relation的oid。该函数很有意思，我们来详细看看： Oid RelnameGetRelid(const char *relname){ Oid relid; ListCell *l; recomputeNamespacePath(); foreach(l, activeSearchPath) { Oid namespaceId = lfirst_oid(l); relid = get_relname_relid(relname, namespaceId); if (OidIsValid(relid)) return relid; } /* Not found in path */ return InvalidOid; } 这里，我们提前先把namespace的概念再引用一下：根据postgresql 8.4的文档44.25. pg_namespace，我们有如下定义： The catalog pg_namespace stores namespaces. A namespace is the structure underlying SQL schemas: each namespace can have a separate collection of relations, types, etc. without name conflicts. 我们知道，namespace是出于schema（schema我们暂时认为一个用户对应一个schema）之下的用于组织relation的逻辑结构。例如和收费相关的relation我们组织成一个namespace，和学生相关的所有relation我们组织成一个namespace等等。在 recomputeNamespacePath()中，根据默认的namepace搜索路径\$user\,public，扫描pg_namespace表，获得相应的namespace oid。我们先看看pg_namespace表中有哪些namespace： mydb=# select oid,* from pg_namespace; oid | nspname | nspowner | nspacl -------+--------------------+----------+------------------------------------- 11 | pg_catalog | 10 | {postgres=UC/postgres,=U/postgres} 99 | pg_toast | 10 | 2200 | public | 10 | {postgres=UC/postgres,=UC/postgres} 11061 | pg_temp_1 | 10 | 11062 | pg_toast_temp_1 | 10 | 11326 | information_schema | 10 | {postgres=UC/postgres,=U/postgres} (6 rows) 看样子，只有对应 public 的oid= 2200一个了。然后再增加一个系统默认的 11 | pg_catalog ，所以最终有两项namespace oid赋给 static List *activeSearchPath 。 11 和2200。然后我们看看 relid = get_relname_relid(relname, namespaceId); 调用。在该函数中调用return GetSysCacheOid(RELNAMENSP, PointerGetDatum(relname), ObjectIdGetDatum(relnamespace), 0, 0); 参数RELNAMENSP 对应syscache中的第37项，一查我们发现就是pg_class项嘛。这样子我们就在pg_class系统表中根据relname和namespace oid来查找相应的rel oid了。显然： relname namespace 'pois' '11' 对应 pg_catalog 不存在，因为我们的pois表不在系统namespace中 'pois' '2200' 对应 public 存在，系统默认把我们的表创建在该public的namespace中。这样子千辛万苦得到对应relname 为'pois'的 relation oid为17229. 至此，我们的分析告一段落，先看看我们生成的pois表在内存中的Relation是什么样子，如下图：下图表明我们的ParseState中到现在都生成了哪些属性项。此时还没把Query赋值。 2.再来看看 qry-targetList = transformTargetList(pstate, stmt-targetList); 这里的targetlist就是对应sql语句“select catcode from pois”中的catcode 了。该函数主要调用 transformTargetEntry(pstate, res-val, NULL, res-name, false) 我们来看看这个函数。该函数主要包含： expr = transformExpr(pstate, node); colname = FigureColname(node); makeTargetEntry((Expr *) expr, (AttrNumber) pstate-p_next_resno++, colname, resjunk); 我们先来看看函数 transformExpr ()，其主要调用： result = transformColumnRef(pstate, (ColumnRef *) expr); 进而调用 Node * colNameToVar(ParseState *pstate, char *colname, bool localonly, int location) 进而调用 Node * scanRTEForColumn(ParseState *pstate, RangeTblEntry *rte, char *colname, int location) 最终创建Var数据结构返回。该变量经debug得到的各项属性为： expr 0x0a082bc4 xpr {...} varno 1 varattno 1 vartype 1043 vartypmod 36 varlevelsup 0 varnoold 1 varoattno 3 location 7 接下来我们看看 colname = FigureColname(node); 调用。该函数主要抽取colname ='pois' 接下来 makeTargetEntry((Expr *) expr, (AttrNumber) pstate-p_next_resno++, colname, resjunk); 该函数只是用输入的参数填充新创建的TargetEntry。回到最开始的pg_analyze_and_rewrite() 函数中的第二步。 /* * (2) Rewrite the queries, as necessary */ querytree_list = pg_rewrite_query(query); 由于对rules的rewrite规则仅限于对view的重写，故我们的pois 表经过pg_rewrite_query(query);之后的query还是原来的query，debug结果表明我们的分析也是对的，故此处不再分析rules规则重写。至此，关于Query我们全部分析完，形成的Query如下图：至此，结束。文中大图已重新打包： postgresql中query tree内存结构.zip 加我私人微信，交流技术。

个人分类: postgresql|10378 次阅读|0 个评论

postgresql中parse tree内存结构

热度 3 hillpig 2010-4-7 03:01

（置顶：请到这里下载本文的后4张图片）根据 http://momjian.us/main/writings/pgsql/internalpics.pdf Page 11，在postgresql的执行流程中Parse Statement 阶段输出结果为parsetree_list（定义为List *parsetree_list;），这个parsetree_list在内存中到底是什么样子的呢？这就牵涉到postgresql的parse阶段，我们来分析分析。在exec_simple_query() 中，我们知道 parsetree_list = pg_parse_query(query_string); 而在pg_parse_query()中，List *raw_parsetree_list; raw_parsetree_list = raw_parser(query_string);而在 raw_parser()中，进而调用base_yyparse();完成。整个调用堆栈为： 9 raw_parser() /home/postgres/develop/postgresql-snapshot/src/backend/parser/parser.c:51 0x0813c1a7 8 pg_parse_query() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:559 7 exec_simple_query() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:827 6 PostgresMain() /home/postgres/develop/postgresql-snapshot/src/backend/tcop/postgres.c:3616 5 BackendRun() /home/postgres/develop/postgresql-snapshot/src/backend/postmaster/postmaster.c:3449 4 BackendStartup() /.../src/backend/postmaster/postmaster.c:3063 3 ServerLoop() /home/postgres/develop/postgresql-snapshot/src/backend/postmaster/postmaster.c:1387 2 PostmasterMain() /.../src/backend/postmaster/postmaster.c:1040 1 main() /home/postgres/develop/postgresql-snapshot/src/backend/main/main.c:188 既然用到了base_yyparse()，我们就不能不看看lexyacc（对应为flexBison）了。关于如何快速泡上美丽的lexyacc妹妹，请参考我的另一篇文章：一天之内不再畏惧lexyacc之必备参考资料 http://www.sciencenet.cn/m/user_content.aspx?id=309595 。好，不怕了lexyacc之后，我们开工。假定我们数据库中已经有如下表，并填充了数据： CREATE TABLE pois ( uid integer not null, catcode VARCHAR(32) not null, ); 现在我们用psql发送请求：select catcode from pois; 那么接下来我们分析分析，这个parstree到底是什么？让我们来分析分析gram.y中相应的代码（以下红色字体，皆为gram.y中代码） extern List *parsetree; /* final parse result is delivered here */ 既然是在外部定义，那么这个还记得在哪定义的吗？其实就是在调用函数raw_parser() 所在的文件parse.c中定义的：List *parsetree; /* result of parsing is left here */ 接下来： stmtblock: stmtmulti { parsetree = $1; } stmtmulti: stmtmulti ';' stmt { if ($3 != NULL) $$ = lappend($1, $3); else $$ = $1; } | stmt { if ($1 != NULL) $$ = list_make1($1); else $$ = NIL; } ; 好，上面的意思是调用$$ = list_make1($1);来生成parsetree 。那我们看看 list_make1()。在pg_list.h中：#define list_make1(x1) lcons(x1, NIL) 进而在list.c中： List * lcons(void *datum, List *list) { Assert(IsPointerList(list)); if (list == NIL) list = new_list(T_List); else new_head_cell(list); lfirst(list-head) = datum; check_list_invariants(list); return list; } 进而： static List * new_list(NodeTag type) { List *new_list; ListCell *new_head; new_head = (ListCell *) palloc(sizeof(*new_head)); new_head-next = NULL; /* new_head-data is left undefined! */ new_list = (List *) palloc(sizeof(*new_list)); new_list-type = type; new_list-length = 1; new_list-head = new_head; new_list-tail = new_head; return new_list; } 很简单，对吧。注意palloc（）函数表明我们的parstree建立在heap上，没有建在shared memory里。 stmt : AlterDatabaseStmt | AlterDatabaseSetStmt | AlterDomainStmt | AlterFdwStmt | AlterForeignServerStmt | AlterFunctionStmt ....... | SelectStmt ....... | /*EMPTY*/ { $$ = NULL; } ; 我们期待已久的SelectStmt在里面了。接下来： SelectStmt: select_no_parens %prec UMINUS | select_with_parens %prec UMINUS ; select_with_parens: '(' select_no_parens ')' { $$ = $2; } | '(' select_with_parens ')' { $$ = $2; } ; 我们当然是属于不带括号的了，那么朝下看： select_no_parens: simple_select { $$ = $1; } | select_clause sort_clause { insertSelectOptions((SelectStmt *) $1, $2, NIL, NULL, NULL, NULL); $$ = $1; } ..... ; 接着： simple_select: SELECT opt_distinct target_list into_clause from_clause where_clause group_clause having_clause window_clause { SelectStmt *n = makeNode(SelectStmt); n-distinctClause = $2; n-targetList = $3; n-intoClause = $4; n-fromClause = $5; n-whereClause = $6; n-groupClause = $7; n-havingClause = $8; n-windowClause = $9; $$ = (Node *)n; } | .... ; 那么至此，大概框架就有了，示意图如下。内存结构图如下：革命成功了十分之一，继续： target_list: target_el { $$ = list_make1($1); } | target_list ',' target_el { $$ = lappend($1, $3); } ; target_el: a_expr AS ColLabel { $$ = makeNode(ResTarget); $$-name = $3; $$-indirection = NIL; $$-val = (Node *)$1; $$-location = @1; } /* ... | a_expr { $$ = makeNode(ResTarget); $$-name = NULL; $$-indirection = NIL; $$-val = (Node *)$1; $$-location = @1; } | '*' { ColumnRef *n = makeNode(ColumnRef); n-fields = list_make1(makeNode(A_Star)); n-location = @1; $$ = makeNode(ResTarget); $$-name = NULL; $$-indirection = NIL; $$-val = (Node *)n; $$-location = @1; } ; 好， | a_expr 即为我们要找的。继续追踪a_expr ： a_expr: c_expr { $$ = $1; } | a_expr TYPECAST Typename { $$ = makeTypeCast($1, $3, @2); } ... ; c_expr: columnref { $$ = $1; } | AexprConst { $$ = $1; } ... ; 接下来： columnref: relation_name { $$ = makeColumnRef($1, NIL, @1); } | relation_name indirection { $$ = makeColumnRef($1, $2, @1); } ; 然后： relation_name: SpecialRuleRelation { $$ = $1; } | ColId { $$ = $1; } ; ColId: IDENT { $$ = $1; } | unreserved_keyword { $$ = pstrdup($1); } | col_name_keyword { $$ = pstrdup($1); } ; ^_^，终于找到了IDENT了，因为我们稍作分析就知道“select catcode from pois”中的catcode可通过scan.I中的如下规则识别为IDENT： {identifier} { const ScanKeyword *keyword; char *ident; SET_YYLLOC(); /* Is it a keyword? */ keyword = ScanKeywordLookup(yytext); if (keyword != NULL) { yylval.keyword = keyword-name; return keyword-value; } /* * No. Convert the identifier to lower case, and truncate * if necessary. */ ident = downcase_truncate_identifier(yytext, yyleng, true); yylval.str = ident; return IDENT; } 看源码中一段注释，就知道IDENT是干嘛的了：IDENT is the lexeme returned by the lexer for identifiers that match no known keyword. 通过gram.y自带的makeColumnRef()函数，我们可以知道该Node的内存存储结构。 static Node * makeColumnRef(char *colname, List *indirection, int location) { /* * Generate a ColumnRef node, with an A_Indirection node added if there * is any subscripting in the specified indirection list. However, * any field selection at the start of the indirection list must be * transposed into the fields part of the ColumnRef node. */ ColumnRef *c = makeNode(ColumnRef); int nfields = 0; ListCell *l; c-location = location; foreach(l, indirection) { if (IsA(lfirst(l), A_Indices)) { A_Indirection *i = makeNode(A_Indirection); if (nfields == 0) { /* easy case - all indirection goes to A_Indirection */ c-fields = list_make1(makeString(colname)); i-indirection = check_indirection(indirection); } else { /* got to split the list in two */ i-indirection = check_indirection(list_copy_tail(indirection, nfields)); indirection = list_truncate(indirection, nfields); c-fields = lcons(makeString(colname), indirection); } i-arg = (Node *) c; return (Node *) i; } else if (IsA(lfirst(l), A_Star)) { /* We only allow '*' at the end of a ColumnRef */ if (lnext(l) != NULL) yyerror(improper use of \*\); } nfields++; } /* No subscripting, so all indirection gets added to field list */ c-fields = lcons(makeString(colname), indirection); return (Node *) c; } 至此，我们分析告一小段落，数据结构示意图如下(看不清图，请到这里下载本文的后4张图片 )：此时，内存结构图如下(点击看大图)：接下来我们分析from 子句。 from_clause: FROM from_list { $$ = $2; } | /*EMPTY*/ { $$ = NIL; } ; from_list: table_ref { $$ = list_make1($1); } | from_list ',' table_ref { $$ = lappend($1, $3); } ; 接下来： table_ref: relation_expr { $$ = (Node *) $1; } ... ; relation_expr: qualified_name { /* default inheritance */ $$ = $1; $$-inhOpt = INH_DEFAULT; $$-alias = NULL; } ... ; qualified_name: relation_name { $$ = makeNode(RangeVar); $$-catalogname = NULL; $$-schemaname = NULL; $$-relname = $1; $$-location = @1; } ... ; relation_name: SpecialRuleRelation { $$ = $1; } | ColId { $$ = $1; } ; 哈，又见“ColId”。还记得下面的规则吗： ColId: IDENT { $$ = $1; } | unreserved_keyword { $$ = pstrdup($1); } | col_name_keyword { $$ = pstrdup($1); } ; 这样，我们得出最终的示意图：最终的内存结构图为：那上面的分析以及结果图是否正确呢？我们debug运行一下： Normal07.8 磅02falsefalsefalseEN-USZH-CNX-NONEMicrosoftInternetExplorer4/* Style Definitions */table.MsoNormalTable{mso-style-name:普通表格;mso-tstyle-rowband-size:0;mso-tstyle-colband-size:0;mso-style-noshow:yes;mso-style-priority:99;mso-style-qformat:yes;mso-style-parent:;mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-para-margin:0cm;mso-para-margin-bottom:.0001pt;mso-pagination:widow-orphan;font-size:10.5pt;mso-bidi-font-size:11.0pt;font-family:Calibri,sans-serif;mso-ascii-font-family:Calibri;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:宋体;mso-fareast-theme-font:minor-fareast;mso-hansi-font-family:Calibri;mso-hansi-theme-font:minor-latin;mso-bidi-font-family:Times New Roman;mso-bidi-theme-font:minor-bidi;mso-font-kerning:1.0pt;}parsetree_list 0x0a1b46c0 type T_List length 1 head 0x0a1b46ac data {...} ptr_value 0x0a1b4620 type T_SelectStmt distinctClause 0x00000000 intoClause 0x00000000 targetList 0x0a1b4594 type T_List length 1 head 0x0a1b4580 data {...} ptr_value 0x0a1b4554 type T_ResTarget name 0x00000000 indirection 0x00000000 val 0x0a1b44f4 type T_ColumnRef fields 0x0a1b4538 type T_List length 1 head 0x0a1b4524 data {...} ptr_value 0x0a1b4510 type T_String val {...} ival 169559264 str 0x0a1b44e0 int_value 169559312 oid_value 169559312 next 0x00000000 tail 0x0a1b4524 location 7 location 7 int_value 169559380 oid_value 169559380 next 0x00000000 tail 0x0a1b4580 fromClause 0x0a1b4604 type T_List length 1 head 0x0a1b45f0 data {...} ptr_value 0x0a1b45c4 type T_RangeVar catalogname 0x00000000 schemaname 0x00000000 relname 0x0a1b45b0 inhOpt INH_DEFAULT istemp false alias 0x00000000 location 20 int_value 169559492 oid_value 169559492 next 0x00000000 tail 0x0a1b45f0 whereClause 0x00000000 groupClause 0x00000000 havingClause 0x00000000 windowClause 0x00000000 withClause 0x00000000 valuesLists 0x00000000 sortClause 0x00000000 limitOffset 0x00000000 limitCount 0x00000000 lockingClause 0x00000000 op SETOP_NONE all false larg 0x00000000 rarg 0x00000000 int_value 169559584 oid_value 169559584 next 0x00000000 tail 0x0a1b46ac 表明我们的分析完全正确。至此，结束。加我私人微信，交流技术。

个人分类: postgresql|13968 次阅读|5 个评论

Postmaster的Shared Memory中的shared buffer pool内存结构

hillpig 2010-4-3 08:13

上次写完了 Postmaster的Shared Memory中的shmem index table 内存结构（ http://www.sciencenet.cn/m/user_content.aspx?id=307647 ），接下来比较重要的另外一个初始化是 shared buffer pool内存结构。初始化后内存结构如下图。. . 说明：由于最后一步初始化 BufferStrategyControl时非常简单，图中只是把 BufferStrategyControl结构加入到图中，没有在ShmemIndex 相应的结构中体现，不影响大局。至此，结束。

个人分类: postgresql|5465 次阅读|0 个评论

Postmaster的Shared Memory中的shmem index table 内存结构

hillpig 2010-3-31 12:09

我们知道，Postmaster的Shared Memory中的shmem index table 是一个dynamic hash table，所以理解该hash table初始化时在内存中的结构对于理解postmaster 的shared memory有重要帮助。调用流程： PostmasterMain-reset_shared(int)-CreateSharedMemoryAndSemaphores()-InitShmemIndex(); 先看PostgreSQL总体内存结构：在momjian的Inside PostgreSQL Shared Memory http://momjian.us/main/writings/pgsql/inside_shmem.pdf 第11页中Shared Memory Creation中，我们可以了解到PostgreSQL总体内存结构：关于Heap，Stack，Shared Memory的关系和在内存中的位置，请参考我的另外一篇文章： http://blog.chinaunix.net/u2/81513/showart.php?id=2203403 由于Postgresql对malloc的替换实现palloc，底层仍然采用glib c的malloc调用，所以我们可以得出凡是使用palloc分配的内存都是在heap上的（即采用brk系统调用所申请），也就是上图中的data向下的箭头所指。关于共享内存的实现，通常是调用os接口mmap()实现。把os中内存页面映射到进程空间中。 pa=mmap(addr, len, prot, flags, fildes, off); 其中addr参数如果为null的话（通常如此），则映射到内存空间的地址由系统设定。由于Postmaster中也是设置的null，故我们不能准确知道shared memory的起始地址。但无妨后面的分析。调用InitShmemIndex()完之后，Postmaster的内存hash index table结构初始化为下图：至此，结束。加我私人微信，交流技术。

个人分类: postgresql|8344 次阅读|0 个评论

PostgreSQL学习

热度 1 guodanhuai 2009-8-1 11:58

最近在学习如何在Linux下安装使用postgresql，先简单介绍一下postgresql名门之秀（origined from University of California,Berkley）： PosgreSQL号称是世界上最先进的开源数据库(The world's most advanced open source database ),具有超过15年的积极的发展和经验证的框架，在稳定性、数据一致性和正确性方面赢得了良好的声誉，可以运行在几乎所有的主流操作系统，包括Linux，UNIX(AIX,BSD,HP-UX,SGI IRIX, Mac OS X,Solaris,Tru64)和Windows，来自多个国家的支持使得postgresql支持多语言的关键字、合并，显示、触发和可存储的处理过程她包括SQL92和SQL99的大部分数据类型，包括integer，numberic，Boolean，char，varchar，date，interval，和timestamp，还支持二进制大对象，包括图形，声音和录像。就地理信息系统而言，她具有天然的支持能力，如他支持几何对象和几何抽象对象，如point，line，lseg(line segment)，box，path，polygon，circle，route等等，具体的使用方法，有待进一步学习，在以后的博文中进一步充实...... 具有天然支持以下开发语言的接口：C/C++，Java，.Net，Perl，Python，Ruby，Tcl，ODBC。在Postgresql的官方文档中，频繁地使用native这个词，我想这也许是出自他们对postgresql开放性和广泛支持的自信吧。 1 安装安装比较简单，按tutorial中的说明就可以解决了 2 配置安装好了之后，由于我的客户机和数据库服务器在不同的网段，所以必须进行必要的配置以下是配置的过程： /usr/local/pgsql/data/下有两个配置文件一为postgresql.conf（声明主服务器配置文件）首先修改这个文件，其中有段话为： #listen_addresses = 'localhost' # what IP address(es) to listen on; 意思是说数据库侦听的地址是localhost，这样如果你使用IP地址的话就不能有效地访问了，吧这里修改成数据库服务器的IP地址，并将前面的#去掉；另一为pg_hba.conf （声明用于主机为基础的认证配置文件的文件名）这个文件是设置远程访问认证机制的下面的内容是我在网上找的，写得挺好：文件上面是介绍文件如何设置的 # type database user cidr-address method 分别指：连接类型，可使用的数据库名，使用者，CIDR地址，验证方法 Type可选择的是local和host Database是连接用户的数据库名字，可以是Postgre的一个具体数据库名，也可以用all来允许用户访问所有的数据库 User可以指定某个具体的用户来连接postgre数据库的用户名，当然也可以用all来代表全部的用户名。 CIDR-address 是 IP 地址和掩码的另一种表示方法， postgres 是通过这个字段来了解，允许哪些 IP 或 IP 网段连接此服务器的。这个掩码和子网掩码是一个道理，只不过是用一个小于等于 32 的正数来表示，表示的正是子网掩码中高几位是 1 （这一段我也没搞得太清楚，）如 255.255.255.0 就是 24 ，说明高 24 位是 1 把子网掩码换成二进制 255.255.252.0 11111111.11111111.11111100.00000000 不就是 22 位吗 22 位表示网络 , 后面有 10 个 0 可以表示主机 IP,2 的 10 次方 =1024 个那 255.255.255.224 子网掩码 , CIDR 前缀是多少呢 , 能表示多少主机 ? 算 Method表示认证方法，即auth-method，有trust、reject、MD5(要求客户端提供一个MD5加密的口令进行认证)，crypt，password(未加密的口令进行认证)选择项设置好后，重启并装入数据库 #netstat nl 就可以看到监听程序监听的地址不是原来默认的127.0.0.1：5432了。

个人分类: Technology|5596 次阅读|1 个评论

Java连接PostgreSQL(1)

guodanhuai 2009-7-31 10:57

PostgreSQL 是一款及其优秀的开源的数据库软件，这可以从她系出名门它，就可以看出来。Java提供了数据库的多种连接方式，JDBC是其中的较适用的一种。下面是在下在最近的工作中学习到的如何在Java环境下连接Postgresql的心得。 JDBC-posgresql.Driver的API函数很多，JDBC.API.Turtorial.and.Reference 3rd Edition就有洋洋洒洒的1345页，看完它，呵呵估计下一个日全食都要快来了，^--^ 在学习软件中，看到自己的Hello World的小程序，比读一大堆的说明文档似乎更能激励自己。网上也有类似的介绍文章，看了之后，有些关键步骤没有说清楚，很打击积极性。下面就是第一个连接代码，so easy！ 1、下载最新的JDBC.postegresql.driver http://jdbc.postgresql.org/download.html#current 视不同的情况确定下载的版本，我下载的是postgresql-8.4-701.jdbc4.jar，下载后将其复制在您的工程文件夹，以便于管理。 2、建立一个Java工程，并建立一个Test.java的main启动程序文件，这在eclipse中是很简单的； 3、在工程属性中添加jdbc.postgresql.driver这一步，很重要方法project properties Java Build Path Libraries Add External JARs 4、在main中加入以下代码： public static void main(String[] args) { // TODO Auto-generated method stub System.out.print(welcome java); System.out.print( this is a test ); try { Class.forName( org.postgresql.Driver ).newInstance(); String url = jdbc:postgresql://IPAddress:port/cas_data ; Connection con = DriverManager.getConnection(url, username , password ); //注意这里用户名和密码前后都不能有空格，否则会报错 Statement st = con.createStatement(); String sql = select lon,lat from sc_status ; ResultSet rs = st.executeQuery(sql); while (rs.next()) { System.out.print(rs.getString( 1 )); System.out.println(rs.getString( 2 )); } rs.close(); st.close(); con.close(); } catch (Exception ee) { System.out.print(ee.getMessage()); } } 这只是一个很基本的连接，为了提高连接的速度和降低服务器的开销，可以设置连接池，这些在网上都有介绍，不是本文的重点。后续的文章，将介绍在其他环境中如何建立postgresql的连接。【Reference】 http://www.blogjava.net/wujun/archive/2006/06/30/55924.html http://jdbc.postgresql.org/documentation/docs.html JDBC API Tutorial and Reference, Third Edition

个人分类: Linux|11955 次阅读|0 个评论

帐号		自动登录	找回密码
密码			注册

关闭安全验证

标签: postgresql

相关帖子

相关日志

关闭 安全验证

标签: postgresql

相关帖子

相关日志

关闭安全验证