Sphinx+Pgsql+中文分词安装
一、安装所需文件
mmseg-0.7.3.tar.gz 中文分词
http://www.coreseek.com/uploads/sources/mmseg-0.7.3.tar.gz
sphinx-0.9.8-rc2.tar.gz sphinx-0.9.8-rc2源代码
http://www.sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz
sphinx-0.98rc2.zhcn-support.patch sphinx支持分词补丁
http://www.coreseek.com/uploads/sources/sphinx-0.98rc2.zhcn-support.patch
fix-crash-in-excerpts.patch sphinx支持分词防crash补丁
http://www.coreseek.com/uploads/sources/fix-crash-in-excerpts.patch
二、开始安装
1 安装libmmseg
tar -zxvf mmseg-0.7.3.tar.gz
- cd mmseg-0.7.3
- ./configure –prefix=/usr/local/mmseg
- make
- make install
- cd ..
安装mmseg完成,测试一下
- mmseg
- Coreseek COS(tm) MM Segment 1.0
- Copyright By Coreseek.com All Right Reserved.
- Usage: mmseg <option> <file>
- -u <unidict> Unigram Dictionary
- -r Combine with -u, used a plain text build Unigram Dictionary, default Off
- -b <Synonyms> Synonyms Dictionary
- -h print this help and exit
2 安装Sphinx 补丁
在安装之前先打两个补丁,这个是支持中文必须打的补丁
- tar -zxvf sphinx-0.9.8-rc2.tar.gz
- cd sphinx-0.9.8
- #下载中文补丁
wget http://cloud.github.com/downloads/cogentsoft/zbs/sphinx-0.98rc2.zhcn-support.patch
patch -p1 < sphinx-0.98rc2.zhcn-support.patch
#下载防crash补丁
wget http://cloud.github.com/downloads/cogentsoft/zbs/fix-crash-in-excerpts.patch
patch -p1 < fix-crash-in-excerpts.patch
3.检查pgsql的开发包是否已安装:
1、查看linux 版本:lsb_release -a
2、对于RHEL 5, 如果在usr/include/ 目录下没有pgsql目录表示pgsql 的开发包还未安装,不能满足sphinx 的头文件和库文件需求,需要安装相应的开发包,比如:postgresql-devel-8.1.4-1.1.i386.rpm(针对Rhel5)
4.安装sphinx
cd /root/lemp/sphinx-0.9.8-rc2
- ./configure –prefix=/usr/local/sphinx –without-mysql –with-pgsql \
–with-pgsql-includes=/usr/include/pgsql/ –with-pgsql-libs=/usr/lib/pgsql/ \
–with-mmseg-includes=/usr/local/mmseg/include/mmseg/ –with-mmseg-libs=/usr/local/mmseg/lib/ –with-mmseg - make
- make install
5.编辑pg_hba.conf
在pg_hba.conf 中添加如下语句:
host all all 127.0.0.1/32 trust
否则运行 indexer –all 的时候会报错:ERROR: index ‘test1’: sql_connect: 致命错误: 没有用于主机 “127.0.0.1”, 用户 “foo”, 数据库 “test”, SSL 关闭 的 pg_hba.conf 记录
6、按照参考资料<sphinx+mysql+中文分词>中的sphinx 测试方法进行测试
1)基于Sphinx+MySQL的千万级数据全文检索(搜索引擎)架构设计:http://blog.s135.com/post/360/
4) sphinx+mysql+中文分词 :http://www.bsdlover.cn/html/83/n-3283.html
关于作者:
昵称:商云方 档案信息:顾问, HAND张江技术中心 联系方式:你可以通过yunfang.shang@hand-china.com联系作者 点击查看商云方发表过的所有文章... 本文永久链接: http://blog.retailsolution.cn/archives/2362 |
对本文的评价:
1)sphinx 支持 postgresql的安装
http://www.postneo.com/2009/02/06/sphinx-search-with-postgresql
2)查看linux 版本:lsb_release -a
3)对于RHEL 5, 如果在usr/include/ 目录下没有pgsql目录表示pgsql 的开发包还未安装,不能满足sphinx 的头文件和库文件需求,需要安装相应的开发包,比如:postgresql-devel-8.1.4-1.1.i386.rpm(针对Rhel5)
4)config 之前安装补丁,按如下顺序,补丁下载到源文件安装目录下
#下载中文补丁
wget http://cloud.github.com/downloads/cogentsoft/zbs/sphinx-0.98rc2.zhcn-support.patch
patch -p1 < sphinx-0.98rc2.zhcn-support.patch #下载防crash补丁
wget http://cloud.github.com/downloads/cogentsoft/zbs/fix-crash-in-excerpts.patch
patch -p1 < fix-crash-in-excerpts.patch 5)在RHEL5 上面配置pgsql的命令如下: ./configure --prefix=/usr/local/sphinx --without-mysql --with-pgsql \ --with-pgsql-includes=/usr/include/pgsql/ --with-pgsql-libs=/usr/lib/pgsql/ \ --with-mmseg-includes=/usr/local/mmseg/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg/lib/ --with-mmseg 6)在pg_hba.conf 中添加如下语句: host all all 127.0.0.1/32 trust 否则运行 indexer --all 的时候会报错:ERROR: index 'test1': sql_connect: 致命错误: 没有用于主机 "127.0.0.1", 用户 "foo", 数据库 "test", SSL 关闭 的 pg_hba.conf 记录