[1]戴光荣,宋玉春.哈希算法与语义映射在语料库对齐中的运用[J].福建工程学院学报,2014,12(05):454-458,463.[doi:10.3969/j.issn.1672-4348.2014.05.009]
 Dai Guangrong,Song Yuchun.Applications of hash algorithms and semantic mapping in C-E sentential alignment[J].Journal of FuJian University of Technology,2014,12(05):454-458,463.[doi:10.3969/j.issn.1672-4348.2014.05.009]
点击复制

哈希算法与语义映射在语料库对齐中的运用()
分享到:

福建工程学院学报[ISSN:2097-3853/CN:35-1351/Z]

卷:
第12卷
期数:
2014年05期
页码:
454-458,463
栏目:
文学与语言学
出版日期:
2014-11-10

文章信息/Info

Title:
Applications of hash algorithms and semantic mapping in C-E sentential alignment
作者:
戴光荣宋玉春
福建工程学院人文学院
Author(s):
Dai Guangrong Song Yuchun
School of Humanities, Fujian University of Technology
关键词:
哈希算法词典语义映射对齐技术平行语料库
Keywords:
hash algorithm semantic mapping alignment technology parallel corpus
分类号:
TP391
DOI:
10.3969/j.issn.1672-4348.2014.05.009
文献标志码:
A
摘要:
探讨汉英句级对齐软件设计中两项主要技术,即哈希算法与词典语义映射在对齐中的运用。哈希算法能帮助软件从词典大量的英汉词条语义信息中快速提取所需的对应义,结合语义映射,将需要对齐的句子关键词信息进行语义识别,从而有效提高汉英句子对齐效果。
Abstract:
Automatic sentential alignment of Chinese and English texts is of critical importance to formulate ChineseEnglish parallel corpora. The alignment quality has direct impact on the reliability of related research such as machine translation, bilingual lexicography and contrastive language studies. This research adopts the two key technologies of alighment software designing, i.e. hash algorithm and semantic mapping to perform sentential alignment of Chinese/English texts. hash algorithms enable retrieving information from the knowledge database in a high speed, and can conduct semantic recognition of the key words/information of the sentences to be aligned via semantic mapping to attain effective E-C corpora alignment. This research can shed new light on Chinese/English sentential alignment.

参考文献/References:

[1] Brown P F, Lai J C, Mercer R L. Aligning sentences in parallel corpora [C]//Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics[]. Berkeley: ACL,1991:169-176.
[2] Gale W A, Church K W. A program for aligning sentences in bilingual corpora[J]. Computational Linguistics,1993,19(1):75-102.
[3] 王建新.计算机语料库的建设与应用[M].北京:清华大学出版社,2005:121-122.
[4] 黄俊红,范云,黄萍.双语平行语料库对齐技术述评[J].外语电化教学,2007(6):21-25.
[5] Kay M, Rscheisen M. Texttranslation alignment[J]. Computational Linguistics,1993,19(1):121-142.
[6] Chen S F. Aligning sentences in bilingual corpora using lexical information[C]//Proceedings of the 31th Annual Meeting of the Association for Computational Linguistics. Columbus: ACL,1993:9-16.
[7] McEnery T, Piao S,Xin X. Parallel alignment in English and Chinese [C]//Botley S P, McEnery A M,Wilson A (eds), Multilingual Corpora in Teaching and Research. Amsterdam: Rodopi,2000:177-191.
[8] Oakes M, McEnery T. Bilingual text alignment: an overview [C]// Botley S P, McEnery A M, Wilson A(eds). Multilingual Corpora in Teaching and Research. Amsterdam: Rodopi,2000:1-37.
[9] Simard M, Foster G, Hannan M L, et al. Bilingual text alignment: where do we draw the line? [C]// Botley S P, McEnery A M,Wilson A (eds). Multilingual Corpora in Teaching and Research. Amsterdam: Rodopi,2000:38-64.
[10] 冯敏萱.汉英平行语料库的平行处理[M].北京:世界图书出版公司,2011:21.
[11] Wu D. Aligning a parallel EnglishChinese corpus statistically with lexical criteria[C]// The Proceedings of the 32nd ACL.New Mexico State University, Les Cruces, New Mexico,1994:80-87.
[12] Wu D. An Algorithm for Simultaneously Bracketting Parallel Texts by Aligning Words[C]// Proceedings of the 33rd ACL. MIT,Cambridge, Mass,1995:244-251.
[13] Melamed I D. Models of translational equivalence among words[J].Computational Linguistics,2000,26(2):221-250.
[14] 张艳,柏冈秀纪.基于长度的扩展方法的汉英句子对齐[J].中文信息学报,2005,19(5):31-37.
[15] 李维刚,刘挺,张宇,等.基于长度和位置信息的双语句子对齐方法[J].哈尔滨工业大学学报,2006,38(5):689-692.
[16] 梁茂成,许家金. 双语语料库建设中元信息的添加和段落与句子的两级对齐[J].中国外语,2012,9(6):37-43.

更新日期/Last Update: 2014-10-25