術(shù)語庫和語料庫合集
翻譯實(shí)踐中,我們常常會(huì)遇到很多字典中查詢不到的詞匯和表達(dá),這個(gè)時(shí)候就可以借助術(shù)語庫和語料庫來解決問題。
中國關(guān)鍵詞:
http://www.china.org.cn/chinese/china_key_words/
中國特色話語對(duì)外翻譯標(biāo)準(zhǔn)化術(shù)語庫:
http://210.72.20.108/index/index.jsp
中國核心詞匯:
https://www.cnkeywords.net/index
中國思想文化術(shù)語:
https://www.chinesethought.cn/TermBase.aspx
聯(lián)合國術(shù)語庫:https://unterm.un.org/UNTERM/pohttps://unterm.un.org/UNTERM/portal/welcome
術(shù)語在線:
https://www.termonline.cn/index
國家教育研究院術(shù)語庫:
https://terms.naer.edu.tw/download/
明代職官中英辭典:
https://escholarship.org/uc/uci_libs
中國規(guī)范術(shù)語:
https://shuyu.cnki.net/#/
Grand Dictionnaire Terminologique:
https://gdt.oqlf.gouv.qc.ca/
TERMIUM:
https://www.btb.termiumplus.gc.ca/tpv2alpha/alpha-eng.html?lang=eng
語帆術(shù)語寶:
http://termbox.lingosail.com/
微軟術(shù)語庫:
https://www.microsoft.com/zh-cn/language
世界衛(wèi)生組織術(shù)語庫:
https://www.who.int/home/cms-decommissioning
電子工程術(shù)語表:
https://www.maximintegrated.com/cn/glossary/definitions.mvp/terms/all
FreeMdict 100GB超大離線詞庫下載:
https://downloads.freemdict.com/
一本詞典(專利術(shù)語庫):
http://www.onedict.com/
國家標(biāo)準(zhǔn)《物流術(shù)語》 :https://logistics.nankai.edu.cn/_upload/article/76/83/1c5da71e4b8e9838ae0843c8cb3d/3a1617ed-acfb-4504-9e18-c079e98e6154.pdf
冬奧會(huì)術(shù)語查詢網(wǎng)站:
owgt.lingosail.com/
音樂術(shù)語查詢:
http://dictionary.t-classical.com/
European Union Language and terminology:
https://eur-lex.europa.eu/summary/glossary.html?locale=en
IATE (Interactive Terminology for Europe) EU’s terminology database:
https://iate.europa.eu/home
香港法律中英術(shù)語:
https://www.elegislation.gov.hk/glossary/chi
Magic Search:
https://magicsearch.org/
Linguee:
https://www.linguee.com/
The Free Dictionary:
https://www.thefreedictionary.com/
Glosbe:
https://glosbe.com/
國內(nèi)
BCC語料庫:
http://bcc.blcu.edu.cn/
語料庫在線:
http://www.aihanyu.org/cncorpus/index.aspx
北京大學(xué)中國語言學(xué)研究中心:
ccl.pku.edu.cn
北外語料庫語言學(xué):
bfsu-corpus.org/
現(xiàn)代漢語平衡語料庫:
https://www.sinica.edu.tw/SinicaCorpus/
古漢語語料庫/近代漢語標(biāo)記語料庫/漢籍電子文獻(xiàn):
https://www.sinica.edu.tw/ch
樹圖數(shù)據(jù)庫:
http://treebank.sinica.edu.tw/
搜文解字:
http://words.sinica.edu.tw/
媒體語言語料庫(MLC):
https://ling.cuc.edu.cn/RawPub/
哈工大信息檢索研究室對(duì)外共享語料庫資源:
http://ir.hit.edu.cn/demo/ltp/Sharing_Plan.htm
泛話語地區(qū)漢語共時(shí)語料庫(LiVaC):
http://www.livac.org/index.php?lang=sc
中文語言資源聯(lián)盟:
http://www.chineseldc.org/
中央研究院近代漢語標(biāo)記語料庫:
http://lingcorpus.iis.sinica.edu.tw/early/
《紅樓夢(mèng)》漢英平行語料庫:
http://corpus.usx.edu.cn/hongloumeng/images/shiyongshuoming.htm
國外
BNC——英國國家語料庫(British National Corpus):
http://www.natcorp.ox.ac.uk/
BOE——柯林斯英語語料庫(the Bank of English):
http://www.collinslanguage.com/language-resources/dictionary-datasets/
ANC——美國國家語料庫(American National Corpus):
https://www.anc.org/
蘭開斯特漢語語料庫 (LCMC):
http://ota.oucs.ox.ac.uk/scripts/download.php?otaid=2474
SKETCH ENGINE多語言語料庫:
https://www.sketchengine.eu/
BASE——英國學(xué)術(shù)口語語料庫(British Academic Spoken English Corpus):
https://warwick.ac.uk/fac/soc/al-archive-deleted/research/base
Lextutor:
http://www.lextutor.ca/
My Memory:
https://mymemory.translated.net/
TAUS:
https://datamarketplace.taus.net/
TTMEM:
https://www.ttmem.com/terminology/download-translation-memory/
TinyTM:
http://tinytm.sourceforge.net/
DGT Translation Memory:
https://magmatranslation.com/en/free-translation-memory/
European Parliament Proceedings Parallel Corpus 1996-2011:
https://statmt.org/europarl/
University of Maryland Parallel Corpus Project: The Bible:
http://users.umiacs.umd.edu/~resnik/parallel/bible.html
Aligned Hansards of the 36th Parliament of Canada:
https://www.isi.edu/research_groups/nlg/home
EU Publication Offices:
https://op.europa.eu/en/web/general-publications/publications
Wikimedia Downloads:
https://dumps.wikimedia.org/backup-index.html
United Nations Parallel Corpus:
https://conferences.unite.un.org/UNCorpus/
European language pairs:
https://www.statmt.org/wmt13/translation-task.html#download
parallel corpus search:
http://paralela.clarin-pl.eu/
UM-Corpus: A Large English-Chinese Parallel Corpus(自然語言處理與中葡機(jī)器翻譯實(shí)驗(yàn)室):
http://nlp2ct.cis.umac.mo/um-corpus/um-corpus-license.html
Clarin Parallel corpora:
https://www.clarin.eu/resource-families/parallel-corpora
The PKU 863 Chinese-English Parallel Corpus:
https://www.lancaster.ac.uk/fass/projects/corpus/863parallel/
BYU corpora:?
https://corpus.byu.edu/
A collection of translated literature:
https://opus.nlpl.eu/Books.php
A collection of EU Translation Memories provided by the JRC:
https://opus.nlpl.eu/DGT.php
Documents from the Catalan Goverment:
https://opus.nlpl.eu/DOGC.php
European Central Bank corpus:
https://opus.nlpl.eu/ECB.php
European Medicines Agency documents:
https://opus.nlpl.eu/EMEA.php
The EU bookshop corpus:
https://opus.nlpl.eu/EUbookshop.php
The European constitution/European Parliament Proceedings:
https://opus.nlpl.eu/EUconst.php
French-English Gigal-Word Corpus:
https://opus.nlpl.eu/giga-fren.php
GNOME localization files:
https://opus.nlpl.eu/GNOME.php
News stories in various languages:
https://opus.nlpl.eu/GlobalVoices.php
English WaC corpus:
https://opus.nlpl.eu/hrenWaC.php
JRC-Acquis- legislative EU texts:
https://opus.nlpl.eu/JRC-Acquis.php
KDE4 – KDE4 localization files (v.2):
https://opus.nlpl.eu/KDE4.php
KDEdoc – the KDE manual corpus:
https://opus.nlpl.eu/KDEdoc.php
MBS – Belgisch Staatsblad corpus:
https://opus.nlpl.eu/MBS.php
memat – Xhosa/English parallel data:
https://opus.nlpl.eu/memat.php
MontenegrinSubs – Montenegrin movie subtitles:
https://opus.nlpl.eu/MontenegrinSubs.php
MultiUN – Translated UN documents:
https://opus.nlpl.eu/MultiUN.php
News Commentary, v9.0, v9.1:
https://opus.nlpl.eu/News-Commentary-v11.php
OfisPublik – Breton – French parallel texts:
https://opus.nlpl.eu/OfisPublik.php
OO – the OpenOffice.org corpus:
https://opus.nlpl.eu/OpenOffice-v2.php
OpenOffice.org 3 corpus:
https://opus.nlpl.eu/OpenOffice-v3.php
OpenSubtitles – the opensubtitles.org corpus:
https://opus.nlpl.eu/OpenSubtitles-v1.php
OpenSubtitles2016 – snapshot from 2016:
https://opus.nlpl.eu/OpenSubtitles-v2016.php
OpenSubtitles2018 – new complete version:
http://opus.nlpl.eu/OpenSubtitles-v2018.php
ParaCrawl corpus:
https://opus.nlpl.eu/ParaCrawl.php
ParaCrawl corpus:
http://opus.nlpl.eu/ParCor
ParCor – A Parallel Pronoun-Coreference Corpus/PHP – the PHP manual corpus:
http://opus.nlpl.eu/ParCor
Regeringsf?rklaringen – a tiny example corpus:
http://opus.nlpl.eu/RF.php
SETIMES?– A parallel corpus of the Balkan languages:
http://opus.nlpl.eu/SETIMES.php
SPC – Stockholm Parallel Corpora:
https://opus.nlpl.eu/SPC.php
Tatoeba?– A DB of translated sentences:
http://opus.nlpl.eu/Tatoeba.php
TedTalks hr-en:
http://opus.nlpl.eu/TedTalks.php
TED Talks 2013:
http://opus.nlpl.eu/TED2013.php
Tanzil?– A collection of Quran translations:
http://opus.nlpl.eu/Tanzil.php
TEP?– The Tehran English-Persian subtitle corpus:
http://opus.nlpl.eu/TEP.php
Ubuntu?– Ubuntu localization files:
http://opus.nlpl.eu/Ubuntu.php
UN?– Translated UN documents:
http://opus.nlpl.eu/UN.php
Wikipedia?– translated sentences from Wikipedia:
http://opus.nlpl.eu/Wikipedia.php
WikiSource?– (small en-sv sample only:
http://opus.nlpl.eu/WikiSource.php
WMT News Test Sets:
http://opus.nlpl.eu/WMT-News.php
The Xhosa – English Navy corpus:
http://opus.nlpl.eu/XhosaNavy.php