Lexical Computing

Home » Companies » Lexical Computing

Lexical Computing CZ s.r.o. is a Czech subsidiary of Lexical Computing Limited, a research company, founded by Dr. Adam Kilgarriff in 2003. It works at the intersection of corpus and computational linguistics, and is committed to an empiricist approach to the study of language, in which corpora play a central role: for a very wide range of linguistic questions, if a suitable corpus is available, it will help our understanding. Its strap line is ‘corpora for all’. It has a leading corpus query tool, the Sketch Engine (http://www.sketchengine.co.uk), incorporating ‘word sketches’, one page corpus-driven summaries of word’s grammatical and collocational behaviour. The lead users for the Sketch Engine have been dictionary publishers and it is in day-to-day use for lexicography at Oxford University Press, Cambridge University Press, Collins, Macmillan, Cornelsen and the Instituut voor Nederlandse Lexicologie (INL, Institute of Dutch Lexicology) among others. To be able to provide corpus services, Lexical Computing needs corpora. As of December 2014 we have large corpora for over 70 languages. (‘Large’ meaning over 20 million words; in most cases corpora are over 100 million words, with more than 1 billion words for the top 20 languages.) For the most part, these are collected from the web – Lexical Computing is a lead player in the ‘web as corpus’ initiative – and have involved collaborations with language experts for the languages in question.

Main location: 

, Brno, Czech Republic