Sunday, April 18, 2010

Spurious corpus failures in IMS Corpus Workbench

At the moment I'm working with Open CWB, also known as the IMS Corpus Workbench. The last couple of days I was getting a lot of corpus errors seemingly for no reason:

[no corpus]> E
Warning:
Data access error (CL: can't load and/or create necessary data)
Perhaps the corpus E is not accessible from the machine you are using.
CQP Error:
Corpus ``E'' is undefined

Looking at the corpus registry, the cwb-encode command evidently had stored the corpus data and info locations as relative paths. Consequently the corpus would only work if cqp was invoked in the same directory as it was encoded from.

So if you give relative paths when running cwb-encode, make sure to fix up the HOME and INFO lines in the registry file afterwards!


-a