Leximetric data coding techniques aim to measure cross-national and inter-temporal variations in the content of legal rules, thereby facilitating statistical analysis of legal systems and their social and economic impacts. In this paper we explain how leximetric methods were used to create the CBR Labour Index (CBR-LRI), an index and related dataset of labour laws from around the world spanning the period from 1970 to 2013. Datasets of this kind must, we suggest, observe certain conventions of transparency and validity if they are to be usable in statistical analysis. The theoretical framework informing the construction of the dataset and the types of questions which it is are designed to answer should be made explicit. Then the choices involved in the selection of indicators, the definition of coding algorithms, and the aggregation and weighting of data to create composite measures, must be spelled out. In addition, primary legal sources should be referenced, and it should be clear how they were used to generate reported values. With these points in mind we provide an overview of the CBR-LRI dataset’s main features and structure, discuss issues of weighting, and present some initial findings on what it reveals of global trends in labour regulation.