论文标题
法律树的信息检索和结构上的复杂性
Information retrieval and structural complexity of legal trees
论文作者
论文摘要
我们介绍了一个模型,以检索隐藏在法律文本中的信息。这些通常是在分层(树)结构中组织的,对给定规定感兴趣的读者需要探索“最深”的级别(文章,条款,...)。我们通过计算一个随机读者检索叶子中种植的信息的平均第一阶段时间来评估合法树的结构复杂性。假定读者根据其兴趣/关键字浏览法律文本的内容,并根据关键字亲和力来吸引所寻求的信息,即层次结构的章节/部分标题似乎与叶子的信息内容相匹配。使用随机生成的关键字模式,我们研究了文本的两个主要特征(水平和垂直连贯性)对搜索时间的影响,并考虑使用真实法律文本验证结果的方法。我们获得了数值和分析结果,后者基于模式水平的平均场近似值,这导致了法律树的复杂性的明确表达,这是模型的结构参数的函数。简要讨论了我们结果的政策影响。
We introduce a model for the retrieval of information hidden in legal texts. These are typically organised in a hierarchical (tree) structure, which a reader interested in a given provision needs to explore down to the "deepest" level (articles, clauses,...). We assess the structural complexity of legal trees by computing the mean first-passage time a random reader takes to retrieve information planted in the leaves. The reader is assumed to skim through the content of a legal text based on their interests/keywords, and be drawn towards the sought information based on keywords affinity, i.e. how well the Chapters/Section headers of the hierarchy seem to match the informational content of the leaves. Using randomly generated keyword patterns, we investigate the effect of two main features of the text -- the horizontal and vertical coherence -- on the searching time, and consider ways to validate our results using real legal texts. We obtain numerical and analytical results, the latter based on a mean-field approximation on the level of patterns, which lead to an explicit expression for the complexity of legal trees as a function of the structural parameters of the model. Policy implications of our results are briefly discussed.