Matteo Pagani 04/14/2026 Blog
2 Minutes


Why in-house legal needs to start optimizing data
4:14

It's now more important than ever that legal departments optimize the data they handle, and here’s how you go about doing it. 

As to why: it’s because optimized data empowers the legal team to make a step change in the value it delivers to the business.

How you go about doing it is via a process that matures your data from unstructured to structured and optimized, via categorization, standardization and automation. The steps are outlined below. But first, some definitions.

Unstructured vs structured data

Broadly speaking, unstructured data is just that – there’s no structure around it: nothing is classified or labelled or grouped. It’s hard to search, hard to conduct comparisons, hard to analyze, lacks necessary context and becomes impossible to automate.

And of course, unstructured data is everywhere that there’s ‘free text’: in documents, in contracts, in emails, in one off LLM chats.

Structured data, by contrast, is where you organize data into an agreed format with agreed rules. This is referred to as a ‘schema’. Once data is structured it becomes accessible and thus exponentially more valuable to the legal team and the business.

From unstructured to categorized

How does unstructured data become structured? The short answer is by replacing a document-centric system with a work-centric one.

In a document-centric system, you’re storing documents in repositories, but without anything linking the document to e.g. a specific matter, or to the business’s structure or priorities.

Work-centric systems, meanwhile, start off by grouping data into defined categories (a unit of work). Contracts documents, values, risk bracket, jurisdiction, all attach to this unit of work. Examples of unit of work structures can be: Projects, Matters, Cases, Employee Files etc. This tagging can be done manually but is increasingly done by LLM’s. You’ve now created semi-structured data.

You can now link work to different teams with different priorities, providing a level of governance not possible in a document-centric system. You’re also gaining the capacity to search and filter the data which enables some basic reporting.

Standardization for consistency

Standardization comes next, which is about imposing consistency on the data. This is where data becomes truly structured and measurable.

Standardization is the process of aligning formats, definitions and values – for instance, imposing a universal date format, e.g. MM-DD-YYYY; and deciding on a single value that represents, for example, where I’m from. Is it ‘South Africa’, ‘S Africa’, ‘S. Africa’, ‘SA’, ‘RSA’, ‘ZA’ or something else?

Categorization also entails resolving duplicate or conflicting information and introducing rules around things like how data sit together, what’s a valid range, what are the required fields etc.

Automation for speed

Now that you’ve got structured data, you can start automating tasks, while at the same time making sure there’s still a role for humans to review automations when it matters.

You can introduce clause libraries by matter and work type. You can introduce playbooks that define actions that will, for example, consistently handle document drafting; produce faster answers to recurring questions; or review large volumes of documents quickly. The impact can be nothing short of transformative.

Optimized for good

The final stop on the data maturity journey is optimization, which is the part where you and AI are running reports and using the data generated to, for example, strip out process bottlenecks and reduce document cycle times.

By this point, structured data is enabling the legal department to be far more productive, timely and accurate. It’s reducing risk and cost, and your people get to shift their efforts from administrative tasks to value adding activity.

The other big advantage of structured data lies in how it connects the work and output of the legal department to the organization’s wider context, concerns, strategies and targets – aka Operational Intelligence. It can ensure that Legal is able to prioritize what matters to the business, and that in itself is worth a great deal.

 


Tag:

Blog


Related Posts

Ben Mitchell 11 May, 2026

In-house lawyers can still escape the billable hour – but only sort of

Here’s something we all know but maybe don’t acknowledge enough: it’s that a significant number of...

Matteo Pagani 05 May, 2026

How and why to unlock the power of cycle time

What is cycle time, and why is it important to in-house legal departments? The short answer is that...

Admin 28 April, 2026

Why Good Data Is the Real Foundation of Legal AI

There's a phrase that's been in computing since the 1950s, and it's never been more relevant than...