How EntrD Optimized its Data Extraction Process with LLM

Learn how EntrD is able to save time and extract, not only more accurately, but also various types of data simultaneously, enhancing their data anonymization process.

entrd logo image cover
dn entrd logo
EntrD is an IT company specializing in data security and privacy solutions, particularly in the technology sector. Their expertise is in developing software that maintains the balance of data representativeness and reliability while achieving compliance with privacy standards.
Employees
11-50

About EntrD

dn header entrd
Founded in 2014 with its headquarters in Heerenveen, Friesland, EntrD is an IT company specializing in data security and privacy solutions, particularly in the technology sector. 
 
EntrD‘s expertise is in developing software that maintains the balance of data representativeness and reliability while achieving compliance with privacy standards.

EntrD’s Software Solutions: FileFactory & DataFactory

FileFactory is EntrD’s comprehensive Software as a Service (SaaS) platform designed to revolutionize the control organizations have over their documents and files. It simplifies enterprise data management by enhancing document searchability, enabling the masking or deletion of sensitive content, and automating proper document classification. These features make FileFactory an essential tool for organizations requiring efficient handling of digital documents, improving tasks like blurring text in documents, managing digital archives, and responding to data requests by eliminating superfluous information.
 
DataFactory complements FileFactory as an automated data masking tool that quickly and securely anonymizes personal data within databases or applications. This solution specially caters to the need for GDPR compliance and protects privacy by concealing sensitive data, such as personal customer details that might be subject to exposure in a data breach. Its primary use is in creating secure, non-reversible test datasets for training, research, and analytical purposes. 

The Importance of Data Masking and Anonymization

Data masking plays a crucial role in transforming sensitive data into a format that’s usable for testing and analysis without compromising privacy. This methodology ensures that sensitive data is shielded from potential unauthorized access while retaining its integrity for practical applications. 

The key features of data masking include: 

  • Protection of Sensitive Data: By replacing real data with fictitious but realistic data, it ensures that sensitive information is not exposed. 
  • Usability: The masked data remains functional and can be used for development, testing, and training purposes without compromising security. 
  • Compliance: Helps organizations comply with data protection regulations by ensuring that sensitive data is not used inappropriately. 

Document masking

Document masking is the practice of obscuring confidential information within documents to protect individual privacy and minimize risks such as data breaches. 

This technique involves replacing sensitive details with fictitious or placeholder data, ensuring compliance with privacy regulations and maintaining the overall structure and usefulness of the documents for collaboration and analysis purposes.  

Overcoming Challenges with Innovate Solutions

dn body image entrd
EntrD recognized a key challenge in the process of data extraction: the limitations of Named Entity Recognition (NER) technology in identifying and anonymizing various types of sensitive data accurately.
"Our collaboration with DataNorth has led to a rapid and pragmatic realization of an AI classification system. This allows our customers to access their valuable data faster and easier."
eric entrd
Eric Hoefman
Managing Partner EntrD

To address this challenge, EntrD engaged in a strategic partnership with DataNorth AI to enhance their capabilities in data anonymization and document masking.

DataNorth AI created an innovative solution using a large language model (LLM) to improve entity extraction accuracy. This approach involved a multi-step process:

  • OCR Technology: Initially, optical character recognition (OCR) technology converts digital document text into machine-readable format.
  • Preprocessing: The extracted data is then preprocessed to correct common errors and formatted suitably for LLM analysis.
  • LLM Application: The preprocessed data is analyzed using LLM to identify and extract personal and financial information accurately.
  • Post Processing: This stage involves cross-referencing the LLM-extracted data with the initial OCR results to ensure consistency and accuracy.
  • Result Delivery: Finally, EntrD receives the processed data, ready for anonymization and further use in their solutions.

Improvement from the LLM Solution

The improvements realized through this partnership and the integration of advanced LLM technology enable EntrD to enhance its anonymization and document masking processes significantly. This advancement fortifies EntrD’s position as a provider of advanced and reliable data security solutions within the tech industry. 

"With DataNorth’s results-oriented approach, 
we have achieved advanced AI integration in a short time."
hielke de jong entrd
Hielke de Jong
Managing Partner EntrD

EntrD will be participating in the fourteenth edition of CorporatiePlein 2024, on the 12th of September in Expo Houten, an event dedicated to the digital transformation of housing corporations. There, they will have the opportunity to showcase the results from the LLM solution’s implementation.

Call me Back Form (EN)