G2 takes pride in showing unbiased reviews on user satisfaction in our ratings and reports. We do not allow paid placements in any of our ratings, rankings, or reports. Learn about our scoring methodologies.
AI-Driven Data Anonymization For Knowledge Management -> Nymiz detects sensitive data in unstructured files (doc, docx, xls, xlsx, jpg, tlf, png, pdf) and also in structured data (databases), and
IBM InfoSphere Optim Data Privacy protects privacy and support compliance using extensive capabilities to de-identify sensitive information across applications, databases and operating systems
Tumult Analytics is an open-source Python library making it easy and safe to use differential privacy; enabling organizations to safely release statistical summaries of sensitive data. Tumult Analyti
Salesforce Shield is a suite of products that provides an extra level of security and protection above and beyond what’s already built into Salesforce. Salesforce Shield capabilities help improve data
Tonic.ai offers a developer platform for data de-identification, synthesis, subsetting, and provisioning to keep test data secure, accessible, and in sync across testing and development environments.
Private AI is at the forefront of privacy solutions, providing an advanced machine learning (ML) system that identifies, redacts, and replaces personally identifiable information (PII) across a wide s
Very Good Security (“VGS”) makes it easy for customers to collect, protect and share sensitive financial data in a way that accelerates revenue, eliminates risk, ensures compliance, and drives profita
Data security and privacy for data in use by both mission-critical and line-of-business applications.
KIProtect makes it easy to ensure compliance and security when working with sensitive or personal data.
Evervault is a developer-first platform that helps payment providers and merchants collect, process, and share sensitive cardholder data without ever exposing it in plaintext. Its modular building blo
PRIVACY VAULT is intended to support industries that collect and process personal profiles, high-velocity consumer activity and IoT data, plus unstructured documents, images, voice and video.
Privacy1 is a software company in Stockholm and London that develops technologies for practical management of personal data. Our mission is to be an enabler to make data protection easier and accessib
brighter AI provides anonymization solutions based on state-of-the-art deep learning technology to protect every identity in public. We develop game-changing image and video anonymization software to
Sensitive Data Discovery, Data Masking. Access Controls.
Aircloak enables organisations to gain flexible and secure insights into sensitive data sets through a smart, automatic, on-demand anonymization engine. It ensures compliance for both internal analyst
Data anonymization is a type of information sanitization in order to protect privacy. It is the process of either encrypting or removing personally identifiable information from a data set so that the
BizDataX makes data masking/data anonymization simple, by cloning production or extracting only a subset of data. And mask it on the way, achieving GDPR compliance easier.
Baffle's solution goes beyond simple encryption to truly close gaps in the data access model. The technology protects against some of the most recent high profile attacks. It's easy to deploy, requi
PCI Vault is a vendor neutral, zero-knowledge, PCI DSS level 1 compliant environment by SnapBill, Inc. It is a SaaS solution offering credit card Tokenization as a Service (TaaS) combined with it's ow
Anonomatic PII Vault is a set of tools organizations may use to comply with local and international data privacy obligations while simultaneously utilizing the full value of their data. Accessible a
Truata delivers privacy-enhanced data management and analytics solutions to help companies unlock business growth while protecting customer privacy. Truata was founded in 2018, with investment from M
AuricVault Tokenization is a payment processing software that associates tokens with secure encrypted data. It encrypts the data it receives and then stores the encrypted data along with a random set
Sudo Platform is an API-first developer-focused ecosystem that delivers the tools necessary to empower our partners’ users and end consumers with the necessary capabilities to protect and control thei
The Anonos Data Embassy platform is the only technology that eliminates the tradeoff between data protection and data utility. The patented software uniquely combines statutory pseudonymization, synth
GDPR Compliance for Zendesk is a powerful application designed to facilitate seamless compliance with Government Data Protection Regulations, particularly the stringent GDPR standards in Europe. This
Enhance data protection by de-sensitizing and de-identifying sensitive data, and pseudonymize data for privacy compliance and analytics. Obscured data retains context and referential integrity remain
Privacy by Design for data _ Bring your privacy policies into data and make sure purpose, transformation and processing are aligned and safe to use in our Privacy Streams.
The heart of our basic service is the pseudonymisation of personal data. To do this, we make use of irreversible pseudonyms and make as little use of behavioural data as possible. The pseudonymised da
DOT Anonymizer helps anonymize huge data volumes thanks to high-performance execution.
Privacy Analytics, an IQVIA company, enables organizations to unlock the value of sensitive data for secondary purposes without compromising personal information.
RansomDataProtect offers a secure solution for sending and sharing sensitive and personal data with confidence. With an easy-to-use interface and robust encryption options, RansomDataProtect ensures t
Vormetric Tokenization and related solutions dramatically reduce the cost and effort of meeting compliance mandates such as GDPR and PCI DSS. Vormetric Tokenization with Dynamic Data Masking protects
Wizuda is the innovation leader in providing software solutions that allow organisations to take control and track all data flows, internally and externally, to create an environment of compliance and
Protecting critical IBM i applications from downtime and guarding against data loss with simple, scalable, full-featured high availability and disaster recovery products.
Enabling your organization to comply with cybersecurity regulations and strengthen IBM i security by controlling access to systems and data, enforcing data privacy, monitoring for compliance, and asse
Build customer trust by keeping their data safe and your processes compliant with GDPR, CCPA, LGPD, and HIPAA data privacy regulations. ✔ Privacy Rights Automation Automate Data Portability, Right To
DATAMIMIC: Unleash the Power of AI in Model-Based Test Data Generation and Privacy Protection. Specializing in enterprise-grade test data creation and obfuscation, with enhanced handling of complex JS
Enigma Vault is a SaaS based payment card, data (plaintext), and file tokenization solution. You provide the data, we provide the token. For payment cards we have secure entry forms that can be embedd
Immuta enables organizations to unlock value from their cloud data by protecting it and providing secure access. The Immuta Data Security Platform provides sensitive data discovery, security and acces
Imperva Data Masking pseudonymizes and anonymizes sensitive data via data masking so you can safely use de-identified data in development, testing, data warehouses, and analytical data stores.
The PHEMI Trustworthy Health DataLab is a unique, cloud-based, integrated big data management system that allows healthcare organizations to enhance innovation and generate value from healthcare data
Protegrity’s data protection solutions and products can improve your business capabilities through protected and dynamic data sharing. Our comprehensive range of methods enable you to unlock your data
Redgate Data Masker enhances your data security with an automated masking approach, reducing significant time to provisioning to benefit the users of your non-production databases. Masking sensitive d
Enables competitively-sensitive parties to exchange information directly, and in a fundamentally new way.
StratoKey is a Cloud Access Security Broker (CASB) that specializes in securing cloud and SaaS applications with Encryption, Monitoring, Analytics and Defence (EMAD™).
Titaniam Protect makes it possible to search, analyze, aggregate, manipulate and transact sensitive data while maintaining granular, flexible, and adaptive protection for specific data and usage scena
Protect the privacy of your employees and clients. Dataheroes Privacy Guard anonymizes all your privacy-sensitive data. Nothing can be traced back to people, while your data remains usable for testing
NavInfo Europe's face and license plate anonymizer detects and blurs recognisable faces and license plates in images. The blurring of faces and license plates helps to reach global privacy standards.
De-identify sensitive data in your organization quickly and easily with HushHush Data Masking components. Use a variety of algorithms to satisfy multiple scenarios and compliance requirements, includi
Celantur offers high quality, cost-effective and flexible image and video anonymization solutions to comply with the GDPR and other data protection laws. By using machine learning, we can detect and b
iMinimize maps the data flow within your enterprise, thereby providing you with accurate knowledge of data proliferation, and equips you to act on Subject Rights Requests in an automated and comprehen
At Privitar, we are all about helping our clients maximize their innovation capabilities through the use of safe data for analytics. Privitar has been helping customers around the globe increase the
Rixon Technology is re-imagining what effective data security and compliance look like through unique industry-leading solutions. Rixon’s innovative approach provides organizations with a high degree
SecuPi Data Security Platform enables seamless data security over on-prem & cross-cloud, consistently applied over business applications, analytical workloads & privileged users. SecuPi is ser
Semele’s robust set of masking methods protects sensitive data in lower environments where enhanced access and reduced security controls make it more vulnerable to breach.
LeapYear’s differentially private system protects some of the world’s most sensitive datasets, including social media data, medical information, and financial transactions. The system ensures analysts
Wald.ai is an enterprise solution that connects employees to advanced AI assistants while prioritizing data protection and regulatory compliance. Wald.ai offers access to multiple AI models includ
5G speeds will create new experiences for consumers. Programmable networks, slicing, and narrowband devices will mean new classes of business applications. In sum, 5G will transform the role of mobile
Enkrypt AI's Data Risk Audit is a comprehensive solution designed to ensure that your data is primed and secure for integration into Generative AI (Gen AI) applications. By identifying and mitigating
Enkrypt AI's Risk Removal with Guardrails is a comprehensive solution designed to enhance the security and reliability of Generative AI applications. By implementing customized, enterprise-ready guard
Anonymization.ai is a German AI that automatically anonymizes or pseudonymizes sensitive data in all kinds of documents while keeping the original document format. It makes GDPR-compliant document pro
CIB PoP (Protect our Privacy) is an AI-driven solution designed to ensure data privacy in document processing. By using intelligent algorithms, it automatically anonymizes and pseudonymizes personal d
From the makers of the award-winning IRI FieldShield and CellShield data masking software in the IRI Data Protector suite and IRI Voracity platform comes IRI DarkShield, a compatible and affordable on
IRI FieldShield® is powerful and affordable data discovery and data masking software for PII, PHI, PI, CSI, CUI, etc. in structured and semi-structured sources, big and small. FieldShield utilities in
TokenEx is an enterprise-class tokenization platform that offers virtually unlimited flexibility in how customers can access, store and secure data. TokenEx works with multiple data-acceptance channel
Mage™ Static Data Masking has been developed working alongside our customers, to address the specific needs and requirements they have. Mage™ Static Data Masking has been designed to deliver adequate
PieEye is Brand Friendly Privacy. Automating Data Subject Requests, Cookie compliance, Sensitive Data Discovery, Anonymization and Privacy Impact Assessments, PieEye is an easy to implement Data Priva
Piiano provides engineering infrastructure to safeguard customers’ sensitive data, preempt breaches, and comply with privacy regulations. Our vault and code scanner shift privacy and security left an
Redactable is a cloud-based document redaction tool that helps organizations efficiently and securely remove sensitive information from PDF documents. This AI-powered solution streamlines the redactio
Redgate Test data Manager is a test data management solution from the Database DevOps experts. We’ve simplified reliable and secure test data provisioning, enabling high-quality software releases, an
Automatically discover and redact sensitive data everywhere
Shaip Data is a modern platform designed to gather high-quality, ethical data for training AI models. It has three main parts: Shaip Manage, Shaip Work, and Shaip Intelligence. The platform makes wor
For any business processing payments and the associated sensitive data including PII and PHI, ShieldConex® is the vaultless PCI-compliant shared tokenization solution that protects transactions throug
Skyflow for GenAI is a data privacy solution designed to enable organizations to develop and train large language models (LLMs) without exposing sensitive data or diverting resources from core product
Skyflow for Payments is a comprehensive data privacy vault designed to secure sensitive financial data across various payment processes, including card acceptance, card issuance, customer data managem
Vormetric Vaultless Tokenization with Dynamic Data Masking dramatically reduces the cost and effort required to comply with security policies and regulatory mandates like PCI DSS.
xtendr facilitates secure data sharing and collaboration between teams, departments, and organisations - generating powerful insights without ever compromising on privacy. Using a combination of bes
Data de-identification tools remove direct and indirect sensitive data and personally-identifying information from datasets to reduce the reidentification of that data. Data de-identification is particularly important for companies working with sensitive and highly-regulated data, such as those in healthcare working with protected health information (PHI) in medical records or financial data.
Companies may be prohibited from analyzing datasets that include sensitive and personally identifiable information (PII) in order to comply with internal policies and meet data privacy and data protection regulations. However, if the sensitive data is removed from a dataset in a non-identifiable manner, that dataset may become usable. For example, using data de-identification software tools, information such as peoples’ names, addresses, protected health information, tax identifying number, social security number, account numbers, and other personally identifying or sensitive data can be removed from datasets enabling companies to extract analytical value from the remaining de-identified data.
When considering using de-identified datasets, companies should understand the risks of that sensitive data becoming re-identified. Reidentification risks can include differencing attacks, such as where bad actors use their knowledge about people to see if specific individuals’ personal data is included in a dataset, or reconstruction attacks, where someone combines data from other data sources to reconstruct the original de-identified dataset. When evaluating data de-identification methods, understanding the degree of anonymity using k-anonymity is important.
The following are some core features within data de-identification tools:
Anonymization: Some data de-identification solutions offer statistical data anonymization methods, including k-anonymity, low-count suppression, and noise insertion. When working with sensitive data, particularly regulated data, anonymization weights and techniques to achieve that must be considered. The more anonymized the data is, the lesser the risk of re-identification. However, the more anonymous a dataset is made, the less its utility and accuracy.
Tokenization or pseudonymization: Tokenization or pseudonymization replaces sensitive data with a token value stored outside the production dataset; it effectively de-identifies the dataset in use but can be reconstructed when needed.
The biggest benefit of using data de-identification tools is enabling analyses of data that would otherwise be prohibited from use. This allows companies to extract insights from their data while following data privacy and protection regulations by protecting sensitive information.
Data usability for data analysis: Enables companies to analyze datasets and extract value from datasets that would otherwise be unable to be processed due to the sensitivity of data contained within them.
Regulatory compliance: Global data privacy and protection regulations require companies to treat sensitive data differently than non-sensitive data. If a dataset can be made non-sensitive using data de-identification software techniques, it may no longer be in the scope of data privacy or data protection regulations.
Data de-identification solutions are used by people analyzing production data or those creating algorithms. De-identified data can also be used for safe data sharing.
Data Managers, administrators, and data scientists: These professionals who interact with datasets regularly will likely work with data de-identification software tools.
Qualified experts: These include qualified experts under HIPAA and can provide expert determination to attest that a dataset is deemed de-identified and the risks of re-identification are small based on generally accepted statistical methods.
Depending on the type of data protection a company is looking for, alternatives to data de-identification tools may be considered. For example, when determining when the data de-identification process is best, data masking may be a better option for companies that want to limit people from viewing sensitive data within applications. If the data merely needs to be protected during transit or at rest, encryption software may be a choice. If privacy-safe testing data is needed, synthetic data may be an alternative.
Data masking software: Data masking software obfuscates the data while retaining the original data. The mask can be lifted to reveal the original dataset.
Encryption software: Encryption software protects data by converting plaintext into scrambled letters, known as ciphertext, which can only be decrypted using the appropriate encryption key.
Synthetic data software: Synthetic data software helps companies create artificial datasets, including images, text, and other data from scratch using computer-generated imagery (CGI), generative neural networks (GANs), and heuristics. Synthetic data is most commonly used for testing and training machine learning models.
Software solutions can come with their own set of challenges.
Minimizing re-identification risks: Simply removing personal information from a dataset may not be enough to consider the dataset de-identified. Indirect personal identifiers— contextual personal information within the data—may be used to re-identify a person in the data. Reidentification can happen from cross-referencing one dataset with another, singling out specific factors that relate to a known individual, or through general inferences of data that tend to correlate. De-identifying both direct and indirect identifiers, introducing noise (random data), and generalizing the data by reducing the granularity and analyzing it in aggregate can help prevent re-identification.
Meeting regulatory requirements: Many data privacy and data protection laws do not specify technical requirements for what is considered de-identified or anonymous data, so it is up to companies to understand the technical capabilities of their software solutions and how that relates to adhering to data protection regulations.
Users must determine their specific needs for data de-identification tools. They can answer the questions below to get a better understanding:
Create a long list
Buyers can visit G2’s Data De-identification Software category, read reviews about data de-identification products, and determine which products fit their businesses’ specific needs. They can then create a list of products that match those needs.
Create a short list
After creating a long list, buyers can review their choices and eliminate some products to create a shorter, more precise list.
Conduct demos
Once buyers have narrowed down their software search, they can connect with the vendor to view demonstrations of the software product and how it relates to their company’s specific use cases. They can ask about the de-identification methods. Buyers can also ask about integrations with their existing tech stack, licensing methods, and pricing—whether fees are based on the number of projects, databases, executions, etc.
Choose a selection team
Buyers must determine which team is responsible for implementing and managing this software. Often, that may be someone from the data team. It is important to have a representative from the financial team on the selection committee to ensure the license is within budget.
Negotiation
Buyers should get specific answers to the license cost, how it is priced, and if the data de-identification software is based on the dataset size, features, or execution. They must keep in mind the company’s data de-identification needs for today and the future.
Final decision
The final decision will come down to whether the software solution meets the technical requirements, the usability, the implementation, other support, the expected return on investment, and more. Ideally, the data team will make the final decision, alongside input from other stakeholders like software development teams.