During a preparing phase your input and output data meant for advanced analytics can be masked in such way that these data stay cryptic for an outsider although still remaining efficiently analyzable.
For doing so, we first suggest and agree with you the business objectives, data formats, data volumes, and the whole procedure of masking, analytics, and back translation to clear data. Data masking then happens via data transforming scripts which we provide for your individual IT platform. We publish the clear source code of those scripts so that you can verify their precise way of working. Once handed over, the scripts remain under your control and are not only serving for data masking, but also for joining results from data analytics, and decryptification back to clear data. This way, we never get and never need access on your clear data!
If needed, we will cluster your data before masking. This counters any decryptification attacks based on rare and known data attribute values like rare zip-codes or rare employee characteristics. The objective is a total k-anonymity which ensures at least k resulting records for each query independently from the filtered text/key attribute combination. This way even records of exposed employees with outstanding salaries or zip-codes stay anonymous. The same applies for other master data.
Data of metric attributes like employee’s age or salary often do expose a business typical statistic distribution over value ranges. We mask these data in such way that their statistic distributions result approximately normally distributed in order to counter distribution-based decryptification attacks. Thus, all metric attributes are transformed to not only have almost the same but also a very analytic-friendly statistic distribution.