Further Information

Semantic Overlay Architecture (SOyA) is a data model authoring and publishing platform and also provides functionalities for validation and transformation. It builds on W3C Resource Description Framework (RDF) and related semantic web technologies to provide a lightweight approach for data integration and exchange. At the core of SOyA is a YAML-based data model for describing data structures with bases and optional overlays, which provide additional information and context.

This anonymisation process ensures GDPR-compliant handling of personal data by applying a configurable, ontology-driven approach. It begins by fetching a JSON-LD configuration from a knowledge graph, which defines the anonymization type and data type for each attribute using SPARQL queries. For every attribute, a matching anonymizer (e.g., masking, generalization, or randomization) is instantiated—depending on available implementations—and applied to the input data after restructuring it by attribute. Generalization, for example, assigns values into buckets to reduce identifiability, while randomization introduces controlled noise, and masking hides values entirely. This modular process ensures flexibility and extensibility, and the entire service is accessible via a documented API. For more details, visit the GitHub repository: https://github.com/OwnYourData/anonymisation-service.

Follow these steps to anonymise your dataset using the Anonymisation Service:

  1. Provide your raw data in a cleaned, well-structured format — either by uploading a file or pasting it into the input field.
    → See example: data.json
  2. Create a SOyA structure that defines your dataset and how it should be anonymised:
    • Use the public SOyA repository to create a new structure: https://soya.ownyourdata.eu
    • First, describe the data structure in the bases section by following this tutorial
      note: make sure to provide a definition for min/max values when using `generalization`
    • Then, define an OverlayClassification (see the Classification section in the tutorial) to specify the anonymisation methods for each attribute.
  3. Enter the name of your SOyA structure into the Model field.
  4. Click "Anonymise" to process your dataset.

This website is a frontend for the underlying technology of data anonymisation. You can use this service via a REST API by calling the following API endpoint:

  • POST https://anonymizer.go-data.at/api/anonymise: provide the data set and a reference to the SOyA structure in the body of a POST request;
    example:
    cat input.json | curl -H 'Content-Type: application/json' -d @- -X POST https://anonymizer.go-data.at/api/anonymise
  • data format of input.json
    {
      "configurationURL": "https://soya.ownyourdata.eu/AnonymisationDemo",
      "data": [...]
    }
    example file: input.json

Swagger API of this service is available here: https://anonymizer.go-data.at/swagger-ui/index.html
Docker image for local deployment can be downloaded here: https://hub.docker.com/r/oydeu/anonymizer

This service is a Proof-of-Concept to demonstrate an anonymisation service using the overlay capabilities of SOyA, i.e., show-case an easy but still machine-readable format to describe datasets and use the built-in mechanisms of SOyA for anonymisation.

We would like to encourage everyone to report issues or even provide pull-requests on the public Github repository.

This project has received funding from the program “Datenökosysteme für die Energiewende” by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK) under grant number 905128.  Learn more about the OwnYourData Anonymiser.