Data Supply Chain Security

Terminology

Glossary of Key Terms

  • Accountabilities: The responsibilities for data-related decisions, outlining who bears the duty for these decisions within an organization.
  • Automated Threat Modeling: A method of threat modeling that does not require human intervention to evaluate system security after modeling.
  • Attack Tree: A hierarchical, tree-based method for modeling and analyzing potential attack scenarios.
  • Attack Vector: The path or method by which an attacker can gain access to a system or network.
  • Black-box Access: A scenario where an attacker lacks knowledge of the model’s architecture, training process, or training data and can interact with the model solely through its inputs and outputs.
  • Blockchain: A decentralized, immutable ledger that facilitates peer-to-peer transactions without central intermediaries, offering data suppression resistance.
  • Certified Robustness: Rigorous techniques for developing models that are resilient to poisoning, often achieved by employing methods to bound the gradient of a neural network.
  • Clean-label Attack: A poisoning attack where the adversary uses labels that are consistent with the data, making the attack less obvious.
  • Compromised Insider: An authorized user whose account has been exploited by an external attacker to gain unauthorized access to systems.
  • Data-Centric Threat Modeling: An approach to threat modeling that focuses on the security of particular data instances, instead of hosts, operating systems, or applications.
  • Data Governance: A system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models. It is the formal framework for managing an organization’s data assets.
  • Data Governance Council: A group responsible for balancing the interests of various stakeholders and making binding decisions related to data governance.
  • Data Integrity: Maintaining the accuracy and consistency of data throughout its lifecycle, safeguarding its reliability.
  • Data Poisoning: An adversarial attack on machine learning models where an attacker manipulates training data to degrade model performance or introduce backdoors.
  • Data Quality: How well data meets its usage requirements in a given context, ensuring it is suitable for its intended purpose.
  • Data Steward: An individual responsible for overseeing data quality, access controls, and compliance within a data governance framework.
  • Data Suppression: The suppression or prohibition of information deemed unacceptable for dissemination.
  • Data Suppression Resistance: Techniques and strategies used to bypass or circumvent data suppression measures, ensuring access to information and freedom of expression.
  • Decision Rights: The authority to make specific decisions about data handling, indicating who within an organization is authorized to make these decisions.
  • Deepfake: Synthetic media (text, images, audio, video) manipulated or generated using artificial intelligence and deep learning.
  • Defensive Freezing: A psychophysiological response to perceived threats that causes immobility and bradycardia, impacting decision-making.
  • Denial of Service (DoS) Attack: An attack designed to disrupt access to resources or cause delays, often by overwhelming a system with traffic.
  • Discriminator (GAN): The neural network in a GAN that assesses the authenticity of the synthetic content generated by the generator.
  • Dirty-label Attack: A poisoning attack where the labels used in the manipulated training data are inconsistent with the features of the data.
  • Disclosure of Information: Occurs in LINDDUN when sensitive information is exposed through privacy threats.
  • Discoverability: A factor in the DREAD model assessing how easy it is for attackers or others to find vulnerabilities or threats.
  • DREAD (Damage Potential, Reproducibility, Exploitability, Affected Users, Discoverability): A threat rating model that helps organizations prioritize threats based on specific criteria.
  • Encrypted Server Name Indication (ESNI): A technology aiming to encrypt the SNI field in TLS headers to enhance privacy and resist censorship.
  • Exfiltration Attack: An attack that involves copying and transferring data from secure systems, often without authorization.
  • Feature Poisoning: A poisoning attack that involves manipulating the features (attributes) of the training data.
  • Generative Adversarial Network (GAN): A machine learning framework consisting of two competing neural networks (a generator and a discriminator) used to create deepfakes.
  • Generator (GAN): The neural network in a GAN that produces synthetic content.
  • Gray-box Access: A scenario where an attacker has limited access to the model or training process, such as knowledge of the model’s architecture or general data type, but not the specific training data.
  • Honeytoken: A decoy designed to look like legitimate data that is intended to attract unauthorized access and detect insider activities.
  • HTTP Filtering: A data suppression technique that restricts access to web pages based on URLs or keywords.
  • Identifiability: A factor in LINDDUN assessing the risk that privacy threats can reveal a user’s real-world identity.
  • Indiscriminate Poisoning: A type of data poisoning attack designed to degrade the overall performance of a model across all or most data points by introducing noise or incorrect information.
  • Information Disclosure: A threat category in STRIDE and LINDDUN where sensitive data is exposed to unauthorized parties.
  • Insider Threat: A security risk originating from within the organization, often by malicious, compromised, or careless insiders.
  • IP Address Blocklists: A fundamental method that censors use to deny traffic to or from specific servers.
  • Jailbreak-tuning: Causing a language model to ignore its safety protocols, often through data manipulation.
  • Keyword Filtering: A data suppression technique that scans data streams for specific words or phrases, blocking or modifying data when a match is found.
  • Label Poisoning: A poisoning attack where the attacker alters only the labels of the training data.
  • LINDDUN (Linkability, Identifiability, Non-repudiation, Detectability, Disclosure of Information, Unawareness): A threat modeling approach that focuses on identifying and addressing privacy threats.
  • Linkability: A factor in LINDDUN assessing the ability to connect different pieces of data or activities related to an individual.
  • Long Short-Term Memory (LSTM): A type of recurrent neural network (RNN) that excels at analyzing sequential data over time.
  • Malicious Insider: An authorized user who intentionally harms the organization’s systems and data.
  • Mel-frequency cepstral coefficients (MFCCs): Features derived from audio signals used in speech and audio recognition.
  • Microsoft Threat Modeling Tool: A software tool designed to support threat modeling experts.
  • Multimodal deepfakes: Deepfakes that combine different forms of media, such as images, audio, and text, to create more convincing content.
  • NIST: The National Institute of Standards and Technology, which supports the practice of threat modeling.
  • Non-repudiation: The inability to deny having performed a particular action. In LINDDUN, it relates to the difficulty in verifying the origin of manipulated content.
  • Nudging: A technique that influences behavior by structuring choices in a specific way, which can impact decision-making.
  • Obfuscation: Techniques used to disguise network traffic, making it appear harmless to evade detection by censors.
  • OCTAVE (Operationally Critical Threat, Asset, and Vulnerability Evaluation): An organization-focused threat modeling method that takes a holistic view of risk assessment and management.
  • OWASP Threat Dragon: An open-source threat modeling tool.
  • PASTA (Process for Attack Simulation and Threat Analysis): A risk-centric threat modeling approach that emphasizes collaboration between business and technical teams.
  • Perturbation Attack: An attack that manipulates data by adding noise, scaling, watermarking, or utilizing triggers to influence data points.
  • Phrenology: A pseudoscientific practice attempting to assess personality and character from skull shape and size.
  • Privilege Escalation: Gaining unauthorized access to sensitive resources or higher levels of system control.
  • Protocol Fingerprinting: An advanced data suppression technique that can identify and block specific protocols or applications.
  • Protocol Mimicry: An obfuscation technique where network traffic is disguised to resemble a different protocol.
  • PyTM: An open-source threat modeling tool.
  • Recurrent Neural Network (RNN): A type of neural network used for analyzing sequential data.
  • Regulatory Compliance: Adherence to laws, regulations, and industry standards related to data, requiring policies and procedures to ensure legal and regulatory obligations are met.
  • Rent-Seeking: An economic theory suggesting that legislation or regulation can reduce competition artificially, enabling a firm to secure economic advantages.
  • Repudiation: A threat category in STRIDE where a user can deny having performed a particular action, making it difficult to trace malicious actions.
  • Risk Mitigation: Establishing policies and controls to reduce potential risks associated with data, including non-compliance, security breaches, and inadequate oversight.
  • Social Threat Learning: The process where an individual’s decision-making is biased by observing or receiving information about potential threats.
  • Spoofing: Techniques used to impersonate or misrepresent the identity of a person, device, or system. In STRIDE, it refers to an attacker pretending to be someone or something else.
  • STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege): A threat modeling approach that classifies threats into distinct categories.
  • Sunk Costs: Fixed costs linked to assets that cannot be redeployed and lack salvage value, which can be exploited by regulators.
  • Targeted Poisoning: A type of data poisoning attack designed to make the model misclassify specific inputs in a desired way.
  • Temporal Aggregation: A defense strategy that uses timestamps to build robust models more resistant to poisoning by considering the timing and duration of attacks.
  • Text-to-Speech (TTS): Technology that converts written text into synthesized speech.
  • TLS-Based Filtering: A data suppression method that targets the Server Name Indication (SNI) within TLS headers to block access to specific websites.
  • Threat Modeling: A structured, proactive, and continuous approach to identifying, analyzing, and mitigating potential threats.
  • Threats to Authority: Challenges to the legitimacy and credibility of the data governance body.
  • Threats to Decision-Making: Factors that can undermine the ability to make well-informed, data-driven decisions within data governance.
  • Unawareness: A vulnerability in LINDDUN where users might not identify manipulated content or privacy threats.
  • Unlearning: Techniques that aim to eliminate the effects of poisoned data by fine-tuning models, although they are not always effective in completely removing the consequences of data poisoning.
  • VAST (Visual, Agile, and Simple Threat Modeling): A threat modeling approach designed to integrate seamlessly into agile software development processes, emphasizing visual representation and collaboration.
  • Vulnerability: A weakness in a system that can be exploited by a threat.
  • VPN (Virtual Private Network): A tool for resisting data suppression by routing traffic through an encrypted tunnel to a server in a different location, concealing the user’s IP address and communications.
  • Voice Conversion: Altering one speaker’s voice to sound like another.
  • Web 3.0: The next generation of the internet, characterized by decentralization, semantic web principles, and user empowerment.
  • White-box Access: A scenario where an attacker has complete access to the model’s architecture, parameters, and training data.
  • Zero-Knowledge Proofs (ZKPs): Cryptographic techniques that allow a user to prove a statement is true without revealing the underlying information, enhancing data suppression resistance.