Unlock Artificial Intelligence Potential – The Power Of Pristine Data

The integration of Artificial Intelligence (AI) into cybersecurity has ushered in a new era of sophisticated threat detection, proactive vulnerability assessments, and automated incident response. As organizations increasingly rely on AI to bolster their defenses, the fundamental principle remains that the quality of the data on which they train these advanced systems directly links to their effectiveness. The old saying “garbage in, garbage out” (GIGO) holds true here; to unlock artificial intelligence potential – the power of pristine data is of the utmost importance.

Unlock Artificial Intelligence Potential - The Power Of Pristine Data

Part 2 – The Perils of Poor Data Hygiene: Undermining AI Training and Performance

Neglecting data hygiene can have severe consequences for those who depend on AI for information. Data hygiene is directly correlated to the training and performance of AI models, particularly in the critical domain of cybersecurity. Several common data quality issues can significantly undermine the effectiveness of even the most sophisticated AI algorithms.

Missing Data

One prevalent issue is incomplete data sets, in particular the presence of missing values in datasets (https://www.geeksforgeeks.org/ml-handling-missing-values/). This is a common occurrence in real-world data collections due to various factors such as technical debt, software bugs, human errors, or privacy concerns. The absence of data points for certain variables can significantly harm the accuracy and reliability of AI models. The lack of complete information can also reduce the effective sample size available for training and tuning. This potentially leads to a decrease in a model’s ability to generalize. Furthermore and slightly more complicated, if the reasons behind missing data points are not random, the introduction of bias into some models becomes a real-world concern. In this scenario a model might learn skewed relationships based on the incomplete data set. Ultimately, mishandling missing values can lead to biased and unreliable results, significantly hindering the overall performance of AI models.

Incomplete data can prevent ML models from identifying crucial patterns or relationships that exist within the full dataset. Addressing missing values typically involves either:

  • Removing data: deleting the rows or columns containing the missing elements. This comes with the risk of reducing a dataset and potentially introducing biased results if the reason for the data to be missing is not based on randomness.
  • Imputation techniques: employing imputation techniques to fill in the missing values with guessed data. While this preserves the dataset size it can introduce its own form of bias if the guesses are inaccurate. 

The fact that missing data can systematically skew a model’s learning process, leading to inaccurate and potentially biased outcomes, highlights the importance of understanding the nature of the missingness. The type of missingness are:

  • Missing Completely At Random (MCAR)
  • Missing At Random (MAR)
  • Missing Not At Random (MNAR)

Understanding the reason at hand directly impacts the strategies for addressing this issue. Arbitrarily filling in missing values without understanding the underlying reasons can be more detrimental than beneficial.

Duplicate Data

Moving beyond missing data elements, another significant challenge is that of duplicate data within training datasets. While the collection of massive datasets has become easier, the presence of duplicate records can considerably impact quality and ultimately the performance and accuracy of AI models trained on this data. This can obviously lead to biased outcomes. Duplicate entries can negatively affect model evaluation by creating a biased evaluation. This occurs primarily when exact or near-duplicate data exists in both training and validation sets, leading to an overestimation of a model’s performance on unknown data. Conversely, if a model performs poorly on the duplicated data point, it can artificially deflate the overall performance metrics. Furthermore, duplicate data can lead to overfitting. This is where a model becomes overly specialized and fails to capture underlying patterns on new unseen data sets. This is particularly true with exact or near duplicates, which can reinforce patterns that may not be real when considering a broader data set.

The presence of duplicate data is also computationally expensive. It increases training costs with necessary computational overhead for preprocessing and training. Additionally, duplicate data can lead to biased feature importance, artificially skewing the importance assigned to certain features if they are consistently associated with duplicated instances. In essence, duplicate entries can distort the underlying distribution of a larger data set. This lowers the accuracy of probabilistic models. It is worth noting that the impact of duplicate data isn’t always negative and can be context-dependent. In some specific scenarios, especially with unstructured data, duplicates might indicate underlying issues with data processing pipelines (https://indicodata.ai/blog/should-we-remove-duplicates-ask-slater/). For Large Language Models (LLMs) the repetition of high-quality examples might appear as near-duplicates. This can sometimes aid in the registering of important patterns (https://dagshub.com/blog/mastering-duplicate-data-management-in-machine-learning-for-optimal-model-performance/). This nuanced view suggests that intimate knowledge of a given data set, and the goals of an AI model, are necessary when strategizing on how to handle duplicate data.

Inconsistent Data

Inconsistent data, or a data set characterized by errors, inaccuracies, or irrelevant information, poses a significant threat to the reliability of AI models. Even the most advanced and sophisticated models will yield unsatisfactory results if trained on data of poor quality. But, inconsistent data can lead to inaccurate predictions, resulting in flawed decision-making with contextually significant repercussions. For example, an AI model used for deciding if an email is dangerous might incorrectly assess risk, leading to business impacting results. Similarly, in security operations, a log analyzing AI system trained on erroneous data could incorrectly classify nefarious activity as benign.

Incomplete or skewed data can introduce bias if the training data does not adequately represent the diversity of the real-world population. This can perpetuate existing biases, affecting fairness and inclusivity. Dealing with inconsistent data often necessitates significant time and resources for data cleansing. This leads to operational inefficiencies and delays in project timelines. Inconsistent data can arise from various sources, including encoding issues, human error during processing, unhandled software exceptions, variations in how data is recorded across different systems, and a general lack of standardization. Addressing this issue requires establishing uniform data standards and robust data governance policies throughout an organization to ensure that data is collected, formatted, and stored consistently. The notion of GIGO accurately describes the direct relationship between the quality of input data and the reliability of the output produced by AI engines.

Here is a table summarizing some of what was covered in Part 2 of this series:

Data Quality IssueImpact on Model TrainingPotential Consequences
Missing ValuesReduced sample size, introduced bias, analysis limitationsBiased and unreliable results, missed patterns
Duplicate DataBiased evaluation, overfitting, increased costs, biased feature importanceInflated accuracy, poor generalization
Inconsistent DataUnreliable outputs, skewed predictions, operational inefficiencies, regulatory risksInaccurate decisions, biased models

Part 3 will cover cybersecurity applications and how bad data impacts the ability to unlock artificial intelligence potential – the power of pristine data.

Unlock Artificial Intelligence Potential – The Power Of Pristine Data

The integration of Artificial Intelligence (AI) into cybersecurity has ushered in a new era of sophisticated threat detection, proactive vulnerability assessments, and automated incident response. As organizations increasingly rely on AI to bolster their defenses, the fundamental principle remains that the quality of the data on which they train these advanced systems directly links to their effectiveness. The old saying “garbage in, garbage out” (GIGO) holds true here; to unlock artificial intelligence potential – the power of pristine data is of the utmost importance.

Unlock Artificial Intelligence Potential - The Power Of Pristine Data

Part 1 – Defining Data Hygiene and Fidelity in the Context of AI and Machine Learning

Outside of the realm of areas like unsupervised learning, the foundation of any successful AI application lies in the data that fuels its models and learning processes. In cybersecurity, the stakes are exceptionally high. Consider a small security operations team that has a disproportionate scope of responsibility. Rightfully so, this team may rely on a Generative Pre-training Transformer (GPT) experience to balance out the team size against the scope of responsibility. If that GPT back-end data source is not solid this team could suffer due to inaccuracies and time sucks that lead to suboptimal results. The need for clean data is paramount. This goal encompasses two key concepts:

  • Data Hygiene
  • Data Fidelity

Data Hygiene

Data Hygiene refers to processes required to ensure that data is “clean”. Meaning it is free from errors, inaccuracies, and inconsistencies (https://www.telusdigital.com/glossary/data-hygiene). Several essential aspects contribute to good data hygiene:

  • Accuracy: This is fundamental, ensuring that the information is correct and devoid of mistakes such as misspellings or incorrect entries. More importantly, accuracy will have a direct impact in not introducing bias into any learning models.
  • Completeness:  This is equally vital to accuracy in terms of what feeds a given model. Requiring datasets that contain all the necessary information, and avoid missing values that could skew results, is a must.
  • Consistency: Consistency ensures uniform data formatting and standardizes entries across different datasets, preventing contradictions. This can have a direct impact on the effectiveness of back-end queries. For example, internationally date formats vary. To create an effective time range query, format those stored values consistently.
  • Timeliness: This dictates that the data must be current and relevant for the specific purpose of training an AI model. This doesn’t exclusively mean current based on the data timestamp, legacy data needs to also be available in a timely fashion.
  • De-duplication: The data removal process is crucial to maintain accuracy, avoid redundancy, and minimize any potential bias in the model training process.

Implementing a robust data hygiene strategy for an AI project yields numerous benefits, including improved accuracy of AI models, reduced bias, and ultimately saves time and financial resources that organizations would otherwise spend correcting unsatisfactory results (https://versium.com/blog/ais-achilles-heel-the-consequence-of-bad-data). Very much like cybersecurity endeavors themselves, data hygiene cannot be an afterthought. The consistent emphasis on these core hygiene attributes highlights their fundamental importance for any data-driven application. Especially in the critical field of AI. Moreover, maintaining data hygiene is not a one-time effort. It is a continuous set of processes and a commitment that involves regular audits and possible rebuilds of data systems, standardization of data input fields, automation of cleansing processes to detect anomalies and duplicates, and continuous processes to prevent deterioration of quality. This continuous maintenance is essential in dynamic environments such as cybersecurity, where data can quickly become outdated or irrelevant.

Data Fidelity

Data Fidelity focuses on integrity, accurately representing data from its original source while retaining its original meaning and necessary detail (https://www.qualityze.com/blogs/data-fidelity-and-quality-management). It is driven by several key attributes:

  • Accuracy: In the context of data fidelity, accuracy means reflecting the true characteristics of the data source without distortion. The data has a high level of integrity and has not been tampered with.
  • Granularity: This refers to maintaining the required level of detail in the data. This is particularly important in cybersecurity where subtle nuances in event logs or network traffic can be critical. A perfect example in the HTTP realm is knowing that a particular POST had a malicious payload but not seeing the payload itself.
  • Traceability: This is another important aspect, allowing for the tracking of data back to its origin. This can prove vital for understanding the context and reliability of the information as well as providing reliable signals for a forensics exercise.

Synthetic data is a reality at this point. It is increasingly used to populate parts of model training datasets. Due to this, statistical similarity to the original, real-world data is a key measure of fidelity. High data fidelity is crucial for AI and Machine Learning (ML). It ensures models learn from data that accurately mirrors the real-world situations they aim to analyze and predict.

This is particularly critical in sensitive fields like cybersecurity, where even minor deviations from the true characteristics of data could lead to flawed security assessments or missed threats (https://www.qualityze.com/blogs/data-fidelity-and-quality-management). The concept of fidelity extends beyond basic accuracy to include the level of detail and the preservation of statistical properties. This becomes especially relevant when dealing with synthetically generated data or when aiming for explainable AI models. 

The specific interpretation of “fidelity” can vary depending on the particular AI application. For instance, in intrusion detection, it might refer to the granularity of some data captured from a specific event. Yet in synthetic data generation, “fidelity” emphasizes the statistical resemblance to some original data set. In explainable AI (XAI), “fidelity” pertains to the correctness of the explanations provided by a model (https://arxiv.org/html/2401.10640v1). While accuracy remains a core component, the precise definition and emphasis of fidelity are context-dependent, reflecting diverse ways in which AI can be applied to the same field.

Here is a table summarizing some of what was covered in Part 1 of this series:

ConceptDefinitionKey AttributesImportance for AI/ML
Data HygieneProcess of ensuring data is cleanAccuracy, Completeness, Consistency, Timeliness, De-duplicationImproves accuracy, reduces bias, better performance
Data FidelityAccuracy of data representing its sourceAccuracy, Granularity, Traceability, Statistical SimilarityEnsures models learn from accurate and detailed data, especially nuanced data

Part 2 will cover the perils of poor data hygiene and how this negatively impacts the ability to unlock artificial intelligence potential – the power of pristine data.

Identity Risk Intelligence and it’s role in Disinformation Security

Src: https://soundproofcentral.com/wp-content/uploads/2020/10/How-To-Block-Low-Frequency-Sound-Waves-Bass-e1602767891920.jpg.webp

From Indicators to Identity: A CISOs guide to identity risk intelligence and its role in disinformation security

The power of signals, or indicators, is evident to those who understand them. They are the basis for identity risk intelligence and it’s role in disinformation security. For years, cybersecurity teams have anchored their defenses on Indicators of Compromise (IOCs), such as IP addresses, domain names, and file hashes, to identify and neutralize threats.

Technical artifacts offer security value, but alone, they’re weak against advanced threats. Attackers possess the capability to seamlessly spoof their traffic sources and rapidly cycle through their operational infrastructure. Malicious IP addresses quickly change, making reactive blocking continuously futile. Flagged IPs might be transient The Onion Routing Project (TOR) nodes, not the actual attackers themselves. Similarly, the static nature of malware file hashes makes them susceptible to trivial alterations. Attackers can modify a file’s hash in mere seconds, effectively evading signature-based detection systems. The proliferation of polymorphic malware, which automatically changes its code after each execution, further exacerbates this issue, rendering traditional hash-based detection methods largely ineffective.

Cybersecurity teams that subscribe to voluminous threat intelligence feeds face an overwhelming influx of data, a substantial portion of which rapidly loses its relevance. These massive “blacklists” of IOCs quickly become outdated or irrelevant due to the ephemeral nature of attacker infrastructure and the ease of modifying malware signatures. This data overload presents a significant challenge for security analysts and operations teams, making it increasingly difficult to discern genuine threats from the surrounding noise and to construct effective proactive protective mechanisms. Data overload obscures critical signals, proving traditional intelligence ineffective. Traditional intelligence details attacks but often misses the responsible actor. Critically, this approach provides little to no insight into how to prevent similar attacks from occurring in the future.

The era of readily identifying malware before user execution is largely behind us. Contemporary security breaches frequently involve elements that traditional IOC feeds cannot reveal – most notably, compromised identities. Verizon’s 2024 Data Breach Investigations Report (DBIR) indicated that the use of stolen credentials has been a factor in nearly one-third (31%) of all breaches over the preceding decade (https://www.verizon.com/about/news/2024-data-breach-investigations-report-emea). This statistic is further underscored by Varonis’ 2024 research, which revealed that 57% of cyberattacks initiate with a compromised identity (https://www.varonis.com/blog/the-identity-crisis-research-report).

Essentially, attackers are increasingly opting to log in rather than hack in. These crafty adversaries exploit exposed valid username and password combinations, whether obtained through phishing campaigns, purchased on dark web marketplaces, or harvested from previous data breaches. With these compromised credentials, attackers can impersonate legitimate users and quietly bypass numerous security controls. This approach extends to authenticated session objects, effectively nullifying the security benefits of Multi-Factor Authentication (MFA) in certain scenarios. While many CISOs advocate for MFA as a panacea for various security challenges, the reality is that it does not address the fundamental risks associated with compromised identities. IOCs and traditional defenses miss attacks from seemingly legitimate, compromised users. This paradigm shift necessitates a proactive and forward-thinking approach to cybersecurity, leading strategists to pivot towards identity-centric cyber intelligence.

Identity intelligence shifts focus from technical IOCs to monitoring digital entities. Security teams now ask: ‘Which identities are compromised?’ instead of just blocking IPs. This evolved approach involves establishing connections between various signals, including usernames, email addresses, and even passwords, across a multitude of data breaches and leaks to construct a more comprehensive understanding of both risky identities and the threat actors employing them, along with their associated tactics. The volume of data analyzed directly determines this approach’s efficacy; more data leads to richer and more accurate intelligence. Unusual logins trigger checks for compromised credentials via identity intelligence. Furthermore, it can enrich this analysis by examining historical data to identify patterns of misuse. Recurring patterns elevate anomalies to significant events, indicating broader attacks. Data correlation provides contextual awareness traditional intelligence lacks.

Fundamentally, identity signals play a crucial role in distinguishing legitimate users from imposters or synthetic identities operating within an environment. In an era characterized by remote and hybrid work models, widespread adoption of cloud services, and the ease of leveraging Virtual Private Network (VPN) services, attackers are increasingly attempting to create synthetic identities – fictitious users, IT personnel, or contractors – to infiltrate organizations. They may also target and compromise the identities of valid users within a given environment.

While traditional indicators like the source IP address of a login offer little value in determining whether a user truly exists within an organization’s Active Directory (AD) or whether that user is a genuine employee versus a fabricated identity, an identity-centric approach excels in this area. This excellence is achieved by meticulously analyzing multiple attributes associated with an identity, such as the employee’s email address, phone number, or other Personally Identifiable Information (PII), against extensive data stores of known breached data and fraudulent identities. Identity risk intelligence can unearth data on identities that simply appear risky. For example, if an email address with no prior legitimate online presence suddenly appears across numerous unrelated breach datasets, it could strongly suggest a synthetic profile.

Some advanced threat intelligence platforms now employ entity graphing to visually map and correlate these intricate and seemingly unrelated signals. Entity graphing involves constructing a network of relationships between various signals – connecting email addresses to passwords, passwords to specific data breaches, usernames to associated online personas, IP addresses to user accounts, and so forth. These interconnected graphs can become highly complex, yet they possess a remarkable ability to reveal hidden links that would remain invisible to a human analyst examining raw data.

An entity graph might reveal that a single Gmail address links multiple accounts across different companies and surfaces within criminal forums, strongly implicating a single threat actor who orchestrates activities across various environments. Often, these email addresses utilize convoluted strings for the username component to deliberately obfuscate the individual’s real name. By pivoting on identity-focused nodes within the graph, analysts can uncover associations between threat actors who employ obscure data points. The resulting intelligence is of high fidelity, sometimes pointing not merely to isolated threat artifacts but directly to the human adversary orchestrating a malicious campaign. This represents a new standard for threat intelligence, one where understanding the identity of the individual behind the keyboard is as critical as comprehending the specific Tactics, Techniques, and Procedures (TTPs) they employ.

The power of analyzing signals for threat intelligence is not a new concept. For example, the NSA’s ThinThread project in the 1990s aimed to analyze massive amounts of phone and email metadata to identify potential threats (https://en.wikipedia.org/wiki/ThinThread). ThinThread was designed to sort through this data, encrypt US-related communications for privacy, and use automated systems to audit how analysts handled the information. By analyzing relationships between callers and their contacts, the system could identify potential threats, and only then would the data be decrypted for further analysis.

Despite rigorous testing and demonstrating superior data-sorting capabilities compared to existing systems, ThinThread was discontinued shortly before the 9/11 attacks. The core component of ThinThread, known as MAINWAY, which focused on analyzing communication patterns, was later deployed and became a key part of the NSA’s domestic surveillance program. This historical example illustrates the potential of analyzing seemingly disparate signals to gain critical insights into potential threats, a principle that underpins modern identity risk intelligence.

Real-World Example: North Korean IT Workers Using Disinformation/Synthetic Identities for Cyber Espionage

No recent event more clearly underscores the urgent need for identity-centric intelligence than the numerous documented cases of North Korean intelligence operatives nefariously infiltrating companies by masquerading as remote IT workers. While this scenario might initially sound like a plot from a Hollywood thriller, it is unfortunately a reality that many organizations have fallen victim to. Highly skilled agents from North Korea meticulously craft elaborate fake personas, complete with fabricated online presences, counterfeit resumes, stolen personal data, and even AI-generated profile pictures, all to secure employment at companies in the West. Once these operatives successfully gain employment, data exfiltration, or at the very least the attempt thereof, becomes virtually inevitable. In some particularly insidious cases, these malicious actors diligently perform the IT work they were hired to do, effectively keeping suspicions at bay for extended periods.

In 2024, U.S. investigators corroborated the widespread nature of this tactic, revealing compelling evidence that groups of North Korean nationals had fraudulently obtained employment with American companies by falsely presenting themselves as citizens of other countries (https://www.justice.gov/archives/opa/pr/fourteen-north-korean-nationals-indicted-carrying-out-multi-year-fraudulent-information). These operatives engaged in the creation of entirely synthetic identities to successfully navigate background checks and interviews. They acquired personal information, either by “borrowing” or purchasing it from real citizens, and presented themselves as proficient software developers or IT specialists available for remote work. In one particularly concerning confirmed case, a North Korean hacker secured a position as a software developer for a cybersecurity company by utilizing a stolen American identity further bolstered by an AI-generated profile photo – effectively deceiving both HR personnel and recruiters. This deceptive “employee” even successfully navigated multiple video interviews and passed typical scrutiny.

In certain instances, the malicious actors exhibited a lack of subtlety and wasted no time in engaging in harmful activities. Reports suggest that North Korean actors exfiltrated sensitive proprietary data within mere days of commencing employment. They often stole valuable source code and other confidential corporate information, which they then used for extortion. In one instance, KnowBe4, a security training firm, discovered that a newly hired engineer on their AI team was covertly downloading hacking tools onto the company network (https://www.knowbe4.com/press/knowbe4-issues-warning-to-organizations-after-hiring-fake-north-korean-employee). Investigators later identified this individual as a North Korean operative utilizing a fabricated identity, and proactive monitoring systems allowed them to apprehend him in time by detecting the suspicious activity.

HR, CISOs, CTOs: traditional security fails against sophisticated insider threats. Early detection of synthetic insiders is crucial for preventing late-stage damage. This is precisely where the intrinsic value of identity risk intelligence becomes evident. By proactively incorporating identity risk signals early in the screening process, organizations can identify red flags indicating a potentially malicious imposter before they gain access to the internal network. For example, an identity-centric approach might have flagged the KnowBe4 hire as high-risk even before onboarding by uncovering inconsistencies or prior exposure of their personal data. Conversely, the complete absence of any historical data breaches associated with an identity could also be a suspicious indicator. Consider the types of disinformation security that identity intelligence enables:

  • Digital footprint verification – by leveraging extensive breach and darknet databases, security analysts and operators can thoroughly investigate whether a job applicant’s claimed identity has any prior history. If an email address or name appears exclusively in breach data associated with entirely different individuals, or if a supposed U.S.-based engineer’s records trace back to IP addresses in other countries, these discrepancies should immediately raise concerns. In the context of disinformation security, digital footprint verification helps to identify inconsistencies that suggest a fabricated identity used to spread false information or gain unauthorized access. Digital footprint analysis involves examining a user’s online presence across various platforms to verify the legitimacy of their identity. Inconsistencies or a complete lack of a genuine online presence can be indicative of a synthetic identity.
  • Proof of life or Synthetic identity detection – advanced platforms possess the capability to analyze combinations of PII to determine the chain of life, or the likelihood of an identity being genuine versus fabricated. For instance, if an individual’s social media presence is non-existent or their provided photo is identified as AI-generated (as was the case with the deceptive profile picture used by the hacker at KnowBe4), these are strong indicators of a synthetic persona. This is a critical aspect of disinformation security, as threat actors often use AI-generated profiles to create believable but fake identities for malicious purposes. AI algorithms and machine learning techniques play a crucial role in detecting these subtle anomalies within vast datasets. Behavioral biometrics, which analyzes unique user interaction patterns with devices, can further aid in distinguishing between genuine and synthetic identities.
  • Continuous identity monitoring – even after an individual is hired, the continuous monitoring of their activity and credentials can expose anomalies. For example, if a contractor’s account suddenly appears in a credential dump online, identity-focused alerts should immediately notify security teams. For disinformation security, this allows for the detection of compromised accounts that might be used to spread malicious content or propaganda.

These types of sophisticated disinformation campaigns underscore the critical importance of linking cyber threats to identity risk intelligence. Static IOCs would fail to reveal the inherent danger of a seemingly “normal” user account that happens to belong to a hostile actor. However, identity-centric analysis – meticulously vetting the true identity of an individual and determining whether their digital persona has any connections to known threat activity – can provide defenders with crucial early warnings before an attacker gains significant momentum.

This is threat attribution in action. Prioritizing identity signals, the attribution of suspicious activity to the actual threat actor becomes possible. The Lazarus Group, for instance, utilizes social engineering tactics on platforms like LinkedIn. Via Linkedin they distribute malware and steal credentials, highlighting the need for identity-focused monitoring even on professional networks. Similarly, APT29 (Cozy Bear) employs advanced spear-phishing campaigns, underscoring the importance of verifying the legitimacy of individuals and their digital footprints.

The Role of Identity Risk Intelligence in Strengthening Security Posture

To proactively defend against the evolving landscape of modern threats, organizations must embrace disinformation security strategies and seamlessly integrate identity-centric intelligence directly into their security operations. The core principle is to enrich every security decision with valuable context about identity risk. This means that whenever a security alert is triggered, or an access request is initiated, the security ecosystem should pose the additional critical question: “is this identity potentially compromised or fraudulent?”. By adopting this proactive approach, companies can transition from a reactive posture to a proactive one in mitigating threats:

  • Early compromised credential detection – imagine an employee’s credentials leak in a third-party breach. Traditional security misses this until active login attempts. Identity risk intelligence alerts immediately upon detection in breaches or dark web dumps. This early warning allows the security team to take immediate and decisive action, such as forcing a password reset or invalidating active sessions. Integrating these timely identity risk signals into Security Information and Event Management (SIEM) and Security Orchestration, Automation and Response (SOAR) systems enables such alerts to trigger automated responses without requiring manual intervention. Taking this further, one can proactively enrich Single Sign-On (SSO) systems and web application authentication frameworks with real-time identity risk intelligence. The following table illustrates recent high-profile data breaches where compromised credentials played a significant role:

    Table 1: Recent High-Profile Data Breaches Involving Compromised Credentials (2024-2025)
OrganizationDateEstimated Records CompromisedAttack VectorReference
Change HealthcareFeb 2024100M+Compromised CredentialsReference
SnowflakeMay 2024165+ OrgsCompromised CredentialsReference
AT&TApr 2024110MCompromised CredentialsReference
TicketmasterMay 2024560MCompromised Credentials (implied)Reference
UK Ministry of DefenceMay 2024270KCompromised Credentials (potential)Reference
New Era Life Insurance CompaniesFeb 2025335KHackingReference
Hospital Sisters Health SystemFeb 2025882KCyberattackReference
PowerSchoolFeb 202562MCyberattackReference
GrubHubFeb 2025UndisclosedCompromised Third-Party AccountReference
DISA GlobalFeb 20253.3MUnauthorized AccessReference
FinastraNov 2024 & Feb 2025400GB & 3.3MUnauthorized AccessReference
Legacy Professionals LLPFeb 2025215KSuspicious ActivityReference
Bankers Cooperative Group, IncAug 2024UndisclosedCompromised EmailReference
Medusind Inc.Jan 2025112KData SeizureReference
TalkTalkJan 202518.8MThird-Party Supplier BreachReference
Gravy AnalyticsJan 2025MillionsUnauthorized AccessReference
UnacastJan 2025UndisclosedMisappropriated KeyReference
  • Identity risk posture for users – leading providers offer something like an “Identity Risk Posture” Application Programming Interface (API). This yields a categorized value that represents the level of exposure or risk associated with a given identity. Meticulous analysis of a vast amount of data about that identity across the digital landscape derives this score. For instance, the types of exposed attributes, the categories of breaches, and data recency are all analyzed. A CISOs team can strategically utilize such a posture value to prioritize decisions and security actions. For example, a Data Security Posture Management (DSPM) solution identifies a series of users with access to specific data resources. If the security team identifies any of those users as having a high-risk posture, they could take action. Actions could include investigations or the mandate of hardware MFA devices. Or even call for more frequent and specialized security awareness training.
  • Threat attribution and hunting – identity-centric intelligence significantly empowers threat hunters to connect seemingly disparate signals, security events, and incidents. In the event of a phishing attack, a traditional response might conclude by simply blocking the sender’s email address and domain. However, incorporating identity data into the analysis might reveal that the phishing email address previously registered an account on a popular developer forum, and the username on that forum corresponds to a known alias of a specific cybercrime group. This enriched attribution helps establish a definitive link between attacks and specific threat actors or groups. Knowing precisely who is targeting your organization enables you to tailor your defenses and incident response processes more effectively. Moreover, a security team can then proactively hunt for specific traces within a given environment. This type of intelligence introduces a new dimension to threat attribution, transforming anonymous attacks into attributable actions by identifiable adversaries.

Integrate identity risk signals via API into security tools: a best practice. Effective solutions offer API access to vast identity intelligence datasets. These APIs provide real-time alerts and comprehensive risk posture data based on a vast data lake of compromised identities and related data points (e.g. infostealer data, etc). Tailored intelligence feeds continuously provide actionable data to security operations. This enables security teams to answer critical questions such as:

  • Which employee credentials have shown up in breaches, data leaks, and/or underground markets?
  • Is an executive’s personal email account being impersonated or misused?
  • Is an executive’s personal information being used to create synthetic, realistic looking public email addresses?
  • Are there any fake social media profiles impersonating our brand or our employees?

These identity risk questions exceed traditional network security’s scope. They bring crucial external insight – information about internet activity that could potentially threaten the organization – into internal defense processes.

Furthermore, identity-centric digital risk intelligence significantly strengthens an organization’s ability to progress towards a Zero Trust (ZT) security posture. ZT security models operate on the fundamental principle of “never trust, always verify” – particularly as it relates to user identities. Real-time information about a user’s identity compromise allows the system to dynamically adjust trust levels. For example, if an administrator account’s risk posture rapidly changes from low to high, a system can require re-authentication until investigation and resolution. This dynamic and adaptive response dramatically reduces the window of opportunity for attackers. Proactive interception of stolen credentials and fake identities replaces reactive breach response.

Embracing Identity-Centric Intelligence: A Call to Action

The landscape of cyber threats is in a constant state of evolution, and our defenses must adapt accordingly. IOCs alone fail against modern attackers; identity-focused threats demand stronger protection. CIOs, CISOs, CTOs: identity-centric intelligence is now a critical strategic necessity. As is understanding identity risk intelligence and it’s role in disinformation security. This necessary shift does not necessitate abandoning your existing suite of security tools; rather, it involves empowering them, where appropriate, with richer context and more identity risk intelligence signals.

By seamlessly integrating identity risk data into every aspect of security operations, from authentication workflows to incident response protocols, security teams gain holistic visibility into an attack, moving beyond fragmented views. Threat attribution capabilities then become significantly enhanced, as cybersecurity teams can more accurately pinpoint who is targeting their organization. Identifying compromised credentials or accounts speeds incident response, enabling faster breach containment. Ultimately, an organization can transition into both proactive and disinformation security strategies.

Several key questions warrant honest and critical consideration:

  • How well do we truly know our users and their associated identities?
  • How quickly can we detect an adversary if they were operating covertly amongst our legitimate users?

If either of these questions elicits uncertainty, it is time to rigorously evaluate how identity risk intelligence can effectively bridge that critical gap. I recommend you begin by exploring solutions that aggregate breach data and provide actionable insights, such as a comprehensive risk score or posture, which your current security ecosystem can seamlessly leverage.

Identity-centric intelligence is vital against sophisticated attacks, surpassing traditional methods for better breach detection. CISOs enhance breach prevention by viewing identity risk holistically, moving beyond basic IOCs. North Korean attacks and data breaches highlight the urgent need for identity-focused security. Implement identity risk, entity graphing, and Zero Trust for a proactive, resilient security posture. Understanding and securing identities equips organizations to navigate complex future threats effectively. Fundamentally, this requires understanding identity risk intelligence and it’s role in disinformation security.

Cybersecurity Empowers Businesses to Soar, this is how

Cybersecurity Empowers Businesses to Soar, this is how

Cybersecurity empowers businesses to soar, this is how. The modern day notion that “cybersecurity is a business enabler” is a very popular one. The problem is that most of the people singing that tune are cybersecurity leaders trying to get their message out. The C-Suite and business folks may not agree, or even see this as actually being the case. So, exactly how is it that cybersecurity is a business enabler? This is an exploration of some concepts, with examples.

Cybersecurity teams typically focus on implementing guardrails and protective controls. That is the overt nature of the business. To be an enabler the opposite (opening things up) may need to be a focal area. Reducing and/or eliminating, excessive or unnecessary guardrails and/or controls could prove very effective. The targets of this exercise are those that hinder innovation and new initiatives, but don’t add any tangible value. Sometimes these exist in the form of technical debt because a predecessor thought they made sense.

In modern day business cybersecurity is no longer just about protective mechanisms. Therefore, as technology advances, and adds value to businesses, cybersecurity must be prioritized so that the organizational goal isn’t “business”, but “safe business”. Safe business implies that asset and customer protection are important areas for an organization. Safe business is where cybersecurity empowers businesses to soar, this is how:

Risk Reduction

Any cyber event (attack, incident, etc) can have a significant negative impact both operationally and financially. There is also the potential reputational consequence. However, there are studies that refute this, here is one example. To be objective, the reputational impact of a cybersecurity event cannot be solely measured by its stock price. Irrespective, robust cybersecurity measures are a common way to reduce cyber related risk, in turn providing some protection to organizational reputation and financial health (avoidance of costly legal fees, loss of revenue, etc).

Example

An organization goes through the pain of implementing native, end-to-end encryption (E2EE), at a database level. This protects customer data both while at-rest and in transit. This added layer of protection raises the work factor for any nefarious entity targeting the organization that has custodial responsibilities over the relevant data sets. This move stands to increase confidence in the organization, amongst other benefits, by reducing risk of data leakage and/or exposure.

Asset protection

Cybersecurity controls and protective mechanisms can protect an organization’s assets. The definition of asset is subjective to a given organization, but generally covers data (customer data, PII, intellectual property, etc), people, technology equipment, etc. By protecting assets, and preventing data breaches, a business can maintain a level of integrity related to their assets. Moreover, the organization can avoid potential financial and reputational damage. Once an organization does not have to worry about that potential damage it can operate in a safe, and focused, fashion.

Example

A data discovery exercise reveals that specific columns in a database store personally identifying information. This data is stored in the clear. Cybersecurity works with the respective engineering teams to implement native column level encryption and then appropriately modify all the relevant touch-points (apps, APIs, etc). This strong protective mechanism provides asset protection of sensitive data from nefarious entities. This protection in turn builds trust with customers and partners, enabling the business to grow.

Adherence to regulations

I will never confuse compliance with security. However, there is business value in being compliant with regulations, when appropriate. A number of industries have regulations regarding data privacy and security. These are sometimes encountered in the form of HIPAA, GDPR, & CCCPA. One way cybersecurity can add value to, or enable, a business is by ensuring adherence with the appropriate regulations. A company that can comply with regulations will have a good chance of avoiding fines, penalties and legal issues. This section should be self-explanatory and needs not an example.

Differentiation

Many organizations have been through the unfortunate circumstances that come with some sort of negative security event. The fact that they have is an indicator of some gap, or deficiency, in their security posture. Contrarily, some organizations are not often heard of within the context of negative security events. All organizations have security gaps but these have probably invested more resources in differentiating themselves from the others. Particularly, it is cybersecurity that can provide this competitive advantage to an organization or company. By implementing strong protective measures, an entity can demonstrate to customers, and partners, that they take security seriously. This in turn makes them a more appealing business partner.

Example

An organization engages in honest, objective, and continuous assessments and penetration tests against their customer facing environments. Those reports, untouched by the organization, are then published for the world to consume. This type of transparency shows goodwill and confidence on the part of the organization. Moreover, it differentiates them from the competition if they haven’t been as forthcoming and transparent.

Customer / partner confidence

Customers, and partners, are becoming more aware of cyber risks. These risks are becoming part of normal life. As such, potential partners and customers now prioritize cybersecurity when considering engaging in business. By implementing effective cybersecurity measures, a company can improve the confidence these potential customers and partners have in it. The goal is to use that as a solid basis for business relationships. Over time this will also lead to increased loyalty and trust. Modern day customers, and partners, will trust a company that takes cybersecurity seriously and is committed to protecting their personal data.

Example

A data discovery exercise exposes many years of technical debt by way of dangling backup files. These files are mostly un-encrypted since file encryption wasn’t a big thing a decade ago. Inside of the backup files there is personal data from databases. There is obvious risk here. Moreover, there is needless spending for the necessary storage. The intelligence gathered from discovery leads to engagements with relevant engineering teams to clean this up. Sharing this discovery, and subsequent action, with relevant parties demonstrates a commitment to data protection. This protects the organization from potential/needless data exposure and builds trust with customers and partners, enabling growth.

Enabling innovation

As companies continue to move forward in competitive fashion, innovation is a differentiator. Safe innovation makes cybersecurity involvement even more critical than under normal circumstances. Sound cybersecurity mechanisms can enable a company to innovate with confidence. This could consist of adopting new technologies, such as cloud computing, Internet of Things (IoT), and generative artificial intelligence. By implementing cybersecurity measures that align with business strategies, a company can improve their agility, level of innovation, and competitiveness.

Example

An IoT manufacturing company is building sensors. Those sensors will automatically send telemetry data to a cloud based ecosystem for storage and eventual analytics. The cybersecurity team works with software developers to make sure that data gets transmitted in the safest possible manner, a combination of orthogonal encryption covering both transmission streams and payloads. Given that, self contained executables can be compiled natively for multiple platforms. That protected mode of transmission becomes portable. The hardware design team can then shift gears and change embedded platforms as needed while not worrying about the data transport mechanism. This improves research & development, production efficiency, agility, reduces cost, and enables the business to scale.

Conclusion

In conclusion, cybersecurity is no longer a technology practice, nor is it just a set of defensive measures. It has become a legitimate business enabler that, when done right, can bring significant benefits to a company. From reducing enterprise risk to protecting corporate assets and ensuring adherence to regulations; from creating differentiators to improving customer/partner confidence; from enabling innovation to enhancing competitive advantage, strong cybersecurity measures can help a company thrive in today’s digital age. Hence, cybersecurity empowers businesses to soar, this is how.

Compliance does not equate to security, or protected

Compliance does not equate to security, or protected
Src: https://unsplash.com/@cdd20

Compliance does not equate to security, or protected. They are not the same thing and the dividing line has become somewhat blurred. Parts of the Cybersecurity industry are in crisis due to an over reliance on compliance frameworks. The unfortunate reality is that many cybersecurity leaders rely on these frameworks because they just don’t know any better. Maybe, they were never practitioners and approach cybersecurity from a purely business perspective. Yes, we cybersecurity leaders need to have a business perspective. But, we also need to know our craft and be able to drive true protective initiatives. Over-focusing on compliance efforts may actually hurt security. Even worse, it may give an organization a false sense of being protected.

Compliance has sort of lost its way

My understanding is that compliance initiatives never intended to give a false sense of security. The intentions were to:

  • provide leaders and practitioners guided mechanisms that could promote better security
  • provide a maturity gauge
  • possibly expose areas of deficiency

One of the areas where they intended to help is that of introducing industry standards. The well intended thought process was that adherence to some well defined standards could make organizations “more secure”.

While I commend the intent, the end result, our current state, could not have been foreseen. Earlier generations of cybersecurity leaders, those who came up the practitioner ranks, treated these as mere guiding tools. Today, there are some with great sounding titles that rely on these as evidence of their brilliance and fine work. The problem is that those unfortunate organizations have been led to believe that they are secure and protected.

This creates a misleading culture of compliance=security. And those who know no better push this rhetoric, leave a trail of insecure environments, but move their careers forward. The funniest part here is that often those leaders bring in an external entity to assess if the organization in question is compliant. This is supposed to somehow add credibility to this weak approach. And this box checking exercise is often pursued in lieu of encouraging a real focus on security and protective controls.

Secure companies might be compliant

A secure company may or may not be compliant. I have encountered environments that are well protected and have not been through any compliance exercise. I have also seen compliant organizations go through the unpleasant experience of one or more negatively impacting incidents. Look no further than Uber. According to this they have received, and maintain, the highly coveted, and difficult to get, ISO-27001 certification. Yet, here is an incomplete list of incidents showing that this does not equate to security: 2014, 2016, 2022.

How they differ

Objectives and target audience

The objective of a typical compliance exercise is to measure an organization against a model, or set of recommendations. These might be industry standards. Audiences interested in these types of results can range from internal audit to the C-Suite. Generally speaking, technologists are not interested in these data points. But, some industries place great weight on these types of results and will not even consider business opportunities with organizations that do not have them.

Security, on the other hand either strives to reach certain levels of Confidentiality, Integrity, or Availability (CIA) and/or being Distributed, Immutable or Ephemeral (DIE). The objectives here revolve around threats, risks and building a resilient organization. The audience are those who see security by way of actual protection and resilience.

Rules of engagement

In security, there are none. Nefarious actors hardly ever play fair.

Compliance, however, has clear rules of engagement. In some cases the compliance exercise is even based on the honor system, think of the “self attestation” documents we have put together over the years. There is also a cadence where organizations know when relevant cycles begin and end. These are structured practices attempting to measure environments where the real battlefields have very little structure.

External motivations

When it comes to compliance, the external entities (auditors, assessors, etc) involved are motivated to generate a report or certificate. In some cases there are degrees of freedom for cutting corners or adjusting the wording of controls. This opens up many possibilities that aren’t all positive. Things can get manipulated so that a specific outcome (leaning towards the positive) is achieved. These folks are running a business and looking for repeat business.

The external entities on the security side are motivated towards results (destruction, your data, etc).

Focal areas

Security is usually focused on protection against threats. This focus could very well reach very granular levels. Granularity levels aside, protection could be both active and reactive in nature. Prioritization plays a big role here. This is especially so in organizations of larger sizes. A cybersecurity leader ultimately directs protective dollars to focal areas of priority.

Conversely, compliance takes a broad and high level look at things, treating all areas equally and so there is very little by way of focus. These exercises are more about checking boxes against a list of requirements that are all of equal importance.

Rate of change

Compliance is generally static and operates on long cycles. Moreover, the process requirements don’t change often at all. The challenge here is that internal changes are not factored in until subsequent cycles, and that could in one years time. In the world of security that feels like an eternity.

Security needs to keep up with constant changes. Simply factor in automated deployments and/or ephemeral cloud entities and it’s easy to see the impact of rate of change. Externally, nefarious entities are always changing, improving, adapting, learning. This means that an organization’s attack surface, and risk profile, are constantly changing. This alone negates the usefulness of a point in time compliance validation.

Organizational culture and the mindset

Mindset

Organizational culture drives the mindset people adopt at work. This could be a subtle dynamic but it is powerful nonetheless. A healthy relationship between compliance and security is the right mindset, and approach as each is important. Compliance is necessary for modern day business, without it an organization may not even be able to bid for certain business opportunities. Different from this modern day construct is security. Security is necessary to actually protect an organization, its assets, users, and customers. The nebulous tier that exists between these two is that compliance represents the only set of data points for external entities (e.g. potential business partners, consumers, etc) to consume and construct a maturity picture.

A company can have the best product on the planet, but if they cannot prove to a potential business partner that it is safe to use the opportunity may not flourish. Here are a few simple example:

  • SOC-2 reports represent proof of security maturity in the United States of America (USA)
  • in the United Kingdom there is generally the requirement of a Cyberessentials certification
  • to interact with credit card providers, an organizations needs to prove (at varying degrees) adherence to the Payment Card Industry Data Security Standard (PCI-DSS) standard.

Culture

The organizational culture point here is that there are security-first and compliance-first mindsets. One actually protects while the other aims to prove that protection exists and is effective. Security protects the organization, while compliance proves it.

This space has evolved into an erroneous approach. A compliance-first mindset is treated as actually meaning an organization is somehow secure.   Sadly this mindset merely provides protections against auditors and not attackers. There is no framework I am aware of that actually leads to anything protective. To expect security, or protection, from some compliance initiative, or framework, is foolish and amateurish. Worst off, it can lead to a false sense of security for an organization.

Does a compliance-first mindset hurt security?

Yes, compliance leads to a false sense of security

Security hurts. It adds time to external projects (such as software development), costs money and takes up human resources. Moreover, it is never a static exercise as the entire space is always in flux (new sophisticated attacks, evolving internal changes, etc). Falling back to compliance as a way to alleviate this pain is just a mistake, but it happens. Compliance provides executive leadership a convenient way to create the illusion of security. This attitude can actually hurt a security program and minimize the effectiveness and/or support of certain projects that can add real protective value.

Yes. Compliance often wins a prioritization battle

Theoretically, compliance and security should synchronized in the goal of improving an organizations ability to conduct business. More often than not the two become competing entities for priority. At that point the synchronized goal ceases to exit. In so many organizations, when this prioritization conflict arises, compliance will win. Some executive will compare cost, effort, ultimate benefit and will fall for the allure of a compliance report or certificate. The perception at that executive level will come down to that person seeing that certificate or report as adding more value to the top/bottom line.

Yes – Top security talent see compliance as soul-draining

Compliance work is not exciting to talented security practitioners, especially younger ones. Let’s face it, that is boring work in comparison to blue/red teaming for instance. These folks come to the table ready to protect an organization’s environment. They absolutely do not come to an organization to check boxes on a document while adding screenshots for evidential purposes. From this perspective compliance hurts security by becoming an obstacle in the way of real protective work. Worst off, it creates an environment that is not pleasant to talented security team members.

Is a compliance-first mindset shocking?

Absolutely not. It is fair to say that entities (humans, companies, etc) will generally pick the path of least resistance and pain in order to reach some goal. Contextually, compliance represents that path towards what the security uneducated consider to be something positive. We cannot blame these people for confusing compliance and security. To the security uneducated, the two can actually look identical. After all, they sound alike, talk the same game, and are sometimes spoken of interchangeably by some who should know better. Some executives mistakenly perceive a strong correlation between compliance and security. Because of this, a checkbox exercise can easily seem like the right thing to do. 

Those of us in the security educated space cannot take a judgemental approach to this unfortunate wave of development. It is entirely on us to educate these people and do our part to clear up the confusion that is permeating this subset of our domains. Educating these people selfishly benefits us. It future proofs us from executives getting excited after reading the marketing campaign that promises to secure their business with an ISO-27001 certificate.

Security-first mindset and making compliance work for you

Pursuing an organization-wide security-first mindset is a must in today’s world. Nothing is digitally safe and you should trust nothing. In order to foster a pervasive cultural change towards a security-first mindset you have to be very much in sync with the organizations culture. More importantly, you need to understand the business itself so that your work enhances it by adding safety. The last thing you want to do is push a security-first mindset incorrectly where it hinders business operations.

Internally, an understanding of the business factors in an understanding of an organizations crown jewels and holistic (inside out and outside in) attack surface. The key question is what do we want to protect here in this organization. The mindset discussed here then has a subliminal bias towards always having the organization as a whole leaning in that direction.

Measuring the effectiveness of this mindset and cultural change is very important. This has to be continuous and happen over time. The results may not be linear as they can be impacted by many external factors (M&A, etc). But the key here is you have a deep connection to, and understanding of, the business and its culture.

Compliance frameworks, processes, and standards cannot provide answers to the questions that lead to the necessary, and change impacting, connections discussed here. So it becomes a tool you manipulate and control it in order to make it beneficial. Make it a value add to your security program and the organization as a whole. Make those processes become sources of valuable data points that, for example, can point you at deficient areas. There is inherent value if you learn anything from one of these exercises. This is certainly more beneficial than a checkbox exercise that becomes pass/fail journey.

Final thoughts

It is understandable that many leaders see compliance frameworks, processes, and standards as something useful. They add structure to a space they may not truly understand. Don’t forget that a percentage of cyber and information security leaders did not come up the ranks as practitioners. They, understandably so, perceive compliance exercises as valuable. For those in the know, the value add from the compliance space comes in the form of posture improvements based on provided data points. However, when compliance efforts are prioritized, and treated as a source of security, harmful situations will arise.

I ask my peers to ponder this question: Who exactly is your adversary? You are security-first in nature if your adversary is an actual nefarious entity. Contrarily, you are not if your adversary is some auditor whom you need a passing grade from. We should all be striving to implement a security-first mindset, regardless of the state of compliance within an organization because compliance does not equate to security, or protected.

Attack Surface Management in the Cloud Era

Attack Surface Management in the Cloud Era: The Many Angles to Consider. This was the title of my talk on a invitation-only conference (2/21/2023) with “Executive Insights”. In this blog I share the material while trying to recount what I actually said during the live session. The session was 15 minutes long with 5 minutes of Q&A. Hence, the content is not intended to be granular or exhaustive.

Note: This is an ultra important topic for cybersecurity leaders who are truly focused on understanding their ecosystem and where to focus protective resources (human and dollars).

Slide 1

Slide 1

Attack surface management is a complex endeavor. Unfortunately a number of products in this space have marketed very well. In turn, some people in cybersecurity think that the purchase and deployment of some specific product actually gives them a handle on their attack surface. What I would like to share with you now are some angles to consider so that you may possibly expand your program to cover your organization in a more thorough way.

Slide 2

Slide 2

External perspective

The overt, and best known, space in regards to attack surface management is that of the external perspective. By that I mean what your organization looks like to the outside world, in particular to nefarious actors. This outside-in perspective generally focuses on public internet facing resources, especially for businesses that actually sell, or host, web based products. The angles here mostly focus on which hosts are public facing and in turn what ports are actively listening per host. A good solution here actually probes the listening ports to verify the service being hosted. For the sake of accuracy and thoroughness please don’t assume that everyone adheres to IANA’s well known port list. It is entirely possible, for instance, to host an SSH server on TCP port 80 and that would inaccurately imply that a web server is at play.

Shadow IT

A benefit of focusing on this outside-in angle is that it is a great way to expose shadow IT elements, if they exist and are hosting in a public facing capacity. I should also state that for this set of angles to be effective this has to be a continuous process such that new public facing hosts and ports are discovered in a rapid fashion. There are many products that serve this space and provide relevant solutions.

B2C / B2B

From the external perspective the natural focus is on the Business to Consumer (B2C) angle. This is where the space is predominately based on end-users/customers and web applications. All of what makes up a web app stack comes into play there. But from a Business to Business (B2B) perspective there is the lesser familiar area of APIs. Whether you run a REST shop or a GraphQL shop there are unique challenges when protecting APIs.  Some of those challenges revolve around authentication, authorization, and possible payload encryption in transit. For instance is TLS enough by way of protecting data in transit? Or do you consider an orthogonal level of protection via payload encryption, something like JSON Web Encryption (JWE) (if you use JSON Web Tokens (JWT)) for instance. It’s certainly an angle that needs consideration.

Slide 3

Slide 3

Covertly there are a number of angles that require attention. Starting with the humans lets focus on insiders (meaning inside of your network).

Insiders

There are threats and employees. Sometimes an employee becomes a threat and the angle here is that they are already inside your network. This angle is typically one where there is high risk because employees have authenticated access to sensitive systems and data. To complicate this, enter hybrid or work from home setups. Now your attack surface has expanded to areas outside of your control. Home networks are clearly high risk environments but since our traditional network perimeters no longer exist, now those home networks are part of your attack surface. Imagine a home network with kids downloading all kinds of crazy stuff. Then imagine a user that has your hardened work laptop at home. Then imagine the day that person feels like accessing work content from their personal machine. Or better yet they figure out how to copy crypto certificates and VPN software to their personal machines. Now the angle is an insecure machine, with direct access to your corporate network, that in turn has direct access to your cloud environments.

Non-generic computing devices

In reference to non standard computing devices there are risks based on this equipment being generally unknown to traditional IT teams. Imagine the HVAC controllers, or motors, or PLCS, required to operate the building you work in. Most of those devices are networked with IP addresses, they typically reside on your network and are part of your attack surface. Now let’s also consider the administrators of said equipment and the fact that they have remote access capabilities. Some VPN paths bring traffic in via your cloud environments with paths back to the building equipment. That is one angle. Then there are direct remote access scenarios, which put people on to your network and in turn there is the possibility of access to your cloud environment. Misconfigurations like these happen all the time and are angles to be considered when studying your attack surface.

SaaS

Supply chain angles are now a big thing. Actually they have been for some time but recently they are getting a lot of industry attention. Let’s start with SaaS solutions. Are they your responsibility from a security perspective? Maybe, maybe not. But, your data goes into these systems. While it is an organizationally subjective decision if your data in a SaaS solution is part of your attack surface, it is an angle to consider. You should at least scrutinize the security configuration of the SaaS components that get accessed by your employees. The goal is to make sure the tightest security configurations possible are in use. Too often consultants get brought in to set SaaS environments up and they may take the easiest path to a solution, meaning the least secure. It happens.

SaaS integrations are even riskier. Now your data is being exchanged by N number of SaaS solutions outside of your control and again, it’s your data being sent around. Were those integrations, typically API based, configured to protect your data or purely to be functional? After all my years in the industry, I can tell you what I have seen puts me in the cynical category based on people doing what is most secure on your behalf.

FOSS

Part of modern day supply chains is open source code. We all know of the higher profile cases that involved negative events based on abuse of things like open source libraries. The angles here vary but I assure you most cloud environments are full of open source code. It is an angle that cannot be avoided in the modern world.

Slide 4

Slide 4

Alternate ingress pathways

A typical cloud setup only accepts sensitive network connectivity from specific networks. This is by design so that sensitive communication paths, like SSH, are not accessible via the public internet. Typically these specific networks are our corporate networks and those, in turn, get remotely accessed via VPNs. Or it could also be the case that your VPN traffic itself flows into, and/or through, your cloud environment. So an angle to scrutinize is exactly where do VPN connections get placed on your network. This may be leaving pathways open to your cloud infrastructure.

Another pathway of concern is direct, browser based, access to your cloud infrastructure. A successful authentication could give a user a privileged console for them get work done. If this account gets compromised then there is substantial risk. The real danger with this ability is that it facilitates users to log in and do their work from personal machines that may not have the same protective controls as a work computer.

Privileged Users

SRE engineers and sysadmins typically have elevated privileges and the ability to make impactful changes in cloud environments. The scripts and tools they use need to be considered as part of your attack surface because I am sure your teams didn’t write every piece of software they use. And so these scripts, the SRE engineers machine, etc all become possible alternate access paths to sensitive elements.

DataBase Administrators (DBA) are valued team members. They typically have the most dangerous levels of access from a data perspective. This is obvious given their role but this also should raise your risk alarms. Imagine a DBA working from a home based machine that for instance has a key logger on it. This is a machine that will also have data dumps at any given time. And of course we all know that DBAs and software engineers always sanitize data dumps for local working copies *[Sarcasm]*.

Git

Git – there are so many known examples of sloppy engineering practices around hard coded, and in turn, leaked elements of sensitive data. One important angle to study is the use of secrets managers. Analysis must take place to sniff out hard coded credentials, API keys, passwords, encryption keys, etc and then ensure re-engineering takes place to remove those angles from your attack surface. The removal obviously being a migration to use a secrets manager as opposed to statically storing sensitive elements where they are just easy to access.

SSH

SSH tunnels and back-doors are both interesting and challenging. Unquestionably they represent a set of angles you need to hone in on. Detecting if some SSH traffic is a reverse tunnel is not trivial. But from an attack surface perspective you can hardly point a finger at something that introduces this much risk. This scenario can expand your attack surface in a dangerous and stealthy way.

Ephemeral port openings

Temporary openings are a real problem. Ephemeral entities within cloud deployments are challenging enough but the legitimate ones are typically not public facing. So for example let’s say you have containerized your web servers. And you are using elastic technology and possibly orchestration to successfully react to traffic spike. That is an ephemeral use case that is acceptable, and typically behind protective layers (Web App Firewall, etc). But what happens when the human factor creates bypasses to controls in order to facilitate ease of use in a specific scenario.

Story – In talking with some members of a start-up playing in this space they were telling me about one of their interesting findings on a project. They discovered a case where a specific host, on a specific port, was open on specific days for a limited amount of time. Their investigation led to the finding of the SRE and DBA teams having an automated process to allow for the running of a remote maintenance script hitting a database server directly. The SRE / DBA teams felt it was such a limited exposure that there was no risk. Interesting angle from an attack surface perspective and maybe one more people need to look for.

Slide 5

Slide 5

Data

Data. The real target and the heart of our challenges. This is also where multiple angles exist. Let’s start with some simple questions, each one should make you realize the angles at play …. In regards to all of the data you are responsible for, do you definitively know where it all is? Do you know all of the databases at play? Do you know every storage location where files are stored? Within those answers, do you truly know where all of your Personally Identifiable Information (PII) and/or Protected Health Information (PHI) data is? If you don’t then those are angles you need cover ASAP. 

Once you know where the data is … then for each location you need to at least map ingress pathways. What can touch database X? Web apps, admin scripts, APIs, people? And what about egress pathways? Once someone touches a data store how can they exfiltrate? The angles within the ingress / egress challenge can be vast and some skilled analysis needs to take place in order to properly understand this part of your attack surface.

In Transit

Another data related angle to consider is that of your data in transit. But not to the outside world, on the inside of your cloud environment. More often than not the following scenario is real …. You have strongly protected TLS 1.2+ streams from the outside to a tier of load balancers controlled by your cloud provider. The load balancers terminate the relevant sockets and in turn terminate the stream encryption at play. From that point to the back-end all of the streams are in the clear. A lot of people assume that is a trusted part of the network. I am not in the business of trust and so that angle bothers me and I push for encrypted streams on the inside of my cloud environments. Otherwise that part of your attack surface is susceptible to prying eyes.

Meta-data

Finally I would like to stress the potential value of proper analysis of meta-data. Take encrypted channels of communication as an example. If an encrypted egress path is identified then some skillful reconstruction of meta-data can yield some valuable intelligence even though you obviously can’t see the content that was moved along the discovered channels.

Thank you for having me, I will take questions now …

Simplify, this is the key to effective cybersecurity messaging

Simplify, this is the key to effective cybersecurity messaging.

Simplify, this is the key to effective cybersecurity messaging. In cybersecurity leadership, effective messaging/communication will make or break you. I have written about the need to translate in our industry. This needs to be augmented by this notion of simplicity.

I often share the following challenge with my peers and/or those coming up in this field …. All you have is a white piece of paper and three colored crayons. With this toolkit design an effective visual and then construct the messaging around it. If you can effectively work with these limited tools then you are on a positive path to effectively relay a digestible message to a varied audience.

The audience matters. More importantly their experience, knowledge base and focal areas matter. A simple messaging pattern can act as a low common denominator that can positively impact an audience of varied executives and/or directors. There are other human elements that come into play, for example emotional responses to events. I speak about some of this with SC Magazine here.

Another challenge that is very popular with Entrepreneurs is the development, and relaying, of an elevator pitch. If you can’t sell your program or effectively relay a message within a very tight time frame (i.e. 30 seconds, etc) then your communication patterns may need some attention. I always think back to my startup days when we pitched to investors and had to go out there to raise money. A simple, clear, super concise and focused message was the tactic, technique and procedure to success. While most CISOs are not running around raising money they can benefit from developing this art/skill.

Ultimately remember to simplify, this is the key to effective cybersecurity messaging.

Cuban Refugee to Cybersecurity Leader, My unique journey

Cuban Refugee to Cybersecurity Leader. From Cuba to the streets of New York to global Cybersecurity leadership.
Src: https://techcrunch.com/wp-content/uploads/2018/04/nyc-enterprise.png

Hispanic Heritage is celebrated in the U.S. at this time of the year (September). That makes this article, about my unique journey – Cuban Refugee to Cybersecurity Leader, that much more special. Growing up on the streets of New York as a Cuban immigrant was an instructive experience. Hispanic Executive (https://hispanicexecutive.com) published this article.

We discuss some of the challenges we (Latino immigrants) encounter coming up in the U.S. professional/corporate ranks. Most immigrants run into similar, sometimes worse, situations. I am cognizant of that. Personally, my many years training in Judo brought me many relevant benefits. This is so due to the discipline of our training but also the flowing, yielding and toughness developed along the Judo journey.

Commonly a path to a technology career is neither linear nor straightforward. The article touches upon this and how I flow with my life’s directed momentum. My journey traverses multiple disciplines within technology. The challenges have been abundant. The lessons learned, priceless. The federal government space, the corporate world, the cybersecurity startup / product world, and consulting for some high profile entities, are worlds I have lived in.

Furthermore, I speak about two other areas in the article that are of paramount importance to me. Those areas are translation and balance.

Translation is of great importance and is relevant in multiple directions. Undoubtedly a successful leader is constantly translating information because part of achieving success is building bridges between entities that don’t speak the same language (i.e. business and tech). I blogged about some of this earlier in the year.

Unquestionably, balance can be elusive, especially within the context of work / life balance. Judo training forces one to understand their own balance on multiple fronts, physically, spiritually and emotionally. Within the context of cybersecurity leadership I speak of the need for balance between technical capacity and business acumen.

Interview: Transforming and Securing Education Through Tech

Transforming and Securing Education Through Tech
https://cybermagazine.com/magazine/cyber-september-2022

Without a doubt, had a great time talking shop (Transforming and Securing Education Through Tech) with the team from Cyber Magazine (https://cybermagazine.com). You can read the interview at this link to the magazine article.

We talk about why security is critical to the present & future of education. Especially considering the face of education is changing.

Had a great Q&A Session with Education Technology Insights

Src: https://unsplash.com/photos/HwWBTd21wiA?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink

I recently had a great Q&A Session with Education Technology Insights where I shared some thoughts. The subject was Cybersecurity and some general thoughts on what is currently, and what may be coming. This was enjoyable in that it had me step back a bit and think about the bigger, more abstract, picture.

The questions they asked me:

1. What are some of the major challenges and trends that have been impacting the Cybersecurity space lately?

2. What keeps you up at night when it comes to some of the major predicaments in the Cybersecurity space?

3. Can you tell us about the latest project that you have been working on and what are some of the technological and process elements that you leveraged to make the project successful?

4. Which are some of the technological trends which excite you for the future of the Cybersecurity space?

5. How can the budding and evolving companies reach you for suggestions to streamline their business?

The name of the article with my perspectives is “Protecting Critical Space Assets from Cyber Threats” and it can be found here: link.