What is an AI Alignment Platform?

Jul 11, 2024

A Unified Approach to Managing AI Risks

By Andrew Burt, Mike Schiller, & Ben Lorica.

Artificial intelligence has a problem. In the last decade, companies have begun to deploy AI widely and have come to rely on a host of different tools and infrastructure. There are tools for data collection, cleaning, and storage. There are tools for model selection and training. There are tools for orchestration, and for performance monitoring and auditing. There are tools that focus on traditional predictive AI models. And there are other tools that focus on generative AI.

There are so many—too many—tools to help companies deploying AI.

But no tools provide a unified approach to ensuring AI models align to the law, regulations, or to company values. What companies need is a unified alignment platform that incorporates all AI risks, and not just some risks. It is not enough to ensure that a model does not violate privacy restrictions, for example, if that model is overtly biased. It’s not enough to focus solely on traditional predictive AI or solely on generative AI systems. Companies use both and must address the liabilities of both.

But at the present moment, companies only have access to siloed tools that manage some but not all AI risks, forcing them to cobble together a complicated suite of tools if they are serious about ensuring their AI does not cause harm, violate laws, or tarnish their reputation.

Aligning AI Behavior

When we say “alignment,” we aren’t just referencing a model’s performance objectives and we are not speculating about artificial general intelligence or its potential risks or abstract society-level values alignment. To us, the AI alignment problem is much more specific.

At a high level, model objectives generally have two parts: do this, and then don’t do that. So there are positive directives, such as be helpful, and then there are prohibitions, such as don’t be harmful. Positive directives generally align to performance objectives, which receive most of the attention from data scientists. If a model can’t perform well, it won’t get deployed, which is why performance tends to get attention first. The best models must have high helpfulness with low harmfulness. Prohibiting certain model behavior is where the issues come in.

The best models must have high helpfulness with low harmfulness

There is a long list of things AI systems should not do—just like there is a long list of things software systems in general should avoid. For AI systems, this list has become siloed, meaning that each harm is addressed by separate teams with separating tools if it is even addressed at all. Companies have teams to address privacy issues, which are typically the responsibility of the chief privacy officer. Cybersecurity issues are addressed by information security teams, which report to the chief information security officer. Bias issues tend to fall on legal teams, who are the most familiar with antidiscrimination laws. Copyright concerns are similarly addressed by lawyers.

But what happens when each of these teams need to work together to manage AI? What tools can they use to align all these requirements to ensure that AI systems do not violate the law or generate harm? Today, companies simply don’t have any good options, which is why it takes so long, and so many manual resources, for companies to holistically manage AI risks. And this is why existing efforts to manage AI risks cannot scale.

The Need for an Alignment Platform

What companies need is a single place to manage all of these risks—not simply to ensure high performance but to align against all the legal, compliance, and reputational risks that continue to grow and become more complex.

This need is already apparent in the growing number of AI incidents—just take one look at resources like the AI Incident Database and it’s clear that companies are struggling to manage all these risks. Survey after survey has shown that the companies are aware of this issue, with executives repeatedly identifying risks as one of the main barriers to adopting AI. We have helped companies manage these risks for years, and the main cause is a fragmented approach to managing AI risk, where siloed teams use disconnected tools to keep track of their AI. An alignment platform is the only way for companies to achieve this type of alignment.

Alignment platforms are composed of three main layers:

The Workflow Management Layer. There is a range of teams and actors involved in getting AI into production—including legal, responsible AI, privacy, data science, product, IT, and sometimes even vendors. All of these teams need to know what to do, when to do it, and how to document their efforts. Without a standardized way to manage all of these teams and requirements, every model ends up being managed differently in practice—creating risks, confusion, and undermining serious efforts to scale. Some models need to be reviewed and approved manually, but many more can be assessed through automated review processes to enable AI deployment at scale. Finally, it’s rare to drive model harmfulness to zero; which is why it’s important to have a consistent framework for assessing the tradeoff between helpfulness and harmfulness. All these different teams and requirements, in short, need to be aligned so they can be efficiently implemented and then scaled.
Analysis and Validation. Alignment efforts require testing to demonstrate their efficacy. Risks like bias, using sensitive data, toxicity, and more need to be quantitatively measured so that they can be addressed. This type of testing layer must be able to do two things: First, it must be able to measure risks in predictive AI systems—those with binary, ranking, or multinomial outputs—as well as generative AI systems, including multimodal models. Second, this layer must be able to conduct testing at specific points in time, such as during an AI impact assessment, but also be able to continuously test models for the same risks while in deployment. This isn’t just because model output may drift, but because legal and policy standards change over time as well. Think of this as ongoing monitoring but for alignment. Finally, it must enable mitigation of problematic model outputs, providing feedback to model developers, or directly to model training systems, to suppress harmful output.
Reporting Layer. Ensuring successful alignment is a big deal—when alignment efforts go wrong, consumers end up suffering significant harms, and legal and other types of liability can arise. This means that demonstrating what was done to minimize risks, and how these efforts were monitored and implemented, is critical for companies that want to protect their AI from risk. The reporting layer must capture logs of each test, as well as clearly and easily document alignment efforts and remediation steps taken for each AI system. Because so many different teams are involved in alignment, with so many levels of expertise, it is important that everything be simple, easily explainable, and readable, all in plain English.

Alignment Platforms in Practice

We are acutely aware of the need for an alignment platform, which is why we built Luminos.AI. Throughout the last decade, we’ve watched as companies adopted AI only to see them stumble—sometimes with great consequences—in managing the requirements placed on these systems. The more AI models a company has, in fact, the harder it usually is for them to manage risks across all of these systems and the more resource intensive, and confusing, their risk management efforts become.

Companies need one platform to manage all risks: performance, legal, compliance, and reputational

Luminos.AI was built from the ground up to manage these three essential layers of alignment.

The first, the workflow management layer, allows all the teams involved in building and deploying AI models to collaborate and standardize their efforts to drive efficiencies and enable their AI adoption to scale. It is not uncommon, for example, for customers to have months-long waits for every model that needs to be approved for deployment—some customers have waiting times of over a year! (Although this appears to be an outlier; more typical waiting times range from two to six months.) But these periods add up, meaning that the more AI systems a company wants to deploy, the worse this approval workflow process becomes. With the Luminos workflow management layer, companies can automatically review and approve AI systems, enabling them to deploy hundreds of systems when previously this was not possible.

The second, the Luminos analysis and validation layer, allows for risks to be quantified and aligned, meaning that companies no longer have to debate over and manually approve which tests to apply to each model. Without this layer, selecting and running the right tests adds weeks or months to alignment efforts, which is why we embed a range of different tests directly into the platform to enable testing at scale. Tests can also be customized so that teams can modify quantitative testing when they need to.

This layer is even more important for generative AI systems, which produce a volume of output that no human can possibly review. Our solution to this problem is to use our own AI systems—some of which are general, some of which are industry specific, and all of which can be fine-tuned to each customer’s data—to monitor and score every model output at an individual level to ensure alignment. This means that companies can obtain and store granular, transparent, and auditable alignment scores for every legal and risk requirement they need, every time a user interacts with a model.

Critically, measurements of both predictive and generative AI systems can be easily leveraged to mitigate the model’s risks through refining or fine-tuning the model to address specific harms. The platform is simple to integrate into model CI/CD pipelines to monitor alignment over time as both model outputs and risk profiles change, informing everyone with a stake in the system, not just the model team.

The Luminos reporting layer is what proves that alignment has been successfully implemented. This consists of automatically generated reports, written in plain English and footnoted as if carefully put together by a human, to summarize how each AI system has been thoroughly managed for risks. We have seen customers use these reports to demonstrate compliance and even stave off lawsuits when needed. Always accessible when needed, these reports are automatically stored as a system of record and kept up-to-date as each model is in use.

Finally, the platform provides an open API that other tools can build upon and supports standard file formats, enabling flexibility and integration throughout the wide range of applications used by AI systems. Tests, rules, and reporting are highly configurable, allowing each customer to adapt the system to their requirements - even allowing the integration of custom testing with our analysis and validation layer.

The Future of Alignment

The AI alignment platform is a new architecture and approach to ensuring AI systems behave as they should and are aligned with the right laws and the right values. Without AI alignment, companies cannot adopt AI. AI alignment platforms hold the key to this adoption, and Luminos.AI is leading the way.

This article first appeared on Luminos.AI.

Data Exchange Podcast

Postgres: The Swiss Army Knife of Databases. Timescale co-founders Ajay Kulkarni and Mike Freedman explain how Postgres has become a versatile powerhouse for time-series analytics, AI, and large-scale relational data. They showcase pgvectorscale for vector search, proving Postgres's ability to replace multiple specialized databases with its scalability, performance, and cost-effectiveness.
Unlocking the Power of Unstructured Data: From Vector Indexing to Distributed Training. LanceDB CEO and co-founder Chang She introduces Lance, an open-source data format built for multimodal AI. Lance surpasses formats like Parquet and ORC by optimizing storage and retrieval of complex data like images, videos, and embeddings, making it ideal for modern AI and machine learning.

The Highs and Lows of Generative AI

At the halfway mark of 2024, the landscape of generative AI continues to evolve rapidly, with both excitement and challenges shaping its trajectory. Building on my previous article about why generative AI projects fail, I've compiled a collection of observations about the current state of this transformative technology, drawing from recent reports, surveys, and conversations with industry experts.

The sentiment surrounding generative AI is mixed, reflecting both its immense potential and the hurdles it faces. On the negative side, high development and operational costs remain a significant barrier, particularly for smaller companies and startups. The looming power crunch and semiconductor bottlenecks pose additional challenges to AI's growth. There's also a risk of overhyped expectations leading to a potential bubble, reminiscent of past tech booms.

On the positive side, generative AI shows promise for significant productivity and efficiency gains. Established tech giants are leading investment and development, reducing risks of underutilized capacity. Historical precedents of technological adoption cycles suggest that, despite initial skepticism, generative AI could drive transformative changes in the long run

This spectrum of observations provides valuable clues for entrepreneurs and AI teams about the types of solutions and tools they should focus on building. Here's why it matters:

High costs and uncertain ROI are slowing AI adoption. Teams must optimize resource utilization and demonstrate tangible value quickly to secure continued support and funding.
There's a disconnect between perceived AI capabilities and reality. It's crucial to manage expectations, focus on achievable goals, and showcase realistic use cases to avoid disillusionment.
The current hype cycle is driving poor investment decisions. Unfortunately, many companies are investing in Generative AI driven by a fear of missing out (FOMO). Instead of blindly following trends, teams need to align their solutions with specific business goals and existing processes.
The gap between AI infrastructure investment and revenue generation is widening. As this "AI investment hole" grows, teams must prioritize developing AI products that deliver clear, monetizable value to users and demonstrate a solid return on investment.
Strategic AI adoption is key. Companies must carefully assess how AI can improve their specific workflows and have a well-defined integration strategy.
LLMs are proving to be force multipliers for specific tasks. Identifying areas where these models can provide the most value allows teams to build solutions that significantly enhance productivity and efficiency.
Despite challenges, AI is demonstrably improving productivity in certain areas. This indicates that well-implemented Generative AI solutions can drive substantial efficiency gains, allowing businesses to streamline operations and focus on higher-level strategic activities.

As you navigate the complex landscape of generative AI, it's clear that success lies in balancing enthusiasm with pragmatism. By focusing on strategic implementation, managing expectations, and prioritizing solutions that deliver tangible value, AI teams and entrepreneurs can position themselves to capitalize on the technology's immense potential while mitigating its risks.

If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe:

Ben Lorica edits the Gradient Flow newsletter. He helps organize the AI Conference, the NLP Summit, Ray Summit, and the Data+AI Summit. He is the host of the Data Exchange podcast. You can follow him on Linkedin, Twitter, Reddit, or Mastodon. This newsletter is produced by Gradient Flow.

Meng Li

Jul 12, 2024

The four pillars of AI alignment, known as the RICE principles.

Robustness: Refers to the AI system's ability to operate reliably in various environments and resist unexpected interference.

Interpretability: Requires us to understand the internal reasoning process of the AI system, especially opaque neural networks. Through interpretability tools, the decision-making process is made open and understandable to users and stakeholders, thereby ensuring the system's safety and operability.

Controllability: Ensures that the behavior and decision-making process of the AI system are subject to human supervision and intervention. This means that humans can correct deviations in system behavior in a timely manner, ensuring that the system remains aligned during deployment.

Ethicality: AI systems adhere to socially recognized ethical standards in decision-making and actions, respecting the values of human society.

Expand full comment