Insights into Generative AI Technology

Dave Sloan, Chief Technology Officer in Microsoft’s Worldwide Public Sector team

Insights into Generative AI Technology

Continuing our series of insights into the use of generative AI in the Public Sector, Dave Sloan, Chief Technology Officer in Microsoft’s Worldwide Public Sector team shares more detail on the Azure Open AI Service.

Introduction to Azure OpenAI Service for Public Sector

Public sector leaders are typically experts in their field and are distinguished by their focus on their ministry’s mission. They appropriately rely on their staff to tackle engineering issues and keep pace with technology trends.

Sometimes, however, disruptive innovations emerge from the regular technology evolution that have the potential to fundamentally reshape or disrupt the way these public sector organizations conduct their business and serve their citizens. These are not incremental improvements to existing systems, but completely new systems that open new realms of possibility.

Historically, this has included the dawn of such advancements as the internet, big data, mobile, and social media. These technical advancements rise to the level where even public sector executives need to build up a core level of understanding to properly envision the future of their own organization.

With generative AI tools bursting into the public consciousness, AI has finally crossed the threshold of essential executive competencies and is positioned to drive massive disruption across all industries, including the global public sector.

Generative AI is a subset of artificial intelligence that involves the use of algorithms and techniques to generate new data – things that have not existed in the world before being created by the models. This capability is based on previous advancements in AI, which focused on training machines to recognize patterns and make predictions based on existing data.

In the case of generative AI, that capacity for prediction is used to create content that it predicts will be seen as realistic and satisfactory to the direction, or “prompt”, that it is given. Over time, the model improves its ability to create realistic data and gets better at identifying spurious data, a feedback loop that continues until the model can create data that is virtually indistinguishable from data collected in the real world. This creates an extraordinarily lifelike natural language experience for users, one that approximates human conversation, across a very broad domain.

The latest generative AI uses a model called Generative Pre-trained Transformer, or GPT, that has been trained on a large language model (LLM) which has learned from vast amounts of text from the internet. Azure OpenAI offers a range of models for customer use. These models vary in terms of their size, but also in terms of their capabilities, cost, and complexity. Guidance as to which model is appropriate for specific conditions is included in the Azure OpenAI service documentation.

To provide a sense of how quickly the size, scope and sophistication of these models are advancing, one measure of the complexity of a model is how many parameters it manipulates. A parameter might be compared to a camera setting – you could think of zoom as one parameter, brightness as another, and focus as a third. You have the ability to adjust each of these parameters independently to optimize the picture that you take. An LLM is also trying to optimize its output, but it has many more parameters it can adjust to do so. GPT-3 could adjust over 100 billion parameters, GPT-4 manages over 100 trillion.

The model’s capabilities are not limited by content type. Current algorithms made available in the Azure OpenAI Service can create new natural language (GPT-3), new computer code (Codex), and will soon be able to create new images or alter existing pictures (DALL-E), all based on simple natural language textual prompts. VALL-E extends this ability to create synthesized audio clips for a given voice, with a specified text and emotional tone. The newly released GPT-4 encompasses much of this previous functionality, and can also interpret images and make deductions based on the context of these images.

This AI functionality is also being used by other software to make it both easier to access and more domain relevant to a variety of users. For example, embedded AI can now summarize Teams meetings that were missed, draft detailed email responses to correspondence, and create operational code in an integrated development environment.

In the future, we can expect to see generative AI being used in a wide range of industries, including the public sector as well. Given the nature of public sector scenarios, it is critical that careful consideration be given to the implications and responsible development and use of this technology, and that customers put safeguards in place to prevent misuse.

Overall, generative AI represents a significant advancement in the field of artificial intelligence and has the potential to revolutionize many aspects of our lives. Forward-looking public sector leaders will seek to gain a level of command over these capabilities and harness them to forge a bold vision of their organizations’ futures – and improve the lives of the people and missions they serve.

AI Compliance

The adoption of AI, and generative AI more specifically, is an exciting and pivotal moment for public sector leaders. However, the public sector has a high bar of legal compliance that must be met in order to ensure the use of AI reflects the values and priorities required for public trust. This article will try to address concerns frequently shared by public sector customers as they consider the adoption and use of AI for public sector services and missions.

Compliance, Security, and Safety

Azure OpenAI was designed with compliance and security in mind. One of the benefits of using Azure OpenAI is that it is built to take advantage of the security and compliance features that are already so well-established in Microsoft’s hyperscale cloud. This includes reliability, redundancy, availability, and scalability, all of which are designed into cloud services by default. Customers will likely benefit from aligning with existing and evolving international compliance standards as this both increases the speed of compliance and reduces the customer cost of compliance. This is especially true of compliance standards around AI which will need to adapt quickly to fast-moving innovation.

Microsoft is also committed to building safe AI systems. Azure OpenAI is designed to comply with our Responsible AI principles generally, and we provide specific guidance for Azure OpenAI usage through a Transparency Note. Furthermore, our standard Terms of Service are augmented with a Code of Conduct for Azure OpenAI where customers commit to avoid specific harmful uses and practices. Azure OpenAI is a limited access service so that Microsoft can verify permissible use cases and ensure customers will exert good faith efforts to comply with these expectations and restrictions.

Privacy and confidentiality

One of the most common concerns expressed by the public sector is that private personal data should not be exposed to – or used to enrich – non-public entities. By nature of their responsibilities, public sector organizations are entrusted with access to a broad array of deeply private and sensitive data about their citizens and employees. Additionally, government systems handle confidential, internal, and deliberative content that must be protected until they are ready for public release. Opportunities to use new technologies, no matter how powerful or helpful, cannot be considered if they violate that trust.

Microsoft believes that our customers’ data belongs to our customers. Azure OpenAI Service does not train on customer data. The large language models are trained on large corpora of text and images retrieved from public sources – not on private or customer data. We do not use private data for their training. Microsoft will also not assert copyright or intellectual property rights over the output of the model that customers or users generate with prompts.

When a customer uploads custom data to fine tune the results of the GPT model, both the customer data and the results of the fine-tuned model are maintained in a protected area of the cloud, stored in the customer’s tenant – accessible only by that customer and separated by robust controls to prevent any other access. The customer data and results can additionally be encrypted by either Microsoft-managed or customer-managed encryption keys in a Bring Your Own Key format, if a customer so chooses.

Regardless of the encryption status, neither the raw data nor the fine-tuned model are shared with any other customers or with Microsoft. Both the raw data provided by the customer and the model generated from that contributed data can be deleted by the customer at any time. In most instances, Microsoft is able to support and troubleshoot any problems with the service without needing access to any customer data, such as the data that was uploaded for fine-tuning. In the rare cases where access to customer data is required, whether it be in response to a customer-initiated support ticket or a problem identified by Microsoft, customers can assert control over access to that data by using Customer Lockbox for Microsoft Azure. Customer Lockbox gives customers the ability to approve or reject any access request to their customer data.

Content filtering is the process by which responses are synchronously examined by automated means to determine if they should be filtered before being returned to a user. This examination happens without any need to store any data, and no humans review the prompts or the responses. (Prompts refer to the text provided by users as requests into the model, whereas responses refer to the data delivered back to the user.) Abuse monitoring is conducted by a separate process. By default, Microsoft temporarily stores request and response data for up to 30 days. This data may be accessed only by authorized Microsoft personnel to assist with debugging, and to protect against abuse or misuse of the system.

For customers who instantiate Azure OpenAI Service in Europe, this review is conducted exclusively by personnel in the European Economic Area. This human review may create a challenge for public sector customers, who need to strike a balance between the safety of the system and the risks of external access – even under controlled conditions. To accommodate that balance, Microsoft offers limited access features that allow for approved customer use cases to opt out of these human review and data logging processes.

Sovereignty, Data Protection, and Data Residency

The technical infrastructure necessary to provision the large language models that support generative AI is unprecedented. The only venue for training and accessing these models is likely to be the huge datacenters of the global hyperscale cloud.

Public sector agencies that want to maintain the ability to access this and other new innovations need to ensure that they have a thoughtful and balanced risk management and resilience approach, as well as a modern procurement policy to be able to access the advanced AI capabilities in the hyperscale cloud. While the data processing for training Azure OpenAI Service fine-tuned models will take place in the region selected by the customer, it is important to recognize the real protection available to the public sector is provided by an array of technical and policy tools that Microsoft places at customers’ disposal to secure their data, and not by the physical location in which it is processed.

Given the significant infrastructure that is required to power these models and the logistics of worldwide instantiation of the services, customers may be further encouraged to adopt a risk-based approach before establishing any associated data residency restrictions. Creating public sector requirements around data residency should be carefully weighed as they may create blockers to advanced functionality.

In terms of evaluating data sovereignty, Microsoft policies and public commitments have helped many customers to meet their sovereign requirements, especially in the public sector. Nothing about the use of Azure OpenAI Service creates any new challenge to sovereign requirements, and the robust tools to secure the fine-tuning training data allow for further mitigation of any concerns.

Compliance is never a fixed target, as governments have a legitimate responsibility to ensure that their society and interests are protected. Microsoft prides itself on developing solutions for public sector use that support multiple compliance goals and will continue to engage in dialogue with regulators around the world. Microsoft works hard to ensure that we are enabling innovation and advancing our customers’ visions in a way that is respectful of public sector values and national regulations.

To find out more:

About the Center of Expertise

Microsoft’s Public Sector Center of Expertise brings together thought leadership and research relating to digital transformation in the public sector. The Center of Expertise highlights the efforts and success stories of public servants around the globe, while fostering a community of decision makers with a variety of resources from podcasts and webinars to white papers and new research. Join us as we discover and share the learnings and achievements of public sector communities.

Questions or suggestions?

Follow Microsoft