By Gurdeep Singh, IT Tools Innovation Lead at Tryg
An AI sandbox enables development of new AI technologies in a controlled environment reducing the risk of violating laws, regulations and guidelines.
Based on a conversation at a recent meeting in our Oslo-based Automation & AI peer group, let me take you through the why, how, benefits, adoption strategy and also the risks of having an AI Sandbox.
Why an AI Sandbox?
This has proven to reduce the so-called 'time to market' for innovations, giving new increased legal certainty to businesses and thereby leading to more innovation.
The AI sandbox environment can be customized based on the requirements at multiple levels
Project level: This is where the configurations, controls, governance and audit requirements are as per the project deliverables, technologies and risk appetite
Organization: This enriches the project level configuration where organization level requirements which are under the umbrella of IT and info security, responsible AI, local governance protocols and vision
Regulatory level: This further enriches the configurations as it provides guidelines and requirements for having the solution in production and in development hence keeping it in consideration as a goal post project teams can adopt the recommendation and guidelines from the beginning
Why is an AI Sandbox required?
An AI sandbox is required as it involves several technical and functional considerations to ensure both usability and security of an AI solution during development and pre-production validations
AI sandbox gives more stability, control, ability to upscale, monitor, reduce time to market for a project, initiative and innovation.
Incase of future audits and controls, segregation of pre-production and production environments help and sharing existing controls and guardrails for current and future AI projects.
The key components of an AI sandbox
While we create a AI sandbox there are various technical and functional aspects and considerations which are suppose to be provided to give higher degree of control, security and usability.
Some of the key factors are as mentioned below. First some functional considerations and then further down, you’ll find the technical ones.
Functional considerations
Data Sanitization: If the sandbox involves handling user data, ensure it is properly sanitized to prevent any malicious data from impacting the environment or other users. Incase the data harnessing, modeling is done in the environment then it should be controlled and contained. If possible the harvesting, harnessing, sanitization and modeling should be done outside the sandbox and should be consumed based on requirement to keep the sandbox lightweighted and avoiding data duplication leading to potential data leakage.
Input Validation: Validate and sanitize inputs to the AI models to prevent injection attacks and potential security vulnerabilities. Input access should be based on RBAC and via a secured protocol. Data transitioned within inter and intra sandbox and external environment should be encrypted ( preferable for inter sandboxes and sandboxes and external).
Model Versioning and Rollback: Implement version control mechanisms for AI models so that you can easily revert to previous versions if necessary. Leverage enterprise code repositories connected to the sandbox if possible.
Testing and Quality Assurance: Thoroughly test the sandbox environment for potential vulnerabilities and bugs. Regularly update and patch the system to maintain security and functionality.
Dev & Testing segregation : Dev and testing environment, resources and dev and test cases should be segregated in the sandbox. There can be tenants within the environments which can help in segregating the pre-production environments further
Technical considerations
Isolation and Containment: The AI sandbox should be isolated from the rest of the system to prevent unauthorized access to sensitive data and code execution. Containerization technologies like Docker or virtualization can be used to achieve this isolation. Exposure to data, applications etc. should be through a gateway, APIs etc. with proper tokenization and access controls.
Resource Management: Allocate appropriate resources to the sandbox, such as CPU, memory, and disk space, to ensure it operates effectively without impacting other system components. Technical requirement and architecture of the same should be agreed with the infra and solution architect teams. Incase the sandbox is not part of a container and is part of a server farm tenants should be created with dedicated resources. Future ready resource setup is recommended to avoid future confrontations.
Access Control: Implement strong access controls to limit who can access the sandbox environment. Role-based access control (RBAC) can be used to manage user permissions. The access should be time sensitive and regular pruning is required. It’s always advisable to have resource groups created with different levels of access depending upon the roles. Access replications, dummy accesses or clone accesses should be avoided to ensure right accesses for individuals.
Monitoring and Logging: Implement comprehensive monitoring and logging mechanisms to track user activities, detect any potential security issues or anomalies and to highlight user behavior should be established. The one view to the Admin or product owner helps in validating and monitoring. Trigger points for abnormalities should be identified beforehand and throughout the project and notifications on the same should be enabled. Comparative analysis on baseline and notifications should be enabled as well for regular review and reporting.
Automatic Shutdown: Set up automated processes to shut down the sandbox after a specified period of inactivity, reducing the risk of unauthorized access. It’s important that the automatic shutdowns should be in agreement with the product owner or technical owner to ensure no disruption to the setup. Imaging and backup incase of mission critical projects should be considered as well.
Secure APIs: If the AI sandbox allows external interactions through APIs, make sure they are secured using authentication, rate limiting, and proper error handling. The usage of logging should be done in the monitoring and logging protocols and regular validation of tokenization should be done.
Backup and Disaster Recovery: Regularly back up of the sandbox environment to ensure data and models are recoverable in case of system failures or security incidents. It’s advisable that the BCP / DRP plan is agreed with the respective technical and product owner along with the BCP / DRP teams. Incase of multiple tenants and server locations it should be mitigated with backups in different servers geographically is possible. Frequency of image backup should be agreed beforehand as per the BCP / DRP plans.
Communication Encryption: Ensure that all communications within the sandbox environment and between the sandbox and external systems are encrypted using secure protocols. Regular assessment of secure protocols and support provided by the base applications should be assessed
Continuous Security Auditing: Regularly perform security audits to identify and address any vulnerabilities or weaknesses in the AI sandbox environment. Ensure the auditing team and the infra and project teams have segregation of duties and roles to ensure conflict of interest. Guidelines for assessment should be agreed with the technical team and product owner
Risks of an AI Sandbox
While an AI Sandbox comes with loads of benefits for short term and long term however along with that with every setup there are risks or rather potential downside as well.
Some of them include:
Duplication of resources: In the endeavor of setting up the right AI sandbox and PPE (pre-production environment) for AI technical teams can create duplicate sandboxes or eventually a total sandbox farm over complicating the requirement, resulting in high maintenance cost and investments.
Synchronization issues: The prime objective of AI sandbox is to give a non-production setup which is the replica of production setup however with ever evolving technology and infrastructure, a sync between these environments is necessary else it will result in duplication of efforts or change in project scope
Non-modular setup: It’s important that the setup is modular, agile and on DecOps principles to ensure reusability of AI Sandbox in long term for multiple projects. In failure of doing so an impact on cost and time can be experienced.
Ready for your AI sandbox now?
Since we have gone through the what, why and how of the AI sandbox it might sound like overwhelming however the benefits of the setup and framework are promising.
By considering these factors, you can design an AI sandbox environment that balances openness with security, allowing users to experiment and innovate while maintaining the integrity and safety of the overall system.
We have seen organizations face limitation with innovation and building on AI capabilities and setting up the right controls and AI sandbox will help them in this journey.