Red Hat has updated Red Hat OpenShift AI, its cloud-based AI and machine learning platform, with a model registry with versioning and model tracking capabilities, data drift and distortion detection tools, and low-rank adaptation (LoRA) fine-tuning capabilities. . Stronger security is also offered, Red Hat said.
Version 2.15 of Red Hat OpenShift AI will be generally available in mid-November. Features highlighted in the release include:
- A model registry, currently in technology preview, that provides a structured way to share, version, deploy, and track models, metadata, and model artifacts.
- Data shift detection to track changes in input data distributions for deployed ML models. This capability allows data scientists to detect when the live data used for model interference is significantly different from the data on which the model was trained. Drift detection helps verify the reliability of the model.
- Bias detection tools to help data scientists and AI engineers monitor whether models are fair and unbiased. These predictive tools from the open source TrustyAI community also monitor the fairness of the models during real-world deployments.
- Fine-tuning with LoRA, enabling more efficient fine-tuning of LLMs (large language models) such as Llama 3. This allows organizations to scale AI workloads while reducing costs and resource consumption.
- Support for Nvidia NIM, a set of interface microservices to accelerate the delivery of generative AI applications.
- Support for AMD GPUs and access to the AMD ROCm workbench image to use AMD GPUs for model development.
Red Hat OpenShift AI also adds capabilities for serving generative AI models, including the vLLM runtime for KServe, a Kubernetes-based model inference platform. Also added is support for KServe Modelcars, adding the Open Container Initiative (OCI) repository as an option for storing and accessing model versions. Additionally, the private/public endpoint route selection in KServe allows organizations to improve the security posture of the model by directing it specifically to internal endpoints when needed.