Many organizations face challenges in creating value from data while maintaining strict regulatory standards set for handling sensitive data. For these organizations, handling large, complex data sets while maintaining efficiency, security and scalability becomes paramount to their deployment. The collaboration between Red Hat and Cloudera offers customers a solution that helps organizations to manage the complete data lifecycle, putting data to work faster and reducing time to value. With Cloudera Private Cloud on Red Hat OpenShift, organizations get aggregated and visualized data that can help them derive actionable insights in a security-focused, hybrid, open source environment. In this article, you’ll see how to take advantage of these capabilities for your AI strategy, using Cloudera workloads running on Red Hat OpenShift.
Introducing Cloudera on Red Hat OpenShift
Red Hat OpenShift is the industry’s leading hybrid cloud application platform powered by Kubernetes, combining a comprehensive set of tools and services to streamline the entire application lifecycle—from development to delivery to management of application workloads. It combines built-in security features with dedicated support, a trusted software supply chain and Red Hat Enterprise Linux as the operating foundation. With various capabilities, such as built-in monitoring, on-demand environments and centralized policy management, OpenShift is trusted by customers around the globeto run their workloads.
Cloudera Platform running on OpenShift delivers powerful analytic, transactional and machine learning (ML) workloads in a hybrid data platform. With a choice of traditional and elastic analytics and scalable object storage, Cloudera on-premises modernizes traditional monolithic cluster deployments in a powerful and efficient platform. It enables the end-user to sensitive data while unlocking the power of AI to increase innovation in their business. Three microservices power this platform:
Cloudera Data Warehouse (CDW)Cloudera Data Engineering (CDE)Cloudera AI
Cloudera AI allows developers to develop, deploy and manage AI in a security-focused, scalable environment. Developers can then utilize these models to develop AI agents and AI applications to address various business needs.
The power of Cloudera AI on Red Hat OpenShift
Cloudera AI provides the tools for data science teams to collaborate across the full data lifecycle. It delivers access to security-focused trusted data pipelines, scalable compute resources and a few other tools. By running the platform on Red Hat OpenShift, Cloudera AI takes advantage of the powerful Kubernetes container orchestration to manage resources efficiently and scale workloads dynamically. This helps organizations build AI models with enterprise-grade security and compliance capabilities.
Running Cloudera AI on OpenShift allows data scientists, ML engineers, DevOps engineers and AI developers to collaborate by utilizing a shared workspace while protecting sensitive data.
Benefits of Cloudera AI on Red Hat OpenShift
Cloudera manages 25 exabytes of data and is used by 9 out of 10 of the largest global companies. With over a decade of experience developing solutions for customers with large data footprints, Cloudera is uniquely positioned to offer a generative AI (gen AI) solution that scales to customer needs. Customers can use Cloudera AI integrated with their organization’s data to impact their business decisions and generate successful outcomes.
Enterprise-grade security and compliance capabilities for sensitive data
Security is paramount when deploying AI models that use sensitive company data. Cloudera on OpenShift provides the following security features:
Security-focused containerized workloads: The namespaces in Kubernetes provide a mechanism to scope resources in a cluster. Namespaces provide a unique scope for named resources to avoid basic naming collisions, delegate management authority to trusted users and limit community resource consumption. By running AI and ML workloads within containers on Red Hat OpenShift, Cloudera AI helps ensure that each workload is isolated in independent namespaces, reducing the risk of data breaches and vulnerabilities that could arise in a standard non-Kubernetes deployment model. Learn more about the various security features in Red Hat Openshift.Governance and compliance: Cloudera offers built-in capabilities for tracking data lineage, auditing access and complying with regulations such as GDPR, HIPAA and more. Apache Ranger, as part of Cloudera’s Shared Data Experience (SDX), offers fine-grained access control to data in a secure Cloudera data lake house, and Apache Atlas (also a component in the SDX mentioned above) provides auditing capability for security teams to track data usage, access and metadata management. These components give customers visibility into how data is accessed and utilized, which is essential for compliance and regulatory reporting for sensitive data sets. In Red Hat OpenShift, Red Hat maintains various security and compliance certification levels. In conjunction with Cloudera and Red Hat’s capabilities in the compliance arena, the platform focuses on security and compliance to help maintain the safety of the data.
Efficient resource management and scalability
Building and deploying AI models requires significant computational resources. The demand fluctuates based on the stage and type of the AI project. Cloudera AI running on Red Hat OpenShift provides the necessary flexibility out of the box to scale up resources during peak workloads and scale down when resources are no longer required.
Using Kubernetes-based infrastructure with Red Hat OpenShift, Cloudera AI enables customers to handle large-scale AI tasks, such as:
Training deep learning models: Cloudera AI can efficiently train complex ML models by using the vast array of GPUs supported by Red Hat OpenShift with heterogeneous GPU support.Running inference at scale: Once trained, AI models can be deployed to production environments, where a continuous data stream is typically configured to enable real-time inference and scale in size automatically as demand increases.AI agents and applications: Cloudera AI facilitates deploying and scaling AI-driven agents and applications, including interactive chatbots, virtual assistants, document summarization and autonomous decision-making systems.
Cloudera Quota Management gives administrators the power to control how resources are allocated within the Cloudera AI workbench by different teams and projects. By setting quotas for CPU, memory and GPU usage for specific projects and workspaces, administrators utilize resources efficiently and prevent monopolization of resources by a single team or project. These capabilities are essential in prioritizing high-priority SLA-driven workloads to receive the necessary resources to help reduce risk for the business.
Cloudera AI takes advantage of Cloudera Quota Management to enhance resource efficiency using:
Controlled resource allocation: Quota Management enables administrators to define resource limits for each user or project. This unique attribute controls provisioning to maintain consistency, reduce SLA impacts and prevent resource contention.Dynamic scalability: The Cloudera Platform on Red Hat OpenShift uses a Kubernetes-based architecture to scale workloads in response to real-time application demands automatically. For example, resources such as CPUs, GPUs and memory can be dynamically increased during model training to enhance performance. Then, during less resource-intensive phases, such as tuning or testing these resources, resources can be scaled back to maintain efficiency while lowering cost. Cloudera Quota Management ensures that scaling only occurs within the administrator’s predefined resource limits, optimizing cost and preventing overuse.Efficient use of GPUs: For compute-intensive AI tasks, Cloudera AI administrators can assign GPUs to accelerate processing for specific workspaces where required. Cloudera Data Platform working with OpenShift, which includes enhancements to Kubernetes so that users can more easily configure and use GPU resources to accelerate workloads.
Collaboration and operational efficiency
Developing AI models requires a collaborative approach across various teams with different skill sets. A diverse background that comes from data scientists, ML engineers, IT operations, DevOps engineers and business stakeholders. Cloudera AI on Red Hat OpenShift helps unite these teams by providing a unified environment where everyone can collaborate and fuel innovation. Key collaborative features include:
Shared workspaces: Multiple team members can work in a shared environment, enabling them to easily share data sets, project artifacts, code and models.Version control: Cloudera AI integrates with Git, Bitbucket and other version controls to enable teams to track changes and use different branches for development.MLOps automation: Customers can automate the entire operations flow from model development to tuning, deployment and persistent monitoring. This reduces the internal development investment effort required to migrate AI models from a lab to a production environment.
Conclusion: driving AI innovation with trusted data
The Cloudera AI data service offered as part of Cloudera on OpenShift offers enterprises a powerful solution to increase AI innovation by allowing users access to securely managed sensitive data in CDP to train and tune AI models. Combining Cloudera AI’s capabilities with Red Hat OpenShift’s robust, scalable container management infrastructure enables organizations to build, train and deploy AI models efficiently with a focus on security.
As organizations adopt AI to transform their business, the combination of Cloudera AI and Red Hat OpenShift offers a security-focused platform for handling sensitive data and delivering AI-powered insights at scale. Whether the customer is in finance, healthcare or any other data-intensive industry with regulatory and compliance needs, Cloudera AI on Red Hat OpenShift handles sensitive data with care while empowering customers to unlock the full potential of gen AI.