Infrastructure & Hardware - AI News https://www.artificialintelligence-news.com/categories/how-it-works/infrastructure-hardware/ Artificial Intelligence News Thu, 16 Apr 2026 11:20:02 +0000 en-GB hourly 1 https://wordpress.org/?v=6.9.4 https://www.artificialintelligence-news.com/wp-content/uploads/2020/09/cropped-ai-icon-32x32.png Infrastructure & Hardware - AI News https://www.artificialintelligence-news.com/categories/how-it-works/infrastructure-hardware/ 32 32 OpenAI Agents SDK improves governance with sandbox execution https://www.artificialintelligence-news.com/news/openai-agents-sdk-improves-governance-sandbox-execution/ Thu, 16 Apr 2026 11:20:00 +0000 https://www.artificialintelligence-news.com/?p=113030 OpenAI is introducing sandbox execution that allows enterprise governance teams to deploy automated workflows with controlled risk. Teams taking systems from prototype to production have faced difficult architectural compromises regarding where their operations occurred. Using model-agnostic frameworks offered initial flexibility but failed to fully utilise the capabilities of frontier models. Model-provider SDKs remained closer to […]

The post OpenAI Agents SDK improves governance with sandbox execution appeared first on AI News.

]]>
OpenAI is introducing sandbox execution that allows enterprise governance teams to deploy automated workflows with controlled risk.

Teams taking systems from prototype to production have faced difficult architectural compromises regarding where their operations occurred. Using model-agnostic frameworks offered initial flexibility but failed to fully utilise the capabilities of frontier models. Model-provider SDKs remained closer to the underlying model, but often lacked enough visibility into the control harness.

To complicate matters further, managed agent APIs simplified the deployment process but severely constrained where the systems could run and how they accessed sensitive corporate data. To resolve this, OpenAI is introducing new capabilities to the Agents SDK, offering developers standardised infrastructure featuring a model-native harness and native sandbox execution.

The updated infrastructure aligns execution with the natural operating pattern of the underlying models, improving reliability when tasks require coordination across diverse systems. Oscar Health provides an example of this efficiency regarding unstructured data.

The healthcare provider tested the new infrastructure to automate a clinical records workflow that older approaches could not handle reliably. The engineering team required the automated system to extract correct metadata while correctly understanding the boundaries of patient encounters within complex medical files. By automating this process, the provider could parse patient histories faster, expediting care coordination and improving the overall member experience.

Rachael Burns, Staff Engineer & AI Tech Lead at Oscar Health, said: “The updated Agents SDK made it production-viable for us to automate a critical clinical records workflow that previous approaches couldn’t handle reliably enough.

“For us, the difference was not just extracting the right metadata, but correctly understanding the boundaries of each encounter in long, complex records. As a result, we can more quickly understand what’s happening for each patient in a given visit, helping members with their care needs and improving their experience with us.”

OpenAI optimises AI workflows with a model-native harness

To deploy these systems, engineers must manage vector database synchronisation, control hallucination risks, and optimise expensive compute cycles. Without standard frameworks, internal teams often resort to building brittle custom connectors to manage these workflows.

The new model-native harness helps alleviate this friction by introducing configurable memory, sandbox-aware orchestration, and Codex-like filesystem tools. Developers can integrate standardised primitives such as tool use via MCP, custom instructions via AGENTS.md, and file edits using the apply patch tool.

Progressive disclosure via skills and code execution using the shell tool also enables the system to perform complex tasks sequentially. This standardisation allows engineering teams to spend less time updating core infrastructure and focus on building domain-specific logic that directly benefits the business.

Integrating an autonomous program into a legacy tech stack requires precise routing. When an autonomous process accesses unstructured data, it relies heavily on retrieval systems to pull relevant context.

To manage the integration of diverse architectures and limit operational scope, the SDK introduces a Manifest abstraction. This abstraction standardises how developers describe the workspace, allowing them to mount local files and define output directories.

Teams can connect these environments directly to major enterprise storage providers, including AWS S3, Azure Blob Storage, Google Cloud Storage, and Cloudflare R2. Establishing a predictable workspace gives the model exact parameters on where to locate inputs, write outputs, and maintain organisation during extended operational runs.

This predictability prevents the system from querying unfiltered data lakes, restricting it to specific, validated context windows. Data governance teams can subsequently track the provenance of every automated decision with greater accuracy from local prototype phases through to production deployment.

Enhancing security with native sandbox execution

The SDK natively supports sandbox execution, offering an out-of-the-box layer so programs can run within controlled computer environments containing the necessary files and dependencies. Engineering teams no longer need to piece this execution layer together manually. They can deploy their own custom sandboxes or utilise built-in support for providers like Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel.

Risk mitigation remains the primary concern for any enterprise deploying autonomous code execution. Security teams must assume that any system reading external data or executing generated code will face prompt-injection attacks and exfiltration attempts.

OpenAI approaches this security requirement by separating the control harness from the compute layer. This separation isolates credentials, keeping them entirely out of the environments where the model-generated code executes. By isolating the execution layer, an injected malicious command cannot access the central control plane or steal primary API keys, protecting the wider corporate network from lateral movement attacks.

This separation also addresses compute cost issues regarding system failures. Long-running tasks often fail midway due to network timeouts, container crashes, or API limits. If a complex agent takes twenty steps to compile a financial report and fails at step nineteen, re-running the entire sequence burns expensive computing resources.

If the environment crashes under the new architecture, losing the sandbox container does not mean losing the entire operational run. Because the system state remains externalised, the SDK utilises built-in snapshotting and rehydration. The infrastructure can restore the state within a fresh container and resume exactly from the last checkpoint if the original environment expires or fails. Preventing the need to restart expensive, long-running processes translates directly to reduced cloud compute spend.

Scaling these operations requires dynamic resource allocation. The separated architecture allows runs to invoke single or multiple sandboxes based on current load, route specific subagents into isolated environments, and parallelise tasks across numerous containers for faster execution times.

These new capabilities are generally available to all customers via the API, utilising standard pricing based on tokens and tool use without demanding custom procurement contracts. The new harness and sandbox capabilities are launching first for Python developers, with TypeScript support slated for a future release.

OpenAI plans to bring additional capabilities, including code mode and subagents, to both the Python and TypeScript libraries. The vendor intends to expand the broader ecosystem over time by supporting additional sandbox providers and offering more methods for developers to plug the SDK directly into their existing internal systems.

See also: Commvault launches a ‘Ctrl-Z’ for cloud AI workloads

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post OpenAI Agents SDK improves governance with sandbox execution appeared first on AI News.

]]>
Cadence expands AI and robotic partnerships with Nvidia, Google Cloud https://www.artificialintelligence-news.com/news/cadence-expands-ai-and-robotics-partnerships-with-nvidia-google-cloud/ Thu, 16 Apr 2026 10:00:00 +0000 https://www.artificialintelligence-news.com/?p=113025 Cadence Design Systems announced two AI-related collaborations at its CadenceLIVE event this week, expanding its work with Nvidia and introducing new integrations with Google Cloud. The Nvidia partnership focuses on combining AI with physics-based simulation and accelerated computing for robotic systems and system-level design. The companies said the approach targets modelling and deployment in semiconductors […]

The post Cadence expands AI and robotic partnerships with Nvidia, Google Cloud appeared first on AI News.

]]>
Cadence Design Systems announced two AI-related collaborations at its CadenceLIVE event this week, expanding its work with Nvidia and introducing new integrations with Google Cloud. The Nvidia partnership focuses on combining AI with physics-based simulation and accelerated computing for robotic systems and system-level design.

The companies said the approach targets modelling and deployment in semiconductors and large-scale AI infrastructure, including robotic systems that Nvidia describes as physical AI.

Cadence is integrating its multi-physics simulation and system design tools with Nvidia’s CUDA-X libraries, AI models, and Omniverse-based simulation environment. The tools model thermal and mechanical interactions so engineers can assess how systems behave under real-world operating conditions. They also extend beyond chip design to cover infrastructure components like networking and power systems. The combined platform lets engineers simulate system behaviour before physical deployment. The companies said system performance depends on how compute, networking and power systems operate together.

The collaboration also includes robotics development. Cadence’s physics engines, which model how real-world materials interact, are being linked with Nvidia’s AI models used to train AI-driven robotic systems in simulated environments.

“We’re working with you in the board on robotic systems,” said Nvidia CEO Jensen Huang during the event.

Training robots in simulation reduces the need for real-world data collection. The companies said these datasets must be generated with physics-based models not gathered from physical systems. Simulation-generated datasets are used to train models, with outcomes dependent on the accuracy of the underlying physics models.

“The more accurate (generated training data) is, the better the model will be,” said Cadence CEO Anirudh Devgan.

Nvidia said industrial robotics companies are using its Isaac simulation frameworks and Omniverse-based digital twin tools to test robotic systems before deployment. Companies including ABB Robotics, FANUC, YASKAWA, and KUKA are integrating these simulation tools into virtual commissioning workflows to test production systems in software prior to physical rollout.

Nvidia said these systems are used to model complex robot operations and entire production lines using physically accurate digital environments.

Chip design automation on cloud

Separately, Cadence introduced a new AI agent designed to automate later-stage chip design tasks. The agent focuses on physical layout processes, translating circuit designs into silicon implementations. The release builds on an earlier agent introduced this year for front-end chip design, where circuits are defined in code-like descriptions. That earlier system handles circuit design, while the new agent focuses on translating those designs into physical layouts on silicon.

The system will be available through Google Cloud. Cadence said the integration combines its electronic design automation tools with Google’s Gemini models for automated design and verification workflows. The cloud deployment allows teams to run those workloads without relying on on-premise compute infrastructure.

Cadence’s ChipStack AI Super Agent platform uses model-based reasoning with native design tools to coordinate tasks in multiple design stages. The system can interpret design requirements and automatically execute tasks in different stages of the design process.

Cadence reported productivity gains of up to 10 times in early deployments in design and verification tasks. The company did not disclose specific customer implementations.

“We help build AI systems, and then those AI systems can help improve the design process,” Devgan said.

The companies said simulation tools are used to validate systems in virtual environments before physical deployment. Digital twin models allow engineers to test design trade-offs, evaluate performance scenarios, and optimise configurations in software.

They added that the cost and complexity of large-scale data centre infrastructure limit the use of trial-and-error deployment methods.

Quantum models announcement

In a separate announcement, Nvidia introduced a family of open-source quantum AI models called NVIDIA Ising. The models are named after the Ising model, a mathematical framework used to represent interactions in physical systems.

The models are designed to support quantum processor calibration and quantum error correction. Nvidia said the models deliver up to 2.5 times faster performance and three times higher accuracy in decoding processes used for error correction.

“AI is essential to making quantum computing practical,” Huang said. “With Ising, AI becomes the control plane – the operating system of quantum machines – transforming fragile qubits to scalable and reliable quantum-GPU systems.”

(Photo by Homa Appliances)

See also: Hyundai expands into robotics and physical AI systems

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Cadence expands AI and robotic partnerships with Nvidia, Google Cloud appeared first on AI News.

]]>
Strengthening enterprise governance for rising edge AI workloads https://www.artificialintelligence-news.com/news/strengthening-enterprise-governance-for-rising-edge-ai-workloads/ Mon, 13 Apr 2026 13:02:01 +0000 https://www.artificialintelligence-news.com/?p=112976 Models like Google Gemma 4 are increasing enterprise AI governance challenges for CISOs as they scramble to secure edge workloads. Security chiefs have built massive digital walls around the cloud; deploying advanced cloud access security brokers and routing every piece of traffic heading to external large language models through monitored corporate gateways. The logic was […]

The post Strengthening enterprise governance for rising edge AI workloads appeared first on AI News.

]]>
Models like Google Gemma 4 are increasing enterprise AI governance challenges for CISOs as they scramble to secure edge workloads.

Security chiefs have built massive digital walls around the cloud; deploying advanced cloud access security brokers and routing every piece of traffic heading to external large language models through monitored corporate gateways. The logic was sound to boards and executive committees—keep the sensitive data inside the network, police the outgoing requests, and intellectual property remains entirely safe from external leaks.

Google just obliterated that perimeter with the release of Gemma 4. Unlike massive parameter models confined to hyperscale data centres, this family of open weights targets local hardware. It runs directly on edge devices, executes multi-step planning, and can operate autonomous workflows right on a local device.

On-device inference has become a glaring blind spot for enterprise security operations. Security analysts cannot inspect network traffic if the traffic never hits the network in the first place. Engineers can ingest highly classified corporate data, process it through a local Gemma 4 agent, and generate output without triggering a single cloud firewall alarm.

Collapse of API-centric defences

Most corporate IT frameworks treat machine learning tools like standard third-party software vendors. You vet the provider, sign a massive enterprise data processing agreement, and funnel employee traffic through a sanctioned digital gateway. This standard playbook falls apart the moment an engineer downloads an Apache 2.0 licensed model like Gemma 4 and turns their laptop into an autonomous compute node.

Google paired this new model rollout with the Google AI Edge Gallery and a highly optimised LiteRT-LM library. These tools drastically accelerate local execution speeds while providing highly structured outputs required for complex agentic behaviours. An autonomous agent can now sit quietly on a local machine, iterate through thousands of logic steps, and execute code locally at impressive speed.

European data sovereignty laws and strict global financial regulations mandate complete auditability for automated decision-making. When a local agent hallucinates, makes a catastrophic error, or inadvertently leaks internal code across a shared corporate Slack channel, investigators require detailed logs. If the model operates entirely offline on local silicon, those logs simply do not exist inside the centralised IT security dashboard.

Financial institutions stand to lose the most from this architectural adjustment. Banks have spent millions implementing strict API logging to satisfy regulators investigating generative machine learning usage. If algorithmic trading strategies or proprietary risk assessment protocols are parsed by an unmonitored local agent, the bank violates multiple compliance frameworks simultaneously.

Healthcare networks face a similar reality. Patient data processed through an offline medical assistant running Gemma 4 might feel secure because it never leaves the physical laptop. The reality is that unlogged processing of health data violates the core tenets of modern medical auditing. Security leaders must prove how data was handled, what system processed it, and who authorised the execution.

The intent-control dilemma

Industry researchers often refer to this current phase of technological adoption as the governance trap. Management teams panic when they lose visibility. They attempt to rein in developer behaviour by throwing more bureaucratic processes at the problem, mandate sluggish architecture review boards, and force engineers to fill out extensive deployment forms before installing any new repository.

Bureaucracy rarely stops a motivated developer facing an aggressive product deadline; it just forces the entire behaviour further underground. This creates a shadow IT environment powered by autonomous software.

Real governance for local systems requires a different architectural approach. Instead of trying to block the model itself, security leaders must focus intensely on intent and system access. An agent running locally via Gemma 4 still requires specific system permissions to read local files, access corporate databases, or execute shell commands on the host machine.

Access management becomes the new digital firewall. Rather than policing the language model, identity platforms must tightly restrict what the host machine can physically touch. If a local Gemma 4 agent attempts to query a restricted internal database, the access control layer must flag the anomaly immediately.

Enterprise governance in the edge AI era

We are watching the definition of enterprise infrastructure expand in real-time. A corporate laptop is no longer just a dumb terminal used to access cloud services over a VPN; it’s an active compute node capable of running sophisticated autonomous planning software.

The cost of this new autonomy is deep operational complexity. CTOs and CISOs face a requirement to deploy endpoint detection tools specifically tuned for local machine learning inference. They desperately need systems that can differentiate between a human developer compiling standard code, and an autonomous agent rapidly iterating through local file structures to solve a complex prompt.

The cybersecurity market will inevitably catch up to this new reality. Endpoint detection and response vendors are already prototyping quiet agents that monitor local GPU utilisation and flag unauthorised inference workloads. However, those tools remain in their infancy today.

Most corporate security policies written in 2023 assumed all generative tools lived comfortably in the cloud. Revising them requires an uncomfortable admission from the executive board that the IT department no longer dictates exactly where compute happens.

Google designed Gemma 4 to put state-of-the-art agentic skills directly into the hands of anyone with a modern processor. The open-source community will adopt it with aggressive speed. 

Enterprises now face a very short window to figure out how to police code they do not host, running on hardware they cannot constantly monitor. It leaves every security chief staring at their network dashboard with one question: What exactly is running on endpoints right now?

See also: Companies expand AI adoption while keeping control

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Strengthening enterprise governance for rising edge AI workloads appeared first on AI News.

]]>
IBM: How robust AI governance protects enterprise margins https://www.artificialintelligence-news.com/news/ibm-how-robust-ai-governance-protects-enterprise-margins/ Fri, 10 Apr 2026 13:57:15 +0000 https://www.artificialintelligence-news.com/?p=112947 To protect enterprise margins, business leaders must invest in robust AI governance to securely manage AI infrastructure. When evaluating enterprise software adoption, a recurring pattern dictates how technology matures across industries. As Rob Thomas, SVP and CCO at IBM, recently outlined, software typically graduates from a standalone product to a platform, and then from a […]

The post IBM: How robust AI governance protects enterprise margins appeared first on AI News.

]]>
To protect enterprise margins, business leaders must invest in robust AI governance to securely manage AI infrastructure.

When evaluating enterprise software adoption, a recurring pattern dictates how technology matures across industries. As Rob Thomas, SVP and CCO at IBM, recently outlined, software typically graduates from a standalone product to a platform, and then from a platform to foundational infrastructure, altering the governing rules entirely.

At the initial product stage, exerting tight corporate control often feels highly advantageous. Closed development environments iterate quickly and tightly manage the end-user experience. They capture and concentrate financial value within a single corporate entity, an approach that functions adequately during early product development cycles.

However, IBM’s analysis highlights that expectations change entirely when a technology solidifies into a foundational layer. Once other institutional frameworks, external markets, and broad operational systems rely on the software, the prevailing standards adapt to a new reality. At infrastructure scale, embracing openness ceases to be an ideological stance and becomes a highly practical necessity.

AI is currently crossing this threshold within the enterprise architecture stack. Models are increasingly embedded directly into the ways organisations secure their networks, author source code, execute automated decisions, and generate commercial value. AI functions less as an experimental utility and more as core operational infrastructure.

The recent limited preview of Anthropic’s Claude Mythos model brings this reality into sharper focus for enterprise executives managing risk. Anthropic reports that this specific model can discover and exploit software vulnerabilities at a level matching few human experts.

In response to this power, Anthropic launched Project Glasswing, a gated initiative designed to place these advanced capabilities directly into the hands of network defenders first. From IBM’s perspective, this development forces technology officers to confront immediate structural vulnerabilities. If autonomous models possess the capability to write exploits and shape the overall security environment, Thomas notes that concentrating the understanding of these systems within a small number of technology vendors invites severe operational exposure.

With models achieving infrastructure status, IBM argues the primary issue is no longer exclusively what these machine learning applications can execute. The priority becomes how these systems are constructed, governed, inspected, and actively improved over extended periods.

As underlying frameworks grow in complexity and corporate importance, maintaining closed development pipelines becomes exceedingly difficult to defend. No single vendor can successfully anticipate every operational requirement, adversarial attack vector, or system failure mode.

Implementing opaque AI structures introduces heavy friction across existing network architecture. Connecting closed proprietary models with established enterprise vector databases or highly sensitive internal data lakes frequently creates massive troubleshooting bottlenecks. When anomalous outputs occur or hallucination rates spike, teams lack the internal visibility required to diagnose whether the error originated in the retrieval-augmented generation pipeline or the base model weights.

Integrating legacy on-premises architecture with highly gated cloud models also introduces severe latency into daily operations. When enterprise data governance protocols strictly prohibit sending sensitive customer information to external servers, technology teams are left attempting to strip and anonymise datasets before processing. This constant data sanitisation creates enormous operational drag. 

Furthermore, the spiralling compute costs associated with continuous API calls to locked models erode the exact profit margins these autonomous systems are supposed to enhance. The opacity prevents network engineers from accurately sizing hardware deployments, forcing companies into expensive over-provisioning agreements to maintain baseline functionality.

Why open-source AI is essential for operational resilience

Restricting access to powerful applications is an understandable human instinct that closely resembles caution. Yet, as Thomas points out, at massive infrastructure scale, security typically improves through rigorous external scrutiny rather than through strict concealment.

This represents the enduring lesson of open-source software development. Open-source code does not eliminate enterprise risk. Instead, IBM maintains it actively changes how organisations manage that risk. An open foundation allows a wider base of researchers, corporate developers, and security defenders to examine the architecture, surface underlying weaknesses, test foundational assumptions, and harden the software under real-world conditions.

Within cybersecurity operations, broad visibility is rarely the enemy of operational resilience. In fact, visibility frequently serves as a strict prerequisite for achieving that resilience. Technologies deemed highly important tend to remain safer when larger populations can challenge them, inspect their logic, and contribute to their continuous improvement.

Thomas addresses one of the oldest misconceptions regarding open-source technology: the belief that it inevitably commoditises corporate innovation. In practical application, open infrastructure typically pushes market competition higher up the technology stack. Open systems transfer financial value rather than destroying it.

As common digital foundations mature, the commercial value relocates toward complex implementation, system orchestration, continuous reliability, trust mechanics, and specific domain expertise. IBM’s position asserts that the long-term commercial winners are not those who own the base technological layer, but rather the organisations that understand how to apply it most effectively.

We have witnessed this identical pattern play out across previous generations of enterprise tooling, cloud infrastructure, and operating systems. Open foundations historically expanded developer participation, accelerated iterative improvement, and birthed entirely new, larger markets built on top of those base layers. Enterprise leaders increasingly view open-source as highly important for infrastructure modernisation and emerging AI capabilities. IBM predicts that AI is highly likely to follow this exact historical trajectory.

Looking across the broader vendor ecosystem, leading hyperscalers are adjusting their business postures to accommodate this reality. Rather than engaging in a pure arms race to build the largest proprietary black boxes, highly profitable integrators are focusing heavily on orchestration tooling that allows enterprises to swap out underlying open-source models based on specific workload demands. Highlighting its ongoing leadership in this space, IBM is a key sponsor of this year’s AI & Big Data Expo North America, where these evolving strategies for open enterprise infrastructure will be a primary focus.

This approach completely sidesteps restrictive vendor lock-in and allows companies to route less demanding internal queries to smaller and highly efficient open models, preserving expensive compute resources for complex customer-facing autonomous logic. By decoupling the application layer from the specific foundation model, technology officers can maintain operational agility and protect their bottom line.

The future of enterprise AI demands transparent governance

Another pragmatic reason for embracing open models revolves around product development influence. IBM emphasises that narrow access to underlying code naturally leads to narrow operational perspectives. In contrast, who gets to participate directly shapes what applications are eventually built. 

Providing broad access enables governments, diverse institutions, startups, and varied researchers to actively influence how the technology evolves and where it is commercially applied. This inclusive approach drives functional innovation while simultaneously building structural adaptability and necessary public legitimacy.

As Thomas argues, once autonomous AI assumes the role of core enterprise infrastructure, relying on opacity can no longer serve as the organising principle for system safety. The most reliable blueprint for secure software has paired open foundations with broad external scrutiny, active code maintenance, and serious internal governance.

As AI permanently enters its infrastructure phase, IBM contends that identical logic increasingly applies directly to the foundation models themselves. The stronger the corporate reliance on a technology, the stronger the corresponding case for demanding openness.

If these autonomous workflows are truly becoming foundational to global commerce, then transparency ceases to be a subject of casual debate. According to IBM, it is an absolute, non-negotiable design requirement for any modern enterprise architecture.

See also: Why companies like Apple are building AI agents with limits

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post IBM: How robust AI governance protects enterprise margins appeared first on AI News.

]]>
KPMG: Inside the AI agent playbook driving enterprise margin gains https://www.artificialintelligence-news.com/news/kpmg-inside-ai-agent-playbook-enterprise-margin-gains/ Wed, 01 Apr 2026 15:24:01 +0000 https://www.artificialintelligence-news.com/?p=112839 Global AI investment is accelerating, yet KPMG data shows the gap between enterprise AI spend and measurable business value is widening fast. The headline figure from KPMG’s first quarterly Global AI Pulse survey is blunt: despite global organisations planning to spend a weighted average of $186 million on AI over the next 12 months, only […]

The post KPMG: Inside the AI agent playbook driving enterprise margin gains appeared first on AI News.

]]>
Global AI investment is accelerating, yet KPMG data shows the gap between enterprise AI spend and measurable business value is widening fast.

The headline figure from KPMG’s first quarterly Global AI Pulse survey is blunt: despite global organisations planning to spend a weighted average of $186 million on AI over the next 12 months, only 11 percent have reached the stage of deploying and scaling AI agents in ways that produce enterprise-wide business outcomes.

However, the central finding is not that AI is failing; 64 percent of respondents say AI is already delivering meaningful business outcomes. The problem is that “meaningful” is doing a lot of heavy lifting in that sentence, and the distance between incremental productivity gains and the kind of compounding operational efficiency that moves the needle on margin is, for most organisations, still substantial.

The architecture of a performance gap

KPMG’s report distinguishes between what it labels “AI leaders” (i.e. organisations that are scaling or actively operating agentic AI) and everyone else. The gap in outcomes between these two cohorts is striking.

Headshot of Steve Chase, Global Head of AI and Digital Innovation at KPMG International.

Steve Chase, Global Head of AI and Digital Innovation at KPMG International, said: “The first Global AI Pulse results reinforce that spending more on AI is not the same as creating value. Leading organisations are moving beyond enablement, deploying AI agents to reimagine processes and reshape how decisions and work flow across the enterprise.”

Among AI leaders, 82 percent report that AI is already delivering meaningful business value. Among their peers, that figure drops to 62 percent. That 20-percentage-point spread might look modest in isolation, but it compounds quickly when you consider what it reflects: not just better tooling, but fundamentally different deployment philosophies.

The organisations in that 11 percent are deploying agents that coordinate work across functions, route decisions without human intermediation at every step, surface enterprise-wide insights from operational data in near real-time, and flag anomalies before they escalate into incidents.

In IT and engineering functions, 75 percent of AI leaders are using agents to accelerate code development versus 64 percent of their peers. In operations, where supply-chain orchestration is the primary use case, the split is 64 percent versus 55 percent. These are not marginal differences in tool adoption rates; they reflect different levels of process re-architecture.

Most enterprises that have deployed AI have done so by layering models onto existing workflows (e.g. a co-pilot here, a summarisation tool there…) without redesigning the process those tools sit inside. That produces incremental gains.

The organisations closing the performance gap have inverted this approach: they are redesigning the process first, then deploying agents to operate within the redesigned structure. The difference in return on AI spend between these two approaches, over a three-to-five-year horizon, is likely to be the defining competitive variable in several industries.

What $186 million actually buys—and what it does not

The investment figures in the KPMG data deserve scrutiny. A weighted global average of $186 million per organisation sounds substantial, but the regional variance tells a more interesting story.

ASPAC leads at $245 million, the Americas at $178 million, and EMEA at $157 million. Within ASPAC, organisations including those in China and Hong Kong are investing at $235 million on average; within the Americas, US organisations are at $207 million.

These figures represent planned spend across model licensing, compute infrastructure, professional services, integration, and the governance and risk management apparatus needed to operate AI responsibly at scale.

The question is not whether $186 million is too much or too little; it is what proportion of that figure is being allocated to the operational infrastructure required to derive value from the models themselves. The survey data suggests that most organisations are still underweighting this latter category.

Compute and licensing costs are visible and relatively easy to budget for. The friction costs – the engineering hours spent integrating AI outputs with legacy ERP systems, the latency introduced by retrieval-augmented generation pipelines built on top of poorly structured data, and the compliance overhead of maintaining audit trails for AI-assisted decisions in regulated industries – tend to surface late in deployment cycles and often exceed initial estimates.

Vector database integration is a useful example. Many agentic workflows depend on the ability to retrieve relevant context from large, unstructured document repositories in real time. Building and maintaining the infrastructure for this – selecting between providers such as Pinecone, Weaviate, or Qdrant, embedding and indexing proprietary data, and managing refresh cycles as underlying data changes – adds meaningful engineering complexity and ongoing operational cost that rarely appears in initial AI investment proposals. 

When that infrastructure is absent or poorly maintained, agent performance degrades in ways that are often difficult to diagnose, as the model’s behaviour is correct relative to the context it receives, but that context is stale or incomplete.

Governance as an operational variable, not a compliance exercise

Perhaps the most practically useful finding in the KPMG survey is the relationship between AI maturity and risk confidence.

Among organisations still in the experimentation phase, just 20 percent feel confident in their ability to manage AI-related risks. Among AI leaders, that figure rises to 49 percent. 75 percent of global leaders cite data security, privacy, and risk as ongoing concerns regardless of maturity level—but maturity changes how those concerns are operationalised.

This is an important distinction for boards and risk functions that tend to frame AI governance as a constraint on deployment. The KPMG data suggests the opposite dynamic: governance frameworks do not slow AI adoption among mature organisations; they enable it. The confidence to move faster – to deploy agents into higher-stakes workflows, to expand agentic coordination across functions – correlates directly with the maturity of the governance infrastructure surrounding those agents.

In practice, this means that organisations treating governance as a retrospective compliance layer are doubly disadvantaged. They are slower to deploy, because every new use case triggers a fresh governance review, and they are more exposed to operational risk, because the absence of embedded governance mechanisms means that edge cases and failure modes are discovered in production rather than in testing.

Organisations that have embedded governance into the deployment pipeline itself (e.g. model cards, automated output monitoring, explainability tooling, and human-in-the-loop escalation paths for low-confidence decisions) are the ones operating with the confidence that allows them to scale.

“Ultimately, there is no agentic future without trust and no trust without governance that keeps pace,” explains Steve Chase, Global Head of AI and Digital Innovation at KPMG International. “The survey makes clear that sustained investment in people, training and change management is what allows organisations to scale AI responsibly and capture value.”

Regional divergence and what it signals for global deployment

For multinationals managing AI programmes across regions, the KPMG data flags material differences in deployment velocity and organisational posture that will affect global rollout planning.

ASPAC is advancing most aggressively on agent scaling; 49 percent of organisations there are scaling AI agents, compared with 46 percent in the Americas and 42 percent in EMEA. ASPAC also leads on the more complex capability of orchestrating multi-agent systems, at 33 percent.

The barrier profiles also differ in ways that carry real operational implications. In both ASPAC and EMEA, 24 percent of organisations cite a lack of leadership trust and buy-in as a primary barrier to AI agent deployment. In the Americas, that figure drops to 17 percent.

Agentic systems, by definition, make or initiate decisions without per-instance human approval. In organisational cultures where decision accountability is tightly concentrated at the senior level, this can generate institutional resistance that no amount of technical capability resolves. The fix is governance design; specifically, defining in advance what categories of decision an agent is authorised to make autonomously, what triggers escalation, and who carries accountability for agent-initiated outcomes.

The expectation gap around human-AI collaboration is also worth noting for anyone designing agent-assisted workflows at a global scale.

East Asian respondents anticipate AI agents leading projects at a rate of 42 percent. Australian respondents prefer human-directed AI at 34 percent. North American respondents lean toward peer-to-peer human-AI collaboration at 31 percent. These differences will affect how agent-assisted processes need to be designed in different regional deployments of the same underlying system, adding localisation complexity that is easy to underestimate in centralised platform planning.

One data point in the KPMG survey that deserves particular attention from CFOs and boards: 74 percent of respondents say AI will remain a top investment priority even in the event of a recession. This is either a sign of genuine conviction about AI’s role in cost structure and competitive positioning, or it reflects a collective commitment that has not yet been tested against actual budget pressure. Probably both, in different proportions across different organisations.

What it does indicate is that the window for organisations still in the experimentation phase is not indefinite. If the 11 percent of AI leaders continue to compound their advantage (and the KPMG data suggests the mechanisms for doing so are in place) the question for the remaining 89 percent is not whether to accelerate AI deployment, but how to do so without compounding the integration debt and governance deficits that are already constraining their returns.

See also: Hershey applies AI across its supply chain operations

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post KPMG: Inside the AI agent playbook driving enterprise margin gains appeared first on AI News.

]]>
SAP and ANYbotics drive industrial adoption of physical AI https://www.artificialintelligence-news.com/news/sap-and-anybotics-drive-industrial-adoption-physical-ai/ Tue, 31 Mar 2026 15:20:53 +0000 https://www.artificialintelligence-news.com/?p=112821 Heavy industry relies on people to inspect hazardous, dirty facilities. It’s expensive, and putting humans in these zones carries obvious safety risks. Swiss robot maker ANYbotics and software company SAP are trying to change that. ANYbotics’ four-legged autonomous robots will be connected straight into SAP’s backend enterprise resource planning software. Instead of treating a robot […]

The post SAP and ANYbotics drive industrial adoption of physical AI appeared first on AI News.

]]>
Heavy industry relies on people to inspect hazardous, dirty facilities. It’s expensive, and putting humans in these zones carries obvious safety risks. Swiss robot maker ANYbotics and software company SAP are trying to change that.

ANYbotics’ four-legged autonomous robots will be connected straight into SAP’s backend enterprise resource planning software. Instead of treating a robot as a standalone asset, this turns it into a mobile data-gathering node within an industrial IoT network.

This initiative shows that hardware innovation can now effectively connect with established business workflows. Underscoring that broader trend, SAP is sponsoring this year’s AI & Big Data Expo North America at the San Jose McEnery Convention Center, CA, an event that is fittingly co-located with the IoT Tech Expo and Intelligent Automation & Physical AI Summit.

When equipment breaks at a chemical plant or offshore rig, it costs a fortune. People do routine inspections to catch these issues early, but humans get tired and plants are massive. Robots, on the other hand, can walk the floor constantly, carrying thermal, acoustic, and visual sensors. Hook those sensors into SAP, and a hot pump instantly generates a maintenance request without waiting for a human to report it.

Cutting out the reporting lag

Usually, finding a problem and logging a work order are two disconnected steps. A worker might hear a weird noise in a compressor, write it down, and type it into a computer hours later. By the time the replacement part gets approved, the machine might be wrecked.

Connecting ANYbotics to SAP eliminates that delay. The robot’s onboard AI processes what it sees and hears instantly. If it hears an irregular motor frequency, it doesn’t just flash a warning on a separate screen, it uses APIs to tell the SAP asset management module directly. The system immediately checks for spare parts, figures out the cost of potential downtime, and schedules an engineer.

This automates the flow of information from the floor to management. It also means machinery gets judged on hard, consistent numbers instead of a human inspector’s subjective opinion.

Putting robots in heavy industry isn’t like installing software in an office—companies have to deal with unreliable infrastructure. Factories usually have awful internet connectivity due to thick concrete, metal scaffolding, and electromagnetic interference.

To make this work, the setup relies on edge computing. It takes too much bandwidth to constantly stream high-def thermal video and lidar data to the cloud. So, the robots crunch most of that data locally. Onboard processors figure out the difference between a machine running normally and one that’s dangerously overheating. They only send the crucial details (i.e. the specific fault and its location) back to SAP.

To handle the network issues, many early adopters build private 5G networks. This gives them the coverage they need across huge facilities where regular Wi-Fi fails. It also locks down access, keeping the robot’s data safe from interception.

Of course, security is a major issue. A walking robot packed with cameras is effectively a roaming vulnerability. Companies must use zero-trust network protocols to constantly verify the robot’s identity and limit what SAP modules it can touch. If the robot gets hacked, the system has to cut its connection instantly to stop the attackers from moving laterally into the corporate network.

These robots generate a massive amount of unstructured data as they walk around. Turning raw audio and thermal images into the neat tables SAP requires is difficult.

If companies don’t manage this right, maintenance teams will drown in alerts. A robot that is too sensitive might spit out hundreds of useless warnings a day, making the SAP dashboard completely ignored. IT teams have to set strict rules before turning the system on. They need exact thresholds for what triggers a real maintenance ticket and what just needs to be watched.

The setup usually uses middleware to translate the robot’s telemetry into SAP’s language. This software acts as a filter, throwing out the noise so only actual problems reach the ERP system. The data lake storing all this information also needs to be organised for future machine learning projects. Fixing broken machines is the short-term goal; the long-term payoff is using years of robot data to predict failures before they happen.

Ensuring a successful physical AI deployment

Dropping robots into a factory naturally makes people nervous. The project’s success often comes down to how human resources handles it. Workers usually look at the robots and assume layoffs are next.

Management has to be clear about why the robots are there. The goal is to get people out of dangerous areas like high-voltage zones or toxic chemical sectors to reduce injuries. The robot collects the data, and the human engineer shifts to analysing that data and doing the actual repairs.

This requires retraining. Workers who used to walk the perimeter now have to read SAP dashboards, manage automated tickets, and work with the robots. They have to trust the sensors, and management has to make sure operators know they can take manual control if something unexpected happens.

Companies need to take the rollout slowly. Because syncing physical robots with enterprise software is complicated, large-scale rollouts should start as small, targeted pilots.

The first test should be in one specific area with known hazards but rock-solid internet. This lets IT watch the data flow between the hardware and SAP in a controlled space. At this stage, the main job is making sure the data matches reality. If the robot sees one thing and SAP records another, it has to be audited and fixed daily.

Once the data pipeline actually works, the company can add more robots and connect other systems, like automated parts ordering. IT chiefs have to keep checking if their private networks can handle more robots, while security teams update their defenses against new threats.

If companies treat these autonomous inspectors as an extension of their corporate data architecture, they get a massive amount of information about their physical assets. But pulling it off means getting the network infrastructure, the data rules, and the human element exactly right.

See also: The rise of invisible IoT in enterprise operations

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post SAP and ANYbotics drive industrial adoption of physical AI appeared first on AI News.

]]>
Securing AI systems under today’s and tomorrow’s conditions https://www.artificialintelligence-news.com/news/quantum-resilient-ai-needs-migration-and-hardware-protected-data-enclaves/ Tue, 24 Mar 2026 15:22:00 +0000 https://www.artificialintelligence-news.com/?p=112759 Evidence cited in an eBook titled “AI Quantum Resilience”, published by Utimaco [email wall], shows organisations consider security risks as the leading barrier to effective adoption of AI on data they hold. AI’s value depends on data amassed by an organisation. However, there are security risks to building models and training them on that data. […]

The post Securing AI systems under today’s and tomorrow’s conditions appeared first on AI News.

]]>
Evidence cited in an eBook titled “AI Quantum Resilience”, published by Utimaco [email wall], shows organisations consider security risks as the leading barrier to effective adoption of AI on data they hold.

AI’s value depends on data amassed by an organisation. However, there are security risks to building models and training them on that data. These risks are in addition to better-publicised threats to intellectual property that exist around the point of inference (prompt engineering, for example).

The eBook’s authors state that organisations need to manage threats throughout their AI development and implementation processes. At the same time, companies can and should prepare to change their security protocols, changes that will become mandatory if quantum computing-powered decryption tools become easily available to bad actors.

Utimaco lists three areas under threat:

  • Training data can be manipulated by bad actors, degrading model outputs in ways are hard to detect,
  • Models can be extracted or copied, eroding intellectual property rights,
  • Sensitive data used during training or inference can be exposed.

Current public key cryptography will become vulnerable in the next ten years, the report’s authors attest; a period in which capable quantum systems may emerge. Regardless of the timescale, it’s thought that better organised groups currently collect encrypted data and store it to decrypt when or if quantum facilities become available. Any dataset with long-term sensitivity, including model training data, financial records, or intellectual property, may require protection against future decryption, therefore, Utimaco says.

A migration to quantum-resistant cryptography will affect protocols, key management, system interoperability, and performance, so any migration is likely to take several years. The report’s authors suggest what they term ‘crypto-agility’, which it defines as changing cryptographic algorithms without redesigning underlying systems. ‘Crypto-agility’ is based on the principle of hybrid cryptography – combining established algorithms with post-quantum methods, such as those suggested by NIST.

The eBook’s authors concur that cryptography on its own doesn’t address all possible areas of risk. It advocates the use of hardware-based trust devices that can isolate cryptographic keys and sensitive operations from normal working environments.

If companies are developing their own AI tools and processes, protection on that basis should extend throughout the AI lifecycle, from data ingestion through to training, model deployment, and inference in production. Hardware keys used to encrypt data and sign models can be generated and stored inside a boundary. Model integrity can then be verified before deployment, and sensitive data processed during inference remains protected.

Hardware-based enclaves isolate workloads so that even system administrators with sufficient privileges can’t access any of the data being processed. Hardware modules can verify that the data enclave is in a trusted state before releasing keys – a process of external attestation – helping create a ‘chain of trust’ from hardware to application.

Hardware-based key management produces tamper-resistant logs covering access and operations to support compliance frameworks such as the EU AI Act.

Many of the risks inherent in AI systems are well known if not already exploited. The risk from quantum computing’s ability to decrypt data currently considered safe is less immediate, but the implications should affect data and infrastructure decisions made today, Utimaco states. It advocates:

  • A strengthening of controls throughout the AI development and deployment lifecycle,
  • The introduction of ‘crypto-agility’ to allow transition to post-quantum security,
  • Establishing hardware-based trust mechanisms wherever high-value assets are in play.

(Image source: “Scanning electron micrograph of an apoptotic HeLa cell” by National Institutes of Health (NIH) is licensed under CC BY-NC 2.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc/2.0)

 

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Securing AI systems under today’s and tomorrow’s conditions appeared first on AI News.

]]>
Goldman Sachs sees AI investment shift to data centres https://www.artificialintelligence-news.com/news/goldman-sachs-sees-ai-investment-shift-to-data-centres/ Tue, 17 Mar 2026 10:00:00 +0000 https://www.artificialintelligence-news.com/?p=112700 Artificial intelligence investment is entering a more selective phase as companies and investors look beyond early excitement and focus on the data centre infrastructure required to run AI systems. Recent analysis from Goldman Sachs suggests the market is moving toward what the firm describes as a “flight to quality.” In practice, investors are paying closer […]

The post Goldman Sachs sees AI investment shift to data centres appeared first on AI News.

]]>
Artificial intelligence investment is entering a more selective phase as companies and investors look beyond early excitement and focus on the data centre infrastructure required to run AI systems.

Recent analysis from Goldman Sachs suggests the market is moving toward what the firm describes as a “flight to quality.” In practice, investors are paying closer attention to companies that own and operate large data centres and computing infrastructure. Firms offering narrow AI tools or experimental software are receiving less attention.

Goldman Sachs expects spending on AI infrastructure to grow rapidly as companies expand computing capacity for model training and deployment. Hyperscale cloud firms are investing tens of billions of dollars each year in new data centres and computing hardware. Networking systems are also expanding to support this growth.

AI demand is reshaping the data centre market

Goldman Sachs Research estimates that AI workloads could account for about 30% of total data centre capacity in the next two years, as demand for computing power grows in cloud services and enterprise applications. The change reflects how AI tasks differ from traditional cloud workloads. Training large models requires thousands of chips running in parallel for extended periods. Inference, the process of generating responses or predictions, also requires steady computing power when services run.

Cloud providers and AI developers are now expanding data centre capacity at a pace not seen during earlier phases of cloud computing. Infrastructure demand extends beyond computing hardware. Energy supply is becoming a central issue in the AI race.

Goldman Sachs Research estimates that global data centre power demand could rise about 175% by 2030 compared with 2023 levels, driven largely by AI workloads. The firm says this increase would be roughly equal to adding the electricity demand of another top-10 power-consuming country to the global grid. Rising power demand is also pushing utilities and governments to consider new investment in energy infrastructure.

Infrastructure limits are shaping AI strategy

The growing need for power and cooling is influencing where new AI data centres are built. Space requirements are also shaping site selection. Large facilities are often located near stable energy sources and high-capacity fibre networks. Some companies are building AI training clusters in remote areas where land and electricity are easier to secure. The location of data centres can also affect environmental impact. Academic research on AI infrastructure shows that cooling systems and geographic location can influence energy use and water consumption as much as hardware efficiency.

The limits are starting to affect how technology firms plan their AI strategies. Building new models or software is only part of the challenge. Companies must also ensure they have the infrastructure needed to run those systems reliably. In many cases, building that infrastructure takes years.

Construction of large data centres involves complex supply chains. Projects often require land acquisition and grid connections. Many also depend on long-term energy agreements. Shortages of electrical equipment and delays in grid expansion can slow new projects. The constraints help explain why investors are paying more attention to companies that already control large data centre networks.

A selective phase of the AI market

During the first wave of generative AI adoption, many companies saw their market value rise simply by associating themselves with AI. That phase is now beginning to change as investors reassess where AI growth will occur.

Investors are examining which companies have the infrastructure and revenue models needed to support long-term deployment. Data centre operators and chip manufacturers sit near the base of that ecosystem. Their services are required regardless of which AI applications gain traction.

During previous waves of computing growth, companies that built the underlying infrastructure often captured stable revenue. Software platforms, in contrast, rose and fell more quickly. A similar dynamic may now be forming in the AI sector.

Infrastructure expansion also raises new questions. Energy demand and grid capacity are becoming central issues for governments and industry planners. Environmental impact is also drawing closer scrutiny.

In the coming years, the AI economy may depend as much on power plants and cooling systems as it does on algorithms and software. That reality is shaping the next stage of the AI race.

(Photo by Lightsaber Collection)

See also: Goldman Sachs and Deutsche Bank test agentic AI for trade surveillance

Want to learn more about AI and big data from industry leaders? Check outAI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Goldman Sachs sees AI investment shift to data centres appeared first on AI News.

]]>
How multi-agent AI economics influence business automation https://www.artificialintelligence-news.com/news/how-multi-agent-ai-economics-business-automation/ Thu, 12 Mar 2026 15:01:20 +0000 https://www.artificialintelligence-news.com/?p=112642 Managing the economics of multi-agent AI now dictates the financial viability of modern business automation workflows. Organisations progressing past standard chat interfaces into multi-agent applications face two primary constraints. The first issue is the thinking tax; complex autonomous agents need to reason at each stage, making the reliance on massive architectures for every subtask too […]

The post How multi-agent AI economics influence business automation appeared first on AI News.

]]>
Managing the economics of multi-agent AI now dictates the financial viability of modern business automation workflows.

Organisations progressing past standard chat interfaces into multi-agent applications face two primary constraints. The first issue is the thinking tax; complex autonomous agents need to reason at each stage, making the reliance on massive architectures for every subtask too expensive and slow for practical enterprise use.

Context explosion acts as the second hurdle; these advanced workflows produce up to 1,500 percent more tokens than standard formats because every interaction demands the resending of full system histories, intermediate reasoning, and tool outputs. Across extended tasks, this token volume drives up expenses and causes goal drift, a scenario where agents diverge from their initial objectives.

Evaluating architectures for multi-agent AI

To address these governance and efficiency hurdles, hardware and software developers are releasing highly optimised tools aimed directly at enterprise infrastructure.

NVIDIA recently introduced Nemotron 3 Super, an open architecture featuring 120 billion parameters (of which 12 billion remain active) that is specifically-engineered to execute complex agentic AI systems.

Available immediately, NVIDIA’s framework blends advanced reasoning features to help autonomous agents finish tasks efficiently and accurately for improved business automation. The system relies on a hybrid mixture-of-experts architecture combining three major innovations to deliver up to five times higher throughput and twice the accuracy of the preceding Nemotron Super model. During inference, only 12 billion of the 120 billion parameters are active.

Mamba layers provide four times the memory and compute efficiency, while standard transformer layers manage the complex reasoning requirements. A latent technique boosts accuracy by engaging four expert specialists for the cost of one during token generation. The system also anticipates multiple future words at the same time, accelerating inference speeds threefold.

Operating on the Blackwell platform, the architecture utilises NVFP4 precision. This setup reduces memory needs and makes inference up to four times faster than FP8 configurations on Hopper systems, all without sacrificing accuracy.

Translating automation capability into business outcomes

The system offers a one-million-token context window, allowing agents to keep the entire workflow state in memory and directly addressing the risk of goal drift. A software development agent can load an entire codebase into context simultaneously, enabling end-to-end code generation and debugging without requiring document segmentation.

Within financial analysis, the system can load thousands of pages of reports into memory, improving efficiency by removing the need to re-reason across lengthy conversations. High-accuracy tool calling ensures autonomous agents reliably navigate massive function libraries, preventing execution errors in high-stakes environments such as autonomous security orchestration within cybersecurity.

Industry leaders – including Amdocs, Palantir, Cadence, Dassault Systèmes, and Siemens – are deploying and customising the model to automate workflows across telecom, cybersecurity, semiconductor design, and manufacturing.

Software development platforms like CodeRabbit, Factory, and Greptile are integrating it alongside proprietary models to achieve higher accuracy at lower costs. Life sciences firms like Edison Scientific and Lila Sciences will use it to power agents for deep literature search, data science, and molecular understanding.

The architecture also powers the AI-Q agent to the top position on DeepResearch Bench and DeepResearch Bench II leaderboards, highlighting its capacity for multistep research across large document sets while maintaining reasoning coherence.

Finally, the model claimed the top spot on Artificial Analysis for efficiency and openness, featuring leading accuracy among models of its size.

Implementation and infrastructure alignment

Built to handle complex subtasks inside multi-agent systems, deployment flexibility remains a priority for leaders driving business automation.

NVIDIA released the model with open weights under a permissive license, letting developers deploy and customise it across workstations, data centres, or cloud environments. It is packaged as an NVIDIA NIM microservice to aid this broad deployment from on-premises systems to the cloud.

The architecture was trained on synthetic data generated by frontier reasoning models. NVIDIA published the complete methodology, encompassing over 10 trillion tokens of pre- and post-training datasets, 15 training environments for reinforcement learning, and evaluation recipes. Researchers can further fine-tune the model or build their own using the NeMo platform.

Any exec planning a digitisation rollout must address context explosion and the thinking tax upfront to prevent goal drift and cost overruns in agentic workflows. Establishing comprehensive architectural oversight ensures these sophisticated agents remain aligned with corporate directives, yielding sustainable efficiency gains and advancing business automation across the organisation.

See also: Ai2: Building physical AI with virtual simulation data

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post How multi-agent AI economics influence business automation appeared first on AI News.

]]>
MWC 2026: SK Telecom lays out plan to rebuild its core around AI https://www.artificialintelligence-news.com/news/mwc-2026-sk-telecom-lays-out-plan-to-rebuild-its-core-around-ai/ Mon, 02 Mar 2026 10:00:00 +0000 https://www.artificialintelligence-news.com/?p=112466 At MWC 2026 in Barcelona, SK Telecom outlined how it is rebuilding itself around AI, from its network core to its customer service desks. The shift goes beyond adding new AI tools. It involves rewriting internal systems, expanding data centre capacity to the gigawatt scale, and upgrading its own large language model to more than […]

The post MWC 2026: SK Telecom lays out plan to rebuild its core around AI appeared first on AI News.

]]>
At MWC 2026 in Barcelona, SK Telecom outlined how it is rebuilding itself around AI, from its network core to its customer service desks. The shift goes beyond adding new AI tools. It involves rewriting internal systems, expanding data centre capacity to the gigawatt scale, and upgrading its own large language model to more than one trillion parameters.

At a press conference during MWC 2026, SK Telecom CEO Jung Jai-hun outlined what the company calls an “AI Native” strategy. The plan centres on reorganising infrastructure and making large investments so the company can help position Korea among the world’s top three AI powers.

“SKT is currently at a golden time of transformation, where the two tasks of ‘customer value innovation’ and ‘AI innovation’ intersect in a borderless, converged environment that goes beyond telecommunications,” Jung said. “SKT defines ‘the customer as the very essence of our business,’ and through innovation driven by AI, we will evolve into a company that makes meaningful contributions to our customers and to Korea.”

Rewriting telecom systems around AI at MWC 2026

At the core of the plan is a rebuild of SK Telecom’s integrated IT systems. The company said it will redesign sales, line management, and billing systems to be optimised for AI. The aim is to let the operator design and offer personalised plans and memberships based on each customer’s usage and behaviour patterns.

The company also plans to apply a Zero Trust security framework across its systems. This will include stronger authentication, access controls, network segmentation, and AI-based monitoring, according to the company’s briefing at MWC 2026.

For enterprises watching the telecom sector, this signals a broader shift. Telecom operators have long relied on legacy billing stacks and network management tools. Rebuilding those systems around AI could change how pricing, service design, and fault detection work in practice. It also raises questions about data governance and how customer data is used to train or tune AI models.

SK Telecom is also expanding its “autonomous network operations” strategy. The company said it will use AI to automate wireless quality management, traffic control, and network equipment operations. With AI-RAN technology, it aims to improve speed and reduce latency. These efforts were described in company materials shared during the press event.

A single AI agent across touchpoints

Another part of the strategy focuses on customer interaction. SK Telecom plans to redesign pricing, roaming, and membership services to make them simpler and more automated. It is developing what it calls an integrated AI agent to connect experiences across its main customer portal, T world, and its online store, T Direct Shop.

The company said the agent will analyse daily usage patterns and offer tailored suggestions across channels. It also plans to expand its AI Contact Center so customer service representatives can use AI tools during support calls.

Offline retail stores are part of the shift. SK Telecom said AI will help staff identify customer needs and offer recommendations after a store visit. It is also building “AI Personas” to analyse digital behaviour across customer segments and support conversational Q&A.

For enterprise leaders, this mirrors a wider pattern. Telecom operators are trying to move from reactive service models to predictive ones. The difference now is scale. By embedding AI into billing, customer service, and retail, SK Telecom is treating AI as an operating layer rather than a separate feature.

Building 1GW-class AI data centres

The infrastructure build-out is equally ambitious. SK Telecom said it will construct hyperscale AI data centres across Korea, targeting capacity that exceeds 1 gigawatt. It aims to attract global investment and position the country as a major AI data centre hub in Asia.

The company already operates a GPU cluster called Haein and applied its virtualisation solution, Petasus AI Cloud, to support GPU-as-a-service workloads last year. It now plans to offer that cloud solution globally.

SK Telecom also plans to build an AI data centre in Korea’s southwestern region in collaboration with OpenAI, according to the company’s announcement at MWC 2026.

On the model side, SK Telecom said its sovereign AI foundation model currently has 519 billion parameters, making it the largest in Korea. The company plans to upgrade it to more than one trillion parameters and add multimodal capabilities so it can process image, voice, and video data starting in the second half of the year.

CEO Jung framed the data centre and model build-out in national terms. “AIDC can be seen as the heart of Korea, and hyperscale LLMs as the brain,” he said. “By combining SKT’s AI capabilities with collaboration from domestic and global partners, we will lead true AI-native transformation for Korean customers and enterprises.”

For enterprise readers, the key issue is not parameter count alone. It is how such models will be applied in sectors like manufacturing. SK Telecom said it is working with SK hynix on a manufacturing-focused AI package that analyses process data in real time to reduce defect rates and improve equipment efficiency. The package will be offered as infrastructure, model, and solution.

Changing internal culture

The transformation also extends to internal operations. SK Telecom has built an “AX Dashboard” to track AI use across departments and individuals. It operates an “AI Board” to oversee AI transformation efforts and has created an “AI playground” where employees can build AI agents without coding. More than 2,000 AI agents are already in use across marketing, legal, and public relations, according to the company’s figures shared at the event.

“To drive future growth, we must reinvent our way of working from the ground up. SKT will fundamentally transform its corporate culture to be centred around AI,” Jung said.

For other enterprises, the takeaway is less about branding and more about structure. SK Telecom is tying infrastructure, models, applications, and internal governance into a single program. Whether it can execute at the scale it describes remains to be seen. What is clear is that AI is no longer positioned as a side project. It is becoming the operating model.

(Photo by PR Newswire)

See also: Nokia and AWS pilot AI automation for real-time 5G network slicing

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post MWC 2026: SK Telecom lays out plan to rebuild its core around AI appeared first on AI News.

]]>
ASML’s high-NA EUV tools clear the runway for next-gen AI chips https://www.artificialintelligence-news.com/news/asml-high-na-euv-production-ready-ai-chips/ Fri, 27 Feb 2026 06:00:00 +0000 https://www.artificialintelligence-news.com/?p=112451 The machine that will make tomorrow’s AI chips possible has just been declared ready for mass production – and the clock for the industry’s next leap has officially started. ASML, the Dutch company that holds a global monopoly on commercial extreme ultraviolet lithography equipment, confirmed this week that its High-NA EUV tools have crossed the […]

The post ASML’s high-NA EUV tools clear the runway for next-gen AI chips appeared first on AI News.

]]>
The machine that will make tomorrow’s AI chips possible has just been declared ready for mass production – and the clock for the industry’s next leap has officially started. ASML, the Dutch company that holds a global monopoly on commercial extreme ultraviolet lithography equipment, confirmed this week that its High-NA EUV tools have crossed the threshold from technically impressive to genuinely production-ready.

The announcement was made exclusively to Reuters by ASML’s chief technology officer Marco Pieters ahead of a technical conference in San Jose.

Current-generation EUV machines are approaching the outer edge of what they can do for advanced AI chip production, meaning the semiconductors powering large language models and AI accelerators are bumping up against a physical ceiling. High-NA EUV tools are designed to break through it, letting chipmakers print finer, denser circuit patterns in fewer steps. That translates directly into more powerful and efficient chips for AI workloads.

“I think that it’s at an important point to look at the amount of learning cycles that have happened,” Pieters told Reuters, referring to the volume of customer testing the machines have now accumulated.

The numbers that matter

ASML’s case for readiness rests on three data points it plans to release publicly. The High-NA EUV tools have now processed 500,000 silicon wafers, achieved roughly 80% uptime – with a target of 90% by year-end – and demonstrated imaging precision capable of replacing multiple conventional patterning steps with a single High-NA pass.

Together, Pieters said, those figures signal that the tools are ready for manufacturers to begin qualification. The machines don’t come cheap. At approximately US$400 million per unit – double the cost of the previous EUV generation – they represent one of the most expensive pieces of capital equipment in industrial history.

TSMC and Intel are among the named early adopters.

A two-to-three-year runway

Technical readiness and manufacturing integration are two different things, and Pieters was careful to separate them. Despite the milestone, full integration into high-volume production lines is still expected to take two to three years as chipmakers work through qualification and process development.

“Chipmakers have all the knowledge to qualify these tools,” he said – a vote of confidence in the industry’s ability to move, even if the timeline remains measured.

The next generation of chip performance improvements is on the horizon, not yet in hand. But with ASML now saying the starting gun has fired, the race to integrate High-NA EUV into production has formally begun.

(Photo by ASML)

See also: 2025’s AI chip wars: What enterprise leaders learned about supply chain reality

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post ASML’s high-NA EUV tools clear the runway for next-gen AI chips appeared first on AI News.

]]>
Nokia and AWS pilot AI automation for real-time 5G network slicing https://www.artificialintelligence-news.com/news/nokia-and-aws-pilot-ai-automation-for-real-time-5g-network-slicing/ Wed, 25 Feb 2026 10:00:00 +0000 https://www.artificialintelligence-news.com/?p=112427 Telecom networks may soon begin adjusting themselves in real time, as operators test systems that allow AI agents to manage traffic and service quality. AI may soon be making operational decisions. This week, Nokia and AWS presented a new network slicing system that uses AI agents to monitor network conditions and adjust resources automatically. The […]

The post Nokia and AWS pilot AI automation for real-time 5G network slicing appeared first on AI News.

]]>
Telecom networks may soon begin adjusting themselves in real time, as operators test systems that allow AI agents to manage traffic and service quality. AI may soon be making operational decisions.

This week, Nokia and AWS presented a new network slicing system that uses AI agents to monitor network conditions and adjust resources automatically. The setup is being tested by telecom operators du in the United Arab Emirates and Orange in Europe and Africa, according to a joint announcement from Nokia.

Adaptive AI-driven networks

Network slicing lets operators create multiple virtual networks on the same physical infrastructure, each tuned for a different purpose. For example, a slice may be configured for emergency services or high-bandwidth consumer traffic. While slicing is part of the 5G standard, it has often required manual planning and fixed configurations, which limits how quickly networks can respond to changing demand.

The new system aims to close that gap by introducing AI agents that track network performance indicators like latency and congestion, and consider data like event schedules or weather conditions. Agents can then adjust network settings to keep services running to agreed performance levels, according to Nokia’s description of the pilot.

AWS said the solution combines Nokia’s slicing and automation tools with AI models delivered through Amazon Bedrock, its managed AI service platform. The companies describe the approach as “agentic AI”.

Autonomous connectivity

The interest in such systems reflects a long-standing challenge: 5G networks have delivered higher speeds and lower latency, but operators have struggled to turn those technical gains into new revenue streams. Research firm GSMA Intelligence notes many operators view network slicing as a potential source of enterprise income, though adoption has been slow due to operational complexity and uncertain demand.

If networks can adapt quickly to sudden demand, like a crowded stadium or emergency responders entering a disaster area, operators may be able to offer temporary connectivity or guaranteed service levels without manual setup.

Orange has said previously enterprise customers expect connectivity to behave more like cloud computing, where resources can scale on demand. Systems that allow automated control of network resources could help move telecom services closer to that model.

Cloud platforms and telecom network operations

The tests also highlight how cloud providers are getting involved in telecom operations. Over the past few years, some operators have moved parts of their core networks onto public cloud platforms or built cloud-based control systems. Industry analysts at Dell’Oro Group report that telecom cloud spending is rising as operators modernise networks and adopt software-driven infrastructure.

Adding AI-driven control loops on top of cloud platforms represents the next step, with AI systems monitoring conditions and applying adjustments quickly.

The technology remains in a testing phase. Nokia’s announcement described the work with Orange as demonstrations and pilots rollouts. Questions remain about how such systems can be deployed, how operators will supervise automated decisions, and how regulators will view AI control of critical communication infrastructure.

Telecom networks carry important traffic so reliability and accountability remain central concerns. Operators typically introduce automation gradually, keeping human oversight in place while validating system behaviour under real conditions.

The experiments suggest that AI is beginning to function as operational controller, adjusting physical and virtual resources in response to live events.

Enterprises that rely on private 5G networks for factories or large venues may gain access to connectivity that adjusts automatically. That could influence how businesses design applications that depend on stable, predictable network performance.

(Photo by M. Rennim)

See also: How e& is using HR to bring AI into enterprise operations

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post Nokia and AWS pilot AI automation for real-time 5G network slicing appeared first on AI News.

]]>