Perplexity AI Unveils Hybrid Inference System to Bridge Cloud and Edge
Perplexity AI unveiled a hybrid inference system at Computex 2026 that autonomously switches between edge and cloud models to optimize AI performance and costs.
Perplexity AI unveiled a hybrid inference system at Computex 2026 that autonomously switches between edge and cloud models to optimize AI performance and costs.
Cerebras Systems claims its hardware achieves inference speeds of nearly 1,000 tokens per second for trillion-parameter AI models, potentially outperforming traditional GPU-based providers and intensifying the competition in AI hardware.
Technological breakthroughs in embedding-space communication and agent-management systems are revolutionizing multi-agent AI efficiency and lowering operational costs for enterprise automation.
Startup Gimlet Labs has raised $80 million in Series A funding to develop technology that enables AI inference to run across diverse hardware architectures, reducing vendor lock-in.
Gimlet Labs has raised $80 million in Series A funding to tackle AI inference bottlenecks by allowing models to run across diverse hardware architectures simultaneously.