Lynx Analytics: Scaling Graph AI to solve complex business problems

Long story short

Lynx Analytics uses GenAI and graph theory to reveal hidden patterns in interconnected data and yield more accurate, context-rich insights. Its data science tool, LynxKite 2000:MM, helps domain experts connect complex datasets and deploy predictive Graph AI workloads ranging from single-GPU experiments to 1,000+ units without writing code. Lynx Analytics uses Nebius’ elastic, vertically integrated AI infrastructure to accelerate delivery across projects and maintain high GPU utilization, keeping performance and costs optimized.

Lynx Analytics bridges industry and tech expertise to deliver AI-powered solutions to enterprises across life sciences, telecom, retail and finance. With a pioneering approach to GenAI, Lynx Analytics integrates graph technologies, LLMs and retrieval augmented generation (RAG) to enable deeply contextual and highly relevant answers that empower organizations to act with confidence.

By treating data as a living network instead of static spreadsheets, Lynx Analytics designs Graph AI projects that reveal hidden intelligence from complex contexts to support confident decision-making across industries. Their platform, LynxKite 2000:MM, empowers teams to natively map interconnected data points with Graph AI models to deliver more accurate, context-aware predictions for meaningful business outcomes.

This case study explores how LynxKite 2000:MM enables innovative problem-solving across domains — from tackling complex math problems to optimizing drug discovery pipelines and advancing predictive modeling in healthcare. We’ll also highlight future directions for innovative Graph AI applications and discuss how Nebius’ flexible GPU provisioning enables Lynx Analytics to scale AI-native workloads on LynxKite 2000:MM.

Advancing mathematical reasoning

Designing AI systems to enhance logical reasoning is one of the most exciting frontiers in ML today. A theorem is a mathematical statement that can be proven through rigorous logical steps — and proving them requires AI to orchestrate complex chains of reasoning and build airtight logical arguments.

Following the steps of LLM-based formal proof systems, Lynx Analytics set up an automated theorem proving framework with LynxKite 2000:MM. By easily connecting datasets, integrating cutting-edge models and training neural networks, the team combined a wide range of graph technologies to unlock new capabilities in mathematical reasoning:

  • Complex multimodal data: With LynxKite 2000:MM, the team integrated tabular data, textual input and text-embedded graphs into a single analytical framework to enable more holistic logical reasoning.

  • Knowledge Graphs: The foundational layer of Graph AI, Knowledge Graphs map entities and relationships as nodes and edges, preserving semantic contexts for improved reasoning. In this project, LynxKite 2000:MM operated on millions of nodes to model proof steps, subgoals, branching proofs and relationships between theorems.

  • Graph RAG: RAG ensures context-relevant information is drawn from large-scale knowledge sources for LLMs to deliver more contextually relevant, informed responses. In general, the more accurate the retrieval, the better the LLM’s answer. Recent tests showed that a graph-based approach to RAG increased retrieval accuracy from 5% to 15% — an impressive 3x jump in performance.

  • Graph Neural Networks (GNNs): Designed and trained in LynxKite 2000:MM and powered by Nebius AI Cloud, a complex GNN was used as a graph-based retriever to support more accurate predictions.

Accelerating drug discovery and development with Graph AI

A graph-based approach accelerates a compound’s journey through pharma R&D stages, shortening timelines and reducing the risk of costly late-stage failures. The key is integrating omics data — such as DNA, RNA and protein sequences —, scientific publications, clinical trial data and molecular and chemical structures as a dynamic network.

When analyzed by GNNs, this structure reveals hidden patterns that indicate high-potential drug candidates, deriving more accurate forecasts of trial success. Curated by Lynx Analytics, the examples below highlight Graph AI’s role in solving complex challenges in drug R&D.

Enhancing drug screening with GenAI

To identify promising drug candidates, researchers typically look into how well a compound binds to proteins that play a central role in specific diseases. An in-silico analysis usually involves complementary pipelines — relating proteins and molecules — to predict binding affinity.

With LynxKite 2000:MM, domain experts can investigate protein sequence similarities with Multiple Sequence Alignment (MSA) Search and predict their 3D structures with folding models like AlphaFold and OpenFold. On the molecular side, researchers can assess chemical properties and rank potential efficacy using generative models like GenMol. Finally, docking models like DiffDock simulate how well molecules attach to the folded protein structures, helping pinpoint high-potential drug candidates faster.

To deliver these cutting-edge models at users’ fingertips, LynxKite 2000:MM handles flexible scaling on Kubernetes, including Nebius’ Managed Kubernetes. By matching GPU deployments to workload intensity, Lynx Analytics ensures robust performance while optimizing costs.

Finding effective drug targets

A major bottleneck in drug discovery is identifying biological targets — most commonly proteins or cellular components — that, when acted upon by a drug, can help treat or manage a specific disease.

Traditional machine learning approaches might analyze biological data in a vacuum, sometimes leading to false positives. Let’s say gene expression, for example: highly expressed genes in cancer patients aren’t always effective targets, as their elevated activity could be a symptom of the disease rather than its cause.

GNNs provide a multi-layered understanding of potential targets by linking genes to the proteins they encode, mapping relationships between proteins and connecting these interactions to disease-specific biological pathways. By prioritizing targets based on their context within disease-critical pathways and their similarity to successful targets, a GNN can filter out the noise and increase the likelihood of therapeutic success.

Repurposing drugs for new indications

Promising compounds, despite passing initial safety and toxicology screens, often fail to show efficacy against their original target diseases. As a result, pharmaceutical companies accumulate large chemical libraries with up to millions of shelved molecules that never made it to market but could be repurposed for new therapeutic applications.

While traditional ML approaches sometimes match isolated features, such as diseases with similar gene expression signatures, GNNs uncover multi-step paths between approved compounds and new diseases with link prediction.

A core task in network analysis, link prediction calculates the probability of a missing link between two nodes. For example, GNNs can infer that an approved drug has a high probability of success in treating a new condition because the compound interacts with a protein that shares a biological pathway with a known driver of the disease, uncovering hidden therapeutic opportunities.

Let us build pipelines of the same complexity for you

Our dedicated solution architects will examine all your specific requirements and build a solution tailored specifically for you.

Enhancing tumor modeling for better treatment predictions

Graph models can also integrate temporal data to understand how drugs or treatments will affect patients over time, lowering the amount of data required to enable accurate predictions by up to 8 times. Leaner and sharper predictions help life-saving treatments reach patients faster — shortening lead times from months to weeks.

By combining early tumor size measurements with omics data and graph embeddings, researchers were able to forecast future tumor growth or shrinkage more accurately than with empirical or simpler statistical models, according to a study presented in the 2024 International Conference on Learning Representations (ICLR).

Experiments like this show how dosing schedules, treatment duration and therapeutic combinations can be optimized in silico. By integrating patient-specific parameters, GNNs can better predict the likelihood of disease trajectories earlier in the process, helping physicians evaluate a patient’s level of risk and support important decisions in clinical trials.

When reproducing this research, Lynx Analytics expanded LynxKite 2000:MM’s neural network toolbox to diversify its advanced modeling capabilities — including neural ODE (Ordinary Differential Equations), heterogeneous graph convolutions and LSTMs (Long Short-Term Memory). Lynx Analytics also expanded user control by enabling custom model inputs for detailed batching and real-time graph creation for individual samples.

How Nebius AI Cloud supercharges LynxKite 2000:MM’s Graph AI performance

LynxKite 2000:MM relies on our AI Cloud for fast provisioning, simple scaling and rapid deployment — matching GPU capacity to each workload’s architecture, size and training or inference needs.

With Managed Kubernetes, shared filesystems and Slurm support, Nebius enables LynxKite 2000:MM to move from 1 to over 1000 GPUs while sustaining over 80% utilization and eliminating DevOps overhead.

  • Elastic provisioning for variable loads: LynxKite 2000:MM powers research and analytics projects of all sizes, on demand — making rapid delivery essential for team adoption. Nebius’ flexible provisioning adapts to LynxKite 2000:MM’s evolving HPC requirements, handling unpredictable demand spikes without redundancies and keeping both performance and costs optimized.

  • Faster deployment: With Nebius, Lynx Analytics deploys production-ready clusters quickly and confidently, with minimal onboarding overhead. Shared file systems across multiple VMs ensure a unified data layer, eliminating re-ingest friction and accelerating time-to-insight.

  • Kubernetes for distributed workloads: Lynx Analytics uses VMs with Managed Kubernetes to automate container orchestration, simplifying the management of distributed applications. This approach enables LynxKite 2000:MM to automatically manage the lifecycle of containerized models, ensuring reliable execution of complex Graph AI workflows with high GPU utilization.

  • Slurm integration: Nebius provides managed Slurm capabilities via Soperator, enabling Lynx Analytics to add Slurm support to LynxKite 2000:MM without building the setup from scratch — and allowing engineers to use this popular GPU cluster management tool without the operational burden.

Looking ahead: Evolving graphs, emerging datasets and promising applications

Graph analytics applications in life sciences are progressing rapidly in many exciting fronts. Here are a few promising developments to watch out for, according to Lynx Analytics:

  • Temporal graphs: Instead of relying purely on “static” genomics data, protein-protein interaction networks and expression snapshots, the dynamic nature of biology can be better represented by Knowledge Graphs that encode changes over time. By incorporating how gene and protein expression changes according to disease or treatment progressions, environmental exposures and feedback loops, the life sciences field is set to benefit from exciting new insights.

  • Beyond molecular and omics data: Spanning medical imaging, electronic health records, wearable technologies and real-world evidence from claims and registries. With increasingly more biological data available every year, graph representations can greatly improve indication selection and outcome prediction by connecting emerging data modalities.

  • Under-studied domains: Not all diseases and biological targets are equally represented in public data. By using structural similarity, evolutionary relationships or cross-species data, Graph AI has the potential to help infer plausible relationships in overlooked areas via “transfer learning” from established fields, potentially improving our knowledge of rare diseases and under-explored targets like disordered proteins and membrane components.

More exciting stories

vLLM

Using Nebius’ infrastructure, vLLM — a leading open-source LLM inference framework — is testing and optimizing their inference capabilities in different conditions, enabling high-performance, low-cost model serving in production environments.

SGLang

A pioneering LLM inference framework SGLang teamed up with Nebius AI Cloud to supercharge DeepSeek R1’s performance for real-world use. The SGLang team achieved a 2× boost in throughput and markedly lower latency on one node.

London Institute for Mathematical Sciences

How well can LLMs abstract problem-solving rules and how to test such ability? A research by LIMS, conducted using our compute, helps to understand the causes of LLM imperfections.

Start your journey today