Crafting Cypher Queries from Question Phrases

The High Plains Computing team of AI experts recently completed a project on Generative AI. The goal of this project is to create a tailored solution for querying Graph databases.

The project aims to utilize generative AI for creating high-quality graph language queries (Cypher queries), enabling interactive usage through BI tools. Human intent in English guides the deep learning network to generate and execute Cypher queries, retrieving results for dashboards and reports. This document provides insight into the architecture and key attributes of this design.

Business Value and Impact

We recognize substantial opportunities in queries generated by machines. Numerous applications use rule engines to convert Object language into SQL transformations (ORM), encompassing widely used Java, Python, Go, and web programming frameworks. Our focus is on producing domain-specific queries, given our client domain’s focus on networks and security.

This compact generative model won’t adversely affect engineers’ roles; rather, it serves as an extra tool for BI and security experts.

Graph Databases

A graph database preserves nodes and connections rather than tables or documents, aligning with how individuals naturally depict intricate systems. The illustration below depicts a graph of a company’s computer networks and owned computing resources. (Figure-1)

Cypher Query Language

Cypher, Neo4j’s graph query language, facilitates data retrieval from the graph. Modeled akin to standard SQL, it queries nodes linked by relationships.

Using the earlier computer network illustration (Figure-1), a Cypher query could determine the quantity of GPU compute nodes owned by the company in the US-WEST region, as shown below.

MATCH {Node:Node {region:'US-WEST-2'})-[rel:NodeComponents]-(NodeComponent:GPUCount > 0 )
RETURN, node.location, node.last_patch_date, node.ip_address

Sequence-to-sequence learning and generative AI

Sequence-to-sequence transformer models, a distinct class of NLP models, handle entire word sequences like sentences and paragraphs. They excel in tasks like machine translation, capturing sentence attributes in one language, and using this knowledge to produce an equivalent sentence in another. These models also find utility in generating text summaries and filling in text based on prompts.

The overarching structure of sequence-to-sequence models entails: 

  • Comprising an encoder block, this element transforms a sentence into a profound machine-learning representation, capturing sentence structure and inter-word relationships. Consequently, the encoder can accurately predict the subsequent possible word, “Gaelic,” when the encoder block is supplied with the subsequent sentence, “I was born in Ireland where the language they teach to all kids in school is ________.” Remarkably, despite the word “Ireland” being distant within the sentence, the encoder’s attention mechanism enables it to establish a robust connection between Ireland and Gaelic. 
  • The decoder block is trained to forecast sequences in the target language. It employs the preceding partial sequence in the target language and an encoded sentence from the encoder to finalize the target sequence.

During the inference phase, the decoder lacks initial access to target sequences. However, it searches for a token, such as a start token, to glean the commencement of generation. Primed by the encoder’s encoding, it forecasts the first token. Once equipped with this initial token, it leverages preceding tokens and encoding to predict the subsequent one. This iterative procedure continues by furnishing the decoder with previously generated target language tokens and encoded source sentences, thus culminating in sequence completion.

The diagram below (Figure-2), illustrates the encoder/decoder framework employed for English-to-Spanish translation in both training and inference stages.

T5 Transformer

The T5 Transformer model was proposed to extend transformer architecture-based transfer learning further.

See: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Unlike BERT, which functions solely as an encoder, and GPT-3, which serves as a decoder, the T5 transformer operates as an encoder-decoder model. Google manages an up-to-date iteration, T5 v1.1, and offers pre-trained transformers in multiple sizes, with the largest comprising a staggering 11 billion parameters.

Training the T5 Transformer

T5 is a teacher-enforced training model where the teacher provides input and target sequences. Training has to generate a question and corresponding Cypher query to transform an English sentence into Cipher. Following example shows an English question and the corresponding  Cypher query.

How many Windows-based computers are at high risk

MATCH (vulnerableNode:Node {os:'Windows'})-[rel:NodeRisks]-(RiskAssement:Risk{level >5})
RETURN, node.location, node.last_patch_date, node.ip_address

For training, the team captured 5000 queries about the client’s network from the reporting and dashboard system and roughly associated them with 7000 questions. A much bigger dataset is needed for more complex graph networks.

Components of the Solution

The following diagram shows a high-level overview of the solution and its components.

A more detailed review of the components of solution shown in Figure-3:

A: Pretrained transformer: Team used a pre-trained T5 v1.1 transformer Base variant trained on the C4 dataset. It is a cleaner version of the common crawl dataset used extensively by most language models. This pre-trained model will have roughly 220 million parameters.

B: English annotation and sample queries: This was done by capturing the most frequent queries from the production system and manually creating corresponding English questions

C: Fine-tuning T5 Model: The hand-crafted data set was split into train, validation, and test datasets, and used to fine-tune pre-trained Transformer.

D: Custom loss function. As a part of sequence-to-sequence translation, customized the standard ROGUE-L (Recall-Oriented Understudy for Gisting Evaluation)  loss functions to add extra penalties for ill-formed Cypher queries. And minimize loss as relations between nodes are correctly identified
E: Post-processing: For details on this, see Post Processing session below.

Post Processing

The post-processing pipeline was constructed using  spaCy pipelines. It involves parsing generated queries, validating syntax accuracy, filtering out blacklisted queries, introducing supplementary tokens for placeholders (e.g., translating “my” to a customer account ID for questions like “show me all my networks”), and undertaking necessary post-processing tasks to ensure query executability.

The general structure of spaCy pipelines is shown in the following figure (Figure-4)

Streamlit UI

A simple yet efficient Streamlet UI was established for the subsequent purpose using Streamlit for the following:

  1. Testing model
  2. Adding more training data
  3. Train and export model for serving

More information on Streamlit can be found at

Model serving Client’s Applications

Team used FastAPI-based Model serving that served the model using TensorFlow Serving and created a REST proxy using FAST API.

FASTAPI is the REST application development framework to create Python-based application endpoints (

The API model follows a straightforward structure, with the request payload containing a collection of English language questions, and the response containing the corresponding generated Cypher queries.

TensorFlow Serving is a versatile, high-performance serving framework tailored for deploying machine learning models in production. It streamlines the deployment of new algorithms and experiments, maintaining consistent server architecture and APIs. While it seamlessly integrates with TensorFlow models, TensorFlow Serving can also be expanded to serve different models and data types. Further details are available on TensorFlow Serving site.


In conclusion, this project shows the power of machine-generated queries to enhance Graph language querying. Use of Generative AI offers efficiency and accuracy in data retrieval. By integrating T5 Transformer, fine-tuning, and post-processing, the team showcased a comprehensive solution. With Graph databases, sequence-to-sequence learning, and TensorFlow Serving, this project presents a significant stride in enabling efficient and effective data querying and manipulation

Social Share :

Can AMD MI300X emerge as a strong contender against NVIDIA’s Hopper GPUs?

Overview AMD MI300X is the newest addition to the AMD Instinct series AI accelerators. Unlike…

Benefits of working with High Plains Cloud Service

High Plains Cloud Service: An Overview High Plains Cloud Service is a managed service that…

Introducing Amazon Q

Overview Amazon Q is a new-gen AI solution that provides insights into enterprise data stores.…

Ready to make your business more efficient?