Technical Background

FoodAtlas behind the Scenes

FoodAtlas is an AI-powered tool that maps the complex relationships between food, chemicals, and diseases. It not only identifies the types and quantities of chemicals in the foods we consume but also explores their potential health impacts.

Our system continuously monitors new research, extracting data on chemical concentrations and disease correlations. This data is cross-referenced with established databases, such as PubChem, and incorporated into our knowledge graph.

The following provides a brief overview of some of the methods and technologies used. For a detailed look behind the scenes, refer to our first publication.


An example image of a knowledge graph. The graph contains an enourmous amount of nodes and edges connecting nodes. Some nodes are highlighted, such as Soybean, Cow, or Tomato. A portion of the image is magnified to illustrate the connections between Garlic, Garlic root, Allicin, and Saponins.

Knowledge Graph

FoodAtlas uses a knowledge graph to systematically store and organize a vast network of interconnected entities, including foods, chemicals, diseases, and their relationships. Each connection is represented as a triplet–a structured entry in our knowledge base–consisting of a head entity, a tail entity, and their relationship. To enhance the reliability of food-related insights, FoodAtlas also incorporates rich metadata, including detailed entity information and supporting evidence for each relationship.

Graph Semantics

A node is either a Food , a Chemical , or a Disease .

An edge informs on the relationship between two nodes. FoodAtlas captures contains relations, i.e. what chemicals are found in certain foods as well as is a relations for parts of foods. Chemicals may then either positively/negatively impact a disease.

A graphic illustrating the semantic relationships of a knowledge graph. Three nodes, including food, chemical, and disease are shown, connected through edges. Those edges include all possible relations of two nodes, including a self-referencing 'is-a' relation which are possible on both foods and chemicals, a 'contains' relation between a food and a chemical, as well as a 'positively / negatively correlates' relation from chemical to disease and from food to disease. The latter is dashed to indicate it's work in progress.

Pipeline

Our pipeline uses state-of-the-art AI models to extract and quantify food connections. The two major steps are (a) knowledge extraction, i.e., converting literature into food-chemical relations and (b) knowledge graph construction , which adds metainformation and new information to our knowledge base.

Knowledge Extraction

1
Filter Documents

Relevant, peer-reviewed literature is filtered using a list of more than 1,200 keywords

2
Predict Relevant Sentences

Sentences likely to contain food information are predicted using BioBERT

3
Extract Relations

Sentences are processed by GPT-4 to extract food-chemical relations

Knowledge Graph Construction

4
Data Conversion

Output is converted into triplets, the building block of the knowledge graph data structure

5
Entity Linking

Triplets are linked to existing corresponding entities, or new ones are created

6
Metadata Injection

Metadata such as concentration values, food parts, external references, and quality scores is compiled

FoodAtlas was created and is maintained by AIFS at the University of California, Davis.

About AIFS

The AI Institute for Next Generation Food Systems, or AIFS aims to meet growing demands in our food supply by increasing efficiencies using Al and bioinformatics spanning the entire system–from growing crops through consumption. We are dedicated to creating AI applications for a healthier, more sustainable planet from farm to fork.

Connect with us

Subscribe to our newsletter to stay up-to-date on AIFS events, industry news, and AI research.

This work is supported by AFRI Competitive Grant no. 2020-67021-32855/project accession no. 1024262 from the USDA National Institute of Food and Agriculture.

2025 AIFS. All rights reserved.