Technical Background

FoodAtlas behind the Scenes

FoodAtlas is an AI-powered tool that maps the complex relationships between food, chemicals, and diseases. It not only identifies the types and quantities of chemicals in the foods we consume but also explores their potential health impacts.

Our system continuously monitors new research, extracting data on chemical concentrations and disease correlations, which are then integrated into our knowledge base. This data is cross-referenced with established databases, such as PubChem, and incorporated into our knowledge graph.

The following provides a brief overview of some of the methods and technologies used–for a detailed look behind the scenes, refer to our first publication.


todo

Knowledge Graph

FoodAtlas uses a knowledge graph to store and organize the vast number of interconnected entities including foods, chemicals, and their relations. Each connection constitutes a triplet–a row in our knowledge base–containing a head, a tail, and their relationship.

In the closeup, Garlic connects to three entities and Garlic root to one entity, forming 4 triplets.

Graph Semantics

A node is either a Food , a Chemical , or a Disease .

An edge informs on the relationship between two nodes. FoodAtlas captures contains relations, i.e. what chemicals are found in certain foods. Those chemicals may then either positively/negatively correlate with a disease. Our next iteration adds taxonomical information with is a , and extends positively/negatively correlates to foods containing chemicals associated with diseases.

todo

Pipeline

Our pipeline uses state-of-the-art AI models to extract and quantify food connections. The two major steps are (a) knowledge extraction, i.e., converting literature into food-chemical relations and (b) knowledge graph construction , which adds metainformation and new information to our knowledge base.

Knowledge Extraction

1
Filter Documents

Relevant, peer-reviewed literature is filtered using a list of more than 1,200 keywords

2
Predict Relevant Sentences

Sentences likely to contain food information are predicted using BioBERT

3
Extract Relations

Sentences are processed by GPT-4 to extract food-chemical relations

Knowledge Graph Construction

4
Data Conversion

Output is converted into triplets, the building block of the knowledge graph data structure

5
Entity Linking

Triplets are linked to existing corresponding entities, or new ones are created

6
Metadata Injection

Metadata such as concentration values, food parts, external references, and quality scores is compiled

FoodAtlas was created and is maintained by AIFS at the University of California, Davis.

About AIFS

The AI Institute for Next Generation Food Systems, or AIFS aims to meet growing demands in our food supply by increasing efficiencies using Al and bioinformatics spanning the entire system–from growing crops through consumption. We are dedicated to creating AI applications for a healthier, more sustainable planet from farm to fork.

Connect with us

Subscribe to our newsletter to stay up-to-date on AIFS events, industry news, and AI research.

This work is supported by AFRI Competitive Grant no. 2020-67021-32855/project accession no. 1024262 from the USDA National Institute of Food and Agriculture.

2024 AIFS. All rights reserved.