Transformer models are taking advantage of GPU compute.

from Heartcore Capital - AI & Productivity Report 2023

About this report

Transformer-based models have shown to be very efficient in training on GPUs by parallelizing the ingestion of large amounts of data. Attention mechanisms allow the model to focus on specific parts of the input sequence while processing it, thereby improving its ability to understand and generate complex patterns.

"Looking at previous words only”

Luke, I am your best worst mother

"Looking at all words at once"

Luke, I am your father

Models like GPT-3 have been trained on terabytes of public text data. These data sets pale in comparison to other text-based content that’s been created by humans. Future SOTA models will be trained on so far untapped non-public and unstructured data.

State of the art LLMs were only trained on a tiny fraction of human created text

Non-public Text data Emails/

Transformer models are taking advantage of GPU compute.

Next Article

About this report

More articles from this publication:

About this report

Konstantine Buhler

Running AI on the edge could push inferencing (and training) costs to the user.

Large models and finetuned derivative models will power the application layer

This article is from:

Heartcore Capital - AI & Productivity Report 2023